v1.0 — Initial Release

Released: June 2026

CrawlPilot v1.0 is the first public release of the extension. It includes the full core toolset for no-code web data extraction.

What's New

3-step visual wizard: pick container → pick item → review schema
Auto-detects column types: Title, Price, Image URL, Link, Text
CSS Selector and XPath support with inline editor
Two auto-scroll strategies:
- Mutation-Aware (default) — uses MutationObserver to detect new DOM nodes after scroll; best for dynamic feeds like Twitter, LinkedIn
- Indexed — iterates children by index; best for static paginated lists
Pagination support: Next Button, Load More Button
Configurable max pages and scroll speed
Live item count during extraction
In-memory and database-level deduplication
Resume capability for interrupted extractions

Bulk multi-URL extraction with configurable concurrency (up to 10 parallel tabs)
Per-URL status tracking: Queued → Extracting → Done / Error
Automatic tab creation and cleanup
Schema definition with Pick on Page for each field
Click actions (for cookie banners, expanders, popups)
Per-URL error logging with retry capability
Unified schema across all processed pages

JSON-LD structured data parsing
Open Graph tag extraction (og:title, og:image, og:description, og:url, og:type)
Twitter Card extraction
Standard meta tag extraction (title, description, author, canonical, robots)
HTML content extraction converted to Markdown via Turndown
Table extraction as structured data
Link enumeration (href + anchor text)
Configurable request delay, timeout, and concurrency

Detects: <img> tags (including lazy-loaded), <picture>/<source>, CSS background images, video poster frames, canvas (converted to PNG), SVG
Dimension filtering: Small / Medium / Large
Individual or bulk selection
ZIP export via JSZip
CORS-aware fetching via background extension proxy
Supported formats: JPG, PNG, GIF, SVG, WebP, AVIF

Right-Click Unlocker: Removes JavaScript-based right-click blocking on any page

Language selector: English, Spanish, French, German, Italian, Chinese (Simplified), Japanese, Russian, Portuguese
Storage usage display
Data retention policy (auto-delete data older than N days)
Manual clear old data and full database wipe
Integrations UI: Webhook, Airtable, Google Sheets (configuration saved; outbound sending in v1.1)

Scheduled runs: Automated recurring extractions are not yet supported. Use History → Re-run manually.
Integration outbound sending: Webhook, Airtable, and Google Sheets data transmission is not yet active. Credential input is saved for v1.1.
Email Extractor: Marked "Coming Soon" — not available in v1.0.
Safari / Firefox: Not supported. Chrome 114+ only.
Cross-origin iframes: Content inside sandboxed iframes on different domains cannot be extracted.
JavaScript SPAs: Page Extractor works best on server-rendered or static pages. Complex SPAs that require multi-step interaction beyond simple clicks may produce incomplete results.

v1.0 is the first release — no migration from a previous version is required.