Metadata Extractor
The Metadata Extractor pulls structured metadata from pages — Open Graph tags, JSON-LD structured data, Twitter Cards, standard meta tags, article text, links, and tables — from a list of URLs in one job.
When to Use It
- Auditing SEO metadata across 100+ pages of your own site
- Researching how competitors structure their Open Graph data
- Pulling clean article text from a reading list
- Collecting schema.org structured data from product pages
- Building a dataset of page titles, descriptions, and canonical URLs
Step 1 — Enter URLs
Paste your list of URLs, one per line:
https://yoursite.com/blog/post-1
https://yoursite.com/blog/post-2
https://yoursite.com/products/widget
Step 2 — Choose What to Extract
Toggle the data types you want:
| Option | What it pulls |
|---|---|
| Metadata | Title, description, author, canonical URL, robots, Open Graph (og:title, og:image, og:description), Twitter Card, JSON-LD structured data |
| Text Content | Main page text, converted to clean Markdown (removes nav, ads, footers) |
| Links | All hyperlinks found on the page (href + anchor text) |
| Images | All image URLs and their alt text |
| Tables | HTML tables as structured row/column data |
You can enable multiple options in a single run.
Step 3 — Configure Options
| Setting | Default | Notes |
|---|---|---|
| Request delay | 500ms | Milliseconds between requests. Use 500–2000ms for polite crawling on sites you don't own |
| Load timeout | 5s | Wait time per page before extraction |
| Concurrent requests | 3 | Parallel page loads |
Step 4 — Run and Export
Click Start. Results appear as pages complete. When done, click Export CSV.
What Each Metadata Field Contains
Open Graph Tags
Fields like og:title, og:description, og:image, og:url, og:type — used by social networks for link previews.
JSON-LD Structured Data
Schema.org markup embedded in <script type="application/ld+json"> blocks. Common types: Article, Product, BreadcrumbList, FAQPage, Organization.
Twitter Card
twitter:card, twitter:title, twitter:description, twitter:image — used by Twitter/X for link previews.
Standard Meta Tags
<title>, <meta name="description">, <meta name="author">, <link rel="canonical">, <meta name="robots">.
Example: SEO Audit of 50 Blog Posts
Goal: Check title length, meta description, and canonical URL for 50 blog posts.
- 02Export your sitemap or URL list to a text file (one URL per line).
- 04Open Metadata Extractor, paste all 50 URLs.
- 06Enable Metadata only (no need for text or links).
- 08Set request delay to 500ms.
- 10Click Start — completes in 30–60 seconds.
- 12Export CSV → open in Google Sheets.
- 14Add a formula column:
=LEN(B2)on the title column to flag titles over 60 characters. - 16Filter rows where description is blank to find missing meta descriptions.