Metadata Extractor

The Metadata Extractor pulls structured metadata from pages — Open Graph tags, JSON-LD structured data, Twitter Cards, standard meta tags, article text, links, and tables — from a list of URLs in one job.

When to Use It

Auditing SEO metadata across 100+ pages of your own site
Researching how competitors structure their Open Graph data
Pulling clean article text from a reading list
Collecting schema.org structured data from product pages
Building a dataset of page titles, descriptions, and canonical URLs

Step 1 — Enter URLs

Paste your list of URLs, one per line:

https://yoursite.com/blog/post-1
https://yoursite.com/blog/post-2
https://yoursite.com/products/widget

Step 2 — Choose What to Extract

Toggle the data types you want:

Option	What it pulls
Metadata	Title, description, author, canonical URL, robots, Open Graph (og:title, og:image, og:description), Twitter Card, JSON-LD structured data
Text Content	Main page text, converted to clean Markdown (removes nav, ads, footers)
Links	All hyperlinks found on the page (href + anchor text)
Images	All image URLs and their alt text
Tables	HTML tables as structured row/column data

You can enable multiple options in a single run.

Step 3 — Configure Options

Setting	Default	Notes
Request delay	500ms	Milliseconds between requests. Use 500–2000ms for polite crawling on sites you don't own
Load timeout	5s	Wait time per page before extraction
Concurrent requests	3	Parallel page loads

Pro Tip

On sites you own, you can set delay to 0 and concurrency to 10 for maximum speed. On third-party sites, use a delay to avoid rate-limiting.

Step 4 — Run and Export

Click Start. Results appear as pages complete. When done, click Export CSV.

What Each Metadata Field Contains

Open Graph Tags

Fields like og:title, og:description, og:image, og:url, og:type — used by social networks for link previews.

JSON-LD Structured Data

Schema.org markup embedded in <script type="application/ld+json"> blocks. Common types: Article, Product, BreadcrumbList, FAQPage, Organization.

Twitter Card

twitter:card, twitter:title, twitter:description, twitter:image — used by Twitter/X for link previews.

Standard Meta Tags

<title>, <meta name="description">, <meta name="author">, <link rel="canonical">, <meta name="robots">.

Example: SEO Audit of 50 Blog Posts

Goal: Check title length, meta description, and canonical URL for 50 blog posts.

02
Export your sitemap or URL list to a text file (one URL per line).
04
Open Metadata Extractor, paste all 50 URLs.
06
Enable Metadata only (no need for text or links).
08
Set request delay to 500ms.
10
Click Start — completes in 30–60 seconds.
12
Export CSV → open in Google Sheets.
14
Add a formula column: =LEN(B2) on the title column to flag titles over 60 characters.
16
Filter rows where description is blank to find missing meta descriptions.