Metadata Extractor

The Metadata Extractor pulls structured metadata from pages — Open Graph tags, JSON-LD structured data, Twitter Cards, standard meta tags, article text, links, and tables — from a list of URLs in one job.

When to Use It

  • Auditing SEO metadata across 100+ pages of your own site
  • Researching how competitors structure their Open Graph data
  • Pulling clean article text from a reading list
  • Collecting schema.org structured data from product pages
  • Building a dataset of page titles, descriptions, and canonical URLs

Step 1 — Enter URLs

Paste your list of URLs, one per line:

https://yoursite.com/blog/post-1
https://yoursite.com/blog/post-2
https://yoursite.com/products/widget

Step 2 — Choose What to Extract

Toggle the data types you want:

OptionWhat it pulls
MetadataTitle, description, author, canonical URL, robots, Open Graph (og:title, og:image, og:description), Twitter Card, JSON-LD structured data
Text ContentMain page text, converted to clean Markdown (removes nav, ads, footers)
LinksAll hyperlinks found on the page (href + anchor text)
ImagesAll image URLs and their alt text
TablesHTML tables as structured row/column data

You can enable multiple options in a single run.

Step 3 — Configure Options

SettingDefaultNotes
Request delay500msMilliseconds between requests. Use 500–2000ms for polite crawling on sites you don't own
Load timeout5sWait time per page before extraction
Concurrent requests3Parallel page loads
Pro Tip
On sites you own, you can set delay to 0 and concurrency to 10 for maximum speed. On third-party sites, use a delay to avoid rate-limiting.

Step 4 — Run and Export

Click Start. Results appear as pages complete. When done, click Export CSV.

What Each Metadata Field Contains

Open Graph Tags

Fields like og:title, og:description, og:image, og:url, og:type — used by social networks for link previews.

JSON-LD Structured Data

Schema.org markup embedded in <script type="application/ld+json"> blocks. Common types: Article, Product, BreadcrumbList, FAQPage, Organization.

Twitter Card

twitter:card, twitter:title, twitter:description, twitter:image — used by Twitter/X for link previews.

Standard Meta Tags

<title>, <meta name="description">, <meta name="author">, <link rel="canonical">, <meta name="robots">.

Example: SEO Audit of 50 Blog Posts

Goal: Check title length, meta description, and canonical URL for 50 blog posts.

  1. 02
    Export your sitemap or URL list to a text file (one URL per line).
  2. 04
    Open Metadata Extractor, paste all 50 URLs.
  3. 06
    Enable Metadata only (no need for text or links).
  4. 08
    Set request delay to 500ms.
  5. 10
    Click Start — completes in 30–60 seconds.
  6. 12
    Export CSV → open in Google Sheets.
  7. 14
    Add a formula column: =LEN(B2) on the title column to flag titles over 60 characters.
  8. 16
    Filter rows where description is blank to find missing meta descriptions.