SCRAPING

Scrape any page. Get clean data.

Five output formats, smart routing, native action macros, and an async pattern that handles batches of any size. One endpoint family for everything from a single URL to a hundred million.

Start free Read the docs

Part of the full 145-endpoint surface — see the scraping bundle for the related endpoints.

Five output formats

Markdown for LLMs, HTML for fidelity, plain text for indexing, link arrays for crawling, original-source for diagnostics. Choose per request — no transcoding pipeline needed.

Smart routing

The smart-scrape endpoint detects whether a page needs JavaScript and routes accordingly. Static pages skip the browser and return in milliseconds; dynamic pages get the full stealth browser automatically.

Action macros

Drive clicks, typing, waits, scrolls, and form submissions from a JSON array. No Puppeteer or Playwright code — the same description works for every site.

Async with webhooks

Fire large batches, walk away, get a clean POST to your endpoint when each job completes. Built for production pipelines that should not hold connections open.

Examples that work today.

Drop any of these into a terminal or notebook.

Scrape with actionsPOST /v1/scrape
curl -X POST https://api.ollagraph.com/v1/scrape \
  -H "Authorization: Bearer $OLLAGRAPH_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "format": "markdown",
    "stealth": true,
    "actions": [
      {"type": "click", "selector": ".load-more"},
      {"type": "wait", "ms": 2000}
    ]
  }'
Smart routingPOST /v1/scrape/smart
# Auto-detect: static page = fast HTTP path; dynamic = full browser.
curl -X POST https://api.ollagraph.com/v1/scrape/smart \
  -H "Authorization: Bearer $OLLAGRAPH_API_KEY" \
  -d '{"url": "https://example.com"}'
# Response includes render_mode: "http_only" | "headless" | "headless_fallback"
BatchPOST /v1/scrape/batch
# Up to 100 URLs in parallel, one response.
curl -X POST https://api.ollagraph.com/v1/scrape/batch \
  -H "Authorization: Bearer $OLLAGRAPH_API_KEY" \
  -d '{
    "urls": ["https://a.com", "https://b.com", "https://c.com"],
    "concurrency": 10,
    "format": "markdown"
  }'

Scraping questions

What formats does /v1/scrape return?

Five: markdown for LLM-friendly text, html for raw page source, text for stripped plain text, links for the URL graph extracted from the page, and original for the raw response without transformation. Specify the format in the request body.

When should I use /v1/scrape vs /v1/scrape/smart?

Use smart when you do not know whether the target page needs JavaScript — it is faster on static pages and falls back to the full browser when needed. Use scrape directly when you already know the page is dynamic, or when you need to pass a custom action macro.

How big can a batch be?

Up to 100 URLs per synchronous batch call. For larger jobs use the async batch endpoint, which queues the work and posts to a webhook when each chunk completes — no upper bound beyond your monthly quota.

Does Ollagraph handle JavaScript-rendered single page apps?

Yes. The default browser engine waits for network idle and DOM stability before extracting. You can also use action macros for click-and-wait patterns common to SPAs.

What about pages that require login?

Use the action macro to drive the login form, then continue with subsequent steps in the same session. For pages that need a persistent authenticated session across many requests, our session endpoints keep state across calls. See the behind-a-login recipe.

How do I scrape paginated content?

Two patterns. For numbered pagination, send the page-2, page-3, page-N URLs as a batch. For infinite-scroll pagination, use the action macro to scroll-and-wait until the load button stops appearing.

Is there a request timeout limit?

Each request has a per-call timeout parameter. Failed calls are auto-refunded, so a timeout never costs you a credit.

Start scraping in five minutes.

1,000 free credits, one bearer token, failed calls auto-refund.

Start free Browse recipes