WEBHOOKS

Fire jobs. Get a clean callback.

Queue a handful of URLs or a hundred thousand. Walk away. Your endpoint receives a single signed POST when the work completes — no polling, no held connections, no retry plumbing on your side.

Start free Read the docs

Webhooks deliver the results of async jobs from the scraping API. Full reference lives in the API docs.

Fire-and-forget at any size

Queue a job of any size — a handful of URLs or a hundred thousand — and walk away. Your server gets a single clean POST when the work is done. No long-held connections, no polling loops, no retry plumbing on your side.

Every callback is signed

We sign every delivery with an HMAC-SHA256 over the raw body, keyed by your account's webhook secret. Your endpoint verifies the signature before trusting the payload. Forgery becomes infeasible and replay attacks become detectable through the timestamp.

Idempotent payloads

Each delivery carries the job_id. Treat it as an idempotency key: record it before processing and ignore a repeat. Because a retried delivery is byte-identical to the first, your handler can dedupe safely without losing data.

Verify before you go live

Send a single signed test payload to your endpoint with the test-webhook call. It uses the exact production signing scheme and retry policy, so you confirm your receiver works before any real job depends on it.

Queue a job. Receive the result.

Pass a webhook_url to any async endpoint and we POST the finished result back to you.

Queue an async batchPOST /v1/scrape/batch/async
# Queue an async batch and walk away. No held connection.
curl -X POST https://api.ollagraph.com/v1/scrape/batch/async \
  -H "Authorization: Bearer $OLLAGRAPH_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "urls": ["https://a.example.com", "https://b.example.com"],
    "format": "markdown",
    "webhook_url": "https://yourapp.com/hooks/ollagraph"
  }'

# Response: { "status": "queued", "job_id": "..." }
# We POST the finished result to your webhook_url when the work completes.
The callback payloadPOST to your webhook_url
// What lands on your endpoint when the job finishes.
// Method: POST   Content-Type: application/json
// Header:  X-Ollagraph-Signature: t=<unix_ts>,v1=<hex_hmac_sha256>

{
  "job_id": "job_3f9a...",
  "status": "completed",
  "result": {
    // The same block you would get from the sync endpoint:
    // markdown / html / text / links, plus per-URL status.
  }
}

// The result block mirrors the corresponding sync response, so your
// handler parses it exactly like a direct /v1/scrape/batch call.

Verify the signature.

Recompute the HMAC over the raw bytes and compare in constant time. Never trust an unverified body.

Verify (Node)crypto.createHmac
import crypto from 'crypto';
import express from 'express';

const app = express();
app.use(express.raw({ type: 'application/json' })); // raw bytes for HMAC

app.post('/hooks/ollagraph', (req, res) => {
  const header = req.headers['x-ollagraph-signature'] || '';
  // Header format: "t=<unix_ts>,v1=<hex_hmac_sha256>"
  const parts = Object.fromEntries(header.split(',').map(p => p.split('=')));
  const ts = parts.t, sig = parts.v1;

  // Reject anything older than five minutes (defeats replay).
  if (!ts || Math.abs(Date.now() / 1000 - Number(ts)) > 300) {
    return res.status(401).send('stale or missing timestamp');
  }

  // Signed body is: <ts>.<exact_json_bytes>
  const expected = crypto
    .createHmac('sha256', process.env.OLLAGRAPH_WEBHOOK_SECRET)
    .update(`${ts}.`)
    .update(req.body) // raw bytes, not re-serialized JSON
    .digest('hex');

  if (!sig || !crypto.timingSafeEqual(Buffer.from(sig), Buffer.from(expected))) {
    return res.status(401).send('invalid signature');
  }

  // Acknowledge fast, then process out of band — see the idempotency note.
  const { job_id, result } = JSON.parse(req.body);
  res.sendStatus(200);
});
Verify (Python)hmac.compare_digest
import hmac, hashlib, os, time
from flask import Flask, request, abort

app = Flask(__name__)
SECRET = os.environ["OLLAGRAPH_WEBHOOK_SECRET"].encode()

@app.post("/hooks/ollagraph")
def callback():
    header = request.headers.get("x-ollagraph-signature", "")
    # Header format: "t=<unix_ts>,v1=<hex_hmac_sha256>"
    parts = dict(p.split("=", 1) for p in header.split(",") if "=" in p)
    ts, sig = parts.get("t"), parts.get("v1")
    if not ts or abs(time.time() - int(ts)) > 300:
        abort(401)
    body = request.get_data()  # raw bytes
    # Signed body is: <ts>.<exact_json_bytes>
    expected = hmac.new(SECRET, f"{ts}.".encode() + body, hashlib.sha256).hexdigest()
    if not sig or not hmac.compare_digest(sig, expected):
        abort(401)
    payload = request.get_json()
    # process payload["result"] ... then return fast
    return "", 200

Crawl, or poll if you prefer.

Full-site crawls deliver one webhook when done. Or skip webhooks entirely and read the job directly.

Crawl with a webhook (or poll)POST /v1/crawl
# Crawl an entire site, get a single signed POST when the crawl finishes.
curl -X POST https://api.ollagraph.com/v1/crawl \
  -H "Authorization: Bearer $OLLAGRAPH_API_KEY" \
  -d '{
    "url": "https://docs.example.com",
    "max_pages": 500,
    "webhook_url": "https://yourapp.com/hooks/ollagraph"
  }'

# Prefer to poll instead? Omit webhook_url and read GET /v1/jobs/{job_id}.

See the broader async surface on the scraping API and the crawl API.

Webhook questions

Which endpoints deliver results via webhook?

Three async endpoints accept a webhook_url: /v1/scrape/async for a single page, /v1/scrape/batch/async for a list of URLs, and /v1/crawl for a full site. Each returns a job_id immediately and POSTs the finished result to your webhook_url when the work completes.

What does the callback payload look like?

A POST with a JSON body containing the job_id and a result block. The result mirrors what you would get from the corresponding synchronous endpoint, so your handler parses the markdown, html, text, or links exactly as it would a direct call. The request also carries the signature header used for verification.

How are webhooks signed?

Each request carries an X-Ollagraph-Signature header in the form t=<unix_ts>,v1=<hex>. The v1 value is an HMAC-SHA256 of the bytes <ts>.<raw_body> keyed by your account's webhook secret. The verification samples above show the full algorithm. Compare in constant time and reject timestamps older than five minutes to defeat replay attacks.

Where do I get my webhook secret?

Call GET /v1/me with your API key and read the webhook_secret field. Rotate it at any time via POST /v1/me/webhook-secret/rotate. The previous secret stops working immediately, so drain in-flight jobs before rotating or their deliveries will fail verification.

How do I test my endpoint before going live?

Call POST /v1/me/webhooks/test with your webhook_url. We send a single signed payload using the exact production signing scheme, retry behavior, and timeout, so the result reflects what a real delivery looks like. Use it to confirm your receiver verifies the signature correctly before you queue real jobs.

What is the retry policy, and how do I stay idempotent?

If your endpoint returns a non-2xx response or the connection errors, we retry with backoff. Because a retried delivery is byte-identical to the original, the job_id is your idempotency key: record it on first receipt and skip duplicates. Acknowledge quickly with a 200 and do heavy processing out of band so a slow handler does not trigger an unnecessary retry.

Can I use webhooks without HTTPS?

No. Webhook URLs must be HTTPS. Plain HTTP endpoints are rejected at job-creation time. This protects the payload and the signature in transit.

What if a webhook is impractical for my environment?

Omit webhook_url and poll instead. Every async job is readable at GET /v1/jobs/{job_id}, which returns the current status and, once finished, the same result block a webhook would deliver. Webhooks are recommended for production because they remove the constant polling overhead, but polling is always available as a fallback.

Ship production-grade async today.

1,000 free credits, one bearer token, signed webhooks on every plan including the free tier.

Start free Read the docs