8 vertical actors · 30+ endpoints · normalized JSON

Vertical scrapers your agent
can actually rely on.

Amazon, Yelp, Google Maps, LinkedIn, 7 ATS job boards, Product Hunt, 8 job sites, academic graph. One Bearer token. Normalized JSON. Vendor fallback baked in — so when our scraper trips a CAPTCHA, a paid vendor finishes the call without you ever knowing.

Start free Read the docs

25/75 hybrid tiering

~25% of traffic uses our Camoufox+Chromium pool with residential proxy. ~75% falls through to vetted vendor APIs (Scrapingdog, Apify, Outscraper, ScrapingBee). One tier field tells you who served the request.

Normalized schema

Every actor returns the same shape regardless of who scraped it. ASIN means ASIN, biz_path means Yelp's slug, place_id means Google's. No vendor lock-in surfacing through the response.

Honest failure modes

When all tiers fail, you get {ok:false, error, tier1_error, tier2_error} — not a 500. Job sites blocked upstream return clean warnings, not TLS errors.

The 8 actor families

Each card lists the live endpoints and which tier serves them today.

Amazon

3 endpoints

Search by query, product detail by ASIN, top reviews. Eight marketplaces supported (US, UK, DE, FR, IN, JP, CA, AU) via the marketplace param.

Tier 1: Camoufox+BD Tier 2: Scrapingdog
POST /v1/actors/amazon/search
POST /v1/actors/amazon/product
POST /v1/actors/amazon/reviews

Yelp

3 endpoints

Business search by query+location, business detail by Yelp slug (biz_path), reviews with pagination. Useful for lead-gen and local SEO audits.

Tier 1: Camoufox+BD Tier 2: Apify yin/yelp-scraper
POST /v1/actors/yelp/search
POST /v1/actors/yelp/business
POST /v1/actors/yelp/reviews

Google Maps

3 endpoints

Places search, place detail by URL or CID, reviews up to 50 pages. Google fights tier-1 hard; production calls go to Outscraper's maps/search-v3 + reviews-v3.

Tier 1: Camoufox+BD Tier 2: Outscraper
POST /v1/actors/gmaps/search
POST /v1/actors/gmaps/place
POST /v1/actors/gmaps/reviews

LinkedIn

auth required

Person profile + company page. Requires li_at session cookie (yours, or one from the managed pool). Returns name, headline, location, experience, skills; company industry, size, employees, followers.

Tier 1: Camoufox + li_at pool
POST /v1/actors/linkedin/profile
POST /v1/actors/linkedin/company

ATS job boards

7 boards · sub-second

Pure public HTTP — no browser, no proxy, no quota. Hits the same JSON endpoint each ATS widget uses. /ats/detect tries all 7 in parallel and returns the live one with most jobs.

Direct HTTP · no tier-2 needed
GET  /v1/actors/ats/supported
POST /v1/actors/ats  { ats: greenhouse|lever|ashby|workable|smartrecruiters|recruitee|teamtailor }
POST /v1/actors/ats/detect  { company }

Product Hunt

2 endpoints

Today's leaderboard (or any historical date) + product detail by slug. Returns title, tagline, upvotes, image, topic tags, maker. PH isn't anti-bot heavy — no proxy needed for most calls.

Tier 1: Camoufox direct
POST /v1/actors/producthunt/daily  { date?, limit? }
POST /v1/actors/producthunt/product  { slug | url }

Jobs

8 boards

Indeed, Glassdoor, LinkedIn Jobs, Google Jobs, Bayt, Naukri, BDJobs. zip_recruiter short-circuits with a clean warning (Cloudflare WAF, upstream). Pass google_search_term for Google Jobs — the literal string Google's UI generates.

Tier 1: JobSpy fork
POST /v1/actors/jobs/scrape  { sites:[…], query, location, limit }
GET  /v1/actors/jobs/sites

Academic / Scholar

3 endpoints

Paper search, author lookup with h-index + citations + paper list, single publication by title. Tier-1 is Semantic Scholar's free API; falls back to Google Scholar (scholarly lib) for records SS doesn't index.

Tier 1: Semantic Scholar Tier 2: Google Scholar
POST /v1/actors/scholar/search
POST /v1/actors/scholar/author
POST /v1/actors/scholar/publication

One call, normalized JSON

Same auth, same envelope, same tier field across every actor. Here's Amazon search.

REQUEST
curl https://api.ollagraph.com/v1/actors/amazon/search \
  -H "Authorization: Bearer osk_…" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "macbook air m3",
    "marketplace": "us",
    "limit": 5
  }'
RESPONSE
{
  "ok": true,
  "marketplace": "us",
  "query": "macbook air m3",
  "count": 5,
  "results": [{
    "asin": "B0CX23V2ZK",
    "title": "Apple 2024 MacBook Air …",
    "price": "$1,099.00",
    "rating": "4.8 out of 5 stars",
    "reviews_count": "1,084",
    "image": "https://m.media-amazon.com/…"
  }],
  "egress_ip": "176.98.89.244",
  "tier": "qbrowser"
}

Why not just use the vendors directly?

One bill, one auth

Scrapingdog + Apify + Outscraper + ScrapingBee + LinkedIn cookies all behind one osk_… token. One usage report. One support thread.

Automatic fallback

When tier-1 trips a CAPTCHA, you don't get a 403 — we route to the vendor and mark tier: "vendor". Same response shape, no client code change.

Normalized output

Each vendor returns its own quirky JSON. We map them into one schema per actor so your agent code stays clean — Scrapingdog's customer_reviews and Apify's reviewCount both come out as reviews_count.

Honest about what's broken

Cloudflare WAF blocks ZipRecruiter? You get a clean warning, not a 500. Semantic Scholar rate limit? Clean {tier1_error, tier2_error}. No mystery timeouts.

Skip the vendor sprawl.

1,000 credits on signup. Every actor, day one. No card. Bring your own li_at cookies for LinkedIn, everything else just works.

Start free View docs Use via MCP