Open a persistent session and steer it in plain English — navigate, act, observe, extract — or send a deterministic JSON action macro when you already know the page. Multi-step journeys across login walls and dynamic apps, without a single line of Playwright.
For single-page extraction see the scraping API; for raw headless-session control see the browser API.
Tell the session what to do in plain English — goto, act, observe, extract. The model resolves each instruction to a concrete element and action (click, type, select, scroll), so a copy change on the target site does not break your script. No selectors to babysit.
Open a session once and drive it across many calls. Cookies, storage, and the logged-in context stay alive between steps, so multi-page journeys — log in, navigate, act, extract — run as one coherent flow instead of disconnected requests.
When you already know the page, skip the model and send a deterministic JSON array of click / type / wait / scroll steps to the scraping endpoint. Same hosted browser, fully reproducible, zero per-step model latency. Mix and match per task.
You write instructions, not browser code. There is no headless Chrome to deploy, no Puppeteer or Playwright scripts to maintain, no stealth tuning — the hosted browser handles fingerprinting and rendering. Bring a managed model (no key required), or bring your own.
Open a session, act in plain English, observe, extract — or fall back to deterministic macros and raw session control.
# Open a persistent, LLM-driven browser session.
# Free to open — you only pay per action (goto/act/observe/extract).
curl -X POST https://api.ollagraph.com/v1/stagehand \
-H "Authorization: Bearer $OLLAGRAPH_API_KEY" \
-H "Content-Type: application/json" \
-d '{}'
# -> { "stagehand_session_id": "sh_...", "llm_provider": "...",
# "model_name": "...", "expires_at": "..." }
# Bring your own model instead (held in memory for the session only):
# {"llm_provider": "openai", "model_name": "...", "llm_api_key": "sk-..."}# Drive the session in plain English — no selectors to maintain.
SID=sh_...
# 1) Navigate
curl -X POST https://api.ollagraph.com/v1/stagehand/$SID/goto \
-H "Authorization: Bearer $OLLAGRAPH_API_KEY" \
-d '{"url": "https://example.com/search"}'
# 2) Act — the model resolves the instruction to a real element + action
curl -X POST https://api.ollagraph.com/v1/stagehand/$SID/act \
-H "Authorization: Bearer $OLLAGRAPH_API_KEY" \
-d '{"instruction": "type the phrase wireless headphones into the search box and press Enter"}'
# -> { "ok": true, "success": true, "message": "<selector used>", "url_after": "..." }# Observe what is actionable, then extract structured data.
SID=sh_...
# List candidate actions on the current page (optionally focus with an instruction)
curl -X POST https://api.ollagraph.com/v1/stagehand/$SID/observe \
-H "Authorization: Bearer $OLLAGRAPH_API_KEY" \
-d '{"instruction": "find the add-to-cart buttons"}'
# -> { "ok": true, "candidates": [{"description": "...", "method": "click", "selector": "..."}] }
# Pull structured data, coerced to your JSON Schema
curl -X POST https://api.ollagraph.com/v1/stagehand/$SID/extract \
-H "Authorization: Bearer $OLLAGRAPH_API_KEY" \
-d '{
"instruction": "extract the product name and price",
"schema": {"type": "object", "properties": {
"name": {"type": "string"}, "price": {"type": "number"}}}
}'# Prefer deterministic JSON macros? Drive a multi-step flow in one shot.
curl -X POST https://api.ollagraph.com/v1/scrape \
-H "Authorization: Bearer $OLLAGRAPH_API_KEY" \
-d '{
"url": "https://example.com/login",
"stealth": true,
"format": "markdown",
"actions": [
{"type": "type", "selector": "#email", "text": "[email protected]"},
{"type": "type", "selector": "#password", "text": "..."},
{"type": "click", "selector": "button[type=submit]"},
{"type": "wait", "ms": 2500},
{"type": "click", "selector": ".load-more"},
{"type": "wait", "ms": 1500}
]
}'# Raw persistent session: render + run a script across calls, keeping cookies.
SID=$(curl -s -X POST https://api.ollagraph.com/v1/session \
-H "Authorization: Bearer $OLLAGRAPH_API_KEY" | jq -r .session_id)
curl -X POST https://api.ollagraph.com/v1/session/$SID/render \
-H "Authorization: Bearer $OLLAGRAPH_API_KEY" \
-d '{"url": "https://example.com/dashboard"}'
# Reuse the same authenticated context as many times as you need, then close it.
curl -X DELETE https://api.ollagraph.com/v1/session/$SID \
-H "Authorization: Bearer $OLLAGRAPH_API_KEY"Patterns customers run in production — each is a handful of readable calls.
goto the search page, act to type a query and submit, observe the result cards, then extract a structured list — name, price, URL — coerced to your JSON Schema. One session, four readable calls.
Open a session, act through the login form, navigate to the protected page, and extract — all on one persistent context. The cookies set during login carry forward to every later step.
When the DOM is stable, send a JSON action macro to the scraping endpoint: type credentials, click submit, wait, click load-more, return markdown. No model in the loop, fully reproducible.
Loop act instructions like “scroll down and load more results” until the feed stops growing, then extract everything rendered so far. The session keeps the full page state between scrolls.
Not sure what is on the page? Call observe to list every actionable element with a description, method, and selector — then decide which act instructions to issue next. Great for building resilient agent loops.
Need lower-level control? Open a session, render URLs and run scripts inside it across calls while cookies persist, then delete it when done. See the browser API for the full session surface.
Working through a login wall? See the behind-a-login recipe, browse all recipes, or wire results to your stack with webhooks.
This automation page is about multi-step, agent-style flows: a persistent session you drive with natural-language instructions (goto, act, observe, extract) or deterministic JSON action macros. The scraping API is single-shot — send a URL, optionally a JSON action macro, get clean data back in one call. The browser API exposes the raw persistent-session surface: open a session, render and run scripts inside it across calls, then close it. Use automation when the task is a journey across several pages; use scraping when it is one page; use the browser API when you want low-level session control.
No. With act and extract you describe what you want in plain English and the model resolves it to a precise element and action under the hood. The response from act even returns the selector it used, so you can inspect or pin it later. If you prefer to be explicit, the JSON action macro on the scraping endpoint takes selectors directly.
By default a managed model handles act, observe, and extract with no key required from you. If you would rather use your own provider, pass your provider, model name, and key when you open the session, or point at any OpenAI-compatible base URL. Bring-your-own keys are held in memory for the life of the session only and are never logged or persisted.
Yes. Opening a session is free; you are billed per action (goto, act, observe, extract, screenshot). The session keeps cookies, storage, and the logged-in context alive between calls, so a login on step one carries through to an extract on step five. Each session has an idle timeout that you can set when you open it, and a keepalive call resets the timer.
Set an idle timeout when you open the session, and send a keepalive call to reset the idle timer during long pauses. When you are finished, delete the session to free its browser context immediately rather than waiting for the timeout.
Yes. Send a JSON action macro to the scraping endpoint: an ordered array of click, type, wait, and scroll steps that runs exactly the same way every time, with no model in the loop and no per-step inference latency. Use natural-language actions when the page changes often; use macros when the DOM is stable and you want determinism.
Observe lists the actionable elements on the current page — each with a human-readable description, the method (such as click or type), and a selector. It is the discovery step in an agent loop: observe first to see what is possible, then issue the act instructions that make sense. You can pass an optional instruction to focus the search, or omit it to list everything.
Yes. Extract takes a plain-English instruction and an optional JSON Schema. When you pass a schema, the result is coerced and validated against it, so you get typed fields like name and price instead of free-form text. Omit the schema for quick free-form extraction.
Opening a session is free. Each goto, act, observe, extract, and screenshot counts as a single metered action; keepalive and closing the session are free. JSON action macros on the scraping endpoint are billed as the scrape call that carries them. Failed calls are auto-refunded, so a timeout or a missed element never costs you. See pricing for the current credit packs and the free monthly grant.
1,000 free credits, one bearer token, failed calls auto-refund. Sessions are free to open — you only pay per action.