Blog

In-depth writing on the open web — for AI teams, RevOps, and data engineers.

2026-05-16·14 min read·AI, RAG, training data

Building a web data pipeline for LLM training in 2026

A practical guide to collecting, cleaning, and shipping training data at scale — what works, what fails, and what to outsource.

2026-05-16·12 min read·sales intelligence, technographic, RevOps

Sales intelligence APIs in 2026: a buyer's guide to DNS, WHOIS, and technographic data

What technographic data really is, what it isn't, and how to pick between BuiltWith, ZoomInfo, Clearbit, and the new wave of API-first providers.

2026-05-16·11 min read·real estate, Zillow, anti-bot

Scraping Zillow in 2026: what works, what fails, what to do about it

An honest look at the bot defenses, embedded payload extraction, and the three working strategies for getting Zillow data into a production pipeline.