On this page
- The old contract: push-time data
- Why "just scrape more often" fails
- The new contract: pull-time data
- MCP is the pull-time surface
- The cost of stale data, with numbers
- What "live" actually means in practice
- What we changed to make this work
- When push-time still works
- Three seconds, then back to the conversation
- Next steps
Your agent does not need yesterday's CSV. It needs the page that exists right now, at the moment the user asks the question. The gap between those two things is where most AI products quietly fail.
Teams have been scraping the web for twenty years, and the default architecture is still the same. A cron job runs at 3am. It walks a list of URLs. It writes rows to a warehouse. In the morning, a human opens a dashboard. That loop worked when the consumer was a person. It does not work when the consumer is a Claude session that started forty seconds ago.
This post is about the contract between an agent and its data source. Why that contract has to be pull-time, not push-time. Why the fix is not a faster cron job. And how we wired Stekpad's MCP server so a model can call scrape mid-conversation and get rows back in three seconds.
The old contract: push-time data
The push-time contract is simple. A scheduler runs. A worker fetches. A row lands in a table. A human reads the table. Everything is batched, everything is pre-computed, everything is slightly stale on purpose because the storage is the product.
That worked because the human was the bottleneck. Even if the data was eight hours old, the human was going to spend thirty minutes reading it, so the freshness window never mattered. The human is slow. The cron is slow. The numbers line up.
Agents break the first half of that equation. An agent is not slow. An agent is a function call with a deadline. When a user asks Claude "what's the current pricing on this competitor's enterprise plan", the agent has about three seconds before the user starts feeling latency and about ten seconds before they open a new tab. The dashboard you scraped at 3am last Tuesday is not going to help.
Why "just scrape more often" fails
The obvious fix is wrong. You cannot paper over a contract mismatch by running your cron every five minutes instead of every night. Three things break.
First, cost. If your agent hits 200 different URLs across a week of conversations, you do not want to scrape 200 URLs every five minutes. That is 57,600 scrapes a day for data that might get read once. At 1 credit per scrape, you spent your month's budget on Monday.
Second, coverage. You do not know in advance which URLs the agent will care about. A user asks about a competitor you have never seen before. Your pre-populated table has nothing. The agent makes something up, or worse, returns "I don't have information about that", and the user loses trust.
Third, freshness is a lie if you can't prove it. Even a five-minute cron is stale. The user expects the page the agent describes to match the page they see when they click. If your cache says one thing and the live page says another, the agent just lied. You cannot undo that moment.
The freshness problem is not a performance problem. It is an architecture problem. You cannot solve it by scheduling harder.
The new contract: pull-time data
The new contract is the opposite shape. The agent pulls. The source fetches on demand. The response comes back in under ten seconds or it comes back with a typed error. Nothing is pre-computed. Everything is fresh at request time, because the request is the trigger.
This is what a call against Stekpad looks like from inside a Claude session.
curl -X POST https://api.stekpad.com/v1/scrape \ -H "Authorization: Bearer stkpd_live_..." \ -H "Content-Type: application/json" \ -d '{ "url": "https://example.com/pricing", "formats": ["markdown", "json"] }'The response arrives in about 2 to 4 seconds for a static page, 4 to 8 seconds for a rendered page, and the JSON shape looks like this.
{ "run_id": "run_01HNXZ3KQ8", "url": "https://example.com/pricing", "status": "succeeded", "markdown": "# Pricing\n\nStarter — $49/month...", "json": { "title": "Pricing", "plans": [...] }, "credits_charged": 1, "fetched_at": "2026-04-14T10:22:17Z"}That fetched_at is not yesterday. It is now. The agent reads the markdown, answers the user, and keeps going. The conversation never paused long enough for anyone to notice a fetch happened.
MCP is the pull-time surface
REST works, but it is not where agents live. Agents live inside MCP clients (Claude Desktop, Cursor, Claude Code, VS Code). MCP gives the model a tool surface it can call without the developer writing a function per tool. That is the right place to put a pull-time data source.
Stekpad ships every verb as an MCP tool from day one: scrape, crawl, map, extract, search, plus read-only tools for datasets. Install once, done.
{ "mcpServers": { "stekpad": { "command": "npx", "args": ["-y", "@stekpad/mcp"], "env": { "STEKPAD_API_KEY": "stkpd_live_..." } } }}Paste that into Claude Desktop's config, restart the app, and the model now has five new tools. When the user asks a question that needs live web data, the model calls scrape, the tool returns JSON, the answer lands in the same message.
No intermediate database. No overnight batch. No dashboard a human has to open. The agent fetches what it needs when it needs it, and the user sees fresh data in the same reply.
The cost of stale data, with numbers
Hypothetical, but not unrealistic. You run a competitor intelligence feature for a B2B SaaS product. You have 5,000 paying customers. Each of them asks the AI assistant 3 questions a week about competitor pricing, roadmap, or feature launches. That is 780,000 questions a year.
If 5 percent of those answers are wrong because the cache is stale, that is 39,000 wrong answers. If each wrong answer has a 2 percent chance of being the moment a user decides your product is unreliable, that is 780 customers you lost to freshness. At $500 ACV, you burned $390,000 in churn on a cron job.
Pull-time is cheaper than that. Pull-time is 780,000 scrapes at 1 credit each. At Stekpad's 9 euro pack price, 780,000 credits is about 3,510 euros a year. You spend 3,510 to save 390,000. The math is not subtle.
What "live" actually means in practice
People use the word "live" loosely. We want to be precise about it, because the distinction is the product.
Live is not streaming. We are not claiming that every page on the web is pushed to you the moment it changes. There is no web socket from Wikipedia to your agent. Live means "fetched at the moment the question was asked", not "continuously mirrored".
Live is not cached-with-a-short-TTL. A one-minute cache is still stale relative to the question. The user asked at second 59, the cache was populated at second 0, the user got a 59-second-old answer. For most questions that is fine. For pricing, availability, status pages, news, live scores, campaign pages, it is not.
Live is not "we scraped this recently". Recency is not freshness. A dataset row from an hour ago is useful for building a corpus, but it is the wrong primitive to hand back to a conversation that is happening right now.
Pull-time is the only contract that survives the question "what time was this?". You fetch when you are asked. The timestamp on the row is the same as the timestamp on the answer. The user trusts the answer because it can be trusted.
What we changed to make this work
Getting a scrape back in under three seconds is not free. Several things have to line up.
First, the verb has to be sync. POST /v1/scrape returns the response in the same HTTP call. No run id to poll, no webhook to wire, no second round trip. The agent calls, the backend fetches, the response comes back. Async is still available for crawls where the work genuinely takes minutes, but the single-page verb is synchronous by default.
Second, the pipeline has to be short. Fetch, parse, return. No intermediate job queue for the happy path, no cold-start penalty on a serverless function that has not run in an hour, no ten-step transformation before the row is visible. Our edge workers hold the whole call in memory and return as soon as the page is rendered.
Third, the MCP tool has to budget itself. Every Stekpad MCP call returns a credits_charged field inline with the data. The agent can read it, decide whether the next call is worth it, and stop itself if the workspace is running low on credits. No surprise bills, no open-ended loops.
Fourth, writes have to fail loud. If your workspace runs out of credits mid-conversation, you do not want the tool to silently return stale rows. You want an insufficient_credits error the model can read and relay to the user. Every error in the Stekpad API is typed and shaped the same way, which means the model can branch on it without parsing English.
When push-time still works
We are not saying cron is dead. If the consumer of your data is a human reading a dashboard once a day, schedule a crawl, land rows in a dataset, and get on with your life. That is what POST /v1/crawl is for.
curl -X POST https://api.stekpad.com/v1/crawl \ -H "Authorization: Bearer stkpd_live_..." \ -H "Content-Type: application/json" \ -d '{ "url": "https://example.com", "max_pages": 500, "webhook_url": "https://your-app.com/webhooks/stekpad" }'Crawls are async. You get a run_id back immediately, you poll or receive a webhook, and the result lands in a dataset you can query later. This is the right tool when the consumer is slow. It is the wrong tool when the consumer is a language model holding its breath.
The right mental model: use crawl for the dataset your humans read every morning, use scrape via MCP for the questions your agent answers in real time. Same product, two clocks.
Three seconds, then back to the conversation
Here is the experience we are optimizing for. A user is deep in a Claude conversation. They ask a question that needs live web data. Claude calls scrape. Three seconds later, the rendered markdown comes back. Claude reads it, answers, and the user never notices the tool call happened, except that the answer is correct.
That is the contract. The agent needs live data. The data source has to meet it where it lives, at request time, on the user's clock, without a cached row from last Tuesday lying about the world.
Cron is fine. Cron is not enough. The agent needs a phone number it can call, not a binder it can flip through.
Next steps
- Read the scrape API reference and the MCP setup guide to see every tool argument.
- See how the cookie bridge lets agents fetch authenticated pages without a server-side cookie jar.
- Compare pricing at /pricing — PAYG credits, no subscription, credits last 12 months.
- Keep reading: Beyond cron jobs goes deeper on the architectural mismatch.