FabricFabric
Browser

API discovery

Find the JSON APIs a website uses under the hood and call them directly — faster, more reliable, and cleaner than DOM scraping.

Most modern web apps are React / Vue / Svelte front-ends talking to a JSON backend. The HTML you see is rendered from API responses — which means if you can find the API, you can skip the DOM entirely and get clean, structured data.

Fabric Agents' built-in browser records every network request it makes. Ask the agent to find the API, and it reads that log to discover URLs, methods, and status codes. From there, evaluate lets it call the API directly from inside the browser (with your session cookies) and you get JSON back instead of HTML to parse.

How to use it

The mental model is:

  1. Open the page the normal way and do the action that loads the data you want.
  2. Check the network lognetwork returns the last requests, filtered by status or resource type.
  3. Identify the call you want. Usually it's an XHR or fetch to a /api/... path with a JSON response.
  4. Call it directly via evaluate fetch('/api/...').

In practice you just say what you want and the agent does it:

"Open our admin panel, go to the users page, find the API it uses to load users, and pull the top 100."

  1. open, navigate to the admin.
  2. navigate /users.
  3. wait network-idle — let the page finish its initial loads.
  4. network 50 2xx — returns the last 50 successful requests.
  5. The agent scans and finds GET /api/v1/users?limit=20.
  6. evaluate fetch('/api/v1/users?limit=100').then(r => r.json()) — returns the data directly.
  7. Hands you a data table.

No DOM parsing, no waiting for the table to render, no brittleness from CSS class changes.

What gets captured

For each request the browser makes, the network log stores:

FieldContent
timestampWhen the request started (ms since epoch).
methodGET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS.
urlFull URL including query string.
statusHTTP status code. 0 for requests that were blocked or failed before reaching the network.
resourceTypexhr, fetch, document, script, image, stylesheet, media, …
okTrue for 2xx.

The log is kept in memory per browser instance, capped at the last 500 requests. New requests push out old ones past that limit.

What is not captured

  • Request bodies — what you sent.
  • Response bodies — what came back.
  • Request / response headers — including Authorization, Cookie.

That's deliberate: bodies can be large, headers contain credentials, and keeping them all in memory would be both privacy-hostile and slow. When the agent needs a response body, it replays the request with evaluate and captures the return value of fetch directly.

Filtering the log

The network command accepts two arguments:

network [limit] [filter]
  • limit — how many entries to return, newest first. Default is a reasonable small number.
  • filterfailed, 2xx, 3xx, 4xx, 5xx, or a resource type (xhr, fetch, document, script, etc.).

Common patterns:

CallWhat you'll see
networkLatest requests, all of them.
network 20 failedLast 20 failed requests — great for debugging.
network 50 2xxLast 50 successful requests — how you find the real APIs.
network 20 xhrLast 20 AJAX calls — the most useful filter when the page uses XHR.
network 20 fetchSame for fetch()-based apps.

Replaying requests

Once the agent spots a URL like /api/v1/users?limit=20, it replays via evaluate:

fetch('/api/v1/users?limit=100', { credentials: 'include' }).then(r => r.json())

The browser automatically attaches cookies via credentials: 'include', so authenticated APIs work the same way they do for the page. The return value of the evaluate is what the agent gets back — a JSON object.

For POST / PUT / PATCH calls, the agent can include a body:

fetch('/api/v1/issues', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  credentials: 'include',
  body: JSON.stringify({ title: 'New issue', priority: 'P2' }),
}).then(r => r.json())

Why it's faster than scraping

  • One network round-trip instead of "render the page, wait for idle, parse the DOM".
  • Structured data — you get the original JSON, not HTML-serialised copies of it.
  • Stable — CSS class names and page layouts change every release; backend APIs are usually versioned and change slowly.
  • Paginable — you can bump limit= or pass a cursor directly. The UI might force 20-at-a-time; the API often lets you pull 100 or 500.

When to use it

  • Bulk extraction — you need more records than the UI shows per page.
  • Data that's hidden behind interactions — it loads only after you click a tab, or only appears inside a collapsed section.
  • Repeated runs — automations and scheduled jobs benefit most.
  • Debugging — "why does this fail?" often becomes "what's the 4xx the page is quietly swallowing?".

When not to use it

  • The service has a public MCP server or REST API. Add it as a source and skip the browser entirely — clean, reproducible, not reliant on your cookies.
  • You only need to do it once. For a single screenshot or a one-time lookup, DOM interaction is fine.
  • The API uses signed bodies or request signing. Re-calling from fetch won't produce the right signature; the browser flow still works because it goes through the page's own signing code.

Caveats

  • Requests may contain sensitive query parameters. URLs with ?token=... end up in the network log. Don't ship the log anywhere, and don't share a session transcript that includes raw network output.
  • Rate limits apply. The server doesn't know you're using a browser — your requests count toward normal user quotas.
  • Session cookies rotate. If the auth cookie expires mid-session, follow-up fetch calls will 401. Re-navigate to refresh or re-log-in.
  • Browser overview — how the browser session works.
  • Examples — general browser automation recipes.
  • Sources — the proper way to use an API when the service offers one.

On this page