How We Taught Our Scanner to Detect IDOR

Our scanner missed one flaw: a broken-access-control IDOR. Here's how we taught its AI agent to detect IDOR on every run, without crying wolf on public keys.

Barret8 min read

In our scanner test, we did something most security vendors avoid: we built a deliberately-broken app, planted 21 flaws in it, wrote the answer key first, and graded our own scanner against it. It caught 19 of the 20 catchable flaws. Then we published the one it missed.

That miss was an IDOR — broken access control — on an endpoint that handed any stranger a real customer's order. We called it the frontier of automated security: the flaw class a pattern-matcher can't reach, because catching it takes judgment, not a regex. We also said that's where the reasoning-agent and human work begins, instead of pretending a one-click scan catches everything.

This is that work. We went back and taught our scanner to detect the IDOR it walked past. Here's what it took, what changed, and — the part we care about most — what it still can't do.

⚡ TL;DR

  • Our scanner caught 19 of 20 planted flaws. The miss was an IDOR: an endpoint returning a stranger's private data, which no pattern match can flag because the string looks fine — the intent is the bug.
  • We fixed it by changing how the scanner's AI agent reasons, not by adding another rule. It now probes like an anonymous attacker and flags an endpoint that returns real, different customers' records with no login. It caught the IDOR on every re-run — bringing the test to 20 of 20.
  • It did not start crying wolf. It still holds your public Supabase and Firebase keys at "info," and it never copies customer data into a report.

What it takes to detect an IDOR

IDOR stands for Insecure Direct Object Reference. The example is boring and everywhere: an app has an endpoint like /api/orders/1042. That's your order. Change the number to 1043 and a different customer's name, email, and shipping address come back — with no login at all. Anyone who can count can read every record.

Here is why a scanner struggles. An exposed key or a missing header is visible from the outside: the string is in the response or it isn't. An IDOR is not. The request /api/orders/1043 succeeds and returns clean, well-formed data. Nothing about that response is malformed. The only thing wrong is that it should have required a login and didn't — and "should have" is not a pattern you can grep for.

So a naive tool has two bad options. Flag every endpoint that returns data to an anonymous request, and you're back to a wall of red — most of those are public on purpose. Stay quiet, and you miss the real ones. That tension is exactly what our first scan ran into. It saw the endpoint, could not prove the intent, and — to avoid a false alarm — said nothing.

The agent that reads like an attacker

Our scanner has three layers. A free passive pass reads only what a browser downloads. A deeper active pass, which runs only after you verify you own the domain, checks the things you can confirm against your own live app. And on top of that sits a small, bounded AI agent: a reasoning model that probes the verified app the way a curious stranger would, one read-only request at a time.

The agent is where an IDOR should get caught, because catching it is a reasoning task. Fetch /api/orders/1042, note it's a person's order. Fetch /api/orders/1043, note it's a different person's order. No login was presented for either. A human looking at that concludes "broken access control" in seconds. The agent should too.

In the first test, it didn't. Not because it couldn't reach the endpoint — it did — but because of how it had been told to weigh what it found.

The real fix was less caution, not more code

The brand promise that makes our scanner worth trusting is that it doesn't cry wolf. We tuned the agent hard toward that discipline: report only on solid proof, treat public-by-design values as non-issues, never inflate a maybe into a critical. Good instincts — except we'd overcorrected. The agent had become so afraid of a false positive that it treated a stranger reading a real customer's private record as "possibly intended," and stayed quiet.

The fix was to change the question the agent asks itself. Instead of "can I be certain this is a vulnerability?" it now reasons from the position it's actually in: an anonymous caller with no account and no session. From there, the test is direct — this endpoint answered me, a stranger, with real customer data. Should it have demanded a login first? An order record with a named person's email and address, handed over with no auth, answers that question on its own.

That single reframing is the whole fix. No new signature, no rule that says "flag /api/orders" — the agent works it out from the responses, the way an attacker would.

🐺 Not a real problem

Teaching an agent to be more aggressive is the easy way to wreck a scanner — now it flags everything. So the test that mattered wasn't "does it catch the IDOR." It's "does it catch the IDOR and still leave the harmless stuff alone." In every run, the agent kept ignoring the Supabase anon key and Firebase web config sitting right there in the page source — public by design, held at info, exactly as before. If yours flags those as critical, that's the amateur tell. The skill isn't finding more; it's finding the right things and nothing else.

What changed, measured

We put the reworked agent back against the same live test app and ran it repeatedly, because a reasoning model can be inconsistent and one lucky catch proves nothing.

It caught the IDOR on every one of six consecutive runs. Each time, the trace shows the same move: it pulled two different order ids, saw two different real customers come back with no authentication, and reported broken object-level authorization. That takes the test from 19 of 20 to 20 of 20 catchable flaws.

Two details matter as much as the catch. First, no new false positives — the public keys stayed at info across every run. Second, the finding it writes contains no customer data at all: it records the method, the path template, the status code, and how many records came back — never the names, emails, or card details it read to reach the conclusion. The agent has to see the private data to reason about it, but that data never lands in a report or a stored record. Proving a data-exposure bug shouldn't create a second copy of the exposure.

The frontier we haven't crossed yet

Being precise about this is the point of the whole exercise, so here's the honest edge.

The IDOR we now catch is the unauthenticated kind: an endpoint that hands private data to anyone, no login required. That's catchable because a single anonymous caller is enough to prove it — a stranger got a record they shouldn't have.

The harder case is cross-tenant IDOR, where both users are logged in: user A signs in and, by changing an id, reads user B's data — and both are supposed to be able to use the app. Proving that needs two real, seeded identities and a comparison of what each is allowed to see. Our anonymous agent can't stand up two logged-in users. That case belongs to deeper active testing with seeded credentials, and at the top tier, to a human audit. We'd rather tell you where the map ends than draw coastline that isn't there.

How to check your own app for IDOR

You can run this exact check by hand today, and you don't need a login for the most dangerous version:

  1. Find an id in a URL or API call. Open your app, watch the network tab, and look for a request with a number or short id in it — /api/orders/1042, /api/users/8, /api/invoices/551.
  2. Change the id while logged out. Copy that request, alter the id, and send it with no session — an incognito window or a plain curl works. If a different person's data comes back, you have an unauthenticated IDOR, and you found it before anyone else did.
  3. Then test it logged in as a second user. Sign in as one test account, note an id you own, and try to read an id owned by another account. Data you shouldn't see is the cross-tenant case — the deeper one.

Want this run against your own app?

Run a free, read-only scan of your live app — no install, results in under a minute.

Scan my app free →

FAQ

Can an automated scanner detect IDOR?

Partly, and the distinction matters. The unauthenticated kind — an endpoint that returns private data to a caller with no login — is catchable by a reasoning agent that probes as a stranger and swaps ids to compare responses, which is what ours now does. The cross-tenant kind, where two logged-in users can read each other's data, needs two seeded identities to prove and is the job of deeper active testing and a human audit. A tool that claims it catches every IDOR automatically is overselling.

Why can't a normal scanner catch broken access control?

Because the response looks correct. An IDOR endpoint returns clean, valid data — the flaw is that it should have required authorization and didn't, and "should have" isn't visible in the bytes. Catching it takes reasoning about intent, not matching a pattern, which is why it sits a layer above exposed keys and missing headers.

Does the scanner store the customer data it reads to find an IDOR?

No. To reason about an exposure, the agent has to read the exposed record, but the finding it saves contains only the method, path template, status code, and record count — never names, emails, or card data. Confirming a leak shouldn't create another copy of it.

The bottom line

We missed an IDOR, said so in public, and then closed the gap — not by bolting on a rule, but by teaching the scanner's agent to reason like the stranger poking at your API: this answered me with someone's private data, and it shouldn't have. It now catches that IDOR on every run, holds the line on false positives, and keeps the data it reads out of the report. What's left — the fully-authenticated cross-tenant case — we've named plainly instead of papering over. An attacker can find an open endpoint in minutes by counting; find it first, on every deploy, and start with the complete guide to vibe-coding security.

Find your gaps before an attacker does.

Is My Site Hackable? scans your deployed app for the exact issues in this article — exposed keys, missing RLS, open buckets — and tells you what's real and what's a false alarm.

Run a free scan →