We Built a Broken Vibe-Coded App to Test Our Scanner
We built a deliberately-broken vibe-coded app with 21 planted flaws to test our security scanner. It caught 19 of 20 — here's the one it missed, and why.
You can point an automated security scanner at your app and get a verdict in under a minute. The harder question is whether to trust it — the wall of red, the "critical" that turns out to be nothing, the real hole it walks right past. A scan is only worth what its judgment is worth.
So we tested ours the hard way. Instead of scanning real apps and hoping they came back clean, we did the opposite: we built a deliberately-broken vibe-coded app, planted 21 specific security flaws in it, wrote down the answer key first, and pointed our own scanner at it. Then we graded the scan against the list.
Here's what it caught, what it refused to panic about, and — the part that matters most — the one thing it missed.
⚡ TL;DR
- We planted 21 flaws in a throwaway app. The hosting platform auto-fixed one before the scan even ran, leaving 20 to find. The scanner caught 19.
- It flagged every real secret — Stripe, OpenAI, AWS, a Supabase service-role key — as critical, and correctly left the Supabase anon key and Firebase config at "info," because those are public by design. Telling those apart is the entire job.
- The one miss was an IDOR (broken access control): the class of flaw no passive scan catches, because proving it needs a second logged-in user. That's the frontier — and we'll show you why.
How we rigged the test
The method is the credibility of any number like this, so here it is.
We built a small analytics app the way an AI tool spits one out over a weekend — a Next.js front end, a handful of API routes, no security review — and planted 21 flaws across every category a scanner should cover: real-looking secret keys hardcoded in the code, config files left reachable, missing security headers, API endpoints that answer without a login, and one broken-access-control bug.
Two rules kept it honest: every "secret" was fake — formatted exactly like the real thing so the detector would treat it as live, but not an actual working key — and we wrote the answer key before running anything, so grading was a checklist, not a vibe.
Then we ran the full scan: the free passive pass that reads only what a browser downloads, plus the deeper active checks a verified owner can run against their own live app.
What it caught: the whole common surface
The scanner found what a competent tool should, and graded it correctly.
The dangerous secrets — flagged critical. We buried four keys that can end your week: a Stripe sk_live_ secret key (charges money), an OpenAI key (runs up someone else's bill on your card), an AWS access key, and a Supabase service-role key (the god-mode key that ignores every access rule). All four came back critical.
The files that shouldn't be public — flagged. We left /.env, /.git/HEAD, /.git/config, and a config.json reachable at the app's root. A live .git directory can hand an attacker your entire source history; a reachable .env hands them the keys. All flagged, critical and high.
The hardening gaps — flagged. Missing Content-Security-Policy (CSP, which limits what scripts a page can run), missing X-Frame-Options (which stops your page being embedded to trick users), missing X-Content-Type-Options, and a CORS (Cross-Origin Resource Sharing) rule allowing any website to make credentialed requests — a real misconfiguration, not a nitpick. It also caught that the production bundle shipped source maps, which republish your source code to anyone who looks.
The endpoints that answer to no one — flagged. Two API routes returned real user data to anonymous requests, no login required — the active scan flagged both. It also caught a page that spilled a raw stack trace on malformed input, the kind of leak that tells an attacker exactly what your server runs.
Tally: 20 catchable flaws on the deployed app, 19 caught.
What it refused to panic about
Here is the part most scanners get wrong, and the reason we ran this test at all.
Sitting in the page source, right next to the four dangerous keys, were a Supabase anon key and a Firebase web config. A scanner that flags on presence — "I see a key-shaped string, sound the alarm" — would have reported six critical secrets. Ours reported four. The other two it marked info.
🐺 Not a real problem
Your Supabase anon key and Firebase web config are meant to live in the browser. They're identifiers — "talk to this project" — not passwords. The real risk is one level deeper: whether your Row Level Security (RLS) rules actually restrict who can read which rows, or are missing entirely. That gap is invisible from the outside and is what leaked 170 apps in CVE-2025-48757. The anon key isn't the emergency; the open rules are. (More on which key actually ends your company: service_role vs anon.)
That single distinction — four criticals, not six — is the difference between a report you act on and a wall of red you learn to tune out. A regex that matches a key-shaped string is trivial. Knowing which strings are supposed to be there is the work.
The plot twist: the platform blocked our broken app
A detail too good to leave out: when we first tried to deploy the app, the host refused — it detected a Next.js version with a known CVE and blocked the deploy until we upgraded. Our intentionally-insecure test app got stopped by a supply-chain gate before it could go live.
A useful reminder that some of your security is upstream of you — the framework, the host, and the dependency scanner each catch a slice. But none of them look at your code, your access rules, or your config. That's the slice this scan is for.
The one it missed — and why that's the honest part
Now the miss. We planted a classic IDOR — an Insecure Direct Object Reference, also called broken access control. The app had an endpoint like /api/orders/1042. That's your order. Change the number to 1043 and it returns a different customer's name, email, and shipping address — with no login at all. Anyone can read any order by counting.
The scanner did not flag it. And the reason is the most useful thing in this whole post.
An exposed key or a missing header is visible from the outside — it's either in the response or it isn't. Broken access control is not. To prove it, you have to show that user A can read data belonging to user B, which means being two logged-in users at once and comparing what each is allowed to see. A tool with no credentials can request /api/orders/1043 and get data back, but it can't know that endpoint isn't meant to be public. Flag every anonymous data read and you're back to crying wolf; stay quiet and you miss the real ones. That tension is the frontier of automated security testing.
This is the honest map: automated scanning is excellent at the wide, common surface — exposed keys, reachable files, missing headers, weak CORS, the presence of a database key — and, done right, at not over-flagging. The deepest layer — broken access control and business-logic flaws — is the last mile, which is why deeper active testing exists and why, at the top, a human audit still does. Anyone selling you a one-click scan that "catches everything" is selling the opposite of the truth.
How to check your own app
You can run every one of these checks on your own deployed app today:
- Search your bundle for secret keys. In dev tools, search the JavaScript for
sk_live,sk-, andAKIA— then confirm a hit is a real credential, not a publishablepk_key or a CSS class name. The validation is the part that matters. - Check your headers and reachable files. In the network tab, look for Content-Security-Policy and X-Frame-Options. Then visit
/.envand/.git/configon your live domain — you should get nothing. - Test access control by hand. Log in as one test user, note an object id in a URL or API call, then try to read a different id (or hit it logged out). If a stranger's data comes back, you have an IDOR — and you found it first.
Want the same check on your own app?
Run a free, read-only scan of your live app — no install, results in under a minute.
Scan my app free →FAQ
Can a security scanner find every vulnerability in my app?
No — and any tool that claims it can is the one to distrust. Automated scanning reliably catches the wide, common surface: exposed secret keys, reachable config files, missing headers, weak CORS, and whether a database key is public — in our test, 19 of 20. What it can't reliably catch alone is broken access control and business-logic flaws, which need two logged-in identities to prove. That's what deeper active checks and a human audit are for.
What is an IDOR, and why is it hard to detect?
IDOR (Insecure Direct Object Reference) is when an app lets you reach data you don't own by changing an identifier — order 1042 is yours, but 1043 returns a stranger's. It's hard for a scanner because confirming it requires acting as two different users and proving one can read the other's data. From the outside, an endpoint that returns data could be a leak or could be public on purpose; only the app's intended access rules say which, and those aren't visible in the page source.
My Supabase anon key shows up in the page source — did the scan miss a leak?
No. The anon key is public by design and belongs in the browser. A scanner marking it "info" instead of "critical" is doing its job. Spend the effort confirming your Row Level Security rules actually restrict access — that, not the key, is what protects your data.
The bottom line
We built a broken app, wrote the answer key, and graded our own scanner against it: 19 of 20 catchable flaws found, every dangerous secret escalated, every public-by-design key held at info, and one honest miss — an IDOR — that no passive scan catches because proving it takes a second logged-in user. That's the real shape of automated security: it nails the common surface and the false-positive discipline, and it tells you where the human-and-agent work begins instead of pretending it doesn't exist. An attacker can find your exposed keys and open endpoints in minutes — find them first, on every deploy, and start with the complete guide to vibe-coding security.
Find your gaps before an attacker does.
Is My Site Hackable? scans your deployed app for the exact issues in this article — exposed keys, missing RLS, open buckets — and tells you what's real and what's a false alarm.
Run a free scan →