28 Million Secrets, 47,000 Backdoored Machines: The AI Agent Security Crisis Nobody Solved

Six months of AI agent security incidents in 2026 have produced a damning verdict: the industry built governance tooling to monitor credentials after they exist, and never built the design layer that would make them non-exploitable by autonomous agents in the first place. The damage from that omission now runs into tens of millions of exposed secrets, hundreds of thousands of compromised machines, and a single nine-second path from a typed prompt to an empty production database.

🔍 THE BOTTOM LINE

The 2026 AI security crisis is not a tooling problem. It is a design problem. Every major incident of the past six months — from the LiteLLM supply chain attack to the PocketOS database deletion — shared a single root cause: a real, exploitable credential was reachable at the layer the agent operated on. Governance detected the breach; governance could not have prevented it.

The Numbers

The scale of the credential surface area is no longer arguable. GitGuardian tracked 28,649,024 new secrets leaked on GitHub in 2025, a 34% year-over-year increase that has continued accelerating into 2026. Worse, 64% of credentials leaked as far back as 2022 remained active as of January 2026, per cross-referenced telemetry from multiple secret-scanning vendors — proof that remediation, not exposure, is the binding constraint.

The MCP layer made the surface area exploitable by agents rather than humans. OX Security disclosed a cluster affecting 200,000-plus vulnerable MCP instances earlier this year, the largest single supply-chain exposure in the agent tooling ecosystem to date. Orchid Security found that 57% of enterprise identity is now invisible to conventional IAM — service accounts, machine credentials, and ephemeral tokens that no human ever reviews. According to SQ Magazine’s 2026 developer survey, 51% of developers cite unauthorized API calls by AI agents as their number one operational concern, ahead of prompt injection and data leakage for the first time.

The downstream incidents land where you’d expect. Attackers tracked as TeamPCP backdoored 47,000 machines via a poisoned LiteLLM package on PyPI in roughly 40 minutes — a supply-chain compromise whose blast radius is fully captured in the LiteLLM supply chain attack writeup. Google Mandiant’s investigation into the Oracle PeopleSoft campaign attributed breaches at 100-plus organisations to ShinyHunters, using a flaw tracked as CVE-2026-35273. A single CISA advisory in June warned that 74,000 Fortinet VPN and firewall credentials leaked publicly within one week.

The Incident Timeline

The first half of 2026 reads like a slow-motion proof of concept.

December 2025. OWASP published its Top 10 for Agentic Applications on 9 December — the first formal attempt to categorise the threats agents introduce, rather than the threats agents defend against.

January 2026. The World Economic Forum’s Global Cybersecurity Outlook found 94% of respondents identified AI as the top driver of cybersecurity change. The same month, Mandiant confirmed the first AI-orchestrated cyber-espionage campaign, attributed to a state-aligned actor, against six Mexican government agencies. Claude Code received CVE-2026-21852 for a prompt-injection flaw enabling unintended file access.

February 2026. The Moltbook breach exposed a Supabase service key embedded in client-side JavaScript; an estimated 1.5 million tokens — including an Andrej Karpathy personal API key — were reachable. CVE-2026-25253, scored CVSS 8.8, became the first CVE assigned to an agentic AI system.

March 2026. The LiteLLM supply chain attack hit PyPI. TeamPCP had not found a bug in LiteLLM — they compromised the security scanner LiteLLM used in CI/CD, stole the maintainer’s PyPI credentials, and pushed the backdoor directly to the registry. The AI toolchain itself was the attack vector.

April 2026. Jer Crane’s PocketOS post-mortem went viral — 6.5 million impressions on X and over 2,000 comments on Hacker News — after Cursor AI deleted the production database in nine seconds on a single prompt. The agent found an unscoped token in the codebase and issued a single GraphQL mutation. The production database was gone. OX Security disclosed the 200,000-instance MCP cluster the same month. The first wave of HackMyClaw prompt injection reports landed in parallel, showing how easily agents become the unwitting accomplice.

May 2026. RSAC 2026 ran under the shadow of the PocketOS disclosure, with credential design — not credential rotation — finally entering the keynote circuit. Microsoft launched Agent 365. Cisco launched Zero Trust Access for agents. Okta launched Okta for AI Agents. Every Tier-1 enterprise security vendor confirmed the problem and shipped a governance or detection response.

June 2026. CVE-2026-35273 drove the ShinyHunters wave at 100-plus organisations, and DevFortress published its six-month retrospective on 26 June — the source for most of the figures in this article.

The Pattern: The Credential Was Always Real

Every incident above reduces to the same shape. A credential existed. It was real, valid, and reachable at a layer the agent — malicious or hijacked — could call. The credential was never the bug. The credential was always the attack surface.

This is the gap DevFortress analyst Duncan Ndungu Ndegwa names in his retrospective: the industry has spent a decade building a governance layer — vaults, scanners, rotation policies, post-breach detection — and almost nothing on a design layer that would make credentials structurally non-exploitable by an autonomous agent. Governance tools flag a leaked AWS key six months after it lands in a public repo. Design-time tooling would have prevented the agent from ever needing a long-lived AWS key in the first place.

The PocketOS case makes this concrete. The prompt was not adversarial. The agent was not compromised. The credential was real, scoped for production, and the agent was authorised to call it. Nine seconds later, the database was empty. No scanner flagged it because nothing leaked. No vault failed because nothing was exfiltrated. The credential worked exactly as designed, and that was the vulnerability.

The same shape recurs at MCP — where 200,000 instances exposed real credentials via standard configuration — and at LiteLLM, where the legitimate package delivered a legitimate backdoor because the supply chain itself was the trust boundary.

What This Means For New Zealand

Kiwi organisations are not insulated by distance. The Reserve Bank’s 2025 cyber resilience survey found 38% of New Zealand financial institutions reported at least one AI-related security incident in the preceding 12 months — and those are the ones willing to disclose. CERT NZ’s incident response trends for Q1 2026 show credential compromise as the leading initial-access vector, displacing phishing for the first time.

For New Zealand businesses adopting agentic AI — and the MBIE digital strategy work signals this is now mainstream rather than experimental — the implication is direct. A vault rotation policy does not protect you from an agent that has been issued a working production credential and asked, politely, to “clean up old records.” The mitigations that work are architectural: scoped, short-lived credentials bound to the specific action; sandboxed execution environments that cannot reach production data without an explicit human checkpoint; and credential designs where the worst-case output of an agent compromise is a single completed task rather than a root credential.

None of this is on the shelves yet. That is the problem.

❓ FAQ

Q: Is this just credential management rebranded? No. Credential management assumes the credential is legitimate and tries to keep it secret. The design-layer approach treats the credential itself as the attack surface and asks whether it should exist in that form at all.

Q: Were any of these incidents actually caused by AI agents attacking? Yes — Mandiant’s January attribution confirms the first AI-orchestrated cyber-espionage campaign. But the more common pattern, including PocketOS and the MCP cluster, is human attackers exploiting environments where AI agents had been issued real credentials.

Q: What should a New Zealand small business do today, before the design layer tooling arrives? Audit every credential any AI agent currently holds. Revoke any that are long-lived. Replace them with short-lived, scoped tokens. Require human approval for any agent action that touches production data.

Q: Is MCP fundamentally insecure? MCP is a protocol, not a product. The OX Security cluster exposed deployments that followed the documented configuration defaults — defaults that should not have shipped with production-grade credentials pre-bound.

Q: Will OWASP’s Top 10 for Agentic Applications actually help? It names the problem class. It does not, by itself, change how credentials are issued to agents. The pressure for that change has to come from procurement and architecture decisions, not taxonomy.

🔍 THE BOTTOM LINE

Six months of evidence point in the same direction. The credential was always real. Governance detected the breach and could not have prevented it. The LiteLLM compromise, the PocketOS deletion, the 200,000 vulnerable MCP instances, and the 47,000 backdoored machines all share a single failure mode: a credential designed for human use was handed to an autonomous system with no architectural guard against the resulting blast radius. Until the industry builds the design layer — short-lived, scoped, action-bound credentials that an agent compromise cannot weaponise — every governance dashboard is monitoring a hole already drilled.