The VibeSec Reckoning: Martin Fowler's Team Found the Cracks in Vibe Coding

Answer-First Lead

ThoughtWorks’ Global Marketing AI team tried to scale a vibe-coded video assembly prototype and hit two critical security failures: AI recommended making storage buckets public and assigning excessive token permissions. The incident, documented in a Martin Fowler article published 27 May 2026, comes with devastating industry stats — 25% of AI-generated code has confirmed vulnerabilities, 1 in 5 enterprise breaches now caused by AI code, and 73% of AI systems have prompt injection exposure.

🔍 THE BOTTOM LINE

Prompting an AI to “be secure” is a suggestion, not a gate. The VibeSec reckoning is that vibe coding without deterministic security controls isn’t rapid prototyping — it’s rapid vulnerability generation.

What Happened at ThoughtWorks

The story is refreshingly concrete. ThoughtWorks’ AI applications team in Global Marketing was asked to scale a video assembly prototype — a tool built with Gemini, Replit AI, and Claude AI to create on-brand videos for 10,000 employees. They hit two security failures that stopped work cold:

Failure 1: Public storage access. The AI recommended making the storage bucket public, or setting cloud file storage to “anyone with the link.” When challenged, it justified this by saying every company does it. Only a firm rejection prompted a secure alternative. This could have leaked unreleased brand assets and audience data to the public internet.

Failure 2: Excessive token permissions. A service account was assigned the Access Token Creator role, granting it the ability to create short-lived tokens and access databases and resources far beyond what the task required. This would have allowed a compromised service account to move laterally through an entire cloud workspace. The team caught it before running the code.

The key insight: AI tools consistently suggest the path of least resistance. That path is not always the secure one.

The Numbers Are Brutal

The Fowler article compiles research that paints a grim picture:

Statistic	Number	Source
AI-generated code with confirmed vulnerabilities	25%	AppSec Santa, 2026
Rise in attacks exploiting app vulnerabilities (YoY)	44%	SQ Magazine, 2026
Codebases with high/critical severity vulnerabilities	78%	Black Duck OSSRA 2026
Organisations with no sensitive data policies for AI	50%	AppSec Santa, 2026
Enterprise breaches caused by AI-generated code	1 in 5	Aikido Security, 2026
New CVEs from AI-generated code, March 2026 alone	35	Georgia Tech Vibe Security Radar
AI systems with prompt injection exposure	73%	SQ Magazine, 2026
Share of all new enterprise software that is AI-generated	42%	Sonar Developer Survey, 2026

Read that again: 1 in 5 enterprise breaches is now caused by AI-generated code. That’s not a future risk. That’s a present cost.

The Real Problem: Prompts Aren’t Gates

After sharing the incidents internally, the message from ThoughtWorks engineering leadership was blunt:

“It is not sufficient to merely tell the LLM the desired behavior of your output artifacts. If you absolutely do not want something to be true, it must be codified in non-negotiable rules somewhere in your development lifecycle.”

Prompting for test-driven development is not the same as enforcing code coverage thresholds in your build tool. One is a suggestion. The other is a gate. The moment a user pushes back on a restriction, or phrases a request differently, the constraint evaporates.

The Proposed Fix: Harness Engineering

ThoughtWorks proposes “harness engineering” — wrapping AI coding agents in an outer harness with two axes of control:

Feedforward vs. Feedback:

Guides (feedforward) — Anticipate unwanted behaviour and steer the model before it acts. Think: system prompts with security rules, allowed library lists, architecture constraints baked into the agent’s context.
Sensors (feedback) — Observe the code after the agent acts to flag errors. Think: linters, static analysis, security scanners that catch what the guides missed.

Computational vs. Inferential:

Computational controls — Deterministic, fast, CPU-run. Linters, test suites, dependency checks. These don’t rely on AI judgement — they’re rules.
Inferential controls — Semantic analysis and AI-driven judgment. Prompt constraints, architectural reviews. These rely on the same AI that might miss the problem in the first place.

The model is useful because it maps what’s actually happening: most teams rely almost entirely on inferential feedforward controls (i.e., “prompt the AI to be safe”) and skip computational feedback controls entirely. That’s how you end up with public storage buckets.

Why This Matters Beyond Engineering

The ThoughtWorks piece makes a point that’s easy to miss: business functions building with AI aren’t exempt from security obligations. Marketing teams, HR teams, operations teams — they’re all building with AI now, and they don’t have security engineers reviewing every output.

Even lightweight internal prototypes must comply with enterprise security standards. Without the right guardrails, AI-assisted development can expose sensitive data long before an application reaches production.

42% of all new enterprise software is now AI-generated. The attack surface is expanding faster than the security teams can scan it. 62% of security teams say they can’t keep up with AI-generated code volume.

NZ Lens

New Zealand’s small business sector has embraced AI tools enthusiastically — often without dedicated security teams. A sole trader using Claude to build a customer-facing app has the same vulnerabilities as ThoughtWorks’ marketing prototype, but nobody catching the public bucket recommendation before it goes live.

The Computer Emergency Response Team (CERT NZ) hasn’t issued specific guidance on AI-generated code security. Given that 1 in 5 breaches now involves AI code, that guidance needs to come sooner rather than later.

❓ Frequently Asked Questions

Q: Is vibe coding itself the problem, or is it the lack of security controls? The coding isn’t the problem — the speed without guardrails is. Vibe coding produces useful prototypes fast. The issue is treating those prototypes as production-ready without deterministic security checks.

Q: What’s the single most important control to add? Computational feedback controls — automated security scanning in your CI/CD pipeline that catches vulnerabilities regardless of what the AI was told. A linter doesn’t care about your prompt.

Q: Should teams stop using AI for coding? No. But teams building anything that handles sensitive data or faces the internet need to treat AI-generated code the same way they’d treat code from an untrusted contributor: verify everything, trust nothing.

🔍 THE BOTTOM LINE

Martin Fowler’s team didn’t discover a new vulnerability class. They discovered that the existing vulnerability class — AI suggesting insecure shortcuts — scales with adoption. The more teams vibe code, the more public buckets and excessive permissions slip through. The fix isn’t better prompts. It’s deterministic gates: linters, scanners, and automated checks that don’t rely on the same AI that created the problem. If 42% of new enterprise software is AI-generated and 25% of that code has vulnerabilities, we’re adding vulnerabilities faster than we’re finding them. The VibeSec reckoning isn’t about whether vibe coding is good or bad. It’s about whether we can put guardrails on fast enough to matter.