AI Makes You Write Better Code — If You Slow Down

Answer-First Lead

Nolan Lawson’s essay “Using AI to Write Better Code More Slowly” hit 695 points on Hacker News this week by arguing the exact opposite of the dominant AI coding narrative. Instead of using LLMs to ship faster, Lawson uses Claude, Codex, and Cursor Bugbot to find bugs, prioritise fixes, and go on side-quests that improve the codebase — even when it means writing less code overall.

🔍 THE BOTTOM LINE

AI’s most valuable role in software development right now might not be generating code — it’s catching bugs you didn’t know existed and forcing you to understand your own codebase better. The counter-intuitive result: slower development, but higher quality.

The argument

Lawson, a well-known open-source developer, starts with a direct challenge:

“A lot of people seem convinced that the point of AI coding is to write low-quality code as fast as possible. Spew out barely-passable slop, open massive PRs, and merge them unvetted. Ship it!”

His counter: LLMs are flexible tools, and you can use them just as effectively to write high-quality code more slowly.

The key insight comes from Mythos — Anthropic’s security-focused model that found 10,000+ vulnerabilities in its first month under Project Glasswing. Lawson cites it directly:

“If Mythos taught us anything, it’s that LLM agents are really good at finding bugs. Throw them at a codebase enough times, and they will find so many bugs that you’ll barely know what to do with them.”

The multi-model review technique

Lawson’s approach is deceptively simple. He has a Claude skill that does this:

Run Claude sub-agent to find bugs in a PR, ranked by critical/high/medium/low
Run Codex on the same PR with the same ranking
Run Cursor Bugbot on the same PR
Review all findings, do his own research to rule out false positives
Write a final report

The core principle comes from a Milvus article on AI code review: the more different models you throw at a PR review, the less likely you are to get hallucinations or bogus bugs.

What is multi-model code review? It’s the practice of using multiple AI models simultaneously to review the same code. Each model independently identifies bugs and issues, then a human (or another model) cross-references the findings. Where models agree, confidence is high. Where they disagree, human judgement fills the gap. This dramatically reduces false positives compared to any single model review.

The results

Lawson reports:

Metric	Finding
False positive rate	Near zero with multi-model review
Bug finding rate	”Always finds tons of bugs”
Velocity impact	No measurable increase — often finds pre-existing bugs that spawn side-quests
Codebase health	Improves tangentially through side-quest fixes
Developer understanding	Deepens — “the happy-path of a complex architecture is less interesting than its failure modes”

His typical workflow:

Have an agent fix all the criticals and highs (with human guidance on the proper solution)
Repeat until no criticals/highs remain
Skip highs/mediums where the fix isn’t worth the effort (e.g., 100 lines of code for a narrow edge case)
Abandon the PR entirely if it has so many criticals that the whole approach is misguided

The side-quest problem

This is where Lawson’s argument gets genuinely interesting. He doesn’t claim AI makes him faster:

“I haven’t necessarily seen my velocity go up. If anything, the review process often finds pre-existing bugs, so I end up on a tangential side-quest where I’m writing unit tests and fixing subtle flaws that pre-date the PR.”

This is the opposite of the “10x productivity” narrative. Lawson is arguing that AI’s value isn’t in output velocity — it’s in codebase comprehension. The side-quests aren’t wasted time; they’re how you learn a codebase’s actual failure modes.

“Pre-LLMs, this is usually how I got familiar with a codebase anyway: understanding where the assumptions break down, and then getting my hands dirty to fix it.”

The cultural moment

The essay resonated because it names something many developers experience but few articulate: the dominant AI coding narrative is wrong, but not in the way critics think.

The “AI = slop cannon” critics are wrong because LLMs genuinely can find bugs. The “AI = 10x productivity” boosters are wrong because the actual experience is slower, more careful, and full of detours. Lawson occupies the space between: AI is genuinely useful, but the value is in quality, not quantity.

He also recommends Matt Pocock’s /grill-me skill — which forces the AI to interrogate you about your own PR until you understand it front-to-back — as another quality-over-velocity technique.

What this means for software engineering

Narrative	Claim	Reality (Lawson)
Vibe coding boosters	AI = 10x velocity	AI often reduces velocity by finding pre-existing bugs
AI skeptics	AI = slop cannon	AI finds real bugs with near-zero false positives (multi-model)
Lawson’s thesis	AI = better code, more slowly	The side-quests improve codebase health and developer understanding

For New Zealand’s growing AI adoption among SMEs — the 2degrees report found measurable results from businesses using AI — Lawson’s approach offers a practical template: don’t use AI to ship faster. Use it to understand what you’ve already built.

❓ Frequently Asked Questions

Q: What exactly is Lawson’s technique? He runs three different AI models (Claude, Codex, Cursor Bugbot) as code reviewers on the same PR, then cross-references their findings. Where models agree, confidence is high. He then uses another agent to fix criticals/highs with human guidance.

Q: Does this actually make you faster? No — and that’s the point. Lawson explicitly says he hasn’t seen velocity go up. The value is in finding bugs you didn’t know existed and understanding your codebase better through side-quests.

Q: What’s the false positive rate? Lawson reports “near zero” when using multi-model review. The key is having multiple models independently flag the same issues — where they agree, the finding is almost always real.

Q: How does this relate to Mythos? Lawson references Mythos directly — Anthropic’s security model that found 10,000+ vulnerabilities in its first month. His argument is that public models are already good enough for similar (if less dramatic) bug-finding work. You don’t need Mythos; Claude, Codex, and Cursor Bugbot catch plenty.

Q: What does this mean for NZ developers? The multi-model review approach is immediately implementable with available tools. Lawson’s insight — that AI’s value is in understanding, not speed — is particularly relevant for smaller teams maintaining legacy codebases, where side-quests that fix pre-existing bugs have outsized impact.