Person staring at laptop showing a massive cloud billing invoice, cold blue lighting, shocked expression
News

Cloud AI Bill Shock: AWS and Google Users Hit With Surprise Charges Worth Tens of Thousands

Cloud AI bills are spiralling out of control, with users hit by surprise charges in the tens of thousands. A developer who owed $58K for a week tells the inside story.

cloud computingAWSGoogle CloudAI billingbusiness risk

An AWS developer is testing autonomous agents. He’s spending $3 to $5 a day, monitoring costs, setting up budget alarms like a good cloud citizen. Then two emails arrive in his inbox after lunch: one for $56,265.59, the other forecasting $70,161.62 if he doesn’t act.

A three-person startup in Mexico has their Gemini API key stolen. Their normal monthly bill is $180. In 48 hours, the thief runs up $82,314.44.

A Wellington man opens his Google Cloud bill and finds $16,000 he didn’t expect.

Cloud AI bill shock is here, and it’s hitting businesses everywhere — including New Zealand, where cloud AI adoption is accelerating faster than cost governance can keep up.

🔍 THE BOTTOM LINE

Cloud providers are making it dangerously easy to rack up enormous AI charges with zero safeguards. Google just added spending caps, but the fix has holes you can drive a truck through. NZ businesses deploying cloud AI without hard cost controls are one bug or one stolen key away from a cash flow crisis.


The $58,000 bug

Damiano Giorgi is an AWS Community Builder. In February 2026, he started testing autonomous agents on Bedrock, using Kimi K2.5 by Moonshot AI through Amazon’s new “Project Mantle” — an OpenAI-compatible API layer announced at re:Invent 2025.

He was doing everything right. Costs were showing $3 to $5 a day. He’d set budget alarms at $100. He was logging into his organisation’s main account daily to check billing. Then the alarm fired — for $56,265.59 — with a forecast of $70,161.62.

The culprit? A bug in Bedrock’s OpenAI-compatible API layer that wasn’t accurately tracking token consumption. The system thought it was spending pennies while the meter was running up thousands.

“I got pretty lucky,” Giorgi wrote. “I knew the infrastructure well enough to track down the issue and shut down the experiment before costs got worse. Not everyone would be able to do that.”

He was able to get AWS to issue credits. But he had to know exactly what buttons to push and who to call. Most businesses don’t have that luxury.

The stolen key that cost $82K

In February 2026, a three-person startup in Mexico had their Gemini API key stolen. Their normal monthly spend: $180. In 48 hours, the thief ran up $82,314.44 in AI inference charges.

Security researchers at Truffle Security had previously found 2,863 live Gemini API keys exposed in public repositories. The problem was architectural: Google’s original API key system was designed for non-secret identifiers, meaning there was no easy way to make them secure by default.

Google eventually issued credits. But developers who disputed charges through their bank — rather than waiting for Google’s refund process — found their entire Google accounts suspended, including Cloud, Play, and YouTube, and were required to upload government ID for reinstatement.

Google’s response: billing caps with a 10-minute hole

In April 2026, Google finally introduced spending caps for Gemini API — the last major AI provider to do so. Anthropic and OpenAI had configurable developer limits for years before.

The caps work in tiers: $250/month for Tier 1, $2,000 for Tier 2, with higher tiers requiring negotiation with Google Cloud sales.

But here’s the kicker: when you hit your cap, Google’s system takes up to 10 minutes to detect it and block requests. During that window, API calls keep processing and keep billing. Google is explicit: “users are responsible for overages incurred during that period.”

For a heavy Gemini 3.1 Pro workload at $12 per million output tokens, 10 minutes of uncapped high-throughput traffic is real money. The practical workaround? Set a project-level cap at 80% of your tier ceiling — meaning users have to hack their own safety net because the platform’s isn’t good enough.

The wider picture: wasted cloud spend is rising

It’s not just freak billing bugs. Flexera’s 2026 State of the Cloud Report found wasted cloud spending has climbed to 29% — the first increase in five years — driven directly by generative AI adoption. 81% of organisations now use generative AI, up from 72% in 2025, and 45% say they use it extensively.

The shift from pilots to production is creating a cost forecasting nightmare. AI workloads are variable, data-intensive, and difficult to predict. “Cloud is maturing,” said Flexera CTO Brian Shannon, “but the cost picture is getting worse, not better.”

What this means for NZ

New Zealand businesses are in a particularly exposed position:

  • A Wellington man was hit for $16,000 after using a Google AI-ready BigQuery tool with unclear pricing signals, as reported by the NZ Herald.
  • 29.6% of cloud spend is wasted in NZ organisations, and AI is the primary driver of the increase.
  • The NZ dollar probing 15-year lows makes USD-denominated cloud AI pricing even more painful.
  • AI API pricing is shifting to metered models — a Layer3 NZ analysis warns the trend is “AI as a utility billed by the sip,” making cost forecasting harder.

The pattern from our nz-ai-adoption-governance-gap-commvault-2026 reporting holds: NZ organisations are deploying fast and governing slow. Cost governance is even further behind security governance.

The trust problem

Here’s the uncomfortable truth: cloud providers have no real incentive to make spending caps easy. More usage equals more revenue. The friction around refunds isn’t a bug — it’s a feature of their business model.

As one NZ-based cloud architect told us: “The alerts tell you you’re on fire, but they don’t put the fire out. You’re still writing the cheque.”

Practical steps

If you’re using cloud AI services:

  • Don’t trust alerts to stop charges. They tell you after the money’s gone.
  • Use separate billing accounts for experimentation, with the lowest possible service quotas.
  • Set project-level caps at 80% of your account ceiling — Google’s 10-minute billing gap means you need your own buffer.
  • Rotate API keys aggressively and scan for exposed keys. Truffle Security regularly finds thousands of live Gemini keys in public repos.
  • Treat AI API costs as 3-5× your initial estimate. That’s the realistic range during experimentation.
  • If you’re in NZ, factor in the exchange rate. Every USD-denominated API call gets more expensive as the kiwi dollar slides.

❓ Frequently Asked Questions

Q: Can I set a hard spending limit on AWS Bedrock or Google Cloud?

Not really. AWS doesn’t offer hard spending caps — only alerts. Google Gemini caps arrived in April 2026 but have a 10-minute overage window. Anthropic Claude lets you set per-workspace budget limits that actually stop. OpenAI has usage limits but they’re soft. The industry is catching up, but slowly.

Q: What happens if I dispute a charge with my bank?

If you have a Google account, be warned: Truffle Security documented cases where bank disputes of Gemini API charges triggered suspension of the entire Google account — Cloud, Play, YouTube, everything — plus a government ID verification requirement to get it back.

Q: Are billing bugs common?

More common than providers admit. The $58K Bedrock bug from February 2026, the $82K stolen key incident, the Gemini 2.5 Flash pricing bug from August 2025 that charged for never-generated tokens — these are documented, cross-verified incidents, not edge cases.


🔍 THE BOTTOM LINE

Cloud AI bill shock is the predictable outcome of providers who profit from runaway spending offering customers the illusion of control without the actual levers to stop the bleeding. Google’s new caps are a step forward, but a 10-minute billing gap isn’t a safety net — it’s a trap door with a slow trigger. NZ businesses diving into cloud AI need to treat cost governance as seriously as security governance. One bug or one stolen key is all it takes.

Sources: Damiano Giorgi (first-hand AWS Bedrock billing bug account), NZ Herald, IT Brief NZ (Flexera State of the Cloud 2026), TokenCost (Google Gemini API billing caps analysis), The Register, Truffle Security (exposed Gemini API keys research), trussed.ai (AI API cost overrun prevention), Layer3 NZ (AI metered pricing)