Futuristic data center corridor with glowing server racks, symbolizing OpenAI's next-generation model training
AGI Countdown

OpenAI's 'Spud' Model Nears Launch — Two Years of Research Compressed Into One Release

Spud isn't just another iteration — it's OpenAI's first fresh pretrain since GPT-4o, and it could reset the AI benchmark leaderboard.

OpenAIGPT-5SpudAI ModelsAGI

OpenAI’s next flagship model — internally codenamed “Spud” — completed pretraining on March 24, 2026, and prediction markets now give it 78% odds of shipping before April 30. If it lands on schedule, it won’t just be another incremental update. According to OpenAI co-founder Greg Brockman, Spud represents roughly two years of accumulated research compressed into a single model launch.

That framing matters. OpenAI hasn’t completed a successful full-scale pretraining run since GPT-4o in May 2024. Every model since — GPT-4.5, the o-series — has been built on existing foundations. Spud is different. It’s a fresh pretrain, which means new architecture, new training data, new scaling decisions baked in from the ground up.


What Brockman Actually Said

In a recent interview, Brockman was unusually specific about what Spud represents within OpenAI’s development process:

“The way that our development process works is you have pre-training. You produce a new base model, and that then is the foundation that we build further improvements on top of. And that is always a huge effort across many people in the company.”

On his personal involvement: “That’s where I’ve actually been spending most of my efforts over the past eighteen months — really focused on our GPU infrastructure, on supporting the teams that do all of the training frameworks to scale up at these big runs.”

And on Spud specifically: “I think of Spud as a new base, as a new pre-train, and I’d say it’s like we have maybe two years worth of research that is coming to fruition in this model.”

Eighteen months of infrastructure work. Two years of research. This isn’t a side project. It’s OpenAI’s answer to the criticism that they’ve been coasting on GPT-4’s foundation for too long.


Why a Fresh Pretrain Matters

Semiconductor research firm SemiAnalysis flagged this exact issue last year: OpenAI hadn’t completed a full-scale pretraining run since GPT-4o. The models in between — GPT-4.5, o1, o3 — were built on existing foundations, not new ones. That’s not inherently bad, but it limits how much capability can improve. You can only squeeze so much juice from an existing architecture.

A fresh pretrain changes the calculus. Google credited pretraining improvements as a key driver of Gemini 3’s performance gains. Brockman’s comments suggest OpenAI is now positioned to make the same leap — potentially leapfrogging competitors who’ve held the benchmark crown in recent months.

The competitive timing is sharp. OpenAI has lost ground to Google and xAI on benchmarks over the past year. Google I/O is May 19-20. Shipping Spud before then preempts whatever Gemini 3.1 Ultra announcement Google has planned.


What Spud Is Expected to Improve

OpenAI hasn’t published a spec sheet. But based on researcher commentary, eval leaks, and the community tracking OpenAI’s pipeline, here’s what’s expected:

Larger context window: GPT-5 maxes out at 128K tokens. Spud is expected to extend this to 256K or even 512K tokens, addressing the coherence degradation that GPT-5 exhibits in the final 20% of long contexts. For RAG pipelines stuffing 100K+ tokens of retrieved documents, this is a meaningful improvement.

Better multi-step tool use: In agentic workflows requiring 5+ sequential tool calls, GPT-5’s error rate compounds. Spud reportedly improves reliability on multi-step tool chains — the specific capability that frameworks like LangChain and Claude Code depend on.

Lower hallucination on factual queries: Specifically on TriviaQA and PopQA benchmarks for obscure facts where GPT-5 still hallucinates at rates that matter in production RAG pipelines.

Code generation improvement: Pass@1 on HumanEval and LiveCodeBench reportedly improves by 8-12 percentage points over GPT-5.

JSON output consistency: GPT-5 produces malformed JSON at low but non-trivial rates on complex schemas. Spud reportedly brings this below 0.1% — which matters at millions of API calls.

None of these are confirmed benchmarks. They’re informed estimates from the community. But the pattern is clear: Spud targets production reliability, not just benchmark scores.


Sam Altman’s Comments

OpenAI CEO Sam Altman has been characteristically measured about Spud publicly, but his recent statements point to significance. He described it as “a very strong model” that “could really accelerate the economy” — language that echoes how he framed GPT-4 before its launch.

Brockman went further, suggesting OpenAI is “70-80% of the way to AGI” with Spud. That’s a bold claim, and one that should be taken with appropriate skepticism given OpenAI’s incentive to build hype ahead of a major release. But even stripped of marketing, the technical case for a significant capability jump is solid: fresh pretrain, two years of research, eighteen months of infrastructure investment.


The Timeline: Why 78%, Not Higher

Polymarket’s 78% probability for an April 30 release sounds confident, but the 22% tail risk reflects legitimate concerns:

Post-training safety evaluations: OpenAI’s red-teaming process after pretraining has historically added 3-8 weeks. GPT-4 was pre-trained in August 2022 and released in March 2023 — seven months of post-training. GPT-5 had a shorter cycle, but regulatory scrutiny has increased since the EU AI Act took effect.

Regulatory review pressure: The EU AI Act’s general-purpose AI provisions require OpenAI to file model cards and capability disclosures for models above the compute threshold. GPT-5 was the first model subject to this requirement. Spud will be too. Additional documentation requests could push the timeline.

Competitive timing: OpenAI has historically shipped to blunt competitor announcements. Google I/O on May 19-20 gives them incentive to release before mid-May. This pushes toward April, not May.

The base case remains mid-to-late April. The tail risk is early May.


What This Means for the AGI Timeline

Brockman’s “70-80% of the way to AGI” claim is the most aggressive public positioning from OpenAI leadership since Altman’s “we are at the doorstep of AGI” remarks earlier this year. Whether that’s genuine assessment or pre-launch positioning is hard to separate.

What’s clear is that Spud represents a different category of release than GPT-4.5 or the o-series. Those were iterative improvements on existing foundations. Spud is a fresh foundation — and that’s the kind of step change that could reshape the competitive landscape, at least until the next model from Google or Anthropic ships 6-10 weeks later.

For anyone tracking the path to AGI, Spud is the next data point. Not because of what OpenAI says about it, but because of what its actual benchmarks reveal when it ships.


SOURCES

  • OfficeChai — OpenAI’s New ‘Spud’ Model Is A Fresh Pretrain, Outcome Of 2 Years Of Research
  • Abhishek Gautam — OpenAI Spud: GPT-5.5 Pretraining Done, April Release Likely
  • Polymarket — Prediction market odds for Spud release by April 30, 2026
  • Greg Brockman interview — YouTube, April 2026
Sources: OfficeChai, Abhishek Gautam, Polymarket, Greg Brockman / OpenAI