Answer-First Lead
Microsoft dropped seven in-house AI models at Build 2026, headlined by MAI-Thinking-1 — a 35B-active/~1T-total MoE reasoning model that matches Claude Opus 4.6 on SWE-Bench Pro and scores 97% on AIME 2025 — and MAI-Code-1-Flash, which beats Claude Haiku 4.5 by 16 points on SWE-Bench Pro (51.2% vs 35.2%) while using up to 60% fewer tokens. Both were trained from scratch without distillation on clean, commercially licensed data. The subtext isn’t subtle.
🔍 THE BOTTOM LINE
Microsoft is building its own models to power its own products. The OpenAI dependency that defined the Microsoft-AI relationship is being systematically replaced. When your coding model beats Anthropic’s and your reasoning model matches it, you don’t need to keep paying for someone else’s.
The Seven Models
Microsoft’s MAI family now spans the full stack:
| Model | What It Does | Status |
|---|---|---|
| MAI-Thinking-1 | Reasoning (35B-active/~1T MoE) | Private preview, Foundry |
| MAI-Code-1-Flash | Coding, tuned for GitHub Copilot | Available in Copilot & VS Code |
| MAI-Image-2.5 | Image generation (top 3 on Arena) | Available in Foundry |
| MAI-Image-2.5 Flash | Faster image generation | Available in Foundry |
| MAI-Transcribe-1.5 | Speech-to-text, 43 languages | Announced |
| MAI-Voice-2 | Speech generation | Announced |
| MAI-Voice-2 Flash | Faster speech generation | Announced |
The two headliners are the reasoning and coding models. Everything else is supporting infrastructure.
MAI-Thinking-1: The Reasoning Model
This is Microsoft’s first reasoning model, and the benchmarks are serious:
- Matches Claude Opus 4.6 on SWE-Bench Pro — not a small claim
- 97.0% on AIME 2025 and 94.5% on AIME 2026 — elite mathematical reasoning
- Preferred over Claude Sonnet 4.6 in blind human evaluations — Microsoft’s claim
- 128K context window
- 35B active parameters / ~1T total (sparse MoE) — efficient for its performance class
The key differentiator Microsoft is pushing: trained from scratch, no distillation from third-party models, on clean commercially-licensed data with no AI-generated content in pre-training. In an era where model provenance and copyright are increasingly contentious, that’s a real selling point.
What is distillation? Training a smaller AI model using outputs from a larger model (like using GPT-4’s answers to train a student model). It’s fast but creates models that inherit the teacher’s limitations and design choices. Microsoft specifically avoided this for MAI-Thinking-1.
MAI-Code-1-Flash: The Coding Model
The coding benchmarks are where this gets genuinely competitive:
| Benchmark | MAI-Code-1-Flash | Claude Haiku 4.5 | Delta |
|---|---|---|---|
| SWE-Bench Pro | 51.2% | 35.2% | +16 pts |
| SWE-Bench Verified | Higher pass rate | Lower | ✅ |
| SWE-Bench Multilingual | Higher pass rate | Lower | ✅ |
| Terminal Bench 2 | Higher pass rate | Lower | ✅ |
And it does this with up to 60% fewer tokens on SWE-Bench Verified. That’s not just better — it’s cheaper and faster.
The model was trained directly with GitHub Copilot production harnesses, evaluated on real Copilot usage patterns, and tuned for the workflows developers actually use. This is Microsoft leveraging its ownership of GitHub as a training data advantage.
The Hill-Climbing Machine
Microsoft isn’t just releasing models — it’s describing a system. The “Hill-Climbing Machine” is their term for a co-designed pipeline where every component of model development is “climbable” — meaning capabilities improve continually and reliably over time.
Three pillars:
- Capabilities learned, not inherited. No distillation. Models learn tasks from scratch, making them more steerable and adaptable.
- Clean data. No AI-generated content in pre-training. Enterprise-grade, commercially licensed data throughout.
- Self-sufficiency across the stack. From custom accelerators through to RL frameworks, Microsoft controls the full pipeline.
This is a long-term independence play. Every pillar is designed to reduce Microsoft’s dependence on external model providers.
Why This Matters
Microsoft has spent years as OpenAI’s cloud partner and investor. That relationship gave Microsoft exclusive access to GPT models for Copilot, Azure, and its productivity suite. But the terms of that deal have been increasingly strained — and with OpenAI now competing directly with Microsoft through ChatGPT Enterprise, the strategic calculation has changed.
MAI-Code-1 is already in GitHub Copilot and VS Code. MAI-Thinking-1 is in private preview on Foundry. The models will also be available through Fireworks AI, Baseten, and OpenRouter.
When Microsoft replaces OpenAI’s models in Copilot with its own, the relationship shifts from “strategic partner” to “customer who happens to also be a competitor.” These models suggest that transition is underway.
The Competitive Landscape
Microsoft’s MAI launch comes at a crowded moment:
- Anthropic just filed for IPO and is pushing Claude deeper into knowledge work
- OpenAI just expanded Codex into a full enterprise platform with Sites, Annotations, and plugins
- Nvidia launched Nemotron 3 Ultra at GTC Taipei as the top US open-weight model
- Google continues to push Gemini across its product suite
Microsoft’s angle is self-sufficiency. Not the best model in every category (yet), but good enough across the board and owned end-to-end. For enterprises that want one vendor for their entire AI stack — from hardware to models to deployment — that’s increasingly Microsoft.
❓ Frequently Asked Questions
Q: Is MAI-Thinking-1 better than Claude Opus 4.6? Microsoft says it matches Opus 4.6 on SWE-Bench Pro and is preferred over Sonnet 4.6 in blind evaluations. “Matches” and “preferred” are different from “better.” Independent benchmarks will tell the full story. The training-from-scratch and clean data claims are notable regardless.
Q: What does this mean for OpenAI? Microsoft has been OpenAI’s largest investor and cloud partner. These MAI models are designed to replace OpenAI’s models inside Microsoft products like Copilot. That doesn’t mean the partnership ends — but it does mean Microsoft has a credible alternative and is building towards self-sufficiency.
Q: Are the models open source? No. MAI-Thinking-1 is in private preview on Microsoft Foundry. MAI-Code-1-Flash is available through GitHub Copilot and VS Code. Third-party access is through Fireworks AI, Baseten, and OpenRouter — all commercial arrangements.
🔍 THE BOTTOM LINE
Microsoft built seven models that cover its entire product line — and two of them are competitive with the best from Anthropic. The message isn’t subtle: Microsoft is building its own AI stack, top to bottom, and it no longer needs to rent someone else’s brain to power its products.