Apache 2.0. 218B Parameters. Two GPUs. Done.
Cohere, the Canadian AI company founded by ex-Google Brain researchers, has open-sourced Command A+ under the Apache 2.0 license. The model is a 218B-parameter mixture-of-experts architecture with 25B active parameters, lossless quantization, and native citation support — and it runs on as few as two H100 GPUs.
This is the most capable fully open-weight model ever released. Not “open-ish” with commercial restrictions. Not “weights available but you can’t use this in production.” Full Apache 2.0 — use it, modify it, build products on it, no permission required.
Why This Matters
The open-source AI debate has been stuck in a loop: Meta’s Llama models are “open” but carry usage restrictions. Mistral’s open releases are solid but cap out below frontier capability. Google has released Gemma weights but with size limits that keep them in the “small model” category.
Command A+ breaks the pattern. It’s a genuinely large, genuinely capable model — and it’s genuinely open. The implications:
- Startups can build production AI products without paying per-token API fees
- Researchers get full access to study a frontier-scale architecture
- Governments can run sovereign AI without depending on US cloud providers
- Enterprises can fine-tune on proprietary data without sending it to anyone’s API
The Technical Details
| Feature | Command A+ |
|---|---|
| Total parameters | 218B |
| Active parameters | 25B (MoE) |
| License | Apache 2.0 |
| Min GPUs (inference) | 2× H100 |
| Quantization | Lossless |
| Citations | Native (built-in RAG) |
| Context window | 256K tokens |
What is mixture-of-experts (MoE)? MoE is a model architecture that activates only a subset of its parameters for each input. Instead of running all 218B parameters on every query, Command A+ routes each token through 25B “active” experts, keeping inference fast and cheap while retaining the knowledge capacity of a much larger model. It’s like having a large company where only the relevant department works on each project.
The native citation feature is particularly notable. Command A+ doesn’t just generate text — it attributes claims to source documents. This makes it immediately viable for enterprise use cases where hallucination liability matters: legal research, medical information, financial analysis.
The Cohere Strategy
Cohere has been the quiet fourth player in the frontier model race — behind OpenAI, Anthropic, and Google. While the Big Three chased consumer chatbots and billion-user platforms, Cohere focused on enterprise AI with a particular emphasis on RAG (retrieval-augmented generation).
Open-sourcing Command A+ is a strategic masterstroke. It:
- Undercuts Meta’s Llama on openness (Apache 2.0 vs. Llama’s custom licence)
- Undercuts OpenAI/Anthropic on cost (run it yourself vs. pay per token)
- Builds a developer ecosystem around Cohere’s architecture
- Forces competitors to justify why their closed models are worth paying for
The business model shift is clear: Cohere makes money on enterprise deployment, custom fine-tuning, and managed infrastructure — not on per-token API fees. Open-sourcing the weights doesn’t cannibalise revenue; it creates customers.
The GPU Economics
Two H100 GPUs for inference is remarkably cheap for a model this capable. At current cloud GPU prices, that’s roughly $4-6/hour on-demand — cheaper than most frontier API pricing for comparable query volumes.
For organisations that already own GPU infrastructure (and many do, for training or existing workloads), Command A+ could run at near-zero marginal cost. That’s a fundamentally different economic proposition than “pay OpenAI $15 per million output tokens.”
What This Means for NZ
NZ organisations running AI workloads on overseas APIs should take note. Command A+ makes it economically feasible to run a frontier-capable model on local infrastructure — either in a NZ data centre or on-premises. For government agencies handling sensitive data, or healthcare organisations bound by privacy regulations, this is a genuine alternative to sending data through US-based APIs. The native citation feature also addresses the auditability requirements that public-sector AI deployments typically need.
❓ Frequently Asked Questions
Q: Is Command A+ really as capable as GPT-4 or Claude? On most enterprise benchmarks, it’s competitive — particularly for RAG-heavy tasks where its native citations give it an advantage. For raw reasoning, the closed-source frontier models still hold an edge, but the gap is narrowing.
Q: What does Apache 2.0 actually allow? Commercial use, modification, distribution, patent use — essentially everything. You can build and sell products using Command A+ without paying Cohere anything or asking permission.
Q: Can I run this on consumer hardware? Not full precision — you need at least two H100 GPUs. But Cohere’s lossless quantization means you can compress the model significantly without performance degradation, potentially reducing hardware requirements further.
Q: Why would Cohere open-source their best model? Because their revenue comes from enterprise deployment services, not model access fees. More developers using Command A+ means more potential enterprise customers for Cohere’s managed offerings.
🔍 THE BOTTOM LINE
Command A+ under Apache 2.0 is the moment open-source AI stopped being a compromise. You get frontier capability, full commercial rights, and two-GPU inference. The question for OpenAI and Anthropic just got harder: what exactly are you charging for?
SOURCES
- VentureBeat — Cohere open-sources Command A+
- The Decoder — Cohere open-sources its strongest model yet
- MarkTechPost — Command A+ technical details and benchmarks