AI Model Pricing Guide

Because tokens cost money and you're not made of it | Updated July 6, 2026

CHEAPEST: Qwen 3.6 Plus ($0.10/$0.30) / Llama 4 Scout ($0.15/$0.55) / GPT-5.4 nano ($0.20/$1.25)BEST VALUE: Claude Sonnet 5 ($2/$10 intro) / Grok 4.3 ($1.25/$2.50)SMARTEST: Claude Fable 5 ($10/$50) / GPT-5.6 Sol ($5/$30, preview) / Claude Opus 4.8 ($5/$25)NEW THIS WEEK: Claude Fable 5 restored (Jul 1) • Sonnet 5 (Jun 30) • GPT-5.6 Sol/Terra/Luna preview (Jun 26)

OpenAI

The OG of AI APIs. GPT kicked off the revolution and they're still leading.

Pricing API Keys ChatGPT Playground

BUDGET

GPT-5 mini

Fast & cheap

$0.25

per 1M

Out

$2.00

per 1M

128K ctxFast

Lightweight champion. Surprisingly capable for simple tasks and high-volume apps.

Best: Chatbots, simple QA, data extraction

BUDGET

o4-mini

Reinforcement tuned

$1.10

per 1M

Out

$4.40

per 1M

200K ctxFine-tuning

Price dropped 70%. Optimized for reinforcement fine-tuning workflows. Create custom reasoning patterns.

Best: Fine-tuning, custom reasoning

NEW

GPT-5.4 mini

Coding & subagents

$0.75

per 1M

Out

$4.50

per 1M

Cached inputCoding

New GPT-5.4-class model. Stronger than GPT-5 mini for coding and subagent workflows.

Best: Coding, subagents, mid-tier apps

NEW

GPT-5.4 nano

Cheapest 5.4-class

$0.20

per 1M

Out

$1.25

per 1M

Cached inputBudget

Cheapest way into the GPT-5.4 family. Cheaper input than GPT-5 mini.

Best: High-volume, budget apps

POWER

GPT-5.4

Still elite — now #2

$2.50

per 1M

Out

$15.00

per 1M

270K ctxReasoning

OpenAI's previous #1. Still elite for complex, multi-step problems.

Best: Hardest problems, professional work

POWER

GPT-5.2

Reasoning beast

$1.75

per 1M

Out

$14.00

per 1M

6.6h horizon200K ctx

Top 3 on METR. Excels at complex tasks, code, and multi-step reasoning.

Best: Code, analysis, agent workflows

POWER

GPT-5.2 Pro

Reasoning premium

$21.00

per 1M

Out

$168.00

per 1M

200K ctxPremium

OpenAI's most precise reasoning model. For when you need the absolute best reasoning.

Best: Hardest problems, precision work

NEW

GPT-5.6 Sol

New flagship — limited preview

$5.00

per 1M

Out

$30.00

per 1M

PreviewAgentsCodingCybersecurity

OpenAI's strongest model yet. Limited preview with US government coordination. Three tiers: Sol (flagship), Terra (balanced, $2.50/$15), Luna (fast, $1/$6). Cerebras version at 750 tok/s coming July.

Best: Frontier agentic tasks, coding, biology, cybersecurity

NEW

GPT-5.6 Terra

Balanced 5.6 — 2x cheaper than Sol

$2.50

per 1M

Out

$15.00

per 1M

PreviewBalanced

Competitive with GPT-5.5 at 2x lower cost. The everyday work model of the 5.6 family.

Best: General purpose, production apps

NEW

GPT-5.6 Luna

Fastest 5.6 — lowest cost

$1.00

per 1M

Out

$6.00

per 1M

PreviewFast

Strong capability at OpenAI's lowest cost. The high-volume tier of the 5.6 family.

Best: High-volume, cost-sensitive apps

POWER

GPT-5.5

Now #2 — still elite

$5.00

per 1M

Out

$30.00

per 1M

1.05M ctxReasoningAgents

Previous #1. 82.7% Terminal-Bench, 84.9% GDPval, 78.7% OSWorld. 1.05M context window. Still elite, now behind GPT-5.6.

Best: Coding, agents, research, multi-step tasks

#1 RANKED

GPT-5.5 Pro

Maximum intelligence

$30.00

per 1M

Out

$180.00

per 1M

1.05M ctxPremiumDeep Research

90.1% BrowseComp, 52.4% FrontierMath Tier 1-3. 1.05M context window — read entire codebases and research libraries. The ceiling for what AI can do right now.

Best: Hardest problems, deep research, scientific discovery

FLAGSHIP

GPT-5.4 Pro

Previous premium — now #3

$30.00

per 1M

Out

$180.00

per 1M

270K ctxPremium

Former #1, now behind GPT-5.5. Still incredibly powerful for demanding tasks.

Best: Most demanding tasks, unlimited budget

Anthropic

Safety-first company. Claude is beloved by developers for being genuinely helpful.

Pricing API Keys Claude Docs

FAST

Claude Haiku 4.5

Speed demon

$1.00

per 1M

Out

$5.00

per 1M

200K ctxFastest

Optimized for fast responses. Perfect for real-time apps and bulk processing.

Best: Real-time chat, bulk processing

NEW

Claude Sonnet 5

Most agentic Sonnet yet — intro pricing

$2.00

per 1M

Out

$10.00

per 1M

200K ctxAgenticIntro price

Close to Opus 4.8 performance at Sonnet prices. Most agentic Sonnet ever — plans, uses tools, runs autonomously. Intro pricing $2/$10 through Aug 31, then $3/$15. Default model for Free and Pro plans.

Best: Coding, agents, tool use, autonomous tasks — the new default Sonnet

BEST

Claude Sonnet 4.6

Previous Sonnet — still solid

$3.00

per 1M

Out

$15.00

per 1M

Balanced200K ctx

Previous Sonnet default. Still excellent but superseded by Sonnet 5 at lower intro pricing.

Best: Most tasks, code, writing, general use

NEW

Claude Opus 4.8

New Opus flagship

$5.00

per 1M

Out

$25.00

per 1M

1M ctx128K outputSelf-verify

New Opus top tier. 1M context, 128K output, autonomous self-verification. Same $5/$25 pricing as 4.7 but newer tokenizer and stronger reasoning.

Best: Complex coding, agents, long-horizon tasks

POWER

Claude Opus 4.7

Previous Opus SOTA — still elite

$5.00

per 1M

Out

$25.00

per 1M

1M ctxxhigh reasoningSelf-verify

Previous Anthropic best. 1M context, autonomous self-verification. Beat GPT-5.4 on BrowseComp. Now superseded by Opus 4.8 at the same price.

Best: Complex coding, agents, long-horizon tasks

POWER

Claude Opus 4.6

Proven workhorse

$5.00

per 1M

Out

$25.00

per 1M

14.5h horizon200K ctxFast mode

Still one of the best. 14+ hour autonomous tasks. Reliable, consistent, now the value play vs 4.7/4.8.

Best: Hard problems, research, complex agents

NEW

Claude Fable 5

Anthropic's most capable widely released model

$10.00

per 1M

Out

$50.00

per 1M

1M ctx128K outputSafety classifiersRestored Jul 1

Public version of Mythos. $10/$50 — double Opus 4.8. 1M context, 128K output, autonomous self-verification. Includes safety classifiers that can refuse requests (refusals are not billed; fallback credits refund prompt-cache cost on retry). Suspended by US government Jun 12, restored globally Jul 1.

Best: Hardest reasoning, long-horizon agentic work, cybersecurity

LIMITED

Claude Mythos 5

Fable 5 without safety classifiers — limited

$10.00

per 1M

Out

$50.00

per 1M

1M ctx128K outputGlasswing only

Same capabilities as Fable 5 without the safety classifiers. Limited availability through Project Glasswing (cybersecurity initiative). Restored Jul 1 after US government suspension. Successor to Claude Mythos Preview.

Best: Cybersecurity, frontier research, if you can get access

xAI

Elon's AI. Grok has real-time X data access and absurdly low pricing.

Pricing API Keys Grok Docs

NEW

Grok 4.20

Same price as 4.3, more features

$1.25

per 1M

Out

$2.50

per 1M

2M ctxReasoningMulti-agentVision

Same pricing as Grok 4.3 with multi-agent orchestration. Cached input at $0.125/1M. 2M context window for complex agent swarms.

Best: Complex multi-agent workflows

NEW

Grok 4.3

New recommended base model

$1.25

per 1M

Out

$2.50

per 1M

Best value2M ctxReasoningVision

xAI's recommended Grok 4 model after retiring old variants. Beats Grok 4.1 on coding, agents, and reasoning. This is the migration target for retiring models.

Best: High-volume apps, X analysis, multi-agent — the new default Grok

RETIRED

Grok 4 / 4.1 Fast

Retired May 15

$0.20

per 1M

Out

$0.50

per 1M

2M ctxRetired

Retired May 15, 2026. Migrated? Good. If not, move to Grok 4.3 ($1.25/$2.50) or Grok 4.20 ($1.25/$2.50).

Best: → Migrate to: Grok 4.3

RETIRED

Grok Code Fast 1

Retired May 15

$0.20

per 1M

Out

$1.50

per 1M

256K ctxRetired

Retired May 15, 2026. Migrate to Grok 4.3 for coding.

Best: → Migrate to: Grok 4.3

BUDGET

Grok 3 Mini

Older gen cheap

$0.30

per 1M

Out

$0.50

per 1M

131K ctxReasoning

Budget fallback if Grok 4's 2M context is overkill for your use case.

Best: Simple tasks, testing

POWER

Grok 4-0709

Premium tier

$3.00

per 1M

Out

$15.00

per 1M

256K ctxReasoningVision

Premium Grok. Smaller context but more reasoning power.

Best: Grok style with more smarts

Google DeepMind

Gemini has quietly become excellent. Massive context, strong multimodal, and a generous free tier.

Pricing API Keys Gemini AI Studio

NEW

Gemini 3.1 Flash-Lite

Cheapest Gemini 3

$0.25

per 1M

Out

$1.50

per 1M

PreviewBudget

Cheapest way into Gemini 3.1. Preview tier with budget-friendly pricing.

Best: Budget Gemini 3 apps, prototyping

NEW

Gemini 3 Flash

New budget

$0.50

per 1M

Out

$3.00

per 1M

PreviewFast

Gemini 3 Flash preview. Balanced performance at budget pricing.

Best: Budget apps, prototyping

VALUE

Gemini 2.5 Flash

Best value

$0.30

per 1M

Out

$2.50

per 1M

1M ctxMultimodalFree tier

Cheapest way to process 1M context. Free tier available. Multimodal - images, video, audio.

Best: High-volume, multimodal, prototypes

NEW

Gemini 2.5 Flash-Lite

Ultra-cheap Flash

$0.10

per 1M

Out

$0.40

per 1M

1M ctxBudget

Flash-Lite tier for Gemini 2.5. Cheaper than standard Flash with 1M context support. Best for high-volume simple tasks.

Best: High-volume, simple tasks, cost-sensitive apps

Gemini 3 Pro

3rd gen flagship

$2.00

per 1M

Out

$12.00

per 1M

1M ctx

Third-generation Gemini Pro. Now stable — no Preview tag. Same / pricing as preview tier. Strong general-purpose flagship.

Best: General production apps, stable Pro performance

FLAGSHIP

Gemini 3.1 Pro Preview

New flagship

$2.00

per 1M

Out

$12.00

per 1M

4h horizonPreviewVideo

77.1% ARC-AGI-2. Price increased from $1.25/$10. Batch and Flex tiers at 50% off.

Best: Video analysis, complex reasoning

LONG

Gemini 2.5 Pro

Long outputs

$1.25

per 1M

Out

$10.00

per 1M

1M ctx64K output

Same price as 3.1 Pro but 64K max output vs 16K. Choose for long-form content generation.

Best: Long-form writing, large outputs

DEPRECATED

Gemini 2.0 Flash

Shuts down Jun 1

$0.15

per 1M

Out

$0.60

per 1M

1M ctx8K outputRetiring Jun 1

Deprecated — shuts down June 1, 2026. Migrate to Gemini 2.5 Flash or 3.1 Flash-Lite.

Best: Migrate away from this model

⬆

Open Source & Local

Open-weight models you can run yourself or call via cheap APIs. The frontier is no longer closed.

Kimi API Llama DeepSeek Qwen ▶ Run Locally

NEW

Kimi K2.6

88% cheaper than Opus

$0.60

per 1M

Out

$2.50

per 1M

256K ctxOpen weightMoE 1T/32B

Beats GPT-5.4 and Opus 4.6 on SWE-Bench Pro. 1T params, 32B active. 300 sub-agent orchestration. OpenAI-compatible API.

Best: Coding, agents, long-horizon tasks

CHEAPEST

Qwen 3.6 Plus

1M context, free tier

$0.10

per 1M

Out

$0.30

per 1M

1M ctxReasoningFree tier

Alibaba's latest. Mandatory chain-of-thought reasoning. Free tier available. Topped 6 coding benchmarks on release.

Best: Budget coding, massive context

NEW

Llama 4 Scout

10M context MoE

$0.15

per 1M

Out

$0.55

per 1M

10M ctxOpen weightMoE 109B

Longest context of any open model. 109B total, 17B active. Multimodal. Runs on 24GB VRAM.

Best: Massive context, multimodal, local

Llama 4 Maverick

Frontier coding MoE

$0.20

per 1M

Out

$0.80

per 1M

1M ctxOpen weightMoE 400B

Beats GPT-4o on coding. 400B total, 17B active. 128 experts. Frontier quality at MoE prices.

Best: Coding, complex reasoning

DeepSeek V3.2

Matches GPT-4o

$0.27

per 1M

Out

$1.10

per 1M

128K ctxOpen weightMoE 685B

94.2% MMLU matching GPT-4o. 685B MoE with 37B active. Best open model for general knowledge.

Best: General knowledge, research

FREE

Qwen3-Coder 8B

Local coding king

$0.00

per 1M

Out

$0.00

per 1M

FREE

local

Out

FREE

local

32K ctxLocal only8B dense

Runs on any 8GB GPU. 92 programming languages. 80-150 tok/s. Best local coding model under 10B. Set it up locally →

Best: Local coding, autocomplete

FREE

DeepSeek R1 Distill 14B

Local reasoning

$0.00

per 1M

Out

$0.00

per 1M

FREE

local

Out

FREE

local

Local onlyReasoning10GB VRAM

Chain-of-thought reasoning on 10GB VRAM. The sweet spot for local reasoning. 55 tok/s on modern GPUs. Run it offline →

Best: Local reasoning, budget hardware

💡 Did You Know?

1M tokens = 750K words

That's roughly 1,500 pages. Process it for $0.05 with GPT-5 nano — the cheapest input ever.

Fable 5's wild week

Claude Fable 5 launched Jun 9, got suspended by the US government Jun 12, and was restored globally Jul 1. Same model, same $10/$50 price. Government export controls are now part of the model release cycle.

Sonnet 5 intro pricing

Claude Sonnet 5 launched at $2/$10 — cheaper than Sonnet 4.6 ($3/$15) despite being significantly better. Intro price ends Aug 31, then $3/$15. Note: new tokenizer produces ~30% more tokens, so it's roughly cost-neutral.

GPT-5.6 government gating

GPT-5.6 Sol ships in limited preview with US government coordination. Trusted partners first, broader release 'in coming weeks'. Government as gatekeeper for frontier AI.

Grok retirement done

Grok 4 / 4.1 Fast / Code Fast 1 retired May 15. Grok 4.3 ($1.25/$2.50) is the replacement — same price, better performance, 2M context.