Select up to 4 models. Compare pricing, context, benchmarks, capabilities, and deployment options. Use the cost estimator to find the best value for your use case.
💰 Pricing
📏 Context
🖼 Modalities
⚙️ Capabilities
📊 Benchmarks
🚀 Deployment
Comparing 3 models
Highlight winner:
Last updated: March 9, 2026
💰 Pricing
⚡ Serverless Input
Per 1M tokens
$3.00 👑
$1.25
$0.27
⚡ Serverless Output
Per 1M tokens
$15.00
$10.00
$1.10 👑
Standard Input
Direct API / per 1M
$3.00 👑
$1.25
$0.27
Standard Output
Direct API / per 1M
$15.00
$10.00
$1.10 👑
Free Tier
—
—
✓limited
Prompt Caching
Write / Read per 1M
$3.75 / $0.30
$3.13 / $0.31
—
Batch Discount
−50%
−50%
—
Providers
Where available
11 providers
8 providers
6 providers
📏 Context & Performance
Context Window
200K
1M 👑
64K
Max Output Tokens
16,384 👑
8,192
8,192
Throughput (avg)
Tokens / second
85 t/s
110 t/s 👑
120 t/s
TTFT P50
Time to first token
430ms
510ms
280ms 👑
Uptime (30d)
99.98%
99.95%
99.90%
Training Cutoff
Jan 2026 👑
Jan 2026 👑
Oct 2024
Parameter Count
~200B est.
Undisclosed
671B (MoE)MoE
🖼 Input / Output Modalities
Text Input
✓
✓
✓
Image Input
Vision
✓
✓video too
—
Audio Input
—
✓
—
File / PDF Input
✓
✓
—
Text Output
✓
✓
✓
Code Output
✓
✓
✓
Embedding Output
—
✓
—
⚙️ Capabilities & Parameters
Tools / Function Calling
✓
✓
✓
JSON Mode
✓
✓
✓
Structured Outputs
✓
✓
—
Streaming
✓
✓
✓
System Prompt
✓
✓
✓
Reasoning Tokens
Extended thinking
Partial
✓ 👑
Partial
FIM Completion
Code infill
—
—
✓
Distillable
—
✓
—
Fine-Tuning
—
—
—
Zero Data Retention
✓ZDR
—
—
📊 Benchmark Scores
MMLU
General knowledge
88.7
89.2 👑
86.7
HumanEval
Code generation
92.4 👑
88.1
87.6
GSM8K
Math reasoning
96.2 👑
95.8
93.5
GPQA Diamond
Expert reasoning
68.3
72.4 👑
62.1
MT-Bench
Multi-turn quality
9.1 👑
8.8
8.7
MATH
Competition math
81.5 👑
80.0
79.2
Benchmark Wins
4 / 6 👑
2 / 6
0 / 6
🚀 Deployment & Compliance
Open Weights
—
—
✓
Serverless Deploy
✓
✓
✓
Private Deploy
Your infra
✓via key routing
✓
✓self-host
Air-Gap Ready
Metaprise
Via substitute
Via substitute
✓native
Zero Data Retention
✓
—
—
HIPAA Ready
✓
✓
—
SOC 2 Type II
✓
✓
—
GDPR Compliant
✓
✓
—
AuditChain Logging
★ Metaprise
✓
✓
✓
License
Commercial API
Commercial API
MIT (Open)
💡 Cost Estimator
Enter your expected usage to compare real costs across all selected models.