📊
AI Model Benchmarks
Compare 9 AI models side by side — pricing, context, and performance
| Model ⇅ | Provider ⇅ | Context ⇅ | Input / 1M ⇅ | Output / 1M ⇅ | Rank ▲ | Best For ⇅ |
|---|---|---|---|---|---|---|
| Claude Opus 4 | Anthropic | 200K | $15.00 | $75.00 | #1 | Complex reasoning, coding, agentic tasks |
| GPT-4o | OpenAI | 128K | $2.50 | $10.00 | #2 | General purpose, coding, analysis |
| Claude Sonnet 4 | Anthropic | 200K | $3.00 | $15.00 | #3 | Coding, writing, long documents |
| Gemini 2.5 Pro | Google DeepMind | 1M | $1.25 | $10.00 | #4 | Long documents, multimodal, reasoning |
| Grok 3 | xAI | 128K | $3.00 | $15.00 | #5 | Real-time info, reasoning, math |
| DeepSeek V3OSS | DeepSeek | 128K | $0.27 | $1.10 | #6 | Cost-efficient reasoning, code, math |
| Llama 4 MaverickOSS | Meta | 1M | Free | Free | #7 | Open source, self-hosted, multilingual |
| Mistral Large | Mistral AI | 128K | $2.00 | $6.00 | #8 | European privacy, multilingual, code |
| Command R+ | Cohere | 128K | $2.50 | $10.00 | #12 | Enterprise RAG, search, multilingual |