Live Rankings
AI Model Benchmarks
Compare 113 AI models side by side — arena rankings, pricing, context windows, and capabilities. Updated weekly.
Total Models
113
tracked & ranked
Open Source
58
freely available
Avg Input Price
$3.39
per 1M tokens
Providers
36
AI companies
Arena Top 3
1
#1
arena rank
Claude Opus 4
Anthropic
Context
200K
Input Price
$5.00
Parameters
Undisclosed
Best For
Complex reasoning, coding, agentic tasks
2
#2
arena rank
GPT-o3
OpenAI
Context
200K
Input Price
$2.00
Parameters
Undisclosed
Best For
Advanced reasoning, agentic tasks, research
3
#2
arena rank
GPT-4o
OpenAI
Context
128K
Input Price
$2.50
Parameters
~200B (est.)
Best For
General purpose, coding, analysis
Largest Context Windows
10M
4M
Most Affordable (Input)
FREE
FREE
FREE
FREE
FREE
Models by Provider
OpenAI12
Google DeepMind11
Mistral AI10
Anthropic8
Meta AI7
Cohere6
xAI4
Microsoft4
All Models (113)
| Model ⇅ | Provider ⇅ | Context ⇅ | Input / 1M ⇅ | Output / 1M ⇅ | Rank ▲ | Best For ⇅ |
|---|---|---|---|---|---|---|
| Claude Opus 4 | Anthropic | 200K | $5.00 | $25.00 | #1 | Complex reasoning, coding, agentic tasks |
| GPT-o3 | OpenAI | 200K | $2.00 | $8.00 | #2 | Advanced reasoning, agentic tasks, research |
| GPT-4o | OpenAI | 128K | $2.50 | $10.00 | #2 | General purpose, coding, analysis |
| GPT-o1 | OpenAI | 200K | $15.00 | $60.00 | #3 | Complex reasoning, math, science, coding |
| Claude Sonnet 4 | Anthropic | 200K | $3.00 | $15.00 | #3 | Coding, writing, long documents |
| DeepSeek R1OSS | DeepSeek | 128K | $0.55 | $2.19 | #3 | Complex reasoning, math, science, coding |
| GPT-4.5 | OpenAI | 128K | $75.00 | $150.00 | #4 | Creative writing, nuanced understanding, EQ tasks |
| Gemini 1.5 Pro | Google DeepMind | 1M | $3.50 | $10.50 | #4 | Long documents, multimodal analysis, coding |
| Gemini 2.5 Pro | Google DeepMind | 1M | $1.25 | $10.00 | #4 | Long documents, multimodal, reasoning |
| Grok 3 | xAI | 128K | $3.00 | $15.00 | #5 | Real-time info, reasoning, math |
| GPT-4 Turbo | OpenAI | 128K | $10.00 | $30.00 | #5 | Complex tasks, coding, analysis, vision |
| DeepSeek V3OSS | DeepSeek | 128K | $0.27 | $1.10 | #5 | Coding, math, general reasoning |
| Qwen 2.5 72BOSS | Alibaba DAMO | 128K | Free (open) | Free (open) | #6 | Multilingual, coding, math, reasoning |
| GPT-o4 Mini | OpenAI | 200K | $1.10 | $4.40 | #6 | Affordable reasoning, coding, STEM tasks |
| Llama 4 MaverickOSS | Meta | 1M | Free | Free | #7 | Open source, self-hosted, multilingual |
| Claude 3 Opus | Anthropic | 200K | $15.00 | $75.00 | #7 | Complex analysis, research, nuanced writing |
| Grok-2 | xAI | 128K | $2.00 | $10.00 | #7 | Real-time information, reasoning, coding |
| Qwen 3OSS | Alibaba | 128K | Free | Free | #7 | Multilingual, reasoning, agentic tasks |
| Mistral Large | Mistral AI | 256K | $0.50 | $1.50 | #8 | European privacy, multilingual, code |
| Gemini 2.0 Flash | Google DeepMind | 1M | $0.10 | $0.40 | #8 | Agentic tasks, multimodal, tool use |
| Mistral Large 2OSS | Mistral AI | 128K | $2.00 | $6.00 | #8 | Multilingual, coding, complex reasoning |
| Moonshot Kimi k2OSS | Moonshot AI | 131K | $0.55 | $2.20 | #8 | Coding, agentic tasks, reasoning |
| Claude 3.5 Sonnet | Anthropic | 200K | $3.00 | $15.00 | #8 | Coding, analysis, writing, vision |
| Llama 3.1 405BOSS | Meta AI | 128K | Free (open) | Free (open) | #9 | Complex reasoning, coding, multilingual tasks |
| Qwen 2.5 Max | Alibaba | 32K | $1.60 | $6.40 | #9 | Multilingual, Chinese/English, reasoning |
| Gemini 1.5 Flash | Google DeepMind | 1M | $0.075 | $0.30 | #10 | High-volume tasks, summarization, chat |
| Gemini 2.5 Flash | Google DeepMind | 1M | $0.30 | $2.50 | #10 | Fast reasoning, cost-efficient, multimodal |
| MiniMax-01OSS | MiniMax | 4M | $0.50 | $1.10 | #11 | Ultra-long context, document analysis |
| Llama 3.2 90B VisionOSS | Meta AI | 128K | Free (open) | Free (open) | #11 | Image understanding, visual QA, multimodal tasks |
| Llama 4 ScoutOSS | Meta | 10M | Free | Free | #12 | Long context, open source, multilingual |
| Claude 3.5 Haiku | Anthropic | 200K | $0.80 | $4.00 | #12 | Fast coding, data extraction, classification |
| Llama 3.3 70BOSS | Meta AI | 128K | Free (open) | Free (open) | #13 | Instruction following, coding, reasoning |
| Llama 3.3OSS | Meta | 128K | Free | Free | #13 | General purpose, multilingual, coding |
| Llama 3.1 70BOSS | Meta AI | 128K | Free (open) | Free (open) | #14 | Balanced performance, fine-tuning, deployment |
| GPT-4o Mini | OpenAI | 128K | $0.15 | $0.60 | #15 | Fast responses, cost-efficient tasks, lightweight apps |
| Claude Haiku 4.5 | Anthropic | 200K | $1.00 | $5.00 | #15 | Fast responses, classification, extraction |
| Mixtral 8x22BOSS | Mistral AI | 64K | $0.90 | $2.70 | #16 | Efficient reasoning, multilingual, coding |
| Mistral Medium | Mistral AI | 128K | $0.40 | $2.00 | #16 | Enterprise tasks, European languages |
| Grok 3 Mini | xAI | 128K | $0.30 | $0.50 | #16 | Lightweight reasoning, fast responses, chat |
| Grok-2 Mini | xAI | 128K | $0.30 | $1.50 | #16 | Fast responses, chat, lightweight tasks |
| Reka Core | Reka AI | 128K | $2.00 | $2.00 | #17 | Multimodal reasoning, video understanding, multilingual |
| Command R+OSS | Cohere | 128K | $2.50 | $10.00 | #17 | RAG, enterprise search, multilingual |
| Gemma 2 27BOSS | Google DeepMind | 8K | Free (open) | Free (open) | #18 | Research, fine-tuning, on-premise deployment |
| Qwen 2.5 CoderOSS | Alibaba | 128K | Free | Free | #18 | Code generation, code review, debugging |
| Gemma 3OSS | Google DeepMind | 128K | Free | Free | #19 | Open source, on-device, research |
| Mistral SmallOSS | Mistral AI | 32K | $0.20 | $0.60 | #19 | Fast inference, cost-effective tasks, chat |
| DBRXOSS | Databricks | 32K | Free (open) | Free (open) | #20 | Enterprise AI, data analysis, coding |
| Claude 3 Haiku | Anthropic | 200K | $0.25 | $1.25 | #20 | Quick tasks, chatbots, content moderation |
| Gemini 2.0 Flash Lite | Google DeepMind | 1M | $0.075 | $0.30 | #22 | High-volume, low-cost tasks |
| Llama 3.1 8BOSS | Meta AI | 128K | Free (open) | Free (open) | #22 | Edge deployment, mobile, fast inference |
| Command ROSS | Cohere | 128K | $0.15 | $0.60 | #23 | Cost-effective RAG, summarization, chat |
| Baichuan 4 | Baichuan AI | 128K | $2.00 | $2.00 | #24 | Chinese language, enterprise tasks |
| GPT-3.5 Turbo | OpenAI | 16K | $0.50 | $1.50 | #25 | Fast responses, chatbots, simple tasks |
| Gemma 2OSS | Google DeepMind | 8K | Free | Free | #26 | On-device AI, research, fine-tuning |
| Mistral NemoOSS | Mistral AI | 128K | $0.30 | $0.30 | #27 | Lightweight tasks, drop-in replacement |
| Phi-4OSS | Microsoft | 16K | Free | Free | #28 | Small model research, edge deployment, reasoning |
| Falcon 180BOSS | Technology Innovation Institute | 4K | Free (open) | Free (open) | — | Research, multilingual generation, fine-tuning |
| Pika 1.5 | Pika | N/A (video) | Credits-based | Credits-based | — | Video generation, video editing, effects |
| Whisper V3OSS | OpenAI | N/A (audio) | Free | Free | — | Speech-to-text, transcription, translation |
| Jamba 1.5 LargeOSS | AI21 Labs | 256K | $2.00 | $8.00 | — | Long documents, enterprise RAG, analysis |
| Falcon 40BOSS | Technology Innovation Institute | 2K | Free (open) | Free (open) | — | General tasks, fine-tuning, research |
| Whisper Large v3OSS | OpenAI | N/A (audio) | Free (open) | Free (open) | — | Speech recognition, transcription, translation |
| Llama 3 8BOSS | Meta AI | 8K | Free (open) | Free (open) | — | Edge deployment, fast inference, fine-tuning |
| Llama 3 70BOSS | Meta AI | 8K | Free (open) | Free (open) | — | General tasks, fine-tuning, instruction following |
| Mistral 7BOSS | Mistral AI | 32K | Free (open) | Free (open) | — | Efficient tasks, fine-tuning, edge deployment |
| Mixtral 8x7BOSS | Mistral AI | 32K | Free (open) | Free (open) | — | Efficient inference, multilingual, coding |
| GLM-4 | Zhipu AI | 128K | Undisclosed | Undisclosed | — | Chinese language tasks, reasoning, coding |
| Jamba 1.5 Mini (SSM)OSS | AI21 Labs | 256K | /bin/zsh.20 | /bin/zsh.40 | — | Efficient long-context processing, throughput |
| Stable Diffusion 3.5 LargeOSS | Stability AI | N/A (image) | Free | Free | — | Open source image generation, customization, fine-tuning |
| FLUX.1 Pro | Black Forest Labs | N/A (image) | API-based | API-based | — | Professional image generation, design, marketing |
| Pixtral LargeOSS | Mistral AI | 128K | $2.00 | $6.00 | — | Image understanding, visual reasoning, documents |
| Stable Diffusion 3OSS | Stability AI | N/A (image) | Free (open) | Free (open) | — | Image generation, art creation, design |
| Inflection 2.5 | Inflection AI | 8K | N/A | N/A | — | Conversational AI, emotional intelligence, empathy |
| Cohere Embed v4 | Cohere | 128K | $0.12 | $0.12 | — | Semantic search, RAG embeddings, document retrieval |
| Stable Video DiffusionOSS | Stability AI | N/A (video) | Free (open) | Free (open) | — | Video generation, animation, visual effects |
| Gen-3 Alpha | Runway | N/A (video) | Credits-based | Credits-based | — | Professional video generation, filmmaking |
| Eleven Turbo v2.5 | ElevenLabs | N/A (audio) | Credits-based | Credits-based | — | Real-time speech synthesis, conversational AI |
| Nemotron 4 340BOSS | NVIDIA | 4K | Free (open) | Free (open) | — | Synthetic data generation, training pipelines |
| Phi-3 MediumOSS | Microsoft | 128K | Free (open) | Free (open) | — | Balanced performance, reasoning, coding |
| ArcticOSS | Snowflake | 4K | Free (open) | Free (open) | — | SQL generation, enterprise data tasks, coding |
| WizardLM-2 8x22BOSS | Microsoft | 64K | Free (open) | Free (open) | — | Complex instructions, reasoning, coding |
| Claude 2.1 | Anthropic | 200K | $8.00 | $24.00 | — | Long documents, analysis, reduced hallucinations |
| Gemini 1.0 Ultra | Google DeepMind | 32K | Subscription-based | Subscription-based | — | Complex reasoning, multimodal understanding |
| Krutrim | Krutrim | 128K | $0.10 | $0.30 | — | Hindi, Indian languages |
| Flux 1.1 Pro | Black Forest Labs | — | $0.04/img | — | — | Image generation, design |
| FLUX.1 SchnellOSS | Black Forest Labs | N/A (image) | Free (open) | Free (open) | — | Fast image generation, prototyping, development |
| Midjourney V6.1 | Midjourney | N/A (image) | $10/month (Basic) | $30/month (Standard) | — | Photorealistic images, artistic creation, visual design |
| Aya 23 35BOSS | Cohere | 8K | Free (open) | Free (open) | — | Multilingual tasks, low-resource languages |
| Aya ExpanseOSS | Cohere | 128K | Free | Free | — | Multilingual (23 languages), research |
| Solar 10.7BOSS | Upstage | 4K | Free (open) | Free (open) | — | Korean-English bilingual, fine-tuning, enterprise |
| StarCoder2 15BOSS | Hugging Face | 16K | Free (open) | Free (open) | — | Code completion, code generation, development |
| Yi-Large | 01.AI | 32K | Undisclosed | Undisclosed | — | Complex reasoning, multilingual, analysis |
| Eleven Multilingual v2 | ElevenLabs | N/A (audio) | Credits-based | Credits-based | — | Multilingual TTS, audiobooks, dubbing |
| Zephyr 7BOSS | Hugging Face | 32K | Free (open) | Free (open) | — | Chat, instruction following, lightweight deployment |
| Sora | OpenAI | — | — | — | — | Video generation from text |
| Codestral | Mistral AI | 32K | $0.30 | $0.90 | — | Code generation, code completion, debugging |
| Veo 2 | Google DeepMind | — | — | — | — | Video generation, cinematic shots |
| DALL-E 3 | OpenAI | N/A (image) | $0.04/image (1024x1024) | $0.08/image (1792x1024) | — | Image generation, creative design, illustration |
| Phi-3 MiniOSS | Microsoft | 128K | Free (open) | Free (open) | — | Edge deployment, mobile, on-device AI |
| Qwen 2.5 Coder 32BOSS | Alibaba DAMO | 128K | Free (open) | Free (open) | — | Code generation, code review, debugging |
| Cohere Embed v3 | Cohere | 512 tokens | $0.10/1M tokens | N/A (embeddings) | — | Search, RAG, semantic similarity, clustering |
| Sarvam-MOSS | Sarvam AI | 32K | $0.20 | $0.20 | — | Indian languages, Indic NLP |
| SDXL TurboOSS | Stability AI | N/A (image) | Free (open) | Free (open) | — | Real-time image generation, rapid prototyping |
| DeepSeek Coder V2OSS | DeepSeek | 128K | $0.14 | $0.28 | — | Code generation, debugging, code review |
| Jamba 1.5 MiniOSS | AI21 Labs | 256K | $0.20 | $0.40 | — | Cost-effective long-context, summarization |
| Kimi | Moonshot AI | 2M | Undisclosed | Undisclosed | — | Ultra-long documents, research, analysis |
| Midjourney v6 | Midjourney | N/A (image) | Subscription-based | Subscription-based | — | Artistic image generation, creative design |
| Yi-1.5 34BOSS | 01.AI | 4K | Free (open) | Free (open) | — | Bilingual tasks, fine-tuning, research |
| Ernie 4.0 | Baidu AI | 128K | Undisclosed | Undisclosed | — | Chinese language, enterprise AI, search |
| QwQ 32BOSS | Alibaba DAMO | 32K | Free (open) | Free (open) | — | Reasoning, math, logical problem-solving |
| Hailuo AI | MiniMax | N/A (video) | Free tier available | Free tier available | — | Video generation, Chinese market content |
| Gen-2 | Runway | N/A (video) | Credits-based | Credits-based | — | Video generation, creative content, effects |
| Dream Machine | Luma AI | N/A (video) | Credits-based | Credits-based | — | Video generation, 3D content, visual effects |
Popular Comparisons
⚖️ Claude Opus 4 vs GPT-o3→⚖️ Claude Opus 4 vs GPT-4o→⚖️ Claude Opus 4 vs GPT-o1→⚖️ Claude Opus 4 vs Claude Sonnet 4→⚖️ Claude Opus 4 vs DeepSeek R1→⚖️ Claude Opus 4 vs GPT-4.5→⚖️ Claude Opus 4 vs Gemini 1.5 Pro→⚖️ GPT-o3 vs GPT-4o→⚖️ GPT-o3 vs GPT-o1→⚖️ GPT-o3 vs Claude Sonnet 4→⚖️ GPT-o3 vs DeepSeek R1→⚖️ GPT-o3 vs GPT-4.5→
Frequently Asked Questions
What is the best AI model in 2026?
Based on arena rankings, Claude Opus 4 by Anthropic holds the #1 position. It excels at Complex reasoning, coding, agentic tasks with a 200K context window.
Which AI model is cheapest?
Llama 4 Maverick offers the lowest input pricing at Free/1M tokens. There are also 11 free models available.
Which AI model has the largest context window?
Llama 4 Scout leads with 10M context window, followed by MiniMax-01 at 4M.
How many open-source AI models are there?
Awaira tracks 58 open-source models out of 113 total. Open-source models include DeepSeek R1, DeepSeek V3, Qwen 2.5 72B and 55 more.
What is the average AI model pricing?
The average input pricing across paid models is $3.39 per 1M tokens. Output pricing is typically 2-5x higher than input pricing.