Google DeepMindReleased May 14, 2024

Gemini 1.5 Flash

#10 Arena RankUndisclosed parameters

Context

1M

Input

$0.075

Key Specifications

🏆

Arena Rank

#10

📐

Context Window

1M

📥

Input Price

per 1M tokens

$0.075

📤

Output Price

per 1M tokens

$0.30

🧠

Parameters

Undisclosed

🔒

Open Source

No

Best For

High-volume taskssummarizationchat

About Gemini 1.5 Flash

Gemini 1.5 Flash is Google DeepMind's speed-optimized model that retains the groundbreaking 1 million token context window of Gemini 1.5 Pro while offering dramatically faster inference and lower costs. It uses a novel distillation process to compress the capabilities of the larger Pro model into a lighter architecture. Flash is designed for high-volume production workloads where cost efficiency and speed are critical, while still maintaining strong multimodal understanding.

Pricing per 1M tokens

Input Tokens

$0.075

Output Tokens

$0.30

Frequently Asked Questions

What is Gemini 1.5 Flash?
Gemini 1.5 Flash is Google DeepMind's speed-optimized model that retains the groundbreaking 1 million token context window of Gemini 1.5 Pro while offering dramatically faster inference and lower costs. It uses a novel distillation process to compress the capabilities of the larger Pro model into a lighter architecture. Flash is designed for high-volume production workloads where cost efficiency and speed are critical, while still maintaining strong multimodal understanding.
How much does Gemini 1.5 Flash cost?
Gemini 1.5 Flash costs $0.075 per 1 million input tokens and $0.30 per 1 million output tokens. Pricing is based on token usage, making it cost-effective for both small and large-scale applications.
What is Gemini 1.5 Flash's context window?
Gemini 1.5 Flash has a context window of 1M tokens. This determines how much text the model can process in a single request — larger context windows allow the model to handle longer documents, maintain more conversation history, and reason over bigger codebases.
Is Gemini 1.5 Flash open source?
No, Gemini 1.5 Flash is a proprietary model available through Google DeepMind's API. Proprietary models are typically accessible via API endpoints and offer managed infrastructure, support, and regular updates from the provider.
What is Gemini 1.5 Flash best for?
Gemini 1.5 Flash is best suited for: High-volume tasks, summarization, chat. These use cases leverage the model's specific strengths in terms of capability, speed, and cost-effectiveness within Google DeepMind's model lineup.