Skip to main content
OpenAIReleased November 6, 2023

Whisper Large v3

Open Source1.5B parameters

Whisper Large v3 is OpenAI's entry in a crowded field.

Context

N/A (audio)

Input

Free (open)

Key Specifications

🏆

Arena Rank

Not disclosed

📐

Context Window

N/A (audio)

📥

Input Price

per 1M tokens

Free (open)

📤

Output Price

per 1M tokens

Free (open)

🧠

Parameters

1.5B

🔓

Open Source

Yes

Best For

Speech recognitiontranscriptiontranslation

About Whisper Large v3

Whisper Large v3, developed by OpenAI, is an open-source automatic speech recognition model with 1.5 billion parameters supporting over 100 languages. The model transcribes audio with high accuracy, handling noisy environments, accented speech, and technical vocabulary effectively. It supports both speech-to-text transcription and speech translation across language pairs. Whisper Large v3 improves upon v2 with reduced hallucination on silence, better timestamp accuracy, and stronger performance on low-resource languages. Free and fully open-source, it can be deployed locally on consumer GPUs for privacy-sensitive transcription applications. The model has become the standard for open-source speech recognition, powering transcription services, meeting note applications, accessibility tools, and podcast processing pipelines. Its combination of broad language support, accuracy, and zero cost has made it the most widely deployed open-source ASR model.

Built byOpenAI

Pricing per 1M tokens

Input Tokens

Free (open)

Output Tokens

Free (open)

Frequently Asked Questions

What is Whisper Large v3?
Whisper Large v3, developed by OpenAI, is an open-source automatic speech recognition model with 1.5 billion parameters supporting over 100 languages. The model transcribes audio with high accuracy, handling noisy environments, accented speech, and technical vocabulary effectively. It supports both speech-to-text transcription and speech translation across language pairs. Whisper Large v3 improves upon v2 with reduced hallucination on silence, better timestamp accuracy, and stronger performance on low-resource languages. Free and fully open-source, it can be deployed locally on consumer GPUs for privacy-sensitive transcription applications. The model has become the standard for open-source speech recognition, powering transcription services, meeting note applications, accessibility tools, and podcast processing pipelines. Its combination of broad language support, accuracy, and zero cost has made it the most widely deployed open-source ASR model.
How much does Whisper Large v3 cost?
Input pricing for Whisper Large v3 is Free (open) per million tokens; output runs Free (open). Token-based pricing means you can scale up or down without a fixed commitment.
What is Whisper Large v3's context window?
The context window for Whisper Large v3 is N/A (audio) tokens. That's the maximum amount of text you can feed into a single prompt, including system instructions, conversation history, and the actual query.
Is Whisper Large v3 open source?
Whisper Large v3 is fully open source. You can grab the weights, run it on your own hardware, and fine-tune it for specific tasks. That flexibility is a big deal for teams with strict data requirements.
What is Whisper Large v3 best for?
The sweet spot for Whisper Large v3 is: Speech recognition, transcription, translation. If your workload fits one of these categories, it's worth benchmarking against alternatives.