Skip to main content
Training

Distillation

Last updated: April 2026

Definition

Distillation is a model compression technique that transfers knowledge from a large teacher model to a smaller student model by training the student to match the teacher output probability distributions, enabling deployment of capable models on resource-constrained devices with minimal quality loss.

Understanding Distillation is key if you're evaluating AI companies or products.

Knowledge distillation, introduced by Hinton et al. in 2015, transfers the knowledge captured by a large, expensive model into a smaller, faster model suitable for deployment. The student model learns not just from the hard labels (correct answers) but from the teacher's soft probability distributions, which contain richer information about relationships between classes. For example, a teacher model's output might indicate that a cat image has a small probability of being a dog but almost zero probability of being a car — this relative information helps the student learn better. Distillation has become crucial for deploying LLMs efficiently: models like Gemma, Phi, and many specialized models are distilled from larger models. The technique enables running capable AI on mobile devices, edge hardware, and in latency-sensitive applications.

Training methodologies involving Distillation are essential to producing capable AI models. Practitioners at companies ranging from OpenAI and Anthropic to smaller startups rely on these techniques to optimize model performance. The computational cost and data requirements of training remain active areas of research and optimization.

Understanding Distillation is essential for anyone working in artificial intelligence, whether as a researcher, engineer, investor, or business leader. As AI systems become more sophisticated and widely deployed, concepts like distillation increasingly influence product development decisions, investment theses, and regulatory frameworks. The rapid pace of innovation in this area means that today best practices may evolve significantly within months, making continuous learning a requirement for AI practitioners.

The continued evolution of Distillation reflects the broader trajectory of artificial intelligence from research curiosity to production-critical technology. Industry analysts project that investments in distillation capabilities and related infrastructure will accelerate as organizations across sectors recognize the competitive advantages offered by AI-native approaches to long-standing business challenges.

Companies in Training

Explore AI companies working with distillation technology and related applications.

View Training Companies →

Related Terms

Explore companies in this space

Training Companies

View Training companies