MLOps
Definition
The set of practices, tools, and principles for managing the full lifecycle of machine learning systems in production, from data preparation through deployment, monitoring, and maintenance.
MLOps (Machine Learning Operations) applies DevOps principles to the unique challenges of ML systems. Key concerns include data versioning, experiment tracking, model training pipelines, automated testing, deployment orchestration, monitoring for data drift and model degradation, and retraining workflows. Popular tools include MLflow (experiment tracking), Weights & Biases (training visualization), Kubeflow (ML pipelines on Kubernetes), DVC (data version control), and Seldon (model deployment). MLOps is critical because ML systems can fail silently — a model that was accurate at deployment may degrade as real-world data distributions shift. The field has grown rapidly as organizations move from ML experiments to production systems, with dedicated MLOps platforms and roles becoming standard in data-driven organizations.
Related Terms
Inference
The process of using a trained AI model to generate predictions or outputs on new data, as opposed t...
Model Serving
The infrastructure and systems for deploying trained AI models in production to handle real-time req...
Cloud AI
AI services and infrastructure provided through cloud computing platforms, allowing organizations to...
Training Data
The dataset used to teach a machine learning model, consisting of examples from which the model lear...