Back to GlossaryInfrastructure

MLOps

Definition

The set of practices, tools, and principles for managing the full lifecycle of machine learning systems in production, from data preparation through deployment, monitoring, and maintenance.

MLOps (Machine Learning Operations) applies DevOps principles to the unique challenges of ML systems. Key concerns include data versioning, experiment tracking, model training pipelines, automated testing, deployment orchestration, monitoring for data drift and model degradation, and retraining workflows. Popular tools include MLflow (experiment tracking), Weights & Biases (training visualization), Kubeflow (ML pipelines on Kubernetes), DVC (data version control), and Seldon (model deployment). MLOps is critical because ML systems can fail silently — a model that was accurate at deployment may degrade as real-world data distributions shift. The field has grown rapidly as organizations move from ML experiments to production systems, with dedicated MLOps platforms and roles becoming standard in data-driven organizations.

Companies in Infrastructure

View Infrastructure companies →