Model Deployment Strategies in MLOps¶
Introduction¶
Model deployment is the process of making a trained model available to a user, system, job, or API. The strategy should match latency needs, risk, and rollback requirements.
Why This Matters¶
A high-quality model can still fail if released badly. Deployment strategy controls blast radius, traffic exposure, validation, and recovery.
Core Concepts¶
Strategies include batch inference, real-time inference, shadow deployment, canary deployment, blue/green deployment, and rollback.
Practical Example¶
A canary can start with one replica of the new version:
apiVersion: apps/v1
kind: Deployment
metadata:
name: churn-api-v2
spec:
replicas: 1
template:
spec:
containers:
- name: api
image: registry.example.com/churn-api:v2
How This Fits in a Production Workflow¶
Choose the strategy before release. APIs need health checks, latency SLOs, error-rate alerts, model-version logs, and a rollback command.
Common Mistakes¶
- Using real-time serving when batch scoring is enough.
- Sending all traffic to a new model without rollback.
- Comparing canary models without logging model version.
- Ignoring business metrics after a technically successful release.
Quick Checklist¶
- Is inference batch or real-time?
- Is model version logged?
- Is rollback tested?
- Are latency and error-rate alerts active?
- Is canary or shadow evaluation needed?
Related Guides¶
Summary¶
Learn practical model deployment strategies including batch inference, real-time APIs, shadow deployments, canaries, blue/green releases, and rollback.