Model Monitoring in Production¶

Introduction¶

Production model monitoring combines normal service monitoring with ML-specific signals. You need to know whether the API is healthy and whether predictions still make sense.

What Can Go Wrong in Production¶

An inference service can return HTTP 200 while silently producing poor predictions. Inputs may shift, upstream systems may change units, labels may arrive late, or traffic may move to a population the model never saw.

Key Metrics to Monitor¶

Monitor availability, latency, error rate, request volume, model version, input feature distribution, prediction distribution, business metrics, CPU, memory, and GPU usage.

Practical Example¶

Structured inference logs make monitoring possible:

import json, time
log = {
    "timestamp": time.time(),
    "model_version": "churn:17",
    "latency_ms": 18.4,
    "features": {"tenure": 12, "monthly_charges": 89.9},
    "prediction": 0.73,
}
print(json.dumps(log))

{"model_version":"churn:17","latency_ms":18.4,"prediction":0.73}

Detection Strategy¶

Create separate alerts for service health and model behavior. A 5xx spike is an API incident; a prediction distribution shift may be a data or business change.

Common Mistakes¶

Monitoring only CPU and memory.
Not logging model version.
Collecting feature values without considering privacy.
Alerting on noisy metrics without baselines.

Quick Checklist¶

Is model version logged?
Are latency and error-rate alerts active?
Are input and prediction distributions tracked?
Are labels joined later for quality checks?
Are dashboards reviewed after each release?

Summary¶

Learn what to monitor for production ML models, including service health, latency, errors, inputs, predictions, and business outcomes.

Model Monitoring in Production

Model Monitoring in Production¶

Introduction¶

What Can Go Wrong in Production¶

Key Metrics to Monitor¶

Practical Example¶

Detection Strategy¶

Common Mistakes¶

Quick Checklist¶

Summary¶

Model Registry Explained

Model Evaluation in MLOps

Model Monitoring in Production¶

Introduction¶

What Can Go Wrong in Production¶

Key Metrics to Monitor¶

Practical Example¶

Detection Strategy¶

Common Mistakes¶

Quick Checklist¶

Related Guides¶

Summary¶

Model Registry Explained

Model Evaluation in MLOps

More Mlops

Model Drift Explained

Data Drift Explained

What Is MLOps? A Practical Guide for Beginners