CloudsArk
Deployment Mlops

Kubernetes for MLOps

Learn how Kubernetes helps deploy ML APIs, scale inference workloads, manage resources, and operate model-serving services.

Kubernetes for MLOps

Introduction

Learn how Kubernetes helps deploy ML APIs, scale inference workloads, manage resources, and operate model-serving services.

Before You Start

You need a container image, resource requests, health endpoints, and a Service. Kubernetes should run the serving workload; training pipelines may run as Jobs or external workflows.

Project Structure

container image -> Deployment -> Service -> Ingress or Route -> monitoring

Step-by-Step Deployment

Deployment and Service example:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ml-api
spec:
  replicas: 2
  selector:
    matchLabels:
      app: ml-api
  template:
    metadata:
      labels:
        app: ml-api
    spec:
      containers:
      - name: api
        image: registry.example.com/ml-api:churn-17
        ports:
        - containerPort: 8000
        resources:
          requests:
            cpu: "250m"
            memory: "512Mi"
          limits:
            cpu: "1"
            memory: "1Gi"
---
apiVersion: v1
kind: Service
metadata:
  name: ml-api
spec:
  selector:
    app: ml-api
  ports:
  - port: 80
    targetPort: 8000

Apply it:

kubectl apply -f ml-api.yaml
kubectl rollout status deployment/ml-api
kubectl get pods -l app=ml-api

Testing the Deployment

Test from inside the cluster:

kubectl run curl --rm -it --image=curlimages/curl --restart=Never -- curl -s http://ml-api/health
{"status":"ok","model_version":"churn:17"}

Production Considerations

Add readiness and liveness probes, autoscaling, PodDisruptionBudgets, logs, metrics, NetworkPolicy, and safe rollout settings before production.

Common Mistakes

  • Deploying without resource requests.
  • Scaling inference before checking model load time and memory.
  • Missing readiness probes.
  • Not exposing model version in health or metrics.

Summary

A reliable model deployment is versioned, testable, observable, and reversible.