CloudsArk
Troubleshooting Kubernetes

Fix Startup Probe Failed

Learn practical fix startup probe failed with kubectl commands, manifests, verification steps, common mistakes, and production-focused guidance.

Fix Startup Probe Failed

Introduction

This guide explains fix startup probe failed with practical kubectl commands, realistic output, and production-focused checks. Use this workflow when an application is failing and you need evidence before changing manifests.

Symptoms

You may see pods stuck in a waiting state, failed rollouts, 4xx or 5xx responses, missing endpoints, failed probes, denied API calls, or repeated events in the namespace.

Common Causes

Common causes include events, previous logs, probes, image pulls, resource limits, and rollout state. Always confirm with events and logs before editing the workload.

Step 1: Check Current State

kubectl get pods -n app -o wide
kubectl describe pod web-7d9f8c-abcde -n app

Expected output:

NAME                  READY   STATUS             RESTARTS   AGE
web-7d9f8c-abcde      0/1     CrashLoopBackOff   6          8m

Step 2: Inspect Events and Logs

kubectl describe pod web-7d9f8c-abcde -n app
kubectl logs web-7d9f8c-abcde -n app --previous

Events show scheduler, kubelet, image pull, mount, and probe errors. Previous logs are critical when the container restarts quickly.

Step 3: Verify the Manifest or Runtime Setting

kubectl rollout status deployment/web -n app
kubectl get pod web-7d9f8c-abcde -n app -o yaml

Check selectors, image names, probes, resource limits, service accounts, volumes, and namespace references.

Step 4: Apply the Fix

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web
  namespace: app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
      - name: web
        image: nginx:1.27
        ports:
        - containerPort: 80

Apply only the corrected field, then let the controller reconcile the desired state.

kubectl apply -f manifest.yaml
kubectl rollout status deployment/web -n app

Step 5: Confirm Recovery

kubectl get pods -n app
kubectl get events -n app --sort-by=.lastTimestamp

Common Mistakes

  • Deleting pods before reading the events that explain why they failed.
  • Changing probes, resources, images, and RBAC at the same time.
  • Troubleshooting only the pod while ignoring the service, PVC, node, or service account.

Quick Checklist

  • Check pod status and restart count.
  • Read describe output and recent events.
  • Inspect current and previous container logs.
  • Verify dependent objects such as Secrets, ConfigMaps, PVCs, Services, and RBAC.
  • Apply one fix and watch the rollout.

Summary

Treat fix startup probe failed as an evidence-driven debugging task. Events identify the failing layer, logs explain application behavior, and rollout checks prove the fix worked.