CloudsArk
Troubleshooting Kubernetes

Fix CoreDNS CrashLoopBackOff

Learn practical fix coredns crashloopbackoff with kubectl commands, manifests, verification steps, common mistakes, and production-focused guidance.

Fix CoreDNS CrashLoopBackOff

Introduction

This guide explains fix coredns crashloopbackoff with practical kubectl commands, realistic output, and production-focused checks. Use this workflow when an application is failing and you need evidence before changing manifests.

Symptoms

You may see pods stuck in a waiting state, failed rollouts, 4xx or 5xx responses, missing endpoints, failed probes, denied API calls, or repeated events in the namespace.

Common Causes

Common causes include DNS policy, CoreDNS pods, upstream resolvers, and NetworkPolicy rules. Always confirm with events and logs before editing the workload.

Step 1: Check Current State

kubectl get pods -n kube-system -l k8s-app=kube-dns
kubectl logs -n kube-system deployment/coredns

Expected output:

NAME                         READY   STATUS    RESTARTS   AGE
coredns-7db6d8ff4d-abcde     1/1     Running   0          3d

Step 2: Inspect Events and Logs

kubectl logs -n kube-system deployment/coredns
kubectl run dns-test --rm -it --image=busybox:1.36 --restart=Never -- nslookup kubernetes.default

Events show scheduler, kubelet, image pull, mount, and probe errors. Previous logs are critical when the container restarts quickly.

Step 3: Verify the Manifest or Runtime Setting

kubectl get events -n app --sort-by=.lastTimestamp
kubectl get pod web-7d9f8c-abcde -n app -o yaml

Check selectors, image names, probes, resource limits, service accounts, volumes, and namespace references.

Step 4: Apply the Fix

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web
  namespace: app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
      - name: web
        image: nginx:1.27
        ports:
        - containerPort: 80

Apply only the corrected field, then let the controller reconcile the desired state.

kubectl apply -f manifest.yaml
kubectl rollout status deployment/web -n app

Step 5: Confirm Recovery

kubectl get pods -n app
kubectl get events -n app --sort-by=.lastTimestamp

Common Mistakes

  • Deleting pods before reading the events that explain why they failed.
  • Changing probes, resources, images, and RBAC at the same time.
  • Troubleshooting only the pod while ignoring the service, PVC, node, or service account.

Quick Checklist

  • Check pod status and restart count.
  • Read describe output and recent events.
  • Inspect current and previous container logs.
  • Verify dependent objects such as Secrets, ConfigMaps, PVCs, Services, and RBAC.
  • Apply one fix and watch the rollout.

Summary

Treat fix coredns crashloopbackoff as an evidence-driven debugging task. Events identify the failing layer, logs explain application behavior, and rollout checks prove the fix worked.