OpenShift Must Gather Guide¶
Introduction¶
oc adm must-gather collects cluster diagnostics for support or deep troubleshooting. Run it from a workstation with enough disk space and avoid sharing archives without checking for sensitive data.
Symptoms¶
Typical symptoms include failed pods, route errors, denied requests, unhealthy operators, or command errors that repeat after retries.
Common Causes¶
- Running must-gather from a nearly full filesystem.
- Uploading diagnostics without reviewing sensitive project data.
- Collecting too late after logs have rotated.
Step 1: Check the Current Status¶
oc adm must-gather
ls -lh must-gather.local.*
tar -czf must-gather.tar.gz must-gather.local.*
du -sh must-gather.local.*
Example output:
INFO Gathering data for ns/openshift-cluster-version...
INFO Wrote gather data to must-gather.local.1234567890
Step 2: Inspect Logs and Events¶
ls -lh must-gather.local.*
find must-gather.local.* -maxdepth 2 -type d | head
Step 3: Verify Configuration¶
Compare the object selectors, service account, image reference, route target, or operator status with the failing symptom. In OpenShift, events often show the exact admission, scheduling, pull, SCC, or route reason.
Step 4: Apply the Fix¶
Apply the smallest targeted fix: correct the selector, update the route or service port, link the pull secret, grant the specific RBAC or SCC permission, or repair the unhealthy operator dependency.
Step 5: Confirm the Problem Is Resolved¶
Run the verification commands again and confirm the status, events, and user-facing test all agree.
Common Mistakes¶
- Running must-gather from a nearly full filesystem.
- Uploading diagnostics without reviewing sensitive project data.
- Collecting too late after logs have rotated.
Quick Checklist¶
- Confirm the active project.
- Inspect the exact object named in the error.
- Read recent events.
- Apply one focused fix.
- Verify status after the change.
Related Guides¶
Summary¶
OpenShift Must Gather Guide requires matching the symptom to the OpenShift object that owns it. Use oc status commands, events, logs, and focused verification so the fix is tied to evidence.