# Kubernetes Troubleshooting ## Debugging Workflow ```bash # 1. Overview kubectl get pods -o wide kubectl get events -n --sort-by='.lastTimestamp' # 2. Details kubectl describe pod # 3. Logs kubectl logs kubectl logs --previous # Crashed instance kubectl logs -c ``` ## Common Pod States | State | Cause | Solution | |-------|-------|----------| | Pending | No node resources | Check node capacity | | ContainerCreating | Image pulling | Check image URI | | CrashLoopBackOff | Container exits | Check logs, health checks | | ImagePullBackOff | Failed image pull | Verify credentials | | OOMKilled (137) | Out of memory | Increase memory limit | ## Service & Network ```bash kubectl exec -it -- nslookup kubernetes.default kubectl exec -it -- curl http://myservice:8080 kubectl get endpoints kubectl port-forward service/myservice 8080:8080 kubectl get networkpolicies -A ``` ## Quick Fixes | Problem | Command | |---------|---------| | Pod stuck | `kubectl delete pod --grace-period=0 --force` | | High CPU | `kubectl top pods -A --sort-by=cpu` | | High memory | `kubectl top pods -A --sort-by=memory` | | Restart | `kubectl rollout restart deployment/` | | Rollback | `kubectl rollout undo deployment/` | See `kubernetes-troubleshooting-advanced.md` for node issues, HPA, anti-patterns.