Interactive guide to common pod problems, how to read events and status, and targeted fixes—no cluster required to learn the patterns.
A Pod moves through phases shown below. Failures often appear as Pending, repeated restarts, or terminal Failed—each with distinct events and fixes.
Init containers and multiple restarts add detail; this view highlights where scheduling, image pull, config, runtime, and OOM issues usually surface.
| State / symptom | One-line description | Severity |
|---|---|---|
| Pending | Scheduler or kubelet prerequisites not met; pod not placed or not starting. | Warn |
| ImagePullBackOff / ErrImagePull | Registry, auth, tag, or network prevents pulling the container image. | High |
| CrashLoopBackOff | Container exits repeatedly; kubelet backs off restarts. | High |
| OOMKilled | Process exceeded its cgroup memory limit and was killed. | High |
| CreateContainerConfigError | Env or volume references point to missing ConfigMaps, Secrets, or keys. | Warn |
| RunContainerError | Runtime could not start the container (security, mounts, binary path). | High |
Use kubectl describe pod <name> and match Events or State to jump to the right tab:
Pending means the pod is accepted but not running—often scheduling, resource, or volume binding.
requests.1. Inspect pod events:
kubectl describe pod <pod-name> -n <namespace>
Scroll to Events; look for FailedScheduling with reasons.
2. Filter cluster events for the pod:
kubectl get events --field-selector involvedObject.name=<pod-name> -n <namespace> --sort-by='.lastTimestamp'
3. Compare node capacity and allocation:
kubectl describe nodes
Check Allocatable vs. running pod requests; note taints and conditions.
Events:
Type Reason Message
---- ------ -------
Warning FailedScheduling 0/3 nodes are available: 1 Insufficient cpu, 2 node(s) had taint {key: value}, that the pod didn't tolerate.
| Cause | Fix |
|---|---|
| No schedulable node | Fix NotReady nodes; adjust workloads. |
| Insufficient CPU/memory | Lower requests/limits, scale cluster, or remove noisy neighbors. Edit: |
| Taints / tolerations | Add tolerations to pod spec or remove taint: |
| Affinity mismatch | Relax nodeSelector, affinity, or label nodes correctly. |
| PVC not bound | Fix StorageClass, provisioner, or capacity. |
| Too many pods | Spread across nodes, raise kubelet max pods (ops), or reduce replicas. |
The kubelet cannot pull the image; Kubernetes retries with exponential backoff (ImagePullBackOff).
imagePullSecrets.kubectl describe pod <pod-name> -n <namespace>
Under Events, find Failed to pull image with the underlying error (404, 401, timeout, etc.).
State: Waiting
Reason: ImagePullBackOff
Events:
Warning Failed Error: ErrImagePull
Warning Failed failed to pull image "myregistry/app:badtag": rpc error: Not Found
Correct the image — fix deployment/pod image: field to a valid name:tag or digest.
Registry credentials — create a pull secret and reference it:
kubectl create secret docker-registry regcred \
--docker-server=<registry> \
--docker-username=<user> \
--docker-password=<token> \
--docker-email=<email> \
-n <namespace>
DNS and connectivity — on a node, test: crictl pull or nerdctl pull (depending on runtime), and verify DNS resolves the registry host.
apiVersion: v1
kind: Pod
metadata:
name: app
spec:
imagePullSecrets:
- name: regcred
containers:
- name: app
image: myregistry.example.com/myapp:1.2.3
The container starts, exits non-zero (or is killed), and kubelet restarts it—backing off after repeated failures.
command / args or working directory.kubectl logs <pod-name> -n <namespace> --previous
kubectl describe pod <pod-name> -n <namespace>
Check restartCount, Last State, and Reason (e.g. Error, OOMKilled).
kubectl get pod <pod-name> -n <namespace> \
-o jsonpath='{.status.containerStatuses[0].lastState}' | jq .
panic: runtime error: invalid memory address
exit status 2
Error: connect ECONNREFUSED 10.0.0.5:5432
--previous logs; fix application code or configuration causing exit.command/args with image ENTRYPOINT documentation.initialDelaySeconds, separate readiness from liveness.OOMKilled, treat as memory limit issue (OOM tab).The Linux OOM killer terminated the container process when cgroup memory exceeded the container limit.
resources.limits.memory set too low for real workload.kubectl describe pod <pod-name> -n <namespace>
Under container status, lastState.terminated.reason: OOMKilled.
kubectl top pod -n <namespace>
On the node (if permitted):
dmesg | grep -i oom
limits (and usually requests) based on profiling.-Xmx (and related flags) below the container memory limit, leaving headroom.containers:
- name: app
image: myapp:1.0
resources:
requests:
memory: "256Mi"
limits:
memory: "512Mi"
Kubelet cannot build the container environment—usually missing ConfigMap, Secret, or a referenced key.
key not found in the ConfigMap/Secret.kubectl describe pod <pod-name> -n <namespace>
Events often show: Error: configmap "app-config" not found or similar for secrets/keys.
kubectl apply -f configmap.yaml -n <namespace>
kubectl apply -f secret.yaml -n <namespace>name, namespace, and key references in the pod spec.envFrom:
- configMapRef:
name: app-config
- secretRef:
name: app-secret
volumeMounts:
- name: cfg
mountPath: /config
volumes:
- name: cfg
configMap:
name: app-config
items:
- key: app.properties
path: app.properties
Container runtime failed to start the container—distinct from in-app crashes (CrashLoop) or image pull failures.
runAsNonRoot: true but image only runs as root.PATH or command.kubectl describe pod <pod-name> -n <namespace>
Container state Waiting, reason RunContainerError; message often names permission, mount, or executable issues.
runAsUser/fsGroup the image supports, or use a non-root image.runAsNonRoot only if policy allows—prefer fixing the image.readOnlyRootFilesystem allows required writes (tmpfs/extra volume).command to full path if needed: ["/usr/local/bin/myapp"].Example security context adjustment (illustrative—match your policy):
securityContext:
runAsUser: 1000
runAsGroup: 1000
allowPrivilegeEscalation: false
| Status / hint | First command | Common cause | Fix direction |
|---|---|---|---|
| Pending + FailedScheduling | kubectl describe pod |
Resources, taints, affinity, PVC | Adjust requests/tolerations/affinity; fix PVC |
| ImagePullBackOff | kubectl describe pod |
Bad tag, auth, rate limit, network | Fix image; add imagePullSecret; check node DNS |
| CrashLoopBackOff | kubectl logs --previous |
App error, bad cmd, probes | Fix app/config; tune probes |
| OOMKilled | kubectl describe pod |
Low limit, leak, JVM heap | Raise limit; profile; set -Xmx |
| CreateContainerConfigError | kubectl describe pod |
Missing CM/Secret/key | Create objects; fix references |
| RunContainerError | kubectl describe pod |
Security, mounts, binary | Adjust securityContext; fix volumes/cmd |