How to fix Pod Unknown in Kubernetes

KubernetesINTERMEDIATEHIGH

Unknown status means Kubernetes lost contact with the node running the pod. Check node health, network connectivity, and kubelet status. The pod may still be running but unreachable.

What this error means

A pod in Unknown status indicates that the Kubernetes API server cannot communicate with the kubelet on the node where the pod is running. This is a communication problem, not necessarily a pod problem—the container may still be running, but Kubernetes has no visibility into its state. The node controller marks pods as Unknown when it hasn't received a status update from the node within the node-monitor-grace-period (default 40 seconds). This typically indicates node failure, network partition, or kubelet issues.

How to fix "Pod Unknown"

1Check node status

First, verify the node health:

bash

kubectl get nodes

Look for nodes in NotReady status. Get details:

bash

kubectl describe node <node-name>

Check the Conditions section for:
- Ready: False or Unknown
- MemoryPressure, DiskPressure, PIDPressure
- NetworkUnavailable

If the node is NotReady, that's the root cause of Unknown pods.

2Identify affected pods

List all pods in Unknown state:

bash

kubectl get pods --all-namespaces -o wide | grep Unknown

If all Unknown pods are on the same node, the issue is node-level, not pod-level:

bash

# Find which node has the problems
kubectl get pods -o wide | grep Unknown | awk '{print $7}' | sort | uniq -c

3Check kubelet on the affected node

If you have node access, check kubelet:

bash

# SSH to node
ssh <node-ip>

# Check kubelet status
systemctl status kubelet

# View kubelet logs
journalctl -u kubelet -f

# Check node resources
free -h
df -h

Common issues:
- Kubelet crashed due to OOM
- Disk full preventing kubelet from operating
- Certificate expired for kubelet-API communication

4Wait for automatic recovery

Kubernetes will automatically recover when communication resumes:

1. If node comes back online, pod status updates automatically
2. If node stays offline past pod-eviction-timeout (default 5 minutes), pods are rescheduled

Monitor recovery:

bash

kubectl get pods -w  # Watch for status changes
kubectl get nodes -w

For managed Kubernetes (EKS, GKE, AKS), the cloud provider may automatically replace unhealthy nodes.

5Force delete stuck pods if necessary

If the node won't recover and pods remain Unknown:

bash

kubectl delete pod <pod-name> --grace-period=0 --force

Caution: Force deletion tells Kubernetes to forget about the pod, but:
- The container may still be running on the failed node
- Attached volumes may not be properly released
- PersistentVolumes might show as still attached

Only use force delete when you're certain the node won't recover or has been terminated.

6Remove the failed node from cluster

If the node is permanently lost:

bash

# Drain remaining pods (will fail but marks intent)
kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-data --force

# Delete the node
kubectl delete node <node-name>

After node deletion:
- Pods are marked for rescheduling
- PersistentVolumes should detach (may need cloud provider cleanup)
- New nodes can join to replace capacity

For cloud providers, terminated instances are usually cleaned up automatically.

How to fix Pod Unknown in Kubernetes

What this error means

Typical symptoms

Common causes

How to fix "Pod Unknown"

Advanced notes

Related errors

Official resources & further reading