How to fix node.kubernetes.io/unreachable in Kubernetes

KubernetesINTERMEDIATEHIGH

The node.kubernetes.io/unreachable taint is applied when the Kubernetes control plane cannot communicate with a worker node and the node status becomes "Unknown". This causes pod eviction and prevents new pods from scheduling on that node.

What this error means

When a node becomes unreachable (network isolated, crashed, or unresponsive), the Kubernetes control plane automatically applies the "node.kubernetes.io/unreachable:NoExecute" taint. This is a self-healing mechanism to prevent new pods from being scheduled on a dead node and to evict existing pods so they can be rescheduled elsewhere. The node status shows "Unknown" when the control plane hasn't heard from the kubelet for more than the node controller's grace period (typically 40-50 seconds). Pods get a default 5-minute tolerance before being evicted (unless they have a matching toleration).

How to fix "node.kubernetes.io/unreachable"

1Check current node status and taints

Run:

bash

kubectl get nodes
kubectl describe node <node-name>

Look for taint: "node.kubernetes.io/unreachable:NoExecute". Check the "Ready" condition—it should show "False" or "Unknown". Examine recent events for clues about what caused unreachability.

2Verify kubelet connectivity

Try to SSH to the node and check if kubelet is running:

bash

ssh <node-ip>
sudo systemctl status kubelet
sudo journalctl -u kubelet -n 50  # Last 50 kubelet logs

If kubelet is down, restart it:

bash

sudo systemctl restart kubelet

Monitor status:

bash

kubectl get nodes -w  # Watch for status changes

3Check network connectivity between control plane and node

From control plane, verify you can reach the node:

bash

ping <node-ip>
kubectl get --raw /api/v1/nodes/<node-name>/proxy/metrics

Check kubelet port (10250):

bash

netstat -tlnp | grep 10250  # On the node
curl -k https://<node-ip>:10250/metrics  # From control plane

If firewall is blocking, open required ports: 10250 (kubelet), 10255 (metrics).

4Enable IP forwarding if disabled

IPv4 forwarding is required for pod networking. Check:

bash

cat /proc/sys/net/ipv4/ip_forward

If it returns 0, enable it:

bash

sudo sysctl -w net.ipv4.ip_forward=1

Make permanent by editing /etc/sysctl.conf:

bash

net.ipv4.ip_forward = 1

Then apply: sudo sysctl -p

5Verify CNI plugin is functioning

Check installed CNI plugin:

bash

kubectl get ds -n kube-system  # DaemonSets including CNI
kubectl logs -n kube-system <cni-pod>  # Check logs for errors

Common CNI plugins: flannel, calico, weave. Verify:
- Pods are running: kubectl get pods -n kube-system | grep <cni>
- Node CIDR assignment: kubectl get node <node-name> -o jsonpath="{.spec.podCIDR}"

If misconfigured, redeploy CNI plugin from official manifests.

6Check firewall rules on the node

Review firewall rules:

bash

sudo iptables -L -n  # On the node
sudo firewall-cmd --list-all  # If using firewalld

Ensure these ports are open:
- 10250/TCP (kubelet)
- 10255/TCP (kubelet metrics)
- 8285-8472/UDP (Flannel VxLAN, if using Flannel)
- 179/TCP (Calico BGP, if using Calico)

For testing, temporarily disable firewall:

bash

sudo systemctl stop firewalld

If this fixes it, add proper firewall rules instead.

7Adjust pod tolerations for transient unreachability

If temporary unreachability is expected, increase toleration:

yaml

spec:
  tolerations:
  - key: node.kubernetes.io/unreachable
    operator: Exists
    effect: NoExecute
    tolerationSeconds: 6000  # 100 minutes instead of 300s
  # ... rest of pod spec

For critical pods that must stay running, increase tolerationSeconds even more or remove the tolerance entirely (not recommended for most workloads).

8For permanently failed nodes, drain and delete

If the node cannot be recovered, gracefully remove it:

bash

kubectl drain <node-name> --ignore-daemonsets
kubectl delete node <node-name>

For cloud providers (AWS, GCP, Azure), the node may auto-rejoin if the instance is still running. Stop the instance if you want permanent removal:

bash

# AWS EC2
aws ec2 stop-instances --instance-ids <instance-id>

9Monitor node recovery

Watch the node status after recovery attempts:

bash

kubectl get nodes -w
kubectl describe node <node-name>

Once Ready condition returns to True, the taint is automatically removed and pods can be scheduled again. Check pod eviction status:

bash

kubectl get pods --all-namespaces --field-selector=status.reason=Evicted

Evicted pods need to be manually recreated (via deployments, etc.).

How to fix node.kubernetes.io/unreachable in Kubernetes

What this error means

Typical symptoms

Common causes

How to fix "node.kubernetes.io/unreachable"

Advanced notes

Related errors

Official resources & further reading