How to fix node not found in Kubernetes

KubernetesINTERMEDIATEHIGH

The "node not found" error occurs when kubelet fails to register with the Kubernetes API server during cluster initialization or after a node restart. This prevents the node from joining the cluster and scheduling pods.

What this error means

When you see "Error getting node err=node not found" in kubelet logs, it means the kubelet on a worker node cannot find itself registered in the API server. This typically happens during cluster initialization when kubelet tries to register before the API server is ready, or after a reboot when node registration fails. Kubelet registers itself with the API server by creating a Node object. If this fails, the node cannot join the cluster and cannot schedule pods. The error appears in kubelet logs, not in kubectl commands.

How to fix "node not found"

1Check kubelet status on the node

SSH to the affected node and verify kubelet is running:

bash

ssh <node-ip>
sudo systemctl status kubelet

If not running, check why:

bash

sudo journalctl -xeu kubelet  # Detailed logs with explanations
sudo journalctl -u kubelet -n 100  # Last 100 kubelet log lines

Look for errors about API server, configuration, or runtime connectivity.

2Restart kubelet to force re-registration

Kubelet should automatically re-register with the API server after restart:

bash

sudo systemctl restart kubelet

Monitor registration from the control plane:

bash

kubectl get nodes -w  # Watch for the node to appear

Watch logs from control plane:

bash

kubectl logs -n kube-system <api-server-pod>  # Check for registration events

Give it 30-60 seconds to register.

3Verify hostname matches configuration

Check the system hostname:

bash

hostnamectl
hostname -f

Verify it matches kubeadm configuration or kubelet flags. If using kubeadm, check:

bash

cat /etc/kubernetes/kubelet.conf | grep hostname

If there's a mismatch, either:
- Change the hostname: sudo hostnamectl set-hostname <new-name>
- Update kubelet config to match the actual hostname

Ensure master and worker nodes have different hostnames.

4Verify API server is running and accessible

From the node, test API server connectivity:

bash

kubectl cluster-info  # From control plane
kubectl get nodes  # From control plane
kubectl --kubeconfig=/etc/kubernetes/kubelet.conf get nodes  # Using kubelet credentials

Test port 6443:

bash

netstat -tlnp | grep 6443  # On control plane
curl -k https://<control-plane-ip>:6443/api/v1  # From node

If API server isn't listening or responding, it must be fixed first.

5Ensure container runtime is running

Check Docker or other container runtime:

bash

sudo systemctl status docker  # For Docker
sudo systemctl status containerd  # For containerd
sudo systemctl status crio  # For CRI-O

If not running, start it:

bash

sudo systemctl start docker

Test container runtime connectivity:

bash

sudo docker version
sudo ctr version  # For containerd

6Install or verify CNI plugin is deployed

Kubelet may fail startup if pod network (CNI) is missing. Check from control plane:

bash

kubectl get pods -n kube-system | grep -E "flannel|calico|weave"  # Check for CNI pods

If missing, install the network plugin:

bash

# Flannel example:
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

Then restart kubelet on the node:

bash

sudo systemctl restart kubelet

7Check firewall rules for port 6443 access

Verify the node can reach the API server on port 6443:

bash

sudo firewall-cmd --list-all  # If using firewalld
sudo iptables -L -n  # If using iptables

Open port if needed:

bash

sudo firewall-cmd --permanent --add-rich-rule='rule family="ipv4" destination address="<control-plane-ip>" port protocol="tcp" port="6443" accept'
sudo firewall-cmd --reload

For testing, temporarily disable firewall:

bash

sudo systemctl stop firewalld

8Verify Docker version compatibility

Some Kubernetes versions have compatibility issues with specific Docker versions. Check:

bash

docker version
kubectl version  # From control plane

Docker 18.6-18.9 have known issues with Kubernetes. If incompatible, upgrade Docker:

bash

sudo apt-get update
sudo apt-get install docker-ce=<version>  # Specific version

Then restart kubelet.

9For cluster initialization, verify kubeadm join order

When joining workers to a new cluster, control plane must be fully ready first:

bash

# On control plane, wait for this to succeed:
kubectl get nodes

# Only then run on worker:
kubeadm join <control-plane-ip>:6443 --token <token> --discovery-token-ca-cert-hash sha256:<hash>

Wait 2-3 minutes for control plane to stabilize before joining workers. Monitor kubelet logs during join:

bash

sudo journalctl -u kubelet -f

How to fix node not found in Kubernetes

What this error means

Typical symptoms

Common causes

How to fix "node not found"

Advanced notes

Related errors

Official resources & further reading