How to fix container runtime error in Kubernetes

KubernetesADVANCEDCRITICAL

Container runtime errors indicate the kubelet cannot communicate with Docker, containerd, or another CRI runtime. Pods fail to start and nodes become degraded when the runtime is unavailable or misconfigured.

What this error means

The kubelet relies on a container runtime (Docker, containerd, CRI-O) to: 1. Create and run containers 2. Manage container lifecycle (start, stop, restart) 3. Handle image pulls 4. Configure container networking When the runtime fails, the kubelet cannot execute any pods. Common causes: - Runtime daemon crashed or is not responding - Runtime socket file missing or permissions wrong - Container runtime image pull failures - Incompatible kubelet and runtime versions - Node disk full, preventing runtime operations

How to fix "container runtime error"

1Verify container runtime is running

Check which runtime your cluster uses:

bash

kubectl get nodes -o wide  # Check CONTAINER-RUNTIME column

SSH into the affected node and verify runtime status:

For Docker:

bash

sudo systemctl status docker
sudo systemctl start docker  # If stopped
sudo docker ps  # Test connectivity

For containerd:

bash

sudo systemctl status containerd
sudo systemctl start containerd
sudo ctr -a /run/containerd/containerd.sock version

For CRI-O:

bash

sudo systemctl status crio
sudo systemctl start crio

If the daemon is running, proceed to next step.

2Check container runtime logs

Review runtime logs for errors:

Docker:

bash

sudo journalctl -u docker -f
sudo tail -f /var/log/docker.log

containerd:

bash

sudo journalctl -u containerd -f
sudo tail -f /var/log/containerd/containerd.log

CRI-O:

bash

sudo journalctl -u crio -f

Look for errors like:
- "failed to create container"
- "out of memory"
- "no space left on device"
- "permission denied"
- "connection refused"

3Verify runtime socket file exists

Check that the runtime socket exists and is accessible:

Docker:

bash

ls -la /var/run/docker.sock
# Should output: srw-rw---- root docker /var/run/docker.sock

containerd:

bash

ls -la /run/containerd/containerd.sock
# Should output: srw-rw-rw- root root /run/containerd/containerd.sock

CRI-O:

bash

ls -la /var/run/crio/crio.sock

If socket file is missing:
1. Restart the runtime daemon
2. Check if the runtime directory exists
3. Verify mount points if using unusual storage

If permissions are wrong:

bash

sudo chmod 660 /var/run/docker.sock
sudo chown root:docker /var/run/docker.sock

4Verify disk space for container runtime

Container runtime typically needs space in:
- /var/lib/docker (Docker)
- /var/lib/containerd (containerd)
- /var/lib/crio (CRI-O)

Check disk usage:

bash

df -h /var/lib/docker
df -h /var/lib/containerd
du -sh /var/lib/docker/*  # Breakdown by component

If disk is full (> 90%):

bash

# Clean up unused images
sudo docker rmi $(docker images -q -f "dangling=true")
sudo ctr -n k8s.io i rm $(ctr -n k8s.io i ls -q | tail -20)

# Remove unused containers
sudo docker container prune

# Clear build cache
sudo docker builder prune

For persistent space issues, increase the volume or move storage:

bash

sudo docker info | grep "Storage Driver"

5Check kubelet configuration for runtime socket

Verify kubelet is configured to use the correct runtime socket:

bash

ps aux | grep kubelet | grep -E "(container-runtime|container-runtime-endpoint)"

For containerd (Kubernetes 1.24+):

bash

kubectl describe node <node> | grep -i "container"

Edit kubelet config:

For kubeadm clusters:

bash

sudo nano /etc/sysconfig/kubelet
# or
sudo nano /etc/kubernetes/kubelet.env

Ensure it includes:

bash

--container-runtime=remote
--container-runtime-endpoint=unix:///var/run/containerd/containerd.sock

Or for Docker:

bash

--container-runtime=docker

Restart kubelet:

bash

sudo systemctl restart kubelet
sudo journalctl -u kubelet -f

6Inspect container runtime API errors

Test runtime API connectivity:

containerd:

bash

sudo ctr -a /run/containerd/containerd.sock version
sudo ctr -a /run/containerd/containerd.sock images list

Docker:

bash

sudo docker version
sudo docker ps

CRI-O:

bash

sudo crictl version
sudo crictl images

If API errors appear:
- Check socket permissions (see Step 3)
- Restart runtime daemon (Step 1)
- Check for stale connections:

bash

sudo lsof | grep containerd.sock
sudo kill <stale-pid>

7Drain node and restart container runtime

As a last resort, restart the entire runtime:

bash

# Cordon the node (prevent new pods)
kubectl cordon <node-name>

# Drain existing pods
kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-data

# SSH into node and restart runtime
sudo systemctl stop kubelet
sudo systemctl restart docker  # or containerd, crio
sudo systemctl start kubelet

# Monitor recovery
sudo journalctl -u kubelet -f
kubectl uncordon <node-name>
kubectl get nodes -w

Pods will be rescheduled to other nodes. Monitor their status:

bash

kubectl get pods -A -w | grep <node-name>

8Upgrade container runtime if version mismatch

Check for version compatibility:

bash

kubectl version  # Kubernetes version
sudo docker --version  # Docker version
sudo containerd --version  # containerd version

Ref: https://kubernetes.io/docs/setup/production-environment/container-runtimes/

If versions are incompatible, upgrade the runtime:

Docker (Ubuntu/Debian):

bash

sudo apt-get update
sudo apt-get install -y docker-ce=<version>
sudo systemctl restart docker

containerd (Ubuntu/Debian):

bash

sudo apt-get install -y containerd.io=<version>
sudo systemctl restart containerd

CRI-O:

bash

sudo apt-get install -y cri-o=<version>
sudo systemctl restart crio

Always test in a staging environment first.

How to fix container runtime error in Kubernetes

What this error means

Typical symptoms

Common causes

How to fix "container runtime error"

Advanced notes

Related errors

Official resources & further reading