The etcd context deadline exceeded error indicates API requests are timing out when communicating with the etcd cluster. This is critical because etcd stores all Kubernetes cluster data and its unavailability affects the entire cluster.
Etcd client requests (from kube-apiserver or etcdctl) are failing to reach or receive responses from the etcd cluster within the default timeout window (typically 5 seconds). Since etcd is the backing database for all Kubernetes state, this error prevents API server operations like creating pods, updating deployments, or querying cluster state.
Check if etcd members are responding:
ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
endpoint healthAll members should show "healthy: true". If any show "unhealthy" or timeout, that's the problem.
Test reachability between etcd members:
ping <etcd-member-ip>
telnet <etcd-member-ip> 2379 # Client port
telnet <etcd-member-ip> 2380 # Peer portCheck firewall rules:
sudo iptables -L -n | grep 2379
sudo iptables -L -n | grep 2380Open ports if blocked:
sudo iptables -A INPUT -p tcp --dport 2379 -j ACCEPT
sudo iptables -A INPUT -p tcp --dport 2380 -j ACCEPTVerify TLS certs are correct:
ETCDCTL_API=3 etcdctl \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
--endpoints=https://127.0.0.1:2379 \
member listIf cert errors occur, certs may be expired or misconfigured. Regenerate if needed:
kubeadm certs renew etcd-server
kubeadm certs renew etcd-peerThen restart etcd.
Give etcd more time to respond:
ETCDCTL_API=3 etcdctl \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
--command-timeout=120s \
--endpoints=https://127.0.0.1:2379 \
endpoint healthIf this succeeds, etcd is slow. Check for performance issues (disk I/O, CPU).
Ensure majority of members are healthy:
ETCDCTL_API=3 etcdctl \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
--endpoints=https://127.0.0.1:2379 \
member listCount healthy members. For quorum:
- 3-member cluster needs 2 healthy
- 5-member cluster needs 3 healthy
If quorum is lost, remove unhealthy members:
ETCDCTL_API=3 etcdctl member remove <member-id>Check if etcd is slow:
# Monitor metrics:
kubectl logs -n kube-system -l component=etcd | grep slowRequestDuration
# Check CPU/memory:
kubectl top pod -n kube-system -l component=etcd
# Check database size:
ls -lh /var/lib/etcd/member/snap/dbIf slow, defragment the database:
ETCDCTL_API=3 etcdctl \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
--endpoints=https://127.0.0.1:2379 \
defragGet more details on the timeout:
kubectl logs -n kube-system -l component=kube-apiserver | grep deadline
kubectl logs -n kube-system -l component=kube-apiserver | grep etcdLook for:
- "context deadline exceeded" - request timed out
- "connection refused" - etcd not reachable
- "x509" - certificate issue
Rolling restart of etcd members (one at a time, wait for recovery):
sudo systemctl restart etcd
# or for containerized:
sudo docker restart <etcd-container>Monitor health between restarts:
watch "ETCDCTL_API=3 etcdctl --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key --endpoints=https://127.0.0.1:2379 endpoint health"Wait 10-30 seconds between restarts for leader election.
Context deadline exceeded is a network-level timeout, not a software bug. The issue is usually infrastructure: network latency, disk I/O slowness, or resource constraints. Etcd performance is critical—even small delays cascade because kube-apiserver retries failed requests. Monitor etcd metrics continuously in production. For large clusters (100+ nodes), consider a dedicated external etcd cluster separate from control plane to prevent resource contention. Socket exhaustion (Issue #90664) can cause deadline exceeded in some versions—upgrade to latest Kubernetes. Time synchronization between control plane nodes is critical for leader elections—use NTP or chrony.
Failed to connect to server: connection refused (HTTP/2)
How to fix "HTTP/2 connection refused" error in Kubernetes
No subnets found for EKS cluster
How to fix "eks subnet not found" in Kubernetes
missing request for cpu in container
How to fix "missing request for cpu in container" in Kubernetes HPA
unable to compute replica count
How to fix "unable to compute replica count" in Kubernetes HPA
error: context not found
How to fix "error: context not found" in Kubernetes