The Kubernetes API server is down or unreachable, blocking all kubectl operations and cluster management. Causes include API server pod crashes, etcd backing store failures, network connectivity problems, or node resource exhaustion. Fix by checking API server logs, validating cluster networking, verifying etcd health, and restarting the control plane if needed.
This critical error means kubectl (or any client) cannot establish a connection to the Kubernetes API server, which is the central control point for the entire cluster. The API server may be completely down (pod crashed), unresponsive (stuck processing requests), or unreachable due to network issues. This blocks all cluster operations including deployments, pod creation, and monitoring.
Check if the kube-apiserver pod exists and is healthy:
# SSH into a control plane node
ssh <control-plane-node>
# Check if API server pod is running
sudo docker ps | grep kube-apiserver
# or for containerd:
sudo crictl ps | grep kube-apiserver
# Check pod status in kube-system namespace
kubectl get pod -n kube-system | grep kube-apiserver
# View pod logs
kubectl logs -n kube-system kube-apiserver-<node-name>If the pod is not running, check for crash loops or termination events.
If the API server pod is stuck or unresponsive, restart it:
# For self-managed clusters, delete the pod to force restart
kubectl delete pod -n kube-system kube-apiserver-<node-name>
# The static pod controller will automatically recreate it
# Wait 30-60 seconds for it to come back online
sleep 30
kubectl get pod -n kube-system kube-apiserver-<node-name>For managed clusters (GKE, AKS, EKS), use cloud provider console to restart control plane.
The API server cannot function without etcd:
# SSH to control plane node
ssh <control-plane-node>
# Check etcd pod status
sudo crictl ps | grep etcd
# Test etcd connectivity (from API server container)
sudo crictl exec <etcd-container-id> etcdctl endpoint health
# Check etcd logs
sudo crictl logs <etcd-container-id> | tail -50If etcd is down or unhealthy, restart it or investigate backing store issues (disk space, corruption).
Test if the API server endpoint is reachable:
# Get the API server endpoint
grep server ~/.kube/config
# Example: https://192.168.1.10:6443
# Test connectivity from your machine
curl -k https://<api-server-ip>:6443/version
# Test from a pod inside the cluster
kubectl run -it debug --image=alpine --restart=Never -- sh
# Inside pod:
apk add curl
curl -k https://kubernetes.default:6443/version
# Check if port 6443 is listening
netstat -tlnp | grep 6443If unreachable, check firewall rules, security groups, and network policies.
Ensure kubeconfig points to the correct API server:
# View current kubeconfig
kubectl config view
# Check the server endpoint
grep server ~/.kube/config
# If incorrect, update it
kubectl config set-cluster kubernetes --server=https://correct-ip:6443
# Verify with a simple test
kubectl cluster-infoFor managed clusters, regenerate kubeconfig from cloud provider (gke-gcloud-auth-plugin, aws eks update-kubeconfig).
API server may be killed by OOMKiller if resource-constrained:
# SSH to control plane
ssh <control-plane-node>
# Check system resources
free -h # Memory
df -h # Disk space
# Check if API server was OOMKilled
journalctl -xe | grep -i oomkiller
# Edit API server resource requests/limits
kubectl edit -n kube-system deployment kube-apiserver
# or for static pod:
sudo vim /etc/kubernetes/manifests/kube-apiserver.yaml
# Increase resources:
spec:
containers:
- name: kube-apiserver
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
memory: 2GiAPI server requires valid TLS certificates:
# Check certificate expiry
echo | openssl s_client -connect <api-server-ip>:6443 2>/dev/null | openssl x509 -noout -dates
# If expired, regenerate cluster certificates
kubeadm certs renew all
# Restart control plane components
kubectl rollout restart deployment -n kube-system
# For managed clusters, use cloud provider certificate rotationExpired certificates are a common cause of API server unavailability.
The control plane node may be in trouble:
# SSH to control plane
ssh <control-plane-node>
# Check system status
systemctl status kubelet
systemctl status docker # or containerd
# View kubelet logs
journalctl -u kubelet -n 50
# Check kernel logs for crashes
journalctl -xe | tail -100
# Monitor disk space (etcd issue if full)
df -h /var/lib/etcd
# Restart kubelet if necessary
sudo systemctl restart kubeletIf node is corrupted, consider recreating the control plane or failing over to another master.
API server unavailability is a critical cluster issue. In production, use highly available control planes (3+ masters with etcd cluster). Implement monitoring on control plane health (API server pod status, etcd latency). For kubeadm clusters, keep /etc/kubernetes/manifests backed up to rebuild if master node fails. In cloud-managed services (GKE, AKS, EKS), this is rare; if it occurs, check cloud provider status and firewall/network security group rules. Always ensure adequate disk space on control plane for etcd (recommend 10x etcd DB size).
Failed to connect to server: connection refused (HTTP/2)
How to fix "HTTP/2 connection refused" error in Kubernetes
No subnets found for EKS cluster
How to fix "eks subnet not found" in Kubernetes
missing request for cpu in container
How to fix "missing request for cpu in container" in Kubernetes HPA
unable to compute replica count
How to fix "unable to compute replica count" in Kubernetes HPA
error: context not found
How to fix "error: context not found" in Kubernetes