The "Eviction Hard Threshold" error occurs when a node reaches critical resource thresholds (memory, disk, or inodes) and the kubelet forcibly evicts pods to free resources. When a hard threshold is breached, pods are immediately terminated without graceful shutdown, potentially causing data loss.
Kubernetes nodes have configurable resource thresholds to prevent the node OS from running out of memory or disk. When resources drop below a hard eviction threshold, the kubelet terminates (evicts) pods immediately to free space. Unlike soft thresholds which allow a grace period, hard thresholds trigger instant termination. Common hard thresholds: - Memory: `memory.available < 100Mi` (default) - Disk: `nodefs.available < 2%` (default) - Inodes: `nodefs.inodesFree < 5%` (default) When breached, pods are evicted starting with the lowest priority, causing service disruptions and potential data loss for stateful applications.
View available resources on the node:
kubectl top nodes
kubectl describe node <node-name> # Shows Allocatable and current pressure
kubectl get nodes -o wideCheck for resource pressure conditions:
kubectl describe node <node-name> | grep -A 10 "Conditions:"Look for MemoryPressure, DiskPressure, PidPressure conditions marked as True.
SSH into the node and check disk usage:
ssh <node-ip>
df -h # Filesystem disk space usage
du -sh /* # Directory sizes
# Check kubelet root directory:
du -sh /var/lib/kubelet
du -sh /var/log/pods
# Find large files:
find / -type f -size +500M 2>/dev/nullIdentify what's consuming disk:
- Pod logs: /var/log/pods
- Container storage: /var/lib/kubelet/pods
- Docker layers: /var/lib/docker
View memory consumption:
kubectl describe node <node-name> | grep -E "Allocated|Requested|Limits"
kubectl top nodes --no-headers | awk '{print $1, $5}' # Memory usage %
# On the node itself:
free -h
cat /proc/meminfo
ps aux --sort=-%mem | head -10 # Top memory consumersCheck which pods are using the most memory:
kubectl top pods -A --sort-by=memoryCheck current eviction thresholds:
kubectl get node <node-name> -o yaml | grep -i eviction
# Check kubelet config:
kubectl get -n kube-system cm kubelet-config-1.xx -o yaml # kubeadm clusters
# SSH to node and check:
ssh <node-ip>
ps aux | grep kubelet # Show kubelet command line args
catcat /var/lib/kubelet/kubelet.conf # Kubelet config fileDefault eviction thresholds:
- --eviction-hard=memory.available<100Mi,nodefs.available<2%,nodefs.inodesFree<5%
- --eviction-soft=memory.available<500Mi,nodefs.available<5%
- --eviction-soft-grace-period=memory.available=1m
Free up space on the node:
ssh <node-ip>
# Remove old container images (careful!):
docker image prune -a --filter "until=168h" # Remove images older than 7 days
kubectl debug node/<node-name> -it --image=ubuntu # Debug pod if SSH unavailable
# Clear old pod logs:
find /var/log/pods -type f -name "*.log" -mtime +7 -delete # Logs older than 7 days
# Clear kubelet temporary files:
rm -rf /var/lib/kubelet/pods/*/volume-subpaths/*
# Clear old evicted pod data:
kubectl get pods -A --field-selector=status.phase=Failed -o json | \
jq -r '.items[] | select(.status.reason=="Evicted") | "\(.metadata.namespace) \(.metadata.name)" ' | \
xargs -I {} sh -c 'kubectl delete pod -n {} --grace-period=0 --force'Verify space was freed: df -h
Update deployments with memory and CPU limits:
apiVersion: apps/v1
kind: Deployment
metadata:
name: bounded-app
spec:
template:
spec:
containers:
- name: app
image: myapp:latest
resources:
requests:
memory: 256Mi
cpu: 100m
limits:
memory: 512Mi # Pod cannot exceed 512Mi
cpu: 500mApply:
kubectl apply -f deployment.yamlPods exceeding memory limits will be OOMKilled (restarted). Set limits conservatively to allow headroom.
Prevent logs from filling the disk:
ssh <node-ip>
# Check current kubelet log rotation:
kubectl get -n kube-system pod <kubelet-pod> -o yaml | grep -i "max-size"
# Or edit kubelet config:
sudo nano /etc/kubernetes/kubelet/kubelet.confAdd log rotation:
serializationFormat: json
serializationFormatVersion: v1
# Add logging config:
logging:
flushFrequency: 5s
format: json
level: infoConfigure container log rotation in Docker/Containerd:
{
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3"
}
}If your workload genuinely needs more space, increase thresholds (not recommended as a first choice):
kubectl edit node <node-name>Add annotation:
metadata:
annotations:
kubelet.kubernetes.io/eviction-hard: "memory.available<50Mi,nodefs.available<1%"OR edit kubelet on the node directly:
ssh <node-ip>
sudo nano /etc/kubernetes/manifests/kubelet-config.yaml
# OR
sudo nano /etc/sysconfig/kubeletUpdate:
KUBELET_EXTRA_ARGS="--eviction-hard=memory.available<50Mi,nodefs.available<1%"Restart kubelet:
sudo systemctl restart kubeletHard eviction thresholds are a safety mechanism and should rarely be reached in properly configured clusters. The best solution is to right-size nodes and limit pod resource consumption. Soft thresholds (default: 500Mi memory, 5% disk) trigger a 1-minute grace period allowing pods to gracefully shut down—hard thresholds do not. Never set hard thresholds to allow more than ~2% of node capacity; this leaves no buffer for the OS. In cloud environments, enable auto-scaling so nodes are added before hitting eviction. For stateful apps (databases, message queues), pod eviction can cause data corruption—avoid hard eviction by proper capacity planning. Container log rotation must be configured to prevent unbounded log growth; default behavior fills disks quickly in high-log-volume environments. Use LimitRanges and ResourceQuotas to enforce resource limits across namespaces. Monitor node disk and memory pressure with Prometheus metrics: node_filesystem_avail_bytes and node_memory_MemAvailable_bytes.
Failed to connect to server: connection refused (HTTP/2)
How to fix "HTTP/2 connection refused" error in Kubernetes
missing request for cpu in container
How to fix "missing request for cpu in container" in Kubernetes HPA
error: invalid configuration
How to fix "error: invalid configuration" in Kubernetes
etcdserver: cluster ID mismatch
How to fix "etcdserver: cluster ID mismatch" in Kubernetes
running with swap on is not supported
How to fix "running with swap on is not supported" in kubeadm