How to fix Eviction Hard Threshold in Kubernetes

KubernetesINTERMEDIATEHIGH

The "Eviction Hard Threshold" error occurs when a node reaches critical resource thresholds (memory, disk, or inodes) and the kubelet forcibly evicts pods to free resources. When a hard threshold is breached, pods are immediately terminated without graceful shutdown, potentially causing data loss.

What this error means

Kubernetes nodes have configurable resource thresholds to prevent the node OS from running out of memory or disk. When resources drop below a hard eviction threshold, the kubelet terminates (evicts) pods immediately to free space. Unlike soft thresholds which allow a grace period, hard thresholds trigger instant termination. Common hard thresholds: - Memory: `memory.available < 100Mi` (default) - Disk: `nodefs.available < 2%` (default) - Inodes: `nodefs.inodesFree < 5%` (default) When breached, pods are evicted starting with the lowest priority, causing service disruptions and potential data loss for stateful applications.

How to fix "Eviction Hard Threshold"

1Check current node resource status

View available resources on the node:

bash

kubectl top nodes
kubectl describe node <node-name>  # Shows Allocatable and current pressure
kubectl get nodes -o wide

Check for resource pressure conditions:

bash

kubectl describe node <node-name> | grep -A 10 "Conditions:"

Look for MemoryPressure, DiskPressure, PidPressure conditions marked as True.

2Check disk space on the node

SSH into the node and check disk usage:

bash

ssh <node-ip>
df -h  # Filesystem disk space usage
du -sh /*  # Directory sizes

# Check kubelet root directory:
du -sh /var/lib/kubelet
du -sh /var/log/pods

# Find large files:
find / -type f -size +500M 2>/dev/null

Identify what's consuming disk:
- Pod logs: /var/log/pods
- Container storage: /var/lib/kubelet/pods
- Docker layers: /var/lib/docker

3Check memory usage on the node

View memory consumption:

bash

kubectl describe node <node-name> | grep -E "Allocated|Requested|Limits"
kubectl top nodes --no-headers | awk '{print $1, $5}'  # Memory usage %

# On the node itself:
free -h
cat /proc/meminfo
ps aux --sort=-%mem | head -10  # Top memory consumers

Check which pods are using the most memory:

bash

kubectl top pods -A --sort-by=memory

4View kubelet eviction configuration

Check current eviction thresholds:

bash

kubectl get node <node-name> -o yaml | grep -i eviction

# Check kubelet config:
kubectl get -n kube-system cm kubelet-config-1.xx -o yaml  # kubeadm clusters

# SSH to node and check:
ssh <node-ip>
ps aux | grep kubelet  # Show kubelet command line args
catcat /var/lib/kubelet/kubelet.conf  # Kubelet config file

Default eviction thresholds:
- --eviction-hard=memory.available<100Mi,nodefs.available<2%,nodefs.inodesFree<5%
- --eviction-soft=memory.available<500Mi,nodefs.available<5%
- --eviction-soft-grace-period=memory.available=1m

5Clean up disk space immediately

Free up space on the node:

bash

ssh <node-ip>

# Remove old container images (careful!):
docker image prune -a --filter "until=168h"  # Remove images older than 7 days
kubectl debug node/<node-name> -it --image=ubuntu  # Debug pod if SSH unavailable

# Clear old pod logs:
find /var/log/pods -type f -name "*.log" -mtime +7 -delete  # Logs older than 7 days

# Clear kubelet temporary files:
rm -rf /var/lib/kubelet/pods/*/volume-subpaths/*

# Clear old evicted pod data:
kubectl get pods -A --field-selector=status.phase=Failed -o json | \
  jq -r '.items[] | select(.status.reason=="Evicted") | "\(.metadata.namespace) \(.metadata.name)" ' | \
  xargs -I {} sh -c 'kubectl delete pod -n {} --grace-period=0 --force'

Verify space was freed: df -h

6Set resource limits on pods to prevent unbounded consumption

Update deployments with memory and CPU limits:

yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: bounded-app
spec:
  template:
    spec:
      containers:
      - name: app
        image: myapp:latest
        resources:
          requests:
            memory: 256Mi
            cpu: 100m
          limits:
            memory: 512Mi  # Pod cannot exceed 512Mi
            cpu: 500m

Apply:

bash

kubectl apply -f deployment.yaml

Pods exceeding memory limits will be OOMKilled (restarted). Set limits conservatively to allow headroom.

7Configure log rotation to limit log size

Prevent logs from filling the disk:

bash

ssh <node-ip>

# Check current kubelet log rotation:
kubectl get -n kube-system pod <kubelet-pod> -o yaml | grep -i "max-size"

# Or edit kubelet config:
sudo nano /etc/kubernetes/kubelet/kubelet.conf

Add log rotation:

yaml

serializationFormat: json
serializationFormatVersion: v1

# Add logging config:
logging:
  flushFrequency: 5s
  format: json
  level: info

Configure container log rotation in Docker/Containerd:

json

{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "10m",
    "max-file": "3"
  }
}

8Adjust eviction thresholds if necessary (cluster admin only)

If your workload genuinely needs more space, increase thresholds (not recommended as a first choice):

bash

kubectl edit node <node-name>

Add annotation:

yaml

metadata:
  annotations:
    kubelet.kubernetes.io/eviction-hard: "memory.available<50Mi,nodefs.available<1%"

OR edit kubelet on the node directly:

bash

ssh <node-ip>
sudo nano /etc/kubernetes/manifests/kubelet-config.yaml
# OR
sudo nano /etc/sysconfig/kubelet

Update:

bash

KUBELET_EXTRA_ARGS="--eviction-hard=memory.available<50Mi,nodefs.available<1%"

Restart kubelet:

bash

sudo systemctl restart kubelet

How to fix Eviction Hard Threshold in Kubernetes

What this error means

Typical symptoms

Common causes

How to fix "Eviction Hard Threshold"

Advanced notes

Related errors

Official resources & further reading