DiskPressure indicates a node is running critically low on available disk space. The kubelet stops scheduling new pods and begins evicting existing pods to reclaim disk space.
DiskPressure is a Kubernetes node condition that indicates a node is running critically low on available disk space. The kubelet continuously tracks disk usage on two filesystems: the node filesystem (nodefs) which stores container data and logs, and the image filesystem (imagefs) which stores container images. When disk usage crosses the configured threshold (typically 85% by default), the kubelet marks the node with DiskPressure=True and stops scheduling new pods. Once DiskPressure is triggered, the kubelet initiates node-pressure eviction. It attempts to free disk space in a specific order: first by garbage collecting dead pods and containers, then by deleting unused container images. If disk pressure persists, the scheduler will evict running pods, starting with those that consume the most disk space. This condition is particularly critical because it can impact essential cluster components like the API server, etcd, and the kubelet itself, potentially causing cascading failures across your entire Kubernetes cluster.
Run kubectl get nodes to list all nodes, then check which ones have DiskPressure:
kubectl describe nodes | grep -A 5 DiskPressureFor a specific node:
kubectl describe node <node-name>Look for DiskPressure=True in the Conditions section. Also check the kubelet logs:
ssh <node-ip>
sudo journalctl -u kubelet -n 100SSH into the affected node and use crictl to remove unused images:
ssh <node-ip>
# List all images
sudo crictl images
# Remove unused images (prune)
sudo crictl rmi --pruneFor Docker-based nodes:
sudo docker image prune -a --forceThis will delete all dangling images and images not used by any running container. Check the freed space with:
df -h /var/lib/kubeletAdjust the kubelet arguments to prevent disk pressure from occurring so quickly. Edit the kubelet service configuration:
ssh <node-ip>
sudo nano /etc/default/kubeletAdd or modify these arguments:
KUBELET_EXTRA_ARGS="--image-gc-high-threshold=70 --image-gc-low-threshold=60 --eviction-hard=nodefs.available<5Gi --eviction-soft=nodefs.available<10Gi --eviction-soft-grace-period=nodefs.available=2m"Restart the kubelet:
sudo systemctl restart kubeletContainer logs in /var/lib/kubelet/pods can grow very large. Configure log rotation. For Docker, edit /etc/docker/daemon.json:
{
"log-driver": "json-file",
"log-opts": {
"max-size": "100m",
"max-file": "10"
}
}Restart Docker:
sudo systemctl restart dockerAlso apply log rotation at the OS level using logrotate:
sudo cat > /etc/logrotate.d/kubernetes << EOF
/var/log/pods/*/*.log {
rotate 5
daily
compress
delaycompress
missingok
notifempty
}
EOFUpdate your pod manifests to define ephemeral storage limits:
apiVersion: v1
kind: Pod
metadata:
name: my-app
spec:
containers:
- name: app
image: my-app:latest
resources:
requests:
ephemeral-storage: "1Gi"
limits:
ephemeral-storage: "5Gi"The ephemeral storage limit should account for container logs. If log rotation is set to max-size=100m with max-file=10, set the ephemeral limit to at least 1Gi.
If recurring disk pressure indicates insufficient total disk capacity:
For cloud-managed nodes (EKS, GKE, AKS), scale up the node group to use larger instances with more disk.
For on-premises/self-managed clusters, add storage to nodes or provision new nodes with larger disks.
# Check current disk usage
df -h /var/lib/kubelet
# Drain the node before making changes
kubectl drain <old-node> --ignore-daemonsets --delete-emptydir-dataMonitor the migration to ensure pods are successfully rescheduled.
The kubelet supports both soft and hard eviction thresholds. Soft thresholds allow a grace period (default 5 minutes) for pods to gracefully terminate before force termination, while hard thresholds trigger immediate eviction. Configure soft eviction: --eviction-soft=nodefs.available<10Gi --eviction-soft-grace-period=nodefs.available=2m.
When image GC runs, it first marks images as unused if no container is using them, then deletes them starting with the oldest. However, if the node is already critically low on space, the kubelet may fail to GC enough images. To prevent this, set image-gc-high-threshold 10-15% lower than your eviction threshold.
Container logs and temporary files consume ephemeral storage (local node disk), while data stored in PersistentVolumes does not count against node-level limits. If your pods need to write large temporary datasets, explicitly request ephemeral storage.
Set up Prometheus alerts for disk usage trending upward. Query kubelet metrics like kubelet_volume_stats_used_bytes and node_filesystem_avail_bytes to catch pressure before it becomes critical. Consider setting a Kubernetes resource quota on ephemeral-storage at the namespace level.
Failed to connect to server: connection refused (HTTP/2)
How to fix "HTTP/2 connection refused" error in Kubernetes
missing request for cpu in container
How to fix "missing request for cpu in container" in Kubernetes HPA
error: invalid configuration
How to fix "error: invalid configuration" in Kubernetes
etcdserver: cluster ID mismatch
How to fix "etcdserver: cluster ID mismatch" in Kubernetes
running with swap on is not supported
How to fix "running with swap on is not supported" in kubeadm