PIDPressure indicates the node has exhausted its maximum number of processes (PIDs). New processes and containers cannot start when the PID limit is reached. This typically affects nodes running many concurrent processes or containers without proper isolation.
When a Kubernetes node reaches its maximum PID limit, the kubelet sets PIDPressure=True. Linux kernels have a configurable maximum PID count (often /proc/sys/kernel/pid_max). When this limit is reached: 1. New processes cannot be spawned 2. Container startups fail with "cannot allocate memory" or "fork() failed" 3. New pods cannot start 4. The kubelet may become unresponsive This is distinct from memory pressure—the system has free memory but cannot create new process structures.
Check node status:
kubectl get nodes
kubectl describe node <node-name> | grep PIDPressureLook for "PIDPressure: True". Also check:
kubectl get nodes -o custom-columns=NAME:.metadata.name,PID:.status.conditions[?(@.type=="PIDPressure")].statusSSH into the affected node:
cat /proc/sys/kernel/pid_max # Maximum PIDs allowed
ps aux | wc -l # Approximate current process count
ps -eo pid,ppid,cmd | grep -c "^" # More accurate countAlternatively:
egrep ^Threads /proc/[0-9]*/status | wc -l # Total threadsIf current processes exceed 85-90% of pid_max, action is needed before pressure triggers.
Find which processes/containers use the most PIDs:
for p in /proc/[0-9]*/; do
echo "$(basename $p): $(cat $p/status 2>/dev/null | grep ^Threads | awk '{print $2}')"
done | sort -t: -k2 -rn | head -10Or use pstree:
pstree -p | head -50 # Process tree showing all PIDs
ps aux | awk '{print $2}" "$11}" | sort -u # PIDs by commandLook for container processes spawning many children (e.g., shell, Python with multiprocessing).
Zombie processes consume PIDs until reaped:
ps aux | grep "<defunct>" # Show zombies
ps aux | grep -c "<defunct>"To see which parent created zombies:
ps -o ppid= -p $(pgrep -f "<defunct>" | head -1)If many zombies exist, the parent process isn't reaping children. This is a bug in the application or container runtime:
- Restart the parent process
- Restart the container runtime (Docker/containerd)
- Kill the zombie's parent: kill -9 <ppid>
Check and increase pid_max:
cat /proc/sys/kernel/pid_max # Current limit
sudo sysctl kernel.pid_max # View via sysctlIncrease it:
# Temporary (until reboot)
sudo sysctl -w kernel.pid_max=4194303 # Maximum value on 64-bit
# Permanent (survives reboot)
sudo echo "kernel.pid_max = 4194303" >> /etc/sysctl.d/99-kubernetes.conf
sudo sysctl -p /etc/sysctl.d/99-kubernetes.confVerify:
cat /proc/sys/kernel/pid_maxFor production, use this value in your node provisioning/IaC (Terraform, Ansible).
Limit processes in individual pods to prevent runaway containers:
apiVersion: v1
kind: Pod
metadata:
name: limited-app
spec:
securityContext:
runAsUser: 1000
containers:
- name: app
image: myapp
resources:
limits:
cpu: "1"
memory: "512Mi"
# Note: No direct PID limit in K8s, use cgroupsFor cgroup-based PID limits, edit container runtime config:
containerd (/etc/containerd/config.toml):
[plugins."io.containerd.grpc.v1.cri".containerd]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_engine = ""
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
PidsLimit = 1024 # Limit per containerDocker (/etc/docker/daemon.json):
{
"default-ulimits": {
"nproc": {
"Name": "nproc",
"Hard": 1024,
"Soft": 1024
}
}
}Restart container runtime after changes.
If a specific pod is consuming excessive PIDs, terminate it:
# First, identify the pod
kubectl get pods --all-namespaces
# Delete the pod
kubectl delete pod <pod-name> -n <namespace>For Deployments, the pod will be recreated. If you want to prevent recreation:
kubectl delete deployment <deployment-name>For temporary relief, scale down workloads:
kubectl scale deployment <name> --replicas=0Then investigate why the pod uses so many PIDs before scaling back up.
Set up monitoring and alerting:
# Check PID usage regularly
watch -n 5 "ps aux | wc -l"Enable kubelet event alerts:
kubectl get events -A --sort-by=.metadata.creationTimestamp | grep PIDPressureAdd Prometheus monitoring:
kubelet_node_status_condition{condition="PIDPressure",status="true"}Implement resource limits on deployments:
resources:
limits:
cpu: "2"
memory: "1Gi"For applications that spawn many processes, use process pooling or async patterns to reduce PID usage. Review application logs for unexpected process creation.
PIDPressure is less common than memory/disk pressure but more common in clusters running batch jobs or message queues that spawn many workers. Bash/Python applications using multiprocessing are major PID consumers. Docker Desktop on macOS/Windows may have artificially low pid_max limits—increase in Docker Desktop settings. For systemd cgroups (cgroup v2), PID limits are enforced per cgroup hierarchy. Zombie processes are a sign of poor process reaping; ensure parent processes properly handle SIGCHLD. In high-throughput applications, consider event-driven architectures (goroutines in Go, async/await in Python) over process-per-task models. WSL2 distributions may have node-level PID limits set in /proc/sys/fs/file-max. Long-running containers may accumulate PIDs from terminated but reaped processes—monitor container exit codes and restart policies. Kubernetes does not have native PID resource requests/limits like memory/CPU, so PID management is primarily a node-level concern.
No subnets found for EKS cluster
How to fix "eks subnet not found" in Kubernetes
unable to compute replica count
How to fix "unable to compute replica count" in Kubernetes HPA
error: context not found
How to fix "error: context not found" in Kubernetes
default backend - 404
How to fix "default backend - 404" in Kubernetes Ingress
serviceaccount cannot list resource
How to fix "serviceaccount cannot list resource" in Kubernetes