DaemonSet pods may timeout during creation or get stuck in not-ready state due to node issues, resource constraints, or broken webhooks. Fix by checking node health, reviewing admission controller logs, ensuring sufficient resources, and resolving node taints.
DaemonSets ensure a pod runs on every eligible node. When a DaemonSet pod creation times out, it means the kubelet on that node couldn't create the pod within the default timeout period (usually 5 minutes). Common causes include: - Node unable to pull the container image - Node resource exhaustion (memory, disk, inodes) - Admission webhooks rejecting pod creation - Node kernel issues or instability - Network problems preventing image pulls Unlike deployments, you can't simply delete and recreate DaemonSet podsโthey're automatically recreated by the DaemonSet controller.
Identify which nodes have issues:
# See all nodes and their conditions:
kubectl get nodes -o wide
kubectl describe nodes
# Focus on nodes without the DaemonSet pod:
kubectl get pods -n <namespace> -o wide | grep <daemonset-name>
# Check specific node conditions:
kubectl describe node <node-name>Look for conditions like MemoryPressure, DiskPressure, or NotReady.
SSH into the problematic node and check kubelet logs:
# View kubelet logs:
sudo journalctl -u kubelet -n 100 -f
# Or check syslog:
sudo tail -100 /var/log/syslog | grep kubelet
# Look for timeout-related messages:
# - "timeout"
# - "deadline"
# - "context canceled"
# - "webhook"This reveals the exact failure reason.
Check if node is under memory/disk pressure:
# Check from control plane:
kubectl describe node <node-name> | grep -A 5 "Allocated resources"
# SSH to node and check:
sudo df -h # Disk usage
free -h # Memory
cat /proc/sys/fs/inode-nr # Inode usage
# If disk usage > 85%, cleanup:
sudo docker system prune -a
sudo rm -rf /var/cache/*If webhooks are timing out, check their status:
# List ValidatingWebhooks:
kubectl get validatingwebhookconfigurations
# Check webhook details:
kubectl describe validatingwebhookconfigurations <webhook-name>
# Check webhook service is running:
kubectl get endpoints -n <webhook-namespace> <webhook-name>
# Check webhook pod logs:
kubectl logs -n <webhook-namespace> <webhook-pod>If webhook is slow or unresponsive, increase timeoutSeconds or disable temporarily.
Ensure the DaemonSet image can be pulled on all nodes:
# Check the image in the DaemonSet:
kubectl get daemonset -n <namespace> <name> -o yaml | grep -i image
# Try pulling the image manually on the node:
sudo crictl pull <image> # For containerd
sudo docker pull <image> # For Docker
# Check image pull secrets:
kubectl get daemonset -n <namespace> <name> -o yaml | grep -i imagePullSecretIf pull fails, fix credentials or image references.
Sometimes kubelet becomes unresponsive. Restart it:
# SSH to node:
sudo systemctl restart kubelet
# Monitor restart:
sudo systemctl status kubelet
sudo journalctl -u kubelet -f
# From control plane, watch pod creation:
kubectl get pods -n <namespace> -w | grep <daemonset-name>Give the pod 2-3 minutes to be created and start.
If the problem is a slow admission webhook:
# Edit the webhook configuration:
kubectl edit validatingwebhookconfigurations <webhook-name>
# Increase timeoutSeconds (default is 10s):
webhooks:
- name: webhook.example.com
timeoutSeconds: 30 # Increased from 10
clientConfig: {}After increasing the timeout, DaemonSet pods should get created successfully.
Once pods are created, check they're actually ready:
# Check all DaemonSet pods:
kubectl get pods -n <namespace> -l app=<daemonset-label> -o wide
# Check individual pod status:
kubectl describe pod -n <namespace> <pod-name>
# Check pod logs:
kubectl logs -n <namespace> <pod-name>Pods should show Running with all containers ready.
### DaemonSet Pod Lifecycle
DaemonSet pods have special handling:
1. Created on matching nodes automatically
2. Not recreated by simple deletion (DaemonSet controller recreates)
3. Tolerate all taints unless explicitly set
4. Can be delayed by node initialization scripts
### Webhook Timeout Tuning
Safe webhook timeout values:
- 5s: For very simple webhooks (not recommended)
- 10s: Default, good for most webhooks
- 30s: For complex validation or external API calls
- 60s+: Usually indicates webhook performance problem
### Node Initialization
On new nodes, DaemonSet pods may wait for:
- kubelet to fully start
- CNI plugin to be ready
- Other init containers to complete
This is normal and expected (up to 30-60 seconds).
### Debugging Slow Node Initialization
# Check if kubelet is starting:
sudo systemctl status kubelet
# Check CNI plugin status:
ls -la /etc/cni/net.d/
ls -la /opt/cni/bin/
# Check for init scripts blocking startup:
ps aux | grep kubelet### Quota and Limit Ranges
DaemonSet pods may fail creation if LimitRange limits are too strict:
kubectl get limitranges -A
kubectl describe limitrange -n <namespace> <name>If the LimitRange min/max doesn't match DaemonSet resources, pods won't start.
### Recovery
If a node is permanently broken, cordon and drain it:
# Prevent new pods on the node:
kubectl cordon <node-name>
# DaemonSet pods won't recreate on cordoned nodes
# To force recreation after fixing:
kubectl uncordon <node-name>Failed to connect to server: connection refused (HTTP/2)
How to fix "HTTP/2 connection refused" error in Kubernetes
missing request for cpu in container
How to fix "missing request for cpu in container" in Kubernetes HPA
error: invalid configuration
How to fix "error: invalid configuration" in Kubernetes
etcdserver: cluster ID mismatch
How to fix "etcdserver: cluster ID mismatch" in Kubernetes
running with swap on is not supported
How to fix "running with swap on is not supported" in kubeadm