A container fails health checks repeatedly, triggering kubelet to restart it continuously. This causes CrashLoopBackOff state. Fix by increasing initialDelaySeconds to allow startup, using startup probes for slow-starting apps, or tuning timeouts and thresholds.
A liveness probe is a health check that runs periodically (default every 10 seconds) while a container is running. If the probe fails enough times (default 3 failures), kubelet restarts the container. Liveness probes detect deadlocks, hangs, or applications that are running but unresponsive. The error means the probe couldn't communicate with the application or the application returned a failure status. This differs from readiness probes, which remove pods from service load balancing without restarting.
View pod status:
kubectl get pod <pod-name> -n <namespace>
kubectl describe pod <pod-name> -n <namespace>Look for restart count and "Last State: Terminated" with reason "Unhealthy".
Check the pod spec:
kubectl get pod <pod-name> -n <namespace> -o yaml | grep -A15 livenessProbeNote initialDelaySeconds, timeoutSeconds, periodSeconds, and failureThreshold.
Default (0) may be too low. Allow more time for startup:
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 90 # Increased from default
periodSeconds: 10
failureThreshold: 3For Java apps or apps with migrations, use 120-180 seconds.
Startup probe gives initial window, blocks liveness:
startupProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
failureThreshold: 30 # 300 seconds total = 30 * 10s
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 10
periodSeconds: 10
failureThreshold: 3Startup probe runs until success, then liveness takes over.
Increase tolerances for slow endpoints:
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 60
timeoutSeconds: 5 # Increased from default 1
periodSeconds: 15 # Increased from default 10
failureThreshold: 5 # Increased from default 3Exec into container and test:
kubectl exec -it <pod-name> -n <namespace> -- /bin/bash
curl -v http://localhost:8080/healthOr from your local machine if service is exposed:
kubectl port-forward <pod-name> 8080:8080
curl -v http://localhost:8080/healthResource starvation causes timeout:
resources:
requests:
cpu: 250m
memory: 256Mi
limits:
cpu: 500m
memory: 512MiCheck pod resource usage:
kubectl top pod <pod-name> -n <namespace>Application may be hanging or throwing errors:
kubectl logs <pod-name> -n <namespace> --tail=100
kubectl logs <pod-name> -n <namespace> --previous # Previous restartLook for database connection errors, deadlocks, or resource exhaustion messages.
Liveness probes should only be used for true "is application alive" checksโuse readiness probes for "can accept traffic" checks. Avoid expensive operations (full health scans, database queries) in liveness probes; use lightweight endpoints. Best practice: separate /health (readiness) from /liveness endpoints. Use higher failureThreshold for liveness (e.g., 3-5) to tolerate temporary blips. For stateful apps, ensure data consistency before declaring readiness. Monitor probe success rates in cluster metrics; sudden failures indicate application or infrastructure issues. Use TCP socket probes for simple connectivity checks; HTTP probes for application logic. For gRPC apps, use gRPC probe type (Kubernetes 1.24+) instead of HTTP.
Failed to connect to server: connection refused (HTTP/2)
How to fix "HTTP/2 connection refused" error in Kubernetes
missing request for cpu in container
How to fix "missing request for cpu in container" in Kubernetes HPA
error: invalid configuration
How to fix "error: invalid configuration" in Kubernetes
etcdserver: cluster ID mismatch
How to fix "etcdserver: cluster ID mismatch" in Kubernetes
running with swap on is not supported
How to fix "running with swap on is not supported" in kubeadm