The "Failed Pod Hook" error occurs when a container lifecycle hook (PostStart or PreStop) fails or times out. PostStart hooks run after the container starts, while PreStop hooks run before termination. A failed hook blocks the pod from becoming Ready or prevents graceful shutdown.
Kubernetes supports lifecycle hooks to run custom logic during pod state transitions: **PostStart Hook**: Runs immediately after container starts. If it fails, the container is considered Failed and the pod won't become Ready. **PreStop Hook**: Runs before container termination. If it times out or fails, the container is force-killed after the timeout. Hooks are useful for: - Initializing data on startup - Warming up caches - Graceful connection draining before shutdown - Cleanup tasks (closing connections, releasing locks) But failed hooks can prevent pod startup or block graceful shutdown.
View what's happening with the hooks:
kubectl describe pod <pod-name> -n <namespace> # Shows Events
kubectl logs <pod-name> -n <namespace> # Container logs
kubectl logs <pod-name> -n <namespace> --previous # Previous failed containerLook for messages like:
- "PostStart hook failed"
- "command not found"
- "connection refused"
- "timeout waiting for hook"
The Events section shows the exact failure reason.
Test the hook command inside the container:
# If PostStart hook tries to run a script:
kubectl exec -it <pod-name> -n <namespace> -- /bin/sh
# Test the hook command directly:
/app/init.sh # Or whatever the PostStart hook specifies
echo $? # Check exit code (0 = success)
# Verify it's executable:
ls -la /app/init.sh
chmod +x /app/init.sh # Make it executable if needed
# Check if dependency commands exist:
which curl
which wget
which shHooks must be absolute paths in the container filesystem.
Examine the actual hook definition:
kubectl get pod <pod-name> -n <namespace> -o yaml | grep -A 10 "lifecycle:"Example PostStart hook:
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c", "echo 'Hello World' > /tmp/msg.txt"]
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 15"] # Grace periodAlso check:
- terminationGracePeriodSeconds: How long to wait for PreStop hook
- Hook timeout: No explicit timeout, but grace period acts as limit
Debug the hook by running it manually:
kubectl debug <pod-name> -n <namespace> -it --image=<base-image>
# Or if pod is running:
kubectl exec -it <pod-name> -n <namespace> -- /bin/sh
# Run the PostStart command step-by-step:
/bin/sh -c "echo 'Hello World' > /tmp/msg.txt"
cat /tmp/msg.txt # Verify output
# If it's a complex script:
sh -x /app/init.sh # Run with debug tracing
# Check exit code:
echo $? # 0 = success, non-zero = failureBreak the command into smaller parts to identify which step fails.
Make the hook more robust:
# Instead of:
kubectl set env deployment <name> POSTSTART_CMD="curl http://init-service/setup"
# Use a script with error handling:yaml
postStart:
exec:
command:
- /bin/sh
- -c
- |
set -e # Exit on error
set -x # Debug output
echo "Starting PostStart hook at $(date)"
curl -f --retry 3 --retry-delay 2 http://init-service/setup
if [ $? -eq 0 ]; then
echo "PostStart hook succeeded"
else
echo "PostStart hook failed" >&2
exit 1
fi
``
Log output helps with debugging. Use >&2` for error output.
Allow more time for PreStop hooks to complete:
kubectl patch pod <pod-name> -n <namespace> -p '{"spec":{"terminationGracePeriodSeconds":60}}'
# Or in the pod spec:
kubectl edit deployment <name> -n <namespace>Example:
apiVersion: apps/v1
kind: Deployment
metadata:
name: graceful-app
spec:
template:
spec:
terminationGracePeriodSeconds: 60 # Default is 30
containers:
- name: app
image: myapp:latest
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 15 && /app/shutdown.sh"]The grace period is the maximum time PreStop hook and SIGTERM handling have combined.
For simpler use cases, use HTTP or TCP probes instead of exec:
lifecycle:
# PostStart HTTP call:
postStart:
httpGet:
path: /init
port: 8080
scheme: HTTP
# Retry automatically
# Or TCP connection test:
postStart:
tcpSocket:
port: 3306 # MySQL portThese are simpler than exec and provide better error handling. Downside: less flexible than custom scripts.
Use init containers or startup probes to ensure readiness:
apiVersion: v1
kind: Pod
metadata:
name: app-with-init
spec:
initContainers:
- name: wait-for-db
image: busybox
command: ['sh', '-c', 'until nc -z db:3306; do echo waiting for db; sleep 2; done']
containers:
- name: app
image: myapp:latest
startupProbe: # Wait before PostStart
httpGet:
path: /health
port: 8080
failureThreshold: 30
periodSeconds: 1
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c", "# Now dependencies are ready"]Init containers run to completion before regular containers start.
Hook failures are hard to debug because the pod doesn't reach Running state. Always include logging and error handling in hooks. PostStart hooks should complete quickly; long-running setup should use init containers instead. PreStop hooks are crucial for graceful shutdown—implement proper connection draining or cleanup. If a PreStop hook times out, the container is force-killed (SIGKILL), losing in-flight requests. For stateful applications, PreStop hooks should flush buffers and close connections. Use sleep in PreStop hooks to allow load balancers to remove the pod from rotation. Hook commands have access to environment variables and volumes from the pod spec. Avoid using loops in hooks; use Kubernetes native features (init containers, startup probes) instead. In production, hook commands should be idempotent (safe to run multiple times, e.g., if retried).
No subnets found for EKS cluster
How to fix "eks subnet not found" in Kubernetes
unable to compute replica count
How to fix "unable to compute replica count" in Kubernetes HPA
error: context not found
How to fix "error: context not found" in Kubernetes
default backend - 404
How to fix "default backend - 404" in Kubernetes Ingress
serviceaccount cannot list resource
How to fix "serviceaccount cannot list resource" in Kubernetes