DeadlineExceeded occurs when a Kubernetes Job exceeds the time limit specified by activeDeadlineSeconds. Kubernetes terminates all running Pods and marks the Job as failed.
The Kubernetes DeadlineExceeded error occurs when a Job exceeds the time limit specified by its .spec.activeDeadlineSeconds field. Once a Job reaches its active deadline, Kubernetes automatically terminates all running Pods and marks the Job as failed with reason: DeadlineExceeded. This is a critical failure state that indicates the job lifecycle exceeded the defined timeout window, regardless of how many retries (backoffLimit) remain available. The activeDeadlineSeconds field applies to the total duration of the entire Job, not individual Pod attempts. The activeDeadlineSeconds takes absolute precedence over the backoffLimit configuration—meaning no additional Pod retries will be scheduled once the deadline is exceeded.
Inspect the Job YAML and current status to understand what deadline was set:
# Describe the job to see status and events
kubectl describe job <job-name> -n <namespace>
# View the job YAML specification
kubectl get job <job-name> -n <namespace> -o yaml
# Check recent events
kubectl get events -n <namespace> --sort-by='.lastTimestamp'Look for the .spec.activeDeadlineSeconds value and verify the reason in .status.conditions shows DeadlineExceeded.
Determine how long your job actually needs by examining logs from previous attempts:
# Get logs from the terminated pod
kubectl logs <pod-name> -n <namespace> --tail=100
# Check pod start and end times
kubectl get pod <pod-name> -n <namespace> -o jsonpath='{.status.containerStatuses[0].state}'Factor in container image pull time, application startup time, and actual workload execution.
Update the Job specification with a higher timeout:
apiVersion: batch/v1
kind: Job
metadata:
name: long-running-backup
spec:
backoffLimit: 3
activeDeadlineSeconds: 3600 # Increased from 600 to 3600 (1 hour)
template:
spec:
containers:
- name: backup-container
image: myregistry/backup:latest
command: ["./backup-script.sh"]
restartPolicy: NeverAs a rule of thumb, set activeDeadlineSeconds to 120-150% of your measured execution time plus buffer for scheduling delays.
Check that your Job has adequate CPU and memory to complete within the deadline:
# Check node capacity and allocation
kubectl top nodes
kubectl describe nodes
# View resource requests/limits in your job
kubectl get job <job-name> -n <namespace> -o yaml | grep -A 5 'resources:'If resources are constrained, either increase node capacity or add resource requests to ensure scheduling.
Implement timeout protection within your container for graceful shutdown:
#!/bin/bash
# Wrap your command with GNU timeout
timeout 1800 ./my-long-running-process.sh
exit_code=$?
if [ $exit_code -eq 124 ]; then
echo "Process timed out after 1800 seconds"
exit 1
fi
exit $exit_codeThis ensures your application has time to log completion status, release locks, and close connections before the Kubernetes deadline is enforced.
Verify that you're setting the deadline at the Job spec level, not the Pod level:
# Correct: activeDeadlineSeconds at Job spec level
kubectl get job <job-name> -n <namespace> -o jsonpath='{.spec.activeDeadlineSeconds}'
# Check if Pod spec also has activeDeadlineSeconds (applies per-pod)
kubectl get job <job-name> -n <namespace> -o jsonpath='{.spec.template.spec.activeDeadlineSeconds}'The Job-level activeDeadlineSeconds controls the total Job lifetime. For most cases, only set the Job-level timeout:
spec:
activeDeadlineSeconds: 3600 # Job-level only
template:
spec:
# Do NOT set activeDeadlineSeconds here unless you need per-Pod control
containers: ...activeDeadlineSeconds measures from Pod creation time (not start time), so container image pulls, scheduler delays, and node resource waits all count against the deadline.
The field takes absolute precedence over backoffLimit—even if you're allowed 5 retries and only used 2, no further Pods will be scheduled once the deadline is exceeded.
For batch workloads with highly variable execution times, consider using CronJobs with shorter individual Job deadlines and external coordination (queue-based processing), rather than single long-deadline Jobs.
When a Job hits the deadline, Kubernetes sends SIGTERM to Pods; set terminationGracePeriodSeconds appropriately to allow graceful shutdown (default 30 seconds), otherwise Pods are killed with SIGKILL after that period.
Failed to connect to server: connection refused (HTTP/2)
How to fix "HTTP/2 connection refused" error in Kubernetes
missing request for cpu in container
How to fix "missing request for cpu in container" in Kubernetes HPA
error: invalid configuration
How to fix "error: invalid configuration" in Kubernetes
etcdserver: cluster ID mismatch
How to fix "etcdserver: cluster ID mismatch" in Kubernetes
running with swap on is not supported
How to fix "running with swap on is not supported" in kubeadm