How to fix CPU limits exceeded, container throttled in Kubernetes

KubernetesINTERMEDIATEMEDIUM

Pod CPU limits cause the kernel to throttle container processes when they exceed configured limits. Unlike memory, throttled pods continue running but with reduced performance. Fix by adjusting CPU requests/limits, enabling HPA, or removing inappropriate limits.

What this error means

When a Kubernetes pod specifies a CPU limit, the Linux CFS (Completely Fair Scheduler) kernel subsystem enforces that limit by throttling—slowing down the container's processes to prevent exceeding the allocated CPU percentage. Unlike memory limits, which kill pods that exceed them, CPU throttling just reduces throughput and increases latency. The container keeps running but at reduced performance, causing: - Increased API response times - Timeout failures - Queue backlogs - Cascading failures in dependent services CPU throttling becomes obvious when comparing actual CPU usage (from metrics) versus the container's performance degradation.

How to fix "CPU limits exceeded, container throttled"

1Monitor CPU usage and throttling

Identify if throttling is actually happening:

bash

# Check pod resource usage:
kubectl top pod -n <namespace> <pod-name>

# View current limits:
kubectl describe pod -n <namespace> <pod-name> | grep -A 4 "Limits\|Requests"

# Check throttling metrics in Prometheus:
container_cpu_cfs_throttled_seconds_total
container_cpu_cfs_throttled_periods_total

Compare the pod's actual CPU usage against its limit.

2Establish baseline CPU usage

Determine what CPU your application actually needs:

bash

# Monitor during typical load:
kubectl top pod -n <namespace> <pod-name> --containers

# Track over time using Prometheus:
rate(container_cpu_usage_seconds_total[5m])

# Check 95th percentile usage:
histogram_quantile(0.95, rate(container_cpu_usage_seconds_total[5m]))

Use this data to set appropriate requests and limits.

3Increase CPU limits temporarily (short-term fix)

Edit the deployment to increase CPU limits:

bash

kubectl edit deployment -n <namespace> <deployment-name>

Update the limits:

yaml

resources:
  limits:
    cpu: "2"      # Increase from current limit
  requests:
    cpu: "1"      # Set to your baseline

New pods will respect the updated limits:

bash

kubectl rollout restart deployment -n <namespace> <deployment-name>

4Configure CPU requests properly

Best practice is to set requests equal to or slightly above peak usage, and limits higher for burst traffic:

yaml

resources:
  requests:
    cpu: "500m"      # What pod consistently needs
  limits:
    cpu: "1000m"     # Max burst capacity

The difference between request and limit allows for temporary spikes without throttling.

5Enable Horizontal Pod Autoscaling (HPA)

Scale pods based on CPU usage instead of trying to give each pod unlimited CPU:

yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: app
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

When CPU usage exceeds 70%, Kubernetes adds more pod replicas.

6Optimize application code

Review the application for inefficiencies:

- Profile CPU usage: perf, py-spy, pprof
- Check for infinite loops or busy-wait patterns
- Reduce unnecessary computations in request paths
- Use caching (Redis, memcached) to avoid repeated work
- Batch operations to reduce per-request overhead
- Use async/await patterns for I/O-bound work

Example with pprof (Go):

import _ "net/http/pprof"

// Then visit http://localhost:6060/debug/pprof/

7Validate the fix

After making changes, monitor performance:

bash

# Watch real-time pod CPU:
kubectl top pod -n <namespace> -w

# Check metrics over longer periods:
# - API response times (p50, p95, p99)
# - Error rates
# - Throughput (requests/second)

# Verify throttling has stopped:
# (throttled_seconds should stop increasing)
container_cpu_cfs_throttled_seconds_total

Performance should normalize within minutes.

8Implement long-term monitoring

Set up alerts to prevent future throttling:

yaml

# Prometheus alert:
- alert: PodCPUThrottling
  expr: |
    rate(container_cpu_cfs_throttled_seconds_total[5m]) > 0.1
  for: 5m
  annotations:
    summary: "Pod {{ $labels.pod_name }} is being throttled"

Monitor in Grafana:
- Graph: rate(container_cpu_cfs_throttled_seconds_total[5m])
- Set threshold alert at 10% throttling rate

How to fix CPU limits exceeded, container throttled in Kubernetes

What this error means

Typical symptoms

Common causes

How to fix "CPU limits exceeded, container throttled"

Advanced notes

Related errors

Official resources & further reading