The Horizontal Pod Autoscaler cannot calculate the desired number of replicas because it's missing critical input data—either metrics are unavailable or resource requests are not defined on containers.
The Kubernetes Horizontal Pod Autoscaler (HPA) uses a specific calculation formula to determine the desired number of replicas: desiredReplicas = ceil[currentReplicas × (currentMetricValue / desiredMetricValue)]. When the HPA reports 'unable to compute replica count', it means the controller cannot complete this calculation because it's missing critical input data. This error typically occurs at the metrics gathering stage. The HPA requires real-time metrics from the Kubernetes Metrics Server to calculate utilization ratios. For CPU/memory scaling, it needs both the current usage (from metrics) and the requested resources (from pod specs) to compute utilization as a percentage. Without either piece, the division fails and scaling decisions cannot be made. The error may also occur when there are zero running replicas or when custom metrics endpoints are misconfigured.
Check if the Metrics Server deployment exists and is running:
kubectl get deployment metrics-server -n kube-system
kubectl get pods -n kube-system -l k8s-app=metrics-server
kubectl logs -n kube-system -l k8s-app=metrics-server | head -50If metrics-server is missing, install it:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yamlTest that the Metrics Server can gather and serve metrics:
kubectl top nodes
kubectl top pods -n <namespace>If these commands return 'unknown' or 'Metrics not available yet', metrics collection is not working. Check Metrics Server logs for errors like 'TLS errors', 'connection refused', or 'unauthorized'.
The HPA cannot calculate utilization without knowing how much CPU/memory each pod requested. Update your Deployment with explicit requests:
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
template:
spec:
containers:
- name: app
image: myimage:1.0
resources:
requests:
cpu: 250m
memory: 256Mi
limits:
cpu: 500m
memory: 512MiApply and rollout:
kubectl apply -f deployment.yaml
kubectl rollout status deployment/myapp -n <namespace>If you use Istio, Linkerd, or other service meshes, proxies are automatically injected but often lack resource requests.
For Linkerd, add this annotation to your pod spec:
spec:
template:
metadata:
annotations:
config.linkerd.io/proxy-cpu-request: 100m
config.linkerd.io/proxy-memory-request: 64MiFor Istio, configure the sidecar injector to include resource requests:
kubectl get configmap istio-sidecar-injector -n istio-system -o yaml | grep -A 20 "resources:"Recreate pods after updating to apply sidecar changes.
Examine the HPA object to understand what's failing:
kubectl get hpa -n <namespace>
kubectl describe hpa <hpa-name> -n <namespace>Look for:
- Metrics: Verify it matches your container's resource requests
- Conditions: Look for 'ScalingActive = False' which indicates metrics cannot be fetched
- Events: Recent events will show exact errors like 'missing request for cpu'
The HPA cannot compute metrics if there are zero replicas running:
kubectl get pods -n <namespace> -l app=myapp
kubectl describe pod <pod-name> -n <namespace>If minReplicas is 0 or pods are terminated, scale the deployment to at least 1:
kubectl scale deployment myapp --replicas=1 -n <namespace>Wait for the pod to reach Ready state, then verify metrics appear:
kubectl top pods -n <namespace>Note: Standard HPA cannot scale down to zero. For scale-to-zero, consider KEDA.
HPA calculates desired replicas as: desiredReplicas = ceil[currentReplicas × (currentMetricValue / targetMetricValue)]. The controller applies a default tolerance of 10% before scaling, meaning it ignores recommendations when the ratio is between 0.9 and 1.1 to prevent thrashing.
When an HPA specifies multiple metrics (e.g., CPU and memory), the controller calculates desired replicas for each metric independently, then selects the largest value. If any metric cannot be fetched, the HPA skips scale-down decisions but can still scale up if other metrics indicate demand.
If the deployment is scaled to 0 replicas, no pods exist to provide metrics, creating a catch-22. The standard HPA cannot scale from zero (requires at least minReplicas = 1). For workloads that should completely shut down during idle, use KEDA with event-driven scaling.
Resource Request Validation: HPA will not scale if resource.requests is missing or zero. Using only resource.limits (without requests) is insufficient. Setting identical requests and limits is valid but defeats HPA's purpose by preventing flexible scaling.
After fixing metrics issues, allow 30-60 seconds for metrics to propagate, then another sync period (default 15 seconds) for HPA controller to re-evaluate.
No subnets found for EKS cluster
How to fix "eks subnet not found" in Kubernetes
error: context not found
How to fix "error: context not found" in Kubernetes
default backend - 404
How to fix "default backend - 404" in Kubernetes Ingress
serviceaccount cannot list resource
How to fix "serviceaccount cannot list resource" in Kubernetes
must specify requests.cpu, requests.memory
How to fix "must specify requests.cpu, requests.memory" in Kubernetes