HorizontalPodAutoscaler will not scale down pods even when demand decreases. Scale-down is disabled by policy, stabilization window, or autoscaling config.
HPA can scale up quickly but intentionally limits scale-down to prevent thrashing (rapid up/down cycles that waste resources). The scale-down behavior is controlled by stabilization windows and scale-down policies. When these are too conservative, HPA will not scale down even when metrics indicate fewer replicas are needed.
First diagnostic step
Second diagnostic step
Third diagnostic step
Fourth diagnostic step
Additional notes and platform-specific considerations.
Failed to connect to server: connection refused (HTTP/2)
How to fix "HTTP/2 connection refused" error in Kubernetes
missing request for cpu in container
How to fix "missing request for cpu in container" in Kubernetes HPA
error: invalid configuration
How to fix "error: invalid configuration" in Kubernetes
Fifth diagnostic step