Helm deployments often timeout when Kubernetes resources take longer than expected to reach their desired state. This error typically indicates insufficient cluster resources, slow image pulls, or misconfigured healthchecks that prevent pods from becoming ready.
The "timed out waiting for the condition" error occurs when a Helm chart installation or upgrade cannot complete within the specified timeout window. Kubernetes waits for pods to reach their ready state (based on readiness probes), but if this takes too long or the pods fail to become ready, Helm aborts the operation and marks the release as failed. This error is common in Terraform Helm provider configurations because the default timeout (5 minutes) may be insufficient for: - Large deployments with many pods - Slow image registries or network conditions - Complex initialization hooks that run during installation - Resource-constrained clusters that prioritize pod scheduling The release enters a failed state, and subsequent install/upgrade attempts will also fail until the issue is resolved.
Use kubectl to examine which pods are pending or failing:
kubectl get pods -n <namespace>
kubectl describe pod <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace>Look for messages in the pod events section and logs that explain why the pod hasn't started. Common issues are ImagePullBackOff, pending resource requests, or failed health checks.
In your Terraform code, add or increase the timeout argument to the helm_release resource:
resource "helm_release" "my_release" {
name = "my-release"
chart = "my-chart"
namespace = "default"
timeout = 600 # Increase from default 5 minutes to 10 minutes (in seconds)
# For even longer deployments, increase further
# timeout = 1800 # 30 minutes
}Apply the Terraform configuration:
terraform applyCheck if your cluster has sufficient resources for the deployment:
# View node status and resource allocation
kubectl get nodes
kubectl describe nodes
# Check for resource pressure or disk pressure
kubectl describe node <node-name>
# View resource requests vs available capacity
kubectl top nodes
kubectl top pods -n <namespace>If nodes are full or show pressure conditions, scale up your cluster or adjust pod resource requests in the chart's values.
Check for unrealistic readiness probes or initialization delays:
# Preview what will be deployed without installing
helm template my-release my-chart -n default
# Lint the chart for issues
helm lint my-chartLook for:
- Readiness probe initialDelaySeconds that's too short
- Large replicas counts that slow deployment
- Memory/CPU requests that exceed node capacity
Adjust the values in your Terraform values block:
resource "helm_release" "my_release" {
values = [
yamlencode({
replicaCount = 1 # Start with fewer replicas
resources = {
requests = {
cpu = "100m"
memory = "128Mi"
}
}
readinessProbe = {
initialDelaySeconds = 30 # Give more time to start
}
})
]
}If the release remains in a failed or pending state, clean up the Helm release secret:
# Find the stuck release
kubectl get secrets -n <namespace> | grep sh.helm.release
# Delete the stuck release secret
kubectl delete secret sh.helm.release.v1.<release-name>.v<revision> -n <namespace>Then retry your Terraform apply after fixing the underlying issue.
For production deployments, add the atomic flag to automatically rollback if the deployment fails:
resource "helm_release" "my_release" {
name = "my-release"
chart = "my-chart"
timeout = 600
atomic = true # Rollback automatically if installation fails
}This prevents your cluster from being left in a partial deployment state. However, ensure your timeout is realistic for the full deployment duration.
Timeout Calculation for Multi-Pod Deployments: When deploying charts with multiple replicas or complex dependencies, consider the total startup time. Each pod needs time to pull its image (potentially several seconds per pod), initialize (run init containers), and pass readiness checks. Multiply the time for a single pod by the number of replicas plus buffer for image pulls.
Hook Timeouts: Pre-install, post-install, and other lifecycle hooks are included in the overall deployment timeout. If your chart runs database migrations or complex initialization, ensure hooks have sufficient time to complete.
Network and Registry Performance: Image pull times vary significantly based on image size, registry location, and network bandwidth. In environments with constrained networks or slow registries, expect slower pulls and adjust timeouts accordingly.
Kubernetes Version Differences: Different Kubernetes versions may have different default scheduler behaviors. Newer versions may be faster, while older clusters might be slower.
Testing with Helm Directly: Before relying on Terraform's Helm provider, test the chart installation directly with Helm CLI and measure actual deployment times to set realistic timeout values in Terraform.
Error: Error installing helm release: cannot re-use a name that is still in use
How to fix "release name in use" error in Terraform with Helm
Error: Error creating GKE Cluster: BadRequest
BadRequest error creating GKE cluster in Terraform
Error: External program failed to produce valid JSON
External program failed to produce valid JSON
Error: Unsupported argument in child module call
How to fix "Unsupported argument in child module call" in Terraform
Error: network is unreachable
How to fix "network is unreachable" in Terraform