The Terraform operation was terminated by the system, usually due to memory exhaustion or timeout. This occurs when the process consumes more resources than available, commonly with large state files or high-concurrency operations.
The 'Operation failed: Killed' error indicates that the Terraform process was forcibly terminated by the operating system or execution environment. This typically happens when the process exceeds available system memory (Out of Memory/OOM kill), reaches a timeout threshold in CI/CD pipelines, or is manually terminated. The error appears at the end of Terraform output when the entire process is terminated mid-execution rather than exiting cleanly.
First, verify if the process was killed due to memory exhaustion:
# On Linux, check dmesg for OOM killer events
sudo dmesg | grep -i 'out of memory\|killed process' | tail -20
# Check system memory usage during operation
free -h # View total/available memory
vmstat 1 10 # Monitor memory in real-timeIf you see 'killed process' messages, memory exhaustion is the cause. Proceed to step 2.
By default, Terraform executes 10 parallel operations. Reducing this value lowers peak memory consumption:
# Apply with reduced parallelism (use 2-3 for constrained environments)
terraform plan -parallelism=3
terraform apply -parallelism=3
# Set globally in .terraformrc for persistent effect
echo 'apply {
parallelism = 3
}' >> ~/.terraformrcStart with -parallelism=3 and increase if the operation completes successfully. Lower values are safer but slower.
If running in CI/CD or cloud environments, increase allocated resources:
GitHub Actions:
jobs:
terraform:
runs-on: ubuntu-latest # Consider ubuntu-latest for more resources
# GitHub Actions provides ~7GB RAM for ubuntu-latestDocker/Container:
# Allocate more memory to container
docker run --memory=4g my-terraform-imageCloud VMs (AWS/Azure/GCP):
- Upgrade instance type to one with more RAM
- Consider memory-optimized instance families
Local development:
# Monitor and ensure sufficient free memory
free -h
df -h # Check disk space tooBreaking infrastructure into separate state files/workspaces reduces per-operation memory demand:
Recommended structure:
terraform/
├── networking/ # VPCs, subnets, security groups
│ └── main.tf
├── compute/ # EC2, containers, compute resources
│ └── main.tf
├── databases/ # RDS, DynamoDB, storage
│ └── main.tf
└── monitoring/ # CloudWatch, logs, alerts
└── main.tfBenefits:
- Each module runs independently with smaller state file
- Faster plans and applies
- Easier to parallelize and debug
- Lower memory per operation
Run each independently:
cd terraform/networking && terraform apply
cd ../compute && terraform apply
cd ../databases && terraform applyIf full apply fails but must proceed, target specific resource groups:
# Apply only compute resources
terraform apply -target='module.compute' -parallelism=2
# Apply specific resources
terraform apply -target='aws_instance.web' -target='aws_instance.app'
# Combine with state file backup
terraform state pull > backup.tfstate
terraform apply -target='module.database' -parallelism=2Caution: -target is a debugging tool. Use it temporarily to recover from failures, but plan to refactor your configuration long-term (see step 4).
Memory efficiency improves in newer versions. Update to latest stable releases:
# Check current version
terraform version
# Upgrade Terraform (macOS with Homebrew)
brew upgrade terraform
# Upgrade Terraform (manually)
# Download from https://www.terraform.io/downloads
# Upgrade providers in your configuration
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0" # Latest major version
}
}
# Run init to update providers
terraform init -upgradeNewer versions often include memory optimizations and bug fixes for large configurations.
If 'Killed' occurs due to timeout rather than memory:
GitHub Actions:
jobs:
terraform:
timeout-minutes: 60 # Default is 360 (6 hours)
runs-on: ubuntu-latestGitLab CI:
terraform_apply:
timeout: 2h # Default is 1h
script:
- terraform apply -auto-approveJenkins:
timeout(time: 2, unit: 'HOURS') {
sh 'terraform apply -auto-approve'
}Also set TFE_PARALLELISM environment variable if using Terraform Cloud/Enterprise:
export TFE_PARALLELISM=3
terraform planFor Terraform Enterprise/Cloud users: If running remote operations in TFE, memory is constrained per workspace. Contact HashiCorp support to increase the worker instance class or enable Agent-based runs which use your own compute. The Terraform Enterprise troubleshooting guide recommends checking system dmesg logs via SSH to the node and increasing the maximum memory allocation.
Linux-specific considerations: Check SELinux and cgroup memory limits, which may artificially constrain available memory:
grep MemLimit /proc/cgroups
cat /proc/self/cgroup | grep memoryProvider-specific issues: Some providers (e.g., AWS provider v6.0.0) had memory-leak issues in specific versions. Always check provider changelogs and upgrade to patch releases.
For rootless Docker or constrained environments: Consider using -refresh=false to skip state refresh (speeds up operations and uses less memory), but only if you're certain remote state matches:
terraform apply -refresh=false -parallelism=2Error: Error installing helm release: cannot re-use a name that is still in use
How to fix "release name in use" error in Terraform with Helm
Error: Error creating GKE Cluster: BadRequest
BadRequest error creating GKE cluster in Terraform
Error: External program failed to produce valid JSON
External program failed to produce valid JSON
Error: Unsupported argument in child module call
How to fix "Unsupported argument in child module call" in Terraform
Error: network is unreachable
How to fix "network is unreachable" in Terraform