The OfferReplacePending error occurs when Terraform tries to change Cosmos DB throughput (RU/s) while a previous scaling operation is still in progress. Azure only allows one throughput change at a time per container or database. Resolve this by waiting for pending operations or restructuring your Terraform configuration to apply changes sequentially.
Azure Cosmos DB throughput scaling operations (changing RU/s) can take significant time to complete, especially when scaling up requires adding new physical partitions to your database. When an `offerReplacePending` operation is in progress, Azure temporarily blocks new throughput change requests. When Terraform attempts to create or modify a Cosmos DB resource while a throughput change is pending, the Azure API returns an error indicating that an offer replacement operation is already in progress. This is a safeguard to prevent conflicting or overlapping scaling operations that could corrupt the database state. The error typically manifests as: - `OfferReplacePending` when checking the offer status - HTTP 423 (Locked) response from the Azure API - Terraform apply failure with the message "OfferReplacePending"
First, verify that there's actually a pending operation. You can check this in the Azure Portal:
1. Navigate to your Cosmos DB account
2. Go to Settings > Keys or Containers (depending on your resource type)
3. Look for any indication of ongoing scaling or throughput changes
4. Alternatively, use the Azure CLI:
# Get Cosmos DB account details
az cosmosdb database show --resource-group <RG> --account-name <account-name> --name <database-name>
# Check for offer replacement status
az cosmosdb sql database throughput show --resource-group <RG> --account-name <account-name> --database-name <database-name>If you see offerReplacePending: true or similar output, an operation is in progress.
If a throughput scaling operation is pending, you must wait for it to finish before Terraform can apply new changes. Scaling duration depends on the size of the change:
- Small increases (within current partition capacity): Usually 30 seconds to a few minutes
- Large increases (requiring partition splits): Can take 4-6 hours
Do not interrupt the operation. Check the Activity Log in Azure Portal to monitor progress:
1. Go to your Cosmos DB account
2. Click Activity Log (usually in the left sidebar)
3. Look for entries like "Update Cosmos DB Throughput" or "Replace Offer"
4. Note the timestamp and estimated completion time
Once the operation completes, the offerReplacePending status will clear.
Prevent parallel apply operations by adding explicit dependencies in your Terraform configuration:
Instead of:
resource "azurerm_cosmosdb_sql_database" "main" {
# ...
throughput = 400
}
resource "azurerm_cosmosdb_sql_container" "container1" {
# ...
throughput = 400
}
resource "azurerm_cosmosdb_sql_container" "container2" {
# ...
throughput = 400
}Use explicit depends_on to serialize changes:
resource "azurerm_cosmosdb_sql_database" "main" {
# ...
throughput = 400
}
resource "azurerm_cosmosdb_sql_container" "container1" {
# ...
throughput = 400
depends_on = [azurerm_cosmosdb_sql_database.main]
}
resource "azurerm_cosmosdb_sql_container" "container2" {
# ...
throughput = 400
depends_on = [azurerm_cosmosdb_sql_container.container1]
}This ensures Terraform waits for each resource to fully provision before creating the next one.
Limit how many resources Terraform creates simultaneously by using the -parallelism flag:
terraform apply -parallelism=1Setting parallelism to 1 forces Terraform to apply resources one at a time, completely avoiding concurrent throughput operations. This is slower but eliminates conflicts.
For a balance between speed and safety, you can use:
terraform apply -parallelism=2This creates up to 2 resources in parallel while still reducing conflicts.
If you're changing throughput at both the database and container level, consider splitting this into two Terraform applies:
First, apply database-level changes:
resource "azurerm_cosmosdb_sql_database" "main" {
# ... other config
throughput = 1000 # Database-level throughput
}Then, after the database scaling completes, apply container-level changes:
resource "azurerm_cosmosdb_sql_container" "example" {
# ... other config
throughput = 500 # Container-level throughput
}You can automate this with a time-based delay:
resource "time_sleep" "wait_for_cosmosdb" {
depends_on = [azurerm_cosmosdb_sql_database.main]
create_duration = "5m"
}
resource "azurerm_cosmosdb_sql_container" "example" {
# ...
depends_on = [time_sleep.wait_for_cosmosdb]
}Once pending operations finish, retry your Terraform deployment:
terraform plan
terraform applyIf you continue to see OfferReplacePending errors, it may indicate:
1. Another operation just started (wait longer)
2. A bug in the Azure provider version (update it)
3. Conflicting manual changes (sync state with terraform refresh)
Update your Azure provider if it's outdated:
terraform init -upgrade
terraform applyIf the issue persists, review the Azure Activity Log for any ongoing operations you may have missed.
Understanding offer replacement pending: The offerReplacePending flag in Cosmos DB's internal state indicates that a PUT request to the offer resource has been received but not yet fully processed. Azure needs to validate the change, check partition limits, and potentially rebalance physical partitions. During this time, subsequent modifications are blocked.
Scaling time expectations: Small throughput increases (e.g., 400 RU/s to 500 RU/s) complete quickly. However, if your new throughput requirement exceeds what your current physical partition layout can support, Azure must split partitions, which can take 4-6 hours or longer depending on data volume and complexity.
Autoscale vs. Manual throughput: If using autoscale, Azure automatically manages throughput scaling based on demand. Mixing manual Terraform changes with autoscale can cause unexpected pending operations. Consider using either pure autoscale (managed by Azure) or pure manual (managed by Terraform), not both.
Terraform provider version: Older versions of the Azure provider (azurerm) may not handle OfferReplacePending gracefully. Update to the latest version to get better retry logic and error handling:
terraform init -upgradeState management: Ensure your Terraform state is stored in a remote backend (Azure Storage, Terraform Cloud, etc.). If your state is lost or corrupted, Terraform may try to re-create resources, triggering additional OfferReplacePending operations.
Error: Error rendering template: template not found
How to fix "template not found" error in Terraform
Error: Error generating private key
How to fix 'Error generating private key' in Terraform
Error creating Kubernetes Service: field is immutable
How to fix "field is immutable" errors in Terraform
Error: Error creating local file: open: permission denied
How to fix "Error creating local file: permission denied" in Terraform
Error: line endings have changed from CRLF to LF
Line endings have changed from CRLF to LF in Terraform