How to fix ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized] in Elasticsearch

ElasticsearchINTERMEDIATECRITICAL

This error occurs when Elasticsearch cluster state has not been fully recovered or initialized after startup or node failures. The cluster is in a transitional state and temporarily blocks operations. You need to wait for the cluster to recover, verify node connectivity, ensure adequate disk space, and check cluster configuration.

What this error means

The "ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized]" error indicates that the Elasticsearch cluster is in the process of recovering its state and is not yet ready to serve requests. When Elasticsearch starts up or after node failures occur, the cluster must go through a state recovery phase. During this phase, the master node collects metadata from all nodes and rebuilds the cluster state. Until this process completes, all data operations are blocked with a SERVICE_UNAVAILABLE exception (HTTP 503). This is a protective mechanism to prevent clients from receiving incomplete or incorrect results while the cluster is unstable. Once all nodes join the cluster and the state is fully recovered, the block is automatically removed and operations resume normally.

How to fix "ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized]"

1Check cluster health and pending tasks

First, examine what state the cluster is in and what operations are pending:

bash

# Check cluster health
curl -X GET "localhost:9200/_cluster/health?pretty" -u "username:password"

# List pending cluster tasks (may timeout during recovery)
curl -X GET "localhost:9200/_cluster/pending_tasks?pretty" -u "username:password"

# Check cluster state version
curl -X GET "localhost:9200/_cluster/state/metadata?pretty" -u "username:password"

# Get node discovery status
curl -X GET "localhost:9200/_nodes?pretty" -u "username:password"

# Check node count
curl -X GET "localhost:9200/_cat/nodes?v" -u "username:password"

Look for:
- All expected nodes present in the node list
- Active master node elected
- Number of pending tasks (high number = still recovering)
- Cluster version number (should be incrementing)

2Verify all nodes have sufficient disk space

Insufficient disk space can prevent nodes from starting and cause state recovery to fail:

bash

# Check disk usage on all nodes
curl -X GET "localhost:9200/_nodes/stats/fs?pretty" -u "username:password"

# More detailed allocation information
curl -X GET "localhost:9200/_cat/allocation?v&pretty" -u "username:password"

# On each node, check filesystem directly
df -h /path/to/elasticsearch/data

Ensure:
- All nodes have at least 10% free disk space
- No node is above 85% disk usage
- Elasticsearch data directories are not full

If disk space is low:

bash

# Delete old indices to free space
curl -X DELETE "localhost:9200/old-index-name" -u "username:password"

# Add new disk volume and restart node
# Or delete expired snapshots if present
curl -X DELETE "localhost:9200/_snapshot/repo-name/snapshot-name" -u "username:password"

3Ensure all nodes can communicate and discover each other

Network connectivity issues can prevent cluster formation:

bash

# Check node connectivity from primary
curl -X GET "localhost:9200/_nodes?pretty" -u "username:password"

# On each node, check the logs for connection errors
tail -f /var/log/elasticsearch/elasticsearch.log | grep -i "connection\|error\|discovery"

# Verify network connectivity between nodes manually
ping <other-node-ip>
telnet <other-node-ip> 9300

# Check discovery settings in elasticsearch.yml:
# discovery.seed_hosts: ["node1:9300", "node2:9300", "node3:9300"]
# cluster.initial_master_nodes: ["node1", "node2", "node3"]

Ensure:
- All nodes can reach the transport port (default 9300) on all other nodes
- Discovery seed hosts are correctly configured
- Firewall rules allow node-to-node communication
- Nodes are on the same network or have proper routing

4Wait for cluster recovery to complete naturally

In many cases, the cluster will recover automatically given enough time:

bash

# Monitor recovery progress
watch -n 5 'curl -s "localhost:9200/_cluster/health?pretty" -u "username:password" | head -20'

# Watch cluster events in real-time
curl -X GET "localhost:9200/_cluster/health?wait_for_status=yellow&timeout=30s&pretty" -u "username:password"

# For large clusters, recovery can take several minutes to hours
# Increase the timeout and monitor logs

Recovery timeline:
- Small clusters (< 5 nodes): Usually 30 seconds to 2 minutes
- Medium clusters (5-20 nodes): 2-10 minutes
- Large clusters (> 20 nodes): 10+ minutes depending on shard count

Do NOT force restart nodes while recovery is in progress.

5Check and adjust gateway recovery settings for cluster size

Incorrect gateway settings can cause the cluster to wait indefinitely:

bash

# Check current gateway settings
curl -X GET "localhost:9200/_cluster/settings?include_defaults=true&filter_path=**.gateway.*&pretty" -u "username:password"

# For a 3-node cluster, check these settings in elasticsearch.yml:
# gateway.recover_after_nodes: 2
# gateway.recover_after_time: 5m
# gateway.expected_nodes: 3

# Verify minimum master nodes setting
curl -X GET "localhost:9200/_cluster/settings?include_defaults=true&filter_path=**.discovery.zen.*&pretty" -u "username:password"

# For Elasticsearch 7.0+, use this instead:
curl -X GET "localhost:9200/_cluster/settings?include_defaults=true&filter_path=**.cluster.initial_master_nodes&pretty" -u "username:password"

Guidelines for gateway settings:
- gateway.recover_after_nodes: Set to (total_nodes / 2) + 1
- gateway.expected_nodes: Set to actual number of nodes
- gateway.recover_after_time: Set to 5-10 minutes for initial recovery

Example for 3-node cluster:

bash

curl -X PUT "localhost:9200/_cluster/settings?pretty" -u "username:password" -H 'Content-Type: application/json' -d'
{
  "persistent": {
    "gateway.recover_after_nodes": 2,
    "gateway.expected_nodes": 3,
    "gateway.recover_after_time": "5m"
  }
}
'

6Force recovery if cluster is permanently stuck

As a last resort, if the cluster is completely stuck and unresponsive:

bash

# CAUTION: Only use these steps if recovery has not completed after 30+ minutes
# and all nodes are present and reachable

# Option 1: Restart the master node
# 1. Identify current master
curl -X GET "localhost:9200/_nodes?filter_path=nodes.*.name,nodes.*.master_node&pretty" -u "username:password"

# 2. Stop the master node gracefully
# On the master node server:
pkill -TERM -f "org.elasticsearch.bootstrap.Elasticsearch"

# 3. Wait 30 seconds, then restart it
systemctl restart elasticsearch
# or
./bin/elasticsearch

# Option 2: Temporarily disable minimum master nodes requirement
curl -X PUT "localhost:9200/_cluster/settings?pretty" -u "username:password" -H 'Content-Type: application/json' -d'
{
  "transient": {
    "discovery.zen.minimum_master_nodes": 1
  }
}
'

# Option 3: Reset cluster state (destructive - last resort only!)
# Stop all nodes completely, delete data/nodes directory, restart fresh
# This will lose all cluster state but allows starting over:
systemctl stop elasticsearch
rm -rf /var/lib/elasticsearch/nodes/0
systemctl start elasticsearch

Warning: Options 2 and 3 can cause data loss or corruption. Only attempt after confirming normal recovery will not work.

Advanced notes

## Advanced Cluster Recovery Topics

### Understanding Cluster State Recovery
The Elasticsearch cluster state contains:
- Cluster metadata (indices, shards, mappings)
- Node information
- Cluster settings
- Index settings and aliases

After a master election, the new master must:
1. Collect metadata from all nodes
2. Rebuild the full cluster state
3. Determine shard allocation
4. Assign shards to nodes
5. Announce the new state to all nodes

### Gateway Recovery Process
The gateway module persists cluster state to disk on all nodes:
- When cluster starts, nodes load persisted state
- Master waits for minimum nodes to report their state
- After gateway.recover_after_time, recovery proceeds even if not all nodes present
- If a node is still missing after recovery, its shards become unassigned

### Common Scenarios

Multi-Node Failure Scenario:
If multiple nodes crash simultaneously:
1. Remaining nodes detect failures via heartbeat timeout (default 30s)
2. Master initiates re-election if quorum still exists
3. New master waits for gateway.recover_after_nodes to join
4. Once met, recovery proceeds and shards are re-allocated

Single-Node Cluster:
Single-node clusters have special requirements:
- No quorum needed
- Recovery happens immediately on startup
- gateway.recover_after_nodes: 1 is typical
- No replication, so data loss if node fails

Split Brain Prevention:
Elasticsearch prevents split-brain via quorum:
- Minimum master nodes = (total_master_nodes / 2) + 1
- Never run cluster with even number of master nodes
- 3, 5, 7 master nodes are ideal

### Monitoring Recovery Progress

bash

# Check recovery status
curl -X GET "localhost:9200/_recovery?human&pretty" -u "username:password"

# Monitor shard allocation
curl -X GET "localhost:9200/_cat/shards?h=index,shard,prirep,state,node&v" -u "username:password"

# Check cluster routing decisions
curl -X GET "localhost:9200/_cluster/allocation/explain?pretty" -u "username:password"

### Performance Tuning for Recovery

bash

# Increase recovery parallelism for faster recovery
curl -X PUT "localhost:9200/_cluster/settings?pretty" -u "username:password" -H 'Content-Type: application/json' -d'
{
  "transient": {
    "cluster.routing.allocation.node_concurrent_recoveries": 5,
    "indices.recovery.max_bytes_per_sec": "200mb",
    "indices.recovery.concurrent_streams": 5
  }
}
'

# For very large indices, reduce recovery pressure
curl -X PUT "localhost:9200/_cluster/settings?pretty" -u "username:password" -H 'Content-Type: application/json' -d'
{
  "transient": {
    "indices.recovery.max_bytes_per_sec": "50mb",
    "cluster.routing.allocation.node_concurrent_recoveries": 2
  }
}
'

### Disk-Based Block vs State Recovery Block
- Disk block: Related to disk space thresholds (FORBIDDEN/12)
- State recovery block: Related to cluster formation (SERVICE_UNAVAILABLE/1)

STATE_UNAVAILABLE blocks when:
- Master hasn't recovered cluster state yet
- Not enough nodes in cluster
- Cluster metadata is corrupted

FORBIDDEN blocks when:
- Disk usage exceeds thresholds
- Index-level blocks applied
- Read-only settings enforced

### Long-Running Recovery Scenarios
For very large clusters:
1. Create a separate recovery cluster if possible
2. Temporarily reduce shard count
3. Use forced merge to consolidate segments
4. Increase heap size during recovery (Xms and Xmx equal)
5. Monitor CPU and I/O, not just network

How to fix ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized] in Elasticsearch

What this error means

Typical symptoms

Common causes

How to fix "ClusterBlockException: blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized]"

Advanced notes

Related errors

Official resources & further reading