How to fix SearchPhaseExecutionException: all shards failed in Elasticsearch

ElasticsearchADVANCEDHIGH

This error occurs when Elasticsearch cannot execute a search query because all shards responsible for the search operation have failed. Shards are the basic units of data storage and search execution in Elasticsearch, and when all shards fail, the search operation cannot complete successfully.

What this error means

The "SearchPhaseExecutionException: all shards failed" error indicates that a search operation in Elasticsearch has completely failed because every shard involved in processing the query encountered an error. In Elasticsearch's distributed architecture, indices are divided into shards (primary and replica) that are distributed across nodes in the cluster. When you execute a search query, it is sent to all relevant shards, and their results are aggregated. This error typically means that something is fundamentally wrong with the cluster state, shard allocation, or the data being searched. Unlike partial failures where some shards succeed and others fail, this complete failure suggests a systemic issue affecting all shards involved in the search operation. Common scenarios include cluster red status (unassigned shards), network partitions preventing communication with shards, corrupted indices, or resource exhaustion on nodes hosting the shards. The error is particularly critical because it means no search results can be returned for the affected indices.

How to fix "SearchPhaseExecutionException: all shards failed"

1Check cluster health and shard allocation status

First, examine the overall cluster health and identify which shards are failing:

bash

# Check cluster health (look for RED status)
curl -X GET "localhost:9200/_cluster/health?pretty" -u "username:password"

# Get detailed shard allocation information
curl -X GET "localhost:9200/_cat/shards?v&h=index,shard,prirep,state,docs,store,ip,node,unassigned.reason&s=state" -u "username:password"

# Check for unassigned shards specifically
curl -X GET "localhost:9200/_cat/shards?v&h=index,shard,prirep,state,unassigned.reason&s=state&state=UNASSIGNED" -u "username:password"

# View cluster allocation explanation
curl -X GET "localhost:9200/_cluster/allocation/explain?pretty" -u "username:password"

Look for:
- RED cluster health: Indicates missing primary shards
- UNASSIGNED shards: Shards not allocated to any node
- Specific error messages in the allocation explanation
- Patterns: Are failures limited to specific indices or nodes?

2Investigate node and resource issues

Check individual node health and resource availability:

bash

# Check node stats and resource usage
curl -X GET "localhost:9200/_nodes/stats?pretty" -u "username:password"

# Check node disk usage (look for high usage)
curl -X GET "localhost:9200/_cat/nodes?v&h=name,heap.percent,ram.percent,cpu,load_1m,disk.used_percent,disk.avail" -u "username:password"

# Check thread pool statistics for search queues
curl -X GET "localhost:9200/_cat/thread_pool/search?v&h=node_name,name,active,queue,rejected,completed" -u "username:password"

# Check JVM heap and garbage collection
curl -X GET "localhost:9200/_nodes/hot_threads?pretty" -u "username:password"

Common resource issues:
- Disk space > 85%: Elasticsearch marks indices read-only
- High heap usage: Can cause JVM pauses and shard failures
- Thread pool rejections: Search queues overwhelmed
- Network connectivity: Check node-to-node communication

3Address unassigned shards and cluster issues

Based on the investigation, take appropriate action:

If disk space is low:

bash

# Check disk watermarks
curl -X GET "localhost:9200/_cluster/settings?include_defaults=true&filter_path=*.disk.*" -u "username:password"

# Free up disk space or add more storage
# Temporarily adjust watermarks (if absolutely necessary)
curl -X PUT "localhost:9200/_cluster/settings" -u "username:password" -H 'Content-Type: application/json' -d'
{
  "transient": {
    "cluster.routing.allocation.disk.watermark.low": "90%",
    "cluster.routing.allocation.disk.watermark.high": "95%",
    "cluster.routing.allocation.disk.watermark.flood_stage": "98%"
  }
}
'

If shards are unassigned due to allocation issues:

bash

# Retry shard allocation
curl -X POST "localhost:9200/_cluster/reroute?retry_failed=true" -u "username:password"

# Manually allocate specific shard (use with caution)
curl -X POST "localhost:9200/_cluster/reroute" -u "username:password" -H 'Content-Type: application/json' -d'
{
  "commands": [
    {
      "allocate_stale_primary": {
        "index": "my-index",
        "shard": 0,
        "node": "target-node-name",
        "accept_data_loss": true
      }
    }
  ]
}
'

Warning: Manual allocation with accept_data_loss: true can cause data loss. Only use when you have backups or can tolerate data loss.

4Restore from snapshot or rebuild indices

If shards are corrupted and cannot be recovered:

Option 1: Restore from snapshot

bash

# List available snapshots
curl -X GET "localhost:9200/_snapshot/my-repository/_all?pretty" -u "username:password"

# Restore specific indices
curl -X POST "localhost:9200/_snapshot/my-repository/my-snapshot/_restore" -u "username:password" -H 'Content-Type: application/json' -d'
{
  "indices": ["my-index-*"],
  "ignore_unavailable": true,
  "include_global_state": false
}
'

Option 2: Reindex from source (if data exists elsewhere)

bash

# Create new index with proper settings
curl -X PUT "localhost:9200/my-index-new" -u "username:password" -H 'Content-Type: application/json' -d'
{
  "settings": {
    "number_of_shards": 3,
    "number_of_replicas": 1
  }
}
'

# Reindex from source (if you have another data source)
curl -X POST "localhost:9200/_reindex" -u "username:password" -H 'Content-Type: application/json' -d'
{
  "source": {
    "index": "my-index-backup"
  },
  "dest": {
    "index": "my-index-new"
  }
}
'

Option 3: Delete and recreate (last resort, data loss)

bash

# Only if you can afford to lose the data
curl -X DELETE "localhost:9200/my-corrupted-index" -u "username:password"

5Implement preventive measures and monitoring

Set up monitoring and preventive configurations:

yaml

# Example: Elasticsearch configuration (elasticsearch.yml)
cluster.routing.allocation.disk.threshold_enabled: true
cluster.routing.allocation.disk.watermark.low: 85%
cluster.routing.allocation.disk.watermark.high: 90%
cluster.routing.allocation.disk.watermark.flood_stage: 95%

# Enable shard allocation awareness
cluster.routing.allocation.awareness.attributes: rack,zone

# Set appropriate shard limits
cluster.max_shards_per_node: 1000
cluster.max_shards_per_node.frozen: 3000

# Configure thread pools
thread_pool.search.queue_size: 1000
thread_pool.search.size: 50

Monitoring setup:
1. Cluster health alerts: Monitor for RED/YELLOW status
2. Disk space alerts: Alert at 80% usage
3. Shard allocation monitoring: Track unassigned shards
4. JVM monitoring: Heap usage, GC pauses
5. Search error rate: Alert on increased search failures

Regular maintenance:
- Regular snapshot backups
- Index lifecycle management for old indices
- Regular cluster health checks
- Capacity planning and scaling

Advanced notes

## Advanced Troubleshooting

### Shard Corruption Recovery
When dealing with corrupted shards, consider these advanced recovery techniques:

1. Use Elasticsearch Recovery Tool: For Lucene index corruption, use the elasticsearch-shard tool to diagnose and potentially repair:

bash

bin/elasticsearch-shard remove-corrupted-data --index my-index --shard-id 0

2. Segment Merges: Force merge to consolidate segments and potentially skip corrupted ones:

bash

curl -X POST "localhost:9200/my-index/_forcemerge?max_num_segments=1" -u "username:password"

3. Translog Recovery: Reset translog if corruption is in write-ahead log:

bash

# Close index
   curl -X POST "localhost:9200/my-index/_close" -u "username:password"

   # Reset translog (dangerous - can lose recent writes)
   curl -X POST "localhost:9200/my-index/_settings" -u "username:password" -H 'Content-Type: application/json' -d'
   {
     "index": {
       "translog.durability": "async",
       "index.translog.retention.size": "1b"
     }
   }
   '

   # Reopen index
   curl -X POST "localhost:9200/my-index/_open" -u "username:password"

### Distributed System Considerations
- Network Partitions: Use split-brain prevention with discovery.zen.minimum_master_nodes (or equivalent in newer versions)
- Zone Awareness: Configure awareness attributes to ensure replica shards are in different failure domains
- Slow Logs: Enable search slow logs to identify problematic queries before they cause shard failures

### Performance Tuning
- Search Circuit Breakers: Configure circuit breakers to prevent resource exhaustion:

yaml

indices.breaker.total.limit: 70%
  indices.breaker.fielddata.limit: 60%
  indices.breaker.request.limit: 60%

- Query Optimization: Use profile: true to analyze slow queries and optimize them
- Caching: Ensure field data and request caches are appropriately sized

### Security Considerations
- Role-Based Access Control: Ensure search roles have appropriate permissions
- Audit Logging: Enable audit logs to track access patterns
- TLS/SSL: Ensure node-to-node communication is encrypted to prevent MITM attacks

### Cloud-Specific Considerations
- AWS/GCP/Azure: Use managed disk snapshots for backups
- Kubernetes: Ensure persistent volume claims have sufficient storage
- Auto-scaling: Configure auto-scaling policies based on search load

How to fix SearchPhaseExecutionException: all shards failed in Elasticsearch

What this error means

Typical symptoms

Common causes

How to fix "SearchPhaseExecutionException: all shards failed"

Advanced notes

Related errors

Official resources & further reading