How to fix UnavailableShardsException: [index][shard] primary shard is not active in Elasticsearch

ElasticsearchADVANCEDCRITICAL

This Elasticsearch error occurs when one or more primary shards in an index are not allocated or active in the cluster, returning a 503 service unavailable response. The cluster is in a red health state, preventing reads and writes. Common causes include cluster recovery delays, insufficient disk space, or node failures.

What this error means

UnavailableShardsException with "primary shard is not active" indicates that a cluster cannot find or activate a required primary shard for an index. Elasticsearch returns HTTP 503 (Service Unavailable) and blocks all operations on that index. Unlike replica shards (which can be recreated from the primary), a missing primary shard means data is unavailable. This error typically occurs after cluster restarts, node failures, or when shards fail to reallocate due to resource constraints. The cluster health state becomes RED when primary shards are unassigned, meaning Elasticsearch cannot guarantee full index availability. This is a critical issue that requires immediate investigation.

How to fix "UnavailableShardsException: [index][shard] primary shard is not active"

1Check cluster health status

First, verify the cluster health and identify which indices have unassigned shards:

bash

GET _cluster/health

This returns the overall cluster state. If status is RED, you have unassigned primary shards. To see detailed shard allocation info:

bash

GET _cat/shards?v&h=index,shard,prirep,state,node,reason

Look for rows with state=UNASSIGNED and prirep=p (primary). The "reason" column explains why the shard is unassigned.

2Check disk space on all nodes

Verify that all nodes have adequate disk space available:

bash

GET _cat/nodes?v&h=name,diskAvail,diskUsed,diskPercent

If any node shows diskPercent >= 85%, Elasticsearch will not allocate shards to that node (low watermark threshold). Free up disk space by:
- Deleting old indices
- Increasing disk capacity
- Archiving data to external storage

Once disk usage drops below 85%, shards should begin reallocating automatically.

3Wait for cluster recovery (if recently restarted)

After a cluster restart or node failure, Elasticsearch delays shard reallocation by default (30 seconds) to avoid unnecessary rebalancing if the node recovers. Monitor progress:

bash

GET _cat/recovery?v&h=index,shard,stage,files_recovered,files_total

If shards are INITIALIZING or TRANSLOG, recovery is in progress—let it complete. This can take minutes to hours for large indices. Do not interrupt the process. Check recovery progress periodically until all shards reach DONE state.

4Manually retry shard allocation if stuck

If shards remain unassigned after 5-10 minutes despite adequate disk space, manually trigger reroute to retry allocation:

bash

POST _cluster/reroute?retry_failed=true

This retries shards that have exceeded max allocation failures. Check the response for deciders that blocked allocation—common ones include:
- DISK_THRESHOLD: Insufficient disk space
- AWARENESS: Shard replica distribution across nodes/racks
- SAME_SHARD_HOST: Replica shard assigned to same host as primary

After reroute, monitor cluster health again:

bash

GET _cluster/health

5Restore from snapshot if data is lost

If a primary shard is permanently lost and all replicas are unavailable, you must restore from a snapshot:

bash

GET _snapshot

To list available snapshots:

bash

GET _snapshot/my-repo/_all

To restore a specific index:

bash

POST _snapshot/my-repo/my-snapshot/_restore
{
  "indices": "my-index"
}

This creates a new index from the snapshot. If the original index is blocking recovery, delete it first:

bash

DELETE my-index

6Check node versions for version mismatch

Verify all nodes in the cluster are running the same Elasticsearch version:

bash

GET _cat/nodes?v&h=name,version

If versions differ, upgrade all nodes to the same version. Elasticsearch does not support mixed-version clusters. Replica shards cannot be allocated to nodes running older versions than the primary shard's version.

Perform a rolling upgrade:
1. Stop one node
2. Upgrade Elasticsearch and plugins
3. Restart the node
4. Wait for cluster to stabilize (shards rebalance)
5. Repeat for each remaining node

How to fix UnavailableShardsException: [index][shard] primary shard is not active in Elasticsearch

What this error means

Typical symptoms

Common causes

How to fix "UnavailableShardsException: [index][shard] primary shard is not active"

Advanced notes

Related errors

Official resources & further reading