How to fix SnapshotException: [repository:snapshot] Snapshot could not be read in Elasticsearch

ElasticsearchINTERMEDIATEHIGH

This error occurs when Elasticsearch cannot read a snapshot from a repository during restore operations. Snapshots are backups of indices and cluster state stored in repository locations like shared filesystems, S3, Azure, or GCS. The error indicates that the snapshot metadata or data files are corrupted, inaccessible, or missing from the repository.

What this error means

The "SnapshotException: [repository:snapshot] Snapshot could not be read" error occurs when Elasticsearch attempts to restore a snapshot but cannot read the snapshot metadata or data files from the repository. Snapshots in Elasticsearch consist of two main components: metadata files (which describe what's in the snapshot) and data files (the actual index segments). This error typically happens during snapshot restore operations when: 1. The snapshot metadata files are corrupted or incomplete 2. Repository permissions have changed, preventing read access 3. The repository location has been modified or files have been manually deleted 4. There's a version mismatch between the snapshot and the Elasticsearch cluster 5. Network or storage issues prevent accessing the repository The error message includes "[repository:snapshot]" where "repository" is the name of your snapshot repository and "snapshot" is the specific snapshot name that cannot be read. This helps identify exactly which snapshot is problematic.

How to fix "SnapshotException: [repository:snapshot] Snapshot could not be read"

1Verify repository connectivity and permissions

First, check if the repository is accessible and Elasticsearch has proper permissions:

bash

# Test repository connectivity
curl -X GET "localhost:9200/_snapshot/my_repository" -u "username:password"

# Get repository details
curl -X GET "localhost:9200/_snapshot/my_repository/_all" -u "username:password"

# For filesystem repositories, check permissions
ls -la /path/to/repository/
# Ensure Elasticsearch user has read access to all files

# For S3/GCS/Azure repositories, check credentials and network connectivity
curl -X POST "localhost:9200/_snapshot/my_repository/_verify" -u "username:password"

If the repository verification fails, check:
- Filesystem permissions (read access for Elasticsearch user)
- Cloud storage credentials and permissions
- Network connectivity to repository location
- Repository configuration matches original settings

2Check snapshot metadata integrity

Examine the snapshot metadata to identify corruption:

bash

# Get detailed snapshot information
curl -X GET "localhost:9200/_snapshot/my_repository/snapshot_name?verbose=true" -u "username:password"

# Look for these key fields in the response:
# - "state": Should be "SUCCESS" for valid snapshots
# - "failures": Any failures during snapshot creation
# - "shards": Check for failed shards

# List all snapshots in repository to see which are readable
curl -X GET "localhost:9200/_snapshot/my_repository/_all" -u "username:password"

# For filesystem repositories, manually check metadata files
ls -la /path/to/repository/indices/
# Look for missing .dat or .meta files

Common metadata issues:
- Missing .meta files (snapshot metadata)
- Corrupted .dat files (index metadata)
- Incomplete snapshot (partial files due to interrupted creation)
- Version mismatch in metadata

3Attempt snapshot repair or recreation

If metadata is corrupted, try these repair steps:

bash

# 1. Delete and recreate the repository (CAUTION: This deletes ALL snapshots!)
# Only do this if you have backups elsewhere
curl -X DELETE "localhost:9200/_snapshot/my_repository" -u "username:password"
curl -X PUT "localhost:9200/_snapshot/my_repository" -u "username:password" -H 'Content-Type: application/json' -d'
{
  "type": "fs",
  "settings": {
    "location": "/path/to/repository",
    "compress": true
  }
}
'

# 2. For S3/GCS/Azure repositories, check if you can access files directly
# Use cloud provider tools to verify file existence and integrity

# 3. If you have multiple snapshots, identify which ones are corrupted
# and delete only the problematic ones:
curl -X DELETE "localhost:9200/_snapshot/my_repository/corrupted_snapshot" -u "username:password"

# 4. Create a new snapshot from healthy indices
curl -X PUT "localhost:9200/_snapshot/my_repository/new_snapshot?wait_for_completion=true" -u "username:password" -H 'Content-Type: application/json' -d'
{
  "indices": "index1,index2",
  "ignore_unavailable": true,
  "include_global_state": false
}
'

Important: Always test with a small, non-critical index first before attempting repairs on production data.

4Restore from alternative snapshot or backup

If the snapshot cannot be repaired, restore from an alternative:

bash

# 1. Check for other available snapshots
curl -X GET "localhost:9200/_snapshot/my_repository/_all" -u "username:password"

# 2. Restore from a different snapshot
curl -X POST "localhost:9200/_snapshot/my_repository/alternative_snapshot/_restore?wait_for_completion=true" -u "username:password" -H 'Content-Type: application/json' -d'
{
  "indices": "index1,index2",
  "ignore_unavailable": true,
  "include_global_state": false,
  "include_aliases": false,
  "rename_pattern": "(.+)",
  "rename_replacement": "restored_$1"
}
'

# 3. Partial restore if only some indices are corrupted
# List indices in the corrupted snapshot first:
curl -X GET "localhost:9200/_snapshot/my_repository/corrupted_snapshot" -u "username:password"
# Then restore only healthy indices

# 4. Cross-cluster restore from another repository
# If you have the same snapshot in a different repository:
curl -X POST "localhost:9200/_snapshot/other_repository/same_snapshot/_restore" -u "username:password"

Restoration strategies:
- Restore to a temporary index first for validation
- Use index aliases to minimize downtime
- Consider reindexing from source data if available

5Implement preventive measures and monitoring

Prevent future snapshot read errors with these practices:

bash

# 1. Regular snapshot verification
curl -X POST "localhost:9200/_snapshot/my_repository/_verify" -u "username:password"

# 2. Implement snapshot lifecycle management (SLM)
curl -X PUT "localhost:9200/_slm/policy/daily-backups" -u "username:password" -H 'Content-Type: application/json' -d'
{
  "schedule": "0 30 1 * * ?",
  "name": "<daily-snap-{now/d}>",
  "repository": "my_repository",
  "config": {
    "indices": ["*"],
    "ignore_unavailable": false,
    "include_global_state": false
  },
  "retention": {
    "expire_after": "30d",
    "min_count": 5,
    "max_count": 50
  }
}
'

# 3. Monitor snapshot health
curl -X GET "localhost:9200/_slm/stats" -u "username:password"
curl -X GET "localhost:9200/_snapshot/my_repository/_status" -u "username:password"

# 4. Regular repository maintenance
# For filesystem repositories:
df -h /path/to/repository  # Check disk space

Best practices:
- Maintain multiple snapshots (keep 7-30 days of history)
- Use multiple repositories for redundancy
- Regularly test restore procedures
- Monitor repository storage capacity
- Use checksums for data integrity verification

Advanced notes

## Advanced Troubleshooting

### Filesystem Repository Specific Issues
For filesystem repositories, common problems include:
- NFS/CIFS mounting issues: Timeouts, stale file handles, or permission problems
- Disk corruption: Run filesystem checks (fsck) on repository volumes
- Inode exhaustion: Check with df -i on Linux systems
- Symbolic link issues: Ensure all paths resolve correctly

### Cloud Repository Considerations
For S3, GCS, and Azure repositories:
- IAM/Role permissions: Ensure Elasticsearch has s3:GetObject, s3:ListBucket permissions
- Network connectivity: Check VPC endpoints, security groups, and firewalls
- Storage class issues: Some storage classes (like S3 Glacier) have retrieval delays
- Cross-region access: Latency and permissions for cross-region repository access

### Snapshot Version Compatibility
Elasticsearch snapshots have version compatibility rules:
- Snapshots can be restored to same major version (7.x to 7.y)
- Some features may not be available across minor versions
- Always test restore compatibility before upgrading clusters
- Use snapshot upgrade API for version migration

### Corruption Recovery Techniques
If metadata is corrupted but data files exist:
1. Manual metadata reconstruction: In extreme cases, metadata can be reconstructed from data files
2. Segment file recovery: Individual segment files can sometimes be extracted
3. Professional data recovery services: For critical data without backups

### Repository Encryption Issues
If using repository encryption:
- Verify encryption keys are available and accessible
- Check key rotation hasn't made old snapshots unreadable
- Ensure encryption settings match between repository creation and access

### Performance Impact During Restore
Large snapshot restores can impact cluster performance:
- Monitor thread pools: thread_pool.snapshot.*
- Adjust restore throttling: indices.recovery.max_bytes_per_sec
- Consider restoring during maintenance windows
- Use partial restores for large indices

### Legal and Compliance Considerations
- Ensure snapshot retention complies with data governance policies
- Implement access controls for sensitive data in snapshots
- Consider encryption for snapshots containing PII/PHI
- Maintain audit logs of snapshot creation and restore operations

How to fix SnapshotException: [repository:snapshot] Snapshot could not be read in Elasticsearch

What this error means

Typical symptoms

Common causes

How to fix "SnapshotException: [repository:snapshot] Snapshot could not be read"

Advanced notes

Related errors

Official resources & further reading