Redis Sentinel logs "+try-failover: New failover in progress" when multiple failover attempts overlap, usually because a previous failover is still underway or quorum cannot be reached. This prevents duplicate promotions and ensures only one master election occurs at a time. To resolve, wait for the current failover to complete, verify Sentinel quorum and network connectivity, then retry if needed.
The "+try-failover: New failover in progress" message appears in Redis Sentinel logs when a Sentinel instance attempts to initiate a failover while another failover is already underway. Redis Sentinel uses a distributed consensus protocol to elect a new master when the current master fails, and it prevents concurrent failovers to avoid split-brain scenarios and data inconsistency. This error indicates that either: 1. A failover is currently in progress and hasn't completed yet 2. Multiple Sentinels are trying to initiate failovers simultaneously due to network partitions or timing issues 3. The failover timeout hasn't expired from a previous attempt Sentinel will queue or reject new failover attempts until the current one completes, times out, or is aborted. This is a protective mechanism to ensure only one master election occurs at a time.
First, check if a failover is already in progress and monitor its status:
# Connect to any Sentinel instance
redis-cli -p 26379
# Check current master and failover state
SENTINEL masters
SENTINEL sentinels <master-name>
SENTINEL get-master-addr-by-name <master-name>
# Look for failover-related flags in Sentinel info
SENTINEL master <master-name> | grep -E "failover|epoch|state"If failover_state shows wait_start, select_slave, promote_slave, reconf_slaves, or update_config, a failover is actively progressing. Wait for it to complete (typically 30-60 seconds) before attempting any new actions.
Ensure Sentinels can communicate and reach quorum for decision-making:
# Check if Sentinels can see each other
SENTINEL sentinels <master-name>
# Verify quorum requirements are met
SENTINEL master <master-name> | grep -E "quorum|num-other-sentinels"
# Test network connectivity between Sentinels
redis-cli -h <sentinel-ip> -p 26379 PING
# Check for split-brain scenarios
SENTINEL is-master-down-by-addr <master-ip> <master-port>If num-other-sentinels is less than quorum-1, Sentinels cannot reach consensus. Fix network issues or adjust quorum in Sentinel configuration. Ensure all Sentinels can reach the master and each other on port 26379.
Check if the failover timeout is too short, causing overlapping attempts:
# View current failover-timeout setting
SENTINEL master <master-name> | grep failover-timeout
# Check Sentinel configuration file
grep -i failover-timeout /etc/redis/sentinel.conf
# Typical configuration (adjust as needed)
sentinel failover-timeout <master-name> 60000 # 60 secondsThe default failover-timeout is 180 seconds (3 minutes). If set too low (e.g., 30 seconds), Sentinels may retry before previous attempts complete. Increase to 60-180 seconds for stable clusters. After changing, reload Sentinel with SENTINEL CONFIG REWRITE or restart.
If a failover appears stuck, you may need to intervene carefully:
# First, check if failover is truly stuck (epoch not changing)
SENTINEL master <master-name> | grep -E "failover_epoch|config_epoch"
# If stuck, try resetting the master (CAUTION: may cause downtime)
SENTINEL reset <master-name>
# As last resort, restart Sentinel instances one by one
sudo systemctl restart redis-sentinelUse SENTINEL reset cautiously—it clears all state for that master and forces re-discovery. Only restart Sentinels one at a time to maintain quorum. Monitor logs closely after any reset: tail -f /var/log/redis/sentinel.log.
Set up monitoring to detect and prevent overlapping failovers:
# Monitor Sentinel logs for failover messages
grep -E "\+try-failover|\+vote-for-leader|\+elected-leader" /var/log/redis/sentinel.log
# Alert on consecutive failover attempts within short windows
# Example Prometheus alert rule:
# - alert: RedisSentinelFailoverOverlap
# expr: increase(redis_sentinel_failovers_total[5m]) > 2
# for: 1m
# Ensure proper Sentinel deployment (odd number, >=3 for production)
# Recommended: 3 or 5 Sentinel instances across different failure domainsDeploy an odd number of Sentinel instances (3 or 5) to ensure clean quorum decisions. Place them across different availability zones or hosts. Use monitoring tools to alert on rapid successive failover attempts, which may indicate network flapping or configuration issues.
Redis Sentinel failover logic uses an epoch-based system to prevent concurrent promotions. Each Sentinel maintains a currentEpoch and configEpoch; failovers increment these epochs. The "+try-failover: New failover in progress" message occurs when a Sentinel's currentEpoch hasn't yet caught up with the latest known failover epoch.
In complex network partition scenarios (split-brain), multiple Sentinels may believe they're the leader and attempt failovers. The Raft-inspired consensus algorithm ensures only one succeeds, but logs may show multiple "+try-failover" attempts before consensus is reached.
For large-scale deployments, consider Redis Cluster instead of Sentinel for automatic sharding and more robust failover. Sentinel is best for single-master setups with 1-5 replicas. Also note that Sentinel failovers are not instantaneous—they involve slave promotion, configuration propagation, and client reconnection, which can take 10-30 seconds even under ideal conditions.
ERR Unbalanced XREAD list of streams
How to fix "ERR Unbalanced XREAD list" in Redis
ERR syntax error
How to fix "ERR syntax error" in Redis
ConnectionError: Error while reading from socket
ConnectionError: Error while reading from socket in redis-py
ERR unknown command
How to fix ERR unknown command in Redis
Command timed out
How to fix 'Command timed out' in ioredis