How to fix Error response from daemon: cannot reach manager node in Docker

DockerINTERMEDIATEHIGH

This error occurs when a Docker Swarm node cannot communicate with the manager node, typically due to network connectivity issues, firewall rules blocking required ports, or when the manager node has lost quorum. The fix involves verifying network connectivity, ensuring required ports are open, and potentially recovering the swarm cluster.

What this error means

Docker Swarm uses a distributed consensus protocol called Raft to maintain cluster state across manager nodes. When you see the "cannot reach manager node" error, it indicates that the Docker daemon on your current node is unable to establish communication with any manager node in the swarm. Manager nodes in Docker Swarm are responsible for: - Maintaining the cluster state and configuration - Scheduling services across worker nodes - Serving the Swarm management API - Storing secrets and configs Communication between swarm nodes requires specific ports to be open: - **Port 2377/tcp**: Cluster management and Raft consensus - **Port 7946/tcp+udp**: Node discovery and container network discovery - **Port 4789/udp**: Overlay network traffic (VXLAN) When these ports are blocked, or network connectivity is otherwise impaired, nodes cannot communicate with managers, leading to this error. The error can also occur when attempting to promote a worker to manager, join a new node as manager, or when the swarm has lost quorum (majority of managers unavailable).

How to fix "Error response from daemon: cannot reach manager node"

1Check manager node status

First, if you have access to any working manager node, check the status of all nodes:

bash

docker node ls

Look for managers with status "Unreachable" or "Down". The output shows:
- MANAGER STATUS: "Leader", "Reachable", or "Unreachable"
- AVAILABILITY: "Active", "Pause", or "Drain"
- STATUS: "Ready" or "Down"

If a manager shows as "Unreachable", that's the problematic node.

Check the swarm state:

bash

docker info --format '{{.Swarm.LocalNodeState}}'
# Expected: active

docker info --format '{{.Swarm.ControlAvailable}}'
# Expected: true (on managers)

2Verify network connectivity

Test basic network connectivity between nodes:

bash

# From the failing node, ping the manager
ping <MANAGER_IP>

# Test TCP connectivity to the management port
nc -zv <MANAGER_IP> 2377

# Or use telnet
telnet <MANAGER_IP> 2377

If ping works but port 2377 doesn't connect, it's likely a firewall issue.

For more detailed network debugging:

bash

# Capture traffic to see what's happening
sudo tcpdump -i any port 2377

# Check routing
traceroute <MANAGER_IP>

3Open required firewall ports

Docker Swarm requires these ports to be open between all nodes:

On Linux with iptables:

bash

# Cluster management (Swarm join, Raft)
sudo iptables -A INPUT -p tcp --dport 2377 -j ACCEPT

# Node communication
sudo iptables -A INPUT -p tcp --dport 7946 -j ACCEPT
sudo iptables -A INPUT -p udp --dport 7946 -j ACCEPT

# Overlay network
sudo iptables -A INPUT -p udp --dport 4789 -j ACCEPT

# Save rules (varies by distro)
sudo iptables-save > /etc/iptables/rules.v4

On Linux with firewalld:

bash

sudo firewall-cmd --permanent --add-port=2377/tcp
sudo firewall-cmd --permanent --add-port=7946/tcp
sudo firewall-cmd --permanent --add-port=7946/udp
sudo firewall-cmd --permanent --add-port=4789/udp
sudo firewall-cmd --reload

On Linux with ufw:

bash

sudo ufw allow 2377/tcp
sudo ufw allow 7946/tcp
sudo ufw allow 7946/udp
sudo ufw allow 4789/udp

On cloud platforms: Also check security groups (AWS), network security groups (Azure), or VPC firewall rules (GCP).

4Fix advertise address issues

If the manager has multiple network interfaces, it may be advertising the wrong IP:

Check current advertise address:

bash

docker info --format '{{.Swarm.NodeAddr}}'

If wrong, you need to leave and rejoin with correct address:

On the problematic node:

bash

docker swarm leave --force

On an existing manager, get the join token:

bash

docker swarm join-token manager
# or for workers:
docker swarm join-token worker

Rejoin with explicit advertise address:

bash

docker swarm join \
  --advertise-addr <CORRECT_IP>:2377 \
  --token <TOKEN> \
  <MANAGER_IP>:2377

For the initial manager, if reinitializing:

bash

docker swarm init --advertise-addr <CORRECT_IP>

5Restart the manager node

If the manager daemon became unresponsive, restarting may help:

bash

# Restart Docker daemon
sudo systemctl restart docker

# Wait for it to rejoin
sleep 30

# Check node status
docker node ls

If the manager doesn't come back as "Reachable" after restart:

bash

# Check Docker logs for errors
sudo journalctl -u docker -n 100 --no-pager

# Look for Raft or swarm-related errors
sudo journalctl -u docker | grep -i "raft\|swarm\|manager"

6Recover from lost quorum

If a majority of managers are down and swarm has lost quorum, management operations fail. You'll see errors like "context deadline exceeded" even on remaining managers.

Check quorum status:
- 3 managers: need at least 2 up (tolerates 1 failure)
- 5 managers: need at least 3 up (tolerates 2 failures)
- 7 managers: need at least 4 up (tolerates 3 failures)

Force recovery on a surviving manager:

bash

# CAUTION: Run on ONE manager only
# This forces a new single-manager cluster
docker swarm init --force-new-cluster

After recovery:
1. Other managers must leave and rejoin
2. Re-add managers to restore fault tolerance

bash

# On old managers that need to rejoin
docker swarm leave --force

# Get new join token from recovered manager
docker swarm join-token manager

# Join old managers back
docker swarm join --token <NEW_TOKEN> <RECOVERED_MANAGER_IP>:2377

7Remove unreachable manager nodes

If a manager is permanently gone and you need to remove it:

bash

# Force remove an unreachable manager
docker node rm --force <NODE_ID>

Important: After removing a manager, add a new one to maintain odd number:

bash

# Promote an existing worker
docker node promote <WORKER_NODE_ID>

# Or add a new manager
docker swarm join-token manager
# Run the output command on the new node

Best practices for manager count:
- Development: 1 manager (no fault tolerance)
- Production: 3 managers (tolerates 1 failure)
- Large production: 5-7 managers (max recommended)

8Verify the fix

Confirm all nodes can reach managers:

bash

# Check all nodes are healthy
docker node ls

# Expected output shows all managers as "Reachable" or "Leader"
# ID                            HOSTNAME   STATUS    AVAILABILITY   MANAGER STATUS
# abc123 *                      manager1   Ready     Active         Leader
# def456                        manager2   Ready     Active         Reachable
# ghi789                        worker1    Ready     Active

# Test swarm operations
docker service create --name test-nginx --replicas 3 nginx
docker service ps test-nginx
docker service rm test-nginx

# Check overlay network creation
docker network create --driver overlay test-overlay
docker network rm test-overlay

If all commands succeed, swarm communication is restored.

Advanced notes

### Raft Consensus and Quorum

Docker Swarm uses the Raft consensus algorithm to maintain a consistent cluster state across managers. Understanding Raft helps diagnose "cannot reach manager" issues:

1. Leader Election: One manager is elected leader; others are followers
2. Heartbeats: The leader sends periodic heartbeats to maintain authority
3. Log Replication: State changes are replicated to a majority before being committed

When network partitions occur, managers on the minority side lose quorum and cannot process requests, resulting in "cannot reach manager" errors.

### Diagnosing Raft Issues

bash

# Check Raft status
docker node inspect <MANAGER_NODE_ID> --format '{{.ManagerStatus}}'

# View Raft logs (requires debug mode)
sudo dockerd --debug
# Then check: sudo journalctl -u docker | grep raft

### Split-Brain Prevention

To prevent split-brain scenarios where two groups of managers think they're the cluster:
- Always use an odd number of managers
- Distribute managers across failure domains (AZs, racks)
- Don't use auto-scaling for manager nodes

### Cloud-Specific Considerations

AWS:

bash

# Security group rules needed
aws ec2 authorize-security-group-ingress \
  --group-id sg-xxx \
  --protocol tcp --port 2377 --source-group sg-xxx

aws ec2 authorize-security-group-ingress \
  --group-id sg-xxx \
  --protocol tcp --port 7946 --source-group sg-xxx

aws ec2 authorize-security-group-ingress \
  --group-id sg-xxx \
  --protocol udp --port 7946 --source-group sg-xxx

aws ec2 authorize-security-group-ingress \
  --group-id sg-xxx \
  --protocol udp --port 4789 --source-group sg-xxx

GCP:

bash

gcloud compute firewall-rules create docker-swarm \
  --allow tcp:2377,tcp:7946,udp:7946,udp:4789 \
  --source-tags docker-swarm \
  --target-tags docker-swarm

### Using Static IPs for Managers

To prevent issues with changing IP addresses:

1. Use static IPs or reserved IPs for manager nodes
2. Use DNS names with short TTLs if IPs must change
3. Consider overlay networks for service communication (independent of host IPs)

### Manager Node Recovery Procedure

If a manager's Docker data is corrupted:

bash

# Stop Docker
sudo systemctl stop docker

# Back up existing swarm data
sudo mv /var/lib/docker/swarm /var/lib/docker/swarm.bak

# Start Docker (will leave swarm)
sudo systemctl start docker

# Rejoin the swarm
docker swarm join --token <TOKEN> <MANAGER_IP>:2377

### Monitoring Swarm Health

Set up monitoring to detect manager issues early:

bash

# Check swarm health programmatically
docker node ls --format '{{.Hostname}} {{.Status}} {{.ManagerStatus}}' | \
  grep -v "Ready.*Reachable\|Ready.*Leader" && \
  echo "ALERT: Swarm node issue detected"

How to fix Error response from daemon: cannot reach manager node in Docker

What this error means

Typical symptoms

Common causes

How to fix "Error response from daemon: cannot reach manager node"

Advanced notes

Related errors

Official resources & further reading