How to fix health check failed in Docker

DockerINTERMEDIATEMEDIUM

This error indicates that Docker's HEALTHCHECK command could not verify your container's service is working correctly. The container is marked as 'unhealthy' and dependent services may fail to start. Fix this by debugging the health check command, adjusting timing parameters, or ensuring your application responds properly.

What this error means

The "health check failed" error in Docker occurs when a container's HEALTHCHECK instruction cannot successfully verify that the service inside the container is operational. Docker runs the health check command periodically and if it fails consecutively (typically 3 times by default), the container is marked as "unhealthy." This error commonly appears in several scenarios: 1. When running `docker ps` and seeing "(unhealthy)" next to your container 2. When Docker Compose services fail to start due to `depends_on` with `condition: service_healthy` 3. In Docker Swarm or Kubernetes when containers are being replaced due to failed health checks 4. In CI/CD pipelines when health check timeouts cause deployment failures The health check itself is a command that Docker executes inside your container. If the command exits with code 0, the container is healthy. If it exits with code 1 (or any non-zero code), the container is unhealthy. The actual error could be anything from a network timeout to a missing tool to an application that hasn't started yet.

How to fix "health check failed"

1Inspect the health check status and error output

First, examine what Docker recorded from the failed health checks:

bash

# View container status
docker ps -a

# Get detailed health check information including failure logs
docker inspect --format='{{json .State.Health}}' <container_name> | jq

# Alternative: view just the last 5 health check logs
docker inspect <container_name> | jq '.State.Health.Log[-5:]'

This shows you:
- Status: "unhealthy", "healthy", or "starting"
- FailingStreak: Number of consecutive failures
- Log: Array of recent health check results with ExitCode and Output

Example output showing a failing health check:

json

{
  "Status": "unhealthy",
  "FailingStreak": 5,
  "Log": [
    {
      "Start": "2024-01-15T10:00:00.000Z",
      "End": "2024-01-15T10:00:05.000Z",
      "ExitCode": 1,
      "Output": "curl: (7) Failed to connect to localhost port 8080: Connection refused"
    }
  ]
}

The "Output" field often reveals exactly what went wrong.

2Test the health check command manually inside the container

Run the health check command yourself to see what's happening:

bash

# Get a shell inside the container
docker exec -it <container_name> sh

# Run your health check command manually
curl -f http://localhost:8080/health
echo "Exit code: $?"

# Check if the service is listening on the expected port
netstat -tlnp || ss -tlnp

If you can't get an interactive shell:

bash

# Run command directly
docker exec <container_name> curl -f http://localhost:8080/health

# Check what ports are listening
docker exec <container_name> netstat -tlnp 2>/dev/null || docker exec <container_name> ss -tlnp

Common issues you might discover:
- "curl: command not found" - curl isn't installed in the image
- "Connection refused" - service not running or wrong port
- "Connection timed out" - service too slow or blocked
- "404 Not Found" - health endpoint doesn't exist
- No ports listening - application failed to start

3Increase start_period for slow-starting applications

If your application needs time to initialize (loading data, warming caches, etc.), the health check may fail before it's ready. Add or increase the start_period:

In Dockerfile:

dockerfile

HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
  CMD curl -f http://localhost:8080/health || exit 1

In docker-compose.yml:

yaml

services:
  myapp:
    image: myapp:latest
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 30s
      timeout: 10s
      start_period: 60s
      retries: 3

Parameter guidelines:
- start_period: Grace period during which failures don't count (set to your app's typical startup time + buffer)
- interval: Time between checks (10-30s is typical)
- timeout: Maximum time for a single check (5-10s)
- retries: Number of consecutive failures before unhealthy (3-5)

For Java applications, databases, or apps loading large datasets, you may need start_period of 2-5 minutes.

4Fix health check command syntax

Docker Compose health check syntax is a common source of errors. Here are the correct formats:

CMD form (recommended for simple commands):

yaml

healthcheck:
  test: ["CMD", "curl", "-f", "http://localhost:8080/health"]

CMD-SHELL form (for commands with shell features like || or pipes):

yaml

healthcheck:
  test: ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"]

String form (implicitly uses shell):

yaml

healthcheck:
  test: curl -f http://localhost:8080/health || exit 1

Common syntax mistakes:

yaml

# WRONG - arguments not separated
test: ["CMD", "/healthcheck.sh ready"]

# CORRECT - each argument is separate
test: ["CMD", "/healthcheck.sh", "ready"]

# WRONG - using || in CMD form (no shell to interpret it)
test: ["CMD", "curl", "-f", "http://localhost:8080/health", "||", "exit", "1"]

# CORRECT - use CMD-SHELL for shell operators
test: ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"]

Why `|| exit 1` matters: curl and other tools may return various exit codes (e.g., 7 for connection refused, 22 for HTTP errors). Docker only recognizes 0 (healthy) and 1 (unhealthy), so normalize exit codes with || exit 1.

5Ensure health check tools are installed in the image

Many minimal Docker images don't include curl or wget. Check what's available and install what you need:

Check available tools:

bash

docker exec <container_name> which curl wget nc

Install curl in your Dockerfile:

dockerfile

# Debian/Ubuntu
RUN apt-get update && apt-get install -y --no-install-recommends curl \
    && rm -rf /var/lib/apt/lists/*

# Alpine
RUN apk add --no-cache curl

# RHEL/CentOS
RUN yum install -y curl && yum clean all

Alternatives if you can't modify the image:

Use wget (often pre-installed):

dockerfile

HEALTHCHECK CMD wget --spider -q http://localhost:8080/health || exit 1

Use netcat for TCP checks:

dockerfile

HEALTHCHECK CMD nc -z localhost 8080 || exit 1

Use a built-in health endpoint (many apps have this):

dockerfile

# For apps with built-in health commands
HEALTHCHECK CMD /app/healthcheck

Use /dev/tcp in bash:

dockerfile

HEALTHCHECK CMD bash -c 'echo > /dev/tcp/localhost/8080' || exit 1

6Verify the application is listening on the correct address

Health checks run inside the container, so the service must be accessible from localhost:

bash

# Check what the app is listening on
docker exec <container_name> netstat -tlnp
# or
docker exec <container_name> ss -tlnp

Common binding issues:

1. App binds to 127.0.0.1 only - This is correct for health checks but may cause issues if you also need external access

2. App binds to a specific IP - Update health check to use that IP:

dockerfile

HEALTHCHECK CMD curl -f http://10.0.0.5:8080/health || exit 1

3. App uses a different port internally - Match the health check port:

yaml

healthcheck:
  # Use the internal port, not the mapped port
  test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
ports:
  - "8080:3000"  # External:Internal

4. App isn't running at all - Check container logs:

bash

docker logs <container_name>

7Configure proper health endpoints in your application

For reliable health checks, implement a dedicated health endpoint in your application:

Node.js/Express:

javascript

app.get('/health', (req, res) => {
  // Quick check - just confirm the server is responding
  res.status(200).json({ status: 'ok' });
});

// Or with dependency checks
app.get('/health', async (req, res) => {
  try {
    await db.query('SELECT 1');
    res.status(200).json({ status: 'ok', db: 'connected' });
  } catch (err) {
    res.status(503).json({ status: 'error', db: 'disconnected' });
  }
});

Python/Flask:

python

@app.route('/health')
def health():
    return {'status': 'ok'}, 200

Common database health checks:

yaml

# PostgreSQL
healthcheck:
  test: ["CMD-SHELL", "pg_isready -U $POSTGRES_USER -d $POSTGRES_DB"]
  interval: 10s
  timeout: 5s
  retries: 5
  start_period: 30s

# MySQL
healthcheck:
  test: ["CMD", "mysqladmin", "ping", "-h", "localhost", "-u", "root", "-p$MYSQL_ROOT_PASSWORD"]
  interval: 10s
  timeout: 5s
  retries: 5

# Redis
healthcheck:
  test: ["CMD", "redis-cli", "ping"]
  interval: 10s
  timeout: 5s
  retries: 3

# MongoDB
healthcheck:
  test: ["CMD", "mongosh", "--eval", "db.adminCommand('ping')"]
  interval: 10s
  timeout: 5s
  retries: 3

8Handle Docker Compose service dependencies

When using depends_on with health checks, ensure the dependency chain is correct:

yaml

services:
  db:
    image: postgres:15
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 10s
      timeout: 5s
      retries: 5
      start_period: 30s

  app:
    image: myapp:latest
    depends_on:
      db:
        condition: service_healthy
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 60s

If services are stuck waiting for health:

1. Check the dependency's health status:

bash

docker compose ps
docker inspect --format='{{json .State.Health}}' <service_name> | jq

2. Start services without waiting:

bash

# Temporarily disable health check waiting
docker compose up --no-deps app

3. Increase timeouts if the dependency is slow to start:

yaml

depends_on:
  db:
    condition: service_healthy
    restart: true  # Restart if dependency becomes unhealthy

How to fix health check failed in Docker

What this error means

Typical symptoms

Common causes

How to fix "health check failed"

Advanced notes

Related errors

Official resources & further reading