How to fix Requested WAL segment has already been removed in PostgreSQL

PostgreSQLADVANCEDHIGH

This streaming replication error occurs when a standby server requests a WAL segment that has been deleted from the primary server. It indicates the standby has fallen too far behind and cannot catch up without rebuilding or using WAL archiving.

What this error means

PostgreSQL uses Write-Ahead Log (WAL) files to ensure data durability and enable streaming replication. When a standby (replica) server requests a specific WAL segment from the primary server, the primary may have already deleted that segment if the standby has fallen too far behind. This happens when the replication lag exceeds the amount of WAL data the primary is configured to retain. The error signals that replication has failed because the standby cannot fetch the needed WAL segment from the primary—it has been garbage collected based on the wal_keep_size (PostgreSQL 13+) or wal_keep_segments (older versions) settings, or when using replication slots with max_slot_wal_keep_size limits.

How to fix "Requested WAL segment has already been removed"

1Check the replication lag on the primary

Connect to the primary server and check how far the standby has fallen behind:

sql

SELECT slot_name, restart_lsn, confirmed_flush_lsn 
FROM pg_replication_slots;

If using streaming replication without slots, check the pg_stat_replication view:

sql

SELECT client_addr, state, flush_lsn 
FROM pg_stat_replication;

A NULL or very old confirmed_flush_lsn indicates the standby is severely behind.

2Increase wal_keep_size or wal_keep_segments

If using PostgreSQL 13 or later, increase wal_keep_size (in MB) in postgresql.conf:

ini

wal_keep_size = 4096  # Retain 4 GB of WAL; adjust based on your replication lag tolerance

For PostgreSQL 12 and earlier, use wal_keep_segments (in 16 MB segments):

ini

wal_keep_segments = 256  # Retain 256 * 16 MB = 4 GB of WAL

Reload the configuration:

bash

sudo systemctl reload postgresql
# or
sudo -u postgres pg_ctl reload -D /var/lib/postgresql/data

Be cautious: setting this too high can exhaust disk space on the primary.

3Consider using replication slots

Replication slots ensure WAL files are retained indefinitely until consumed by the standby, without relying on wal_keep_size guessing. Create a slot on the primary:

sql

SELECT * FROM pg_create_physical_replication_slot('standby_slot');

Then configure the standby to use this slot in recovery.conf or postgresql.conf:

ini

primary_slot_name = 'standby_slot'

Optionally limit WAL buildup with max_slot_wal_keep_size to prevent disk exhaustion:

ini

max_slot_wal_keep_size = 4096  # In MB

4Rebuild the standby if WAL archiving is not available

If the standby has fallen too far behind and you don't have WAL archiving set up, the only reliable recovery is to rebuild it:

bash

# Stop the standby
sudo systemctl stop postgresql

# Remove the old data directory
sudo rm -rf /var/lib/postgresql/data/*

# Run pg_basebackup from the primary
sudo -u postgres pg_basebackup -h <primary_ip> -D /var/lib/postgresql/data -U replication -v -P -W

# Start the standby
sudo systemctl start postgresql

For pg_basebackup, use the -Xs flag (streaming) instead of -Xf (fetch) to avoid this issue in the future:

bash

sudo -u postgres pg_basebackup -h <primary_ip> -D /var/lib/postgresql/data -U replication -v -P -W -Xs

5Set up WAL archiving to enable standby recovery

If you have an archive location (S3, NFS, etc.), enable WAL archiving on the primary to give standby servers a fallback source:

ini

# postgresql.conf on primary
wal_level = replica
archive_mode = on
archive_command = 'aws s3 cp pg_wal/%f s3://my-bucket/wal-archive/%f'
archive_timeout = 300  # Archive every 5 minutes regardless of size

Reload the configuration and test:

bash

sudo systemctl reload postgresql

On the standby, configure recovery to fetch from the archive:

ini

# recovery.conf (or in postgresql.conf)
restore_command = 'aws s3 cp s3://my-bucket/wal-archive/%f pg_wal/%f'

This allows the standby to recover missing WAL segments without requiring a full rebuild.

6Monitor replication health and adjust retention settings

After recovery, monitor to prevent future issues:

sql

-- View replication slot status
SELECT slot_name, slot_type, restart_lsn, confirmed_flush_lsn,
       pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) / 1024.0 / 1024.0 AS retained_wal_mb
FROM pg_replication_slots;

-- View standby progress
SELECT pid, usename, state, write_lsn, flush_lsn, replay_lsn,
       pg_wal_lsn_diff(pg_current_wal_lsn(), flush_lsn) / 1024.0 / 1024.0 AS replication_lag_mb
FROM pg_stat_replication;

Based on the replication lag observed, set wal_keep_size (or wal_keep_segments) to a value that accommodates normal slowdowns plus a safety margin:

ini

# Set based on peak replication lag observed + 1-2 GB buffer
wal_keep_size = 6144  # 6 GB

How to fix Requested WAL segment has already been removed in PostgreSQL

What this error means

Typical symptoms

Common causes

How to fix "Requested WAL segment has already been removed"

Advanced notes

Related errors

Official resources & further reading