How to fix vacuum failsafe triggered in PostgreSQL

PostgreSQLADVANCEDCRITICAL

PostgreSQL has activated its vacuum failsafe mechanism to prevent transaction ID wraparound. This emergency response occurs when a table's transaction ID age exceeds the safety threshold and autovacuum hasn't kept up with freezing old transactions, risking database shutdown.

What this error means

The "vacuum failsafe triggered" message appears when PostgreSQL detects that one or more tables are dangerously close to transaction ID (XID) wraparound—a critical condition where transaction IDs run out and the database can become unstable or fail. The failsafe mechanism is PostgreSQL's emergency response to prevent this catastrophic scenario. When triggered, VACUUM abandons normal resource-friendly behavior and aggressively freezes old tuples to prevent XID wraparound. This typically indicates that normal autovacuum maintenance wasn't able to keep up with transaction load on your database. The failsafe activates when a table's age exceeds the vacuum_failsafe_age parameter (default 1.6 billion transactions), which is well beyond the normal autovacuum_freeze_max_age threshold (default 200 million).

How to fix "vacuum failsafe triggered"

1Check transaction ID age of affected tables

First, identify which tables are approaching transaction ID wraparound and are triggering failsafe:

sql

SELECT
  schemaname,
  tablename,
  age(relfrozenxid) AS age,
  2147483647 - age(relfrozenxid) AS xids_remaining
FROM pg_class
JOIN pg_tables ON pg_class.relname = pg_tables.tablename
ORDER BY age(relfrozenxid) DESC
LIMIT 20;

Tables showing age over 1.6 billion or remaining XIDs under 500 million need immediate action. If any table shows remaining XIDs under 100 million, the database is in critical danger.

2Verify autovacuum is enabled and working

Check that autovacuum is enabled globally and on affected tables:

sql

-- Check global setting
SHOW autovacuum;

-- Check per-table setting and reloptions
SELECT schemaname, tablename, reloptions FROM pg_tables
WHERE tablename IN (SELECT tablename FROM pg_tables);

If autovacuum is disabled, enable it immediately:

sql

ALTER SYSTEM SET autovacuum = on;
SELECT pg_reload_conf();

Even if autovacuum is enabled, verify it's actually running. A failsafe trigger means normal autovacuum hasn't kept up.

3Identify and close long-running transactions

Long-running transactions block VACUUM from freezing old tuples. Find blocking transactions:

sql

SELECT
  pid,
  usename,
  state,
  xact_start,
  now() - xact_start AS transaction_duration,
  query
FROM pg_stat_activity
WHERE state IN ('idle in transaction', 'active')
  AND xact_start IS NOT NULL
ORDER BY xact_start;

Terminate long-running idle transactions (after confirming they're not critical):

sql

SELECT pg_terminate_backend(pid) FROM pg_stat_activity
WHERE pid != pg_backend_pid()
  AND state = 'idle in transaction'
  AND xact_start < now() - interval '1 hour';

Coordinate with application teams before terminating production transactions.

4Tune autovacuum parameters for more aggressive vacuum

Adjust autovacuum settings to work faster and more frequently:

sql

ALTER SYSTEM SET autovacuum_naptime = '10s';  -- Check for work every 10 seconds
ALTER SYSTEM SET autovacuum_vacuum_cost_delay = '1ms';  -- Process faster
ALTER SYSTEM SET autovacuum_vacuum_cost_limit = 5000;  -- More work per cycle
ALTER SYSTEM SET autovacuum_max_workers = 4;  -- More parallel workers
SELECT pg_reload_conf();

For specific problem tables with high write volume:

sql

ALTER TABLE your_high_volume_table SET (
  autovacuum_vacuum_cost_delay = 0,
  autovacuum_vacuum_cost_limit = 10000
);

These changes take effect immediately after pg_reload_conf() for new autovacuum workers.

5Let failsafe vacuum complete or manually accelerate if critical

When failsafe is triggered, PostgreSQL automatically runs aggressive VACUUM with high priority. Monitor progress:

sql

SELECT
  pid,
  usename,
  application_name,
  state,
  query_start,
  now() - query_start AS duration,
  query
FROM pg_stat_activity
WHERE query ILIKE '%vacuum%'
  OR state = 'in vacuum';

For PostgreSQL 13+, check detailed progress:

sql

SELECT p.pid, p.datname, p.relid::regclass AS table_name,
       p.phase, p.heap_blks_total, p.heap_blks_scanned,
       round(100.0 * p.heap_blks_scanned / NULLIF(p.heap_blks_total, 0), 1) AS percent_complete
FROM pg_stat_progress_vacuum p
JOIN pg_stat_activity a ON p.pid = a.pid;

Do NOT cancel failsafe VACUUM - it will restart from the beginning. The aggressive mode is necessary to prevent wraparound.

6Monitor vacuum progress in system logs

Watch PostgreSQL logs for vacuum failsafe progress:

bash

tail -f /var/log/postgresql/postgresql.log | grep -E 'vacuum|failsafe|freeze'

For systemd-based systems:

bash

journalctl -u postgresql -f | grep -E 'vacuum|failsafe'

Look for messages indicating:
- "vacuum failsafe triggered" when failsafe activated
- Completion messages showing tables frozen
- Recovery of transaction ID space

Wait for all VACUUM operations to complete before considering the issue resolved. Once remaining XID space is above 500 million, you have breathing room.

7Optimize autovacuum_freeze_max_age after recovery

Once the failsafe event is resolved and vacuum completes, consider adjusting autovacuum_freeze_max_age:

sql

-- Increase from default 200M to a value your system can handle
-- E.g., if you have high transaction rate, try 300-500M
ALTER SYSTEM SET autovacuum_freeze_max_age = 300000000;
SELECT pg_reload_conf();

Important: Only increase this if:
1. Your monitoring shows steady transaction rates can handle it
2. You have adequate disk space (larger values = more pg_xact usage)
3. Your autovacuum workers are tuned to keep pace

The absolute maximum value is 2 billion transactions, but safety margins matter.

Advanced notes

Transaction ID Wraparound Mechanics: PostgreSQL uses 32-bit transaction IDs (XID), providing approximately 4 billion unique IDs. The system divides this into windows: 2 billion past, present, and 2 billion future. When XIDs approach wraparound, tuple visibility checks could incorrectly mark live rows as dead, causing data corruption. The autovacuum_freeze_max_age parameter (default 200M) triggers preventative VACUUM to freeze tuples before wraparound risk. The vacuum_failsafe_age parameter (default 1.6B) is the absolute last-resort threshold—when crossed, failsafe mode activates.

Failsafe Behavior Details: When vacuum failsafe triggers, VACUUM: (1) removes cost-based delays entirely, allowing unlimited I/O; (2) skips non-essential work like index cleanup and empty page truncation; (3) disables Buffer Access Strategy to use all shared_buffers; (4) focuses exclusively on freezing old tuples. This aggressive mode will temporarily impact query performance but is necessary to prevent database shutdown.

Multixact Wraparound: PostgreSQL also tracks multixact IDs for row-level locks. The vacuum_multixact_failsafe_age parameter similarly protects against multixact wraparound. If you see both types of failsafe messages, investigate both XID and multixact ages using age(relfrozenxid) and age(relminmxid).

Managed Services: AWS RDS, Azure Database for PostgreSQL, and Google Cloud SQL may have autovacuum pre-configured differently. Check service documentation—some restrict parameter changes or have different defaults. If failsafe triggers on a managed database, contact support immediately, as it signals serious maintenance issues that may require service-level intervention.

Prevention Best Practices: Monitor transaction ID age proactively (aim to keep age below 50% of autovacuum_freeze_max_age). Implement alerts when age exceeds 100M transactions. For high-volume databases, consider partitioning large tables so autovacuum can work on smaller chunks concurrently. Ensure long-running transactions are minimized and connection pooling is used to prevent transaction bloat.

How to fix vacuum failsafe triggered in PostgreSQL

What this error means

Typical symptoms

Common causes

How to fix "vacuum failsafe triggered"

Advanced notes

Related errors

Official resources & further reading