PostgreSQL has activated its vacuum failsafe mechanism to prevent transaction ID wraparound. This emergency response occurs when a table's transaction ID age exceeds the safety threshold and autovacuum hasn't kept up with freezing old transactions, risking database shutdown.
The "vacuum failsafe triggered" message appears when PostgreSQL detects that one or more tables are dangerously close to transaction ID (XID) wraparound—a critical condition where transaction IDs run out and the database can become unstable or fail. The failsafe mechanism is PostgreSQL's emergency response to prevent this catastrophic scenario. When triggered, VACUUM abandons normal resource-friendly behavior and aggressively freezes old tuples to prevent XID wraparound. This typically indicates that normal autovacuum maintenance wasn't able to keep up with transaction load on your database. The failsafe activates when a table's age exceeds the vacuum_failsafe_age parameter (default 1.6 billion transactions), which is well beyond the normal autovacuum_freeze_max_age threshold (default 200 million).
First, identify which tables are approaching transaction ID wraparound and are triggering failsafe:
SELECT
schemaname,
tablename,
age(relfrozenxid) AS age,
2147483647 - age(relfrozenxid) AS xids_remaining
FROM pg_class
JOIN pg_tables ON pg_class.relname = pg_tables.tablename
ORDER BY age(relfrozenxid) DESC
LIMIT 20;Tables showing age over 1.6 billion or remaining XIDs under 500 million need immediate action. If any table shows remaining XIDs under 100 million, the database is in critical danger.
Check that autovacuum is enabled globally and on affected tables:
-- Check global setting
SHOW autovacuum;
-- Check per-table setting and reloptions
SELECT schemaname, tablename, reloptions FROM pg_tables
WHERE tablename IN (SELECT tablename FROM pg_tables);If autovacuum is disabled, enable it immediately:
ALTER SYSTEM SET autovacuum = on;
SELECT pg_reload_conf();Even if autovacuum is enabled, verify it's actually running. A failsafe trigger means normal autovacuum hasn't kept up.
Long-running transactions block VACUUM from freezing old tuples. Find blocking transactions:
SELECT
pid,
usename,
state,
xact_start,
now() - xact_start AS transaction_duration,
query
FROM pg_stat_activity
WHERE state IN ('idle in transaction', 'active')
AND xact_start IS NOT NULL
ORDER BY xact_start;Terminate long-running idle transactions (after confirming they're not critical):
SELECT pg_terminate_backend(pid) FROM pg_stat_activity
WHERE pid != pg_backend_pid()
AND state = 'idle in transaction'
AND xact_start < now() - interval '1 hour';Coordinate with application teams before terminating production transactions.
Adjust autovacuum settings to work faster and more frequently:
ALTER SYSTEM SET autovacuum_naptime = '10s'; -- Check for work every 10 seconds
ALTER SYSTEM SET autovacuum_vacuum_cost_delay = '1ms'; -- Process faster
ALTER SYSTEM SET autovacuum_vacuum_cost_limit = 5000; -- More work per cycle
ALTER SYSTEM SET autovacuum_max_workers = 4; -- More parallel workers
SELECT pg_reload_conf();For specific problem tables with high write volume:
ALTER TABLE your_high_volume_table SET (
autovacuum_vacuum_cost_delay = 0,
autovacuum_vacuum_cost_limit = 10000
);These changes take effect immediately after pg_reload_conf() for new autovacuum workers.
When failsafe is triggered, PostgreSQL automatically runs aggressive VACUUM with high priority. Monitor progress:
SELECT
pid,
usename,
application_name,
state,
query_start,
now() - query_start AS duration,
query
FROM pg_stat_activity
WHERE query ILIKE '%vacuum%'
OR state = 'in vacuum';For PostgreSQL 13+, check detailed progress:
SELECT p.pid, p.datname, p.relid::regclass AS table_name,
p.phase, p.heap_blks_total, p.heap_blks_scanned,
round(100.0 * p.heap_blks_scanned / NULLIF(p.heap_blks_total, 0), 1) AS percent_complete
FROM pg_stat_progress_vacuum p
JOIN pg_stat_activity a ON p.pid = a.pid;Do NOT cancel failsafe VACUUM - it will restart from the beginning. The aggressive mode is necessary to prevent wraparound.
Watch PostgreSQL logs for vacuum failsafe progress:
tail -f /var/log/postgresql/postgresql.log | grep -E 'vacuum|failsafe|freeze'For systemd-based systems:
journalctl -u postgresql -f | grep -E 'vacuum|failsafe'Look for messages indicating:
- "vacuum failsafe triggered" when failsafe activated
- Completion messages showing tables frozen
- Recovery of transaction ID space
Wait for all VACUUM operations to complete before considering the issue resolved. Once remaining XID space is above 500 million, you have breathing room.
Once the failsafe event is resolved and vacuum completes, consider adjusting autovacuum_freeze_max_age:
-- Increase from default 200M to a value your system can handle
-- E.g., if you have high transaction rate, try 300-500M
ALTER SYSTEM SET autovacuum_freeze_max_age = 300000000;
SELECT pg_reload_conf();Important: Only increase this if:
1. Your monitoring shows steady transaction rates can handle it
2. You have adequate disk space (larger values = more pg_xact usage)
3. Your autovacuum workers are tuned to keep pace
The absolute maximum value is 2 billion transactions, but safety margins matter.
Transaction ID Wraparound Mechanics: PostgreSQL uses 32-bit transaction IDs (XID), providing approximately 4 billion unique IDs. The system divides this into windows: 2 billion past, present, and 2 billion future. When XIDs approach wraparound, tuple visibility checks could incorrectly mark live rows as dead, causing data corruption. The autovacuum_freeze_max_age parameter (default 200M) triggers preventative VACUUM to freeze tuples before wraparound risk. The vacuum_failsafe_age parameter (default 1.6B) is the absolute last-resort threshold—when crossed, failsafe mode activates.
Failsafe Behavior Details: When vacuum failsafe triggers, VACUUM: (1) removes cost-based delays entirely, allowing unlimited I/O; (2) skips non-essential work like index cleanup and empty page truncation; (3) disables Buffer Access Strategy to use all shared_buffers; (4) focuses exclusively on freezing old tuples. This aggressive mode will temporarily impact query performance but is necessary to prevent database shutdown.
Multixact Wraparound: PostgreSQL also tracks multixact IDs for row-level locks. The vacuum_multixact_failsafe_age parameter similarly protects against multixact wraparound. If you see both types of failsafe messages, investigate both XID and multixact ages using age(relfrozenxid) and age(relminmxid).
Managed Services: AWS RDS, Azure Database for PostgreSQL, and Google Cloud SQL may have autovacuum pre-configured differently. Check service documentation—some restrict parameter changes or have different defaults. If failsafe triggers on a managed database, contact support immediately, as it signals serious maintenance issues that may require service-level intervention.
Prevention Best Practices: Monitor transaction ID age proactively (aim to keep age below 50% of autovacuum_freeze_max_age). Implement alerts when age exceeds 100M transactions. For high-volume databases, consider partitioning large tables so autovacuum can work on smaller chunks concurrently. Ensure long-running transactions are minimized and connection pooling is used to prevent transaction bloat.
PANIC: could not write to file
How to fix PANIC: could not write to file in PostgreSQL
Assert failure
How to fix "Assert failure" in PostgreSQL
insufficient columns in unique constraint for partition key
How to fix "insufficient columns in unique constraint for partition key" in PostgreSQL
ERROR 42501: must be owner of table
How to fix "must be owner of table" in PostgreSQL
trigger cannot change partition destination
How to fix "Trigger cannot change partition destination" in PostgreSQL