This error occurs when trying to create a unique index on a table that contains duplicate values in the indexed column(s). PostgreSQL refuses to create the index because existing data violates the uniqueness constraint. Fixing requires finding and removing duplicates before creating the index.
PostgreSQL cannot create a unique index when the table already contains duplicate values in the column(s) being indexed. A unique index enforces that all values in the indexed column(s) are distinct. When you attempt to create the index with `CREATE UNIQUE INDEX`, PostgreSQL scans the table and validates uniqueness. If duplicates are found, it rejects the operation with an error message like "ERROR: could not create unique index "index_name": Key (column)=(value) is duplicated." This error commonly occurs during database migrations, schema alterations, or after importing data from external sources. Unlike the `duplicate key constraint violation` error (which occurs during INSERT/UPDATE), this error prevents you from establishing the constraint in the first place—you must clean the data before creating the index.
Use a GROUP BY query to find which values appear more than once.
-- Find duplicates in a single column:
SELECT column_name, COUNT(*) as count
FROM your_table
GROUP BY column_name
HAVING COUNT(*) > 1;
-- For composite indexes (multiple columns):
SELECT column1, column2, COUNT(*) as count
FROM your_table
GROUP BY column1, column2
HAVING COUNT(*) > 1;
-- To see the actual rows that are duplicated:
SELECT * FROM your_table
WHERE column_name IN (
SELECT column_name FROM your_table
GROUP BY column_name
HAVING COUNT(*) > 1
)
ORDER BY column_name;This shows you exactly which values violate uniqueness and helps you decide which rows to delete.
You have several strategies to choose which duplicate to retain:
Strategy 1: Keep the earliest row (lowest ctid)
-- This keeps the first row inserted, deletes later duplicates:
DELETE FROM your_table a
USING your_table b
WHERE a.ctid > b.ctid
AND a.column_name = b.column_name;Strategy 2: Keep the most recently updated row
-- Assumes you have an updated_at or created_at timestamp:
DELETE FROM your_table a
USING your_table b
WHERE a.column_name = b.column_name
AND a.id < b.id
AND a.updated_at < b.updated_at;
-- Then keep only the most recent:
DELETE FROM your_table a
USING your_table b
WHERE a.column_name = b.column_name
AND a.updated_at < b.updated_at;Strategy 3: Manual review for composite indexes
If multiple columns are involved, examine each duplicate set and decide based on business logic:
-- See all duplicates for a composite key:
SELECT *
FROM your_table
WHERE (column1, column2) IN (
SELECT column1, column2
FROM your_table
GROUP BY column1, column2
HAVING COUNT(*) > 1
)
ORDER BY column1, column2, id;Choose your strategy based on which version of the data is most correct or useful.
Execute the DELETE statement based on your chosen strategy. Always verify first with a SELECT (wrap it in a CTE).
-- SAFE: Preview which rows will be deleted:
WITH to_delete AS (
SELECT ctid FROM your_table a
USING your_table b
WHERE a.ctid > b.ctid
AND a.column_name = b.column_name
)
SELECT COUNT(*) FROM to_delete;
-- If the count looks correct, execute the DELETE:
DELETE FROM your_table a
USING your_table b
WHERE a.ctid > b.ctid
AND a.column_name = b.column_name;
-- Verify duplicates are gone:
SELECT column_name, COUNT(*) FROM your_table
GROUP BY column_name
HAVING COUNT(*) > 1;
-- Should return no rowsIf using timestamps instead of ctid, adjust the comparison accordingly. Always backup your data before running DELETE.
If the duplicates are in a SERIAL or BIGSERIAL column, the sequence may be out of sync. Reset it after deleting duplicates.
-- Find the sequence name:
SELECT pg_get_serial_sequence('your_table', 'id');
-- Result example: public.your_table_id_seq
-- Reset the sequence to the next available value:
SELECT setval('your_table_id_seq', (SELECT MAX(id) FROM your_table) + 1);
-- Verify the sequence is correct:
SELECT nextval('your_table_id_seq');
-- Should return MAX(id) + 2This ensures that new auto-generated IDs will not conflict with existing data.
Now that all duplicates are eliminated, the unique index can be created successfully.
-- Single column unique index:
CREATE UNIQUE INDEX index_name ON your_table (column_name);
-- Composite unique index (multiple columns):
CREATE UNIQUE INDEX index_name ON your_table (column1, column2);
-- With optional CONCURRENTLY clause (for production tables, does not block reads/writes):
CREATE UNIQUE INDEX CONCURRENTLY index_name ON your_table (column_name);The index creation should now succeed without errors. If it fails again, duplicate values were not fully removed—rerun the duplicate detection query to find any remaining conflicts.
If you want PostgreSQL to enforce the uniqueness through a constraint rather than just an index, you can add an explicit unique constraint. However, a unique index alone is sufficient.
-- Add a unique constraint (creates an underlying unique index):
ALTER TABLE your_table
ADD CONSTRAINT constraint_name UNIQUE (column_name);
-- Or create a constraint without a separate index (uses existing unique index if available):
ALTER TABLE your_table
ADD CONSTRAINT constraint_name UNIQUE (column_name) USING INDEX index_name;This makes the intent explicit in your schema and allows you to drop the constraint (which also drops the index) if needed later.
For large tables, use CONCURRENTLY when creating unique indexes to avoid long write-locks: CREATE UNIQUE INDEX CONCURRENTLY .... This allows concurrent INSERT/UPDATE operations while the index is being built, though it takes longer and requires more disk space.
For tables with many duplicates, consider using a materialized view or temporary table to process deduplication more efficiently: SELECT DISTINCT ON (column_name) * INTO temp_table FROM your_table ORDER BY column_name, id; TRUNCATE your_table; INSERT INTO your_table SELECT * FROM temp_table; However, this approach requires careful attention to foreign key constraints and data ordering.
If the table is very large and you cannot afford downtime, consider using CONCURRENTLY and gradually deleting duplicates in batches rather than in a single transaction. This reduces lock contention and allows the application to continue operating.
PostgreSQL stores NULL values as distinct from each other in unique indexes, so you can have multiple NULL values even in a unique column. If you need to prevent duplicate NULLs, create the constraint with NULLS NOT DISTINCT: CREATE UNIQUE INDEX index_name ON table_name (column_name NULLS NOT DISTINCT); (PostgreSQL 15+).
insufficient columns in unique constraint for partition key
How to fix "insufficient columns in unique constraint for partition key" in PostgreSQL
ERROR 42501: must be owner of table
How to fix "must be owner of table" in PostgreSQL
trigger cannot change partition destination
How to fix "Trigger cannot change partition destination" in PostgreSQL
SSL error: certificate does not match host name
SSL error: certificate does not match host name in PostgreSQL
No SSL connection
No SSL connection to PostgreSQL