Logical Replication Conflict Resolution

criticalIncident Response

Diagnosing and resolving conflicts in PostgreSQL logical replication when unique constraints, permissions, or row-level security prevent replication from proceeding.

Prompt: My PostgreSQL logical replication subscription stopped working with unique constraint violation errors. The replication worker keeps failing and I'm seeing apply errors in pg_stat_subscription. How do I diagnose what's causing the conflict and safely resume replication without losing data?

With vs Without Schema

Same prompt, same model (claude-sonnet-4-5-20250929) — the only difference is whether Schema's knowledge graph was available.

Tokens
1,224vs1,776
Latency
25.9smsvs33.2sms
Turns
2vs2
Tool Calls
0vs0
Schema Calls
0vs0
Response
3,838 charsvs5,754 chars
Evaluation
Without Schema
C+
vs
With Schema
B+
The response without Schema provides a well-organized approach but contains a critical technical error: the `ALTER SUBSCRIPTION ... SKIP (lsn = ...)` syntax does not exist in PostgreSQL, which could lead operators down a dead-end. The response with Schema uses the correct `pg_replication_origin_advance()` function for skipping transactions and includes newer PostgreSQL 17+ features like `conflict_action = 'skip'`. The with-Schema response also provides a more comprehensive diagnostic script and detailed safe recovery procedure. Both responses cover similar ground in terms of structure and approach, but the technical accuracy and currency of the with-Schema version makes it significantly more reliable for an operator facing a production issue.
Without Schema
I'll help you diagnose and resolve the PostgreSQL logical replication unique constraint violation issue. Here's a systematic approach: ## 1. **Check Current Subscription Status** ```sql -- View subscription status and errors SELECT subname, pid, received_lsn, latest_end_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_time, state, sync_state FROM pg_stat_subscription; -- Check for specific error messages SELECT * FROM pg_stat_subscription_stats; ``` ## 2. **Find the Conflicting Data** ```sql -- Check PostgreSQL logs for the specific constraint violation -- Look for messages like: "duplicate key value violates unique constraint" -- Identify the table and key causing issues from the error message -- Then query both publisher and subscriber: -- On SUBSCRIBER - find the existing conflicting row SELECT * FROM your_table WHERE id = <conflicting_id>; -- On PUBLISHER - see what's trying to replicate SELECT * FROM your_table WHERE id = <conflicting_id>; ``` ## 3. **Common Causes** - **Manual changes on subscriber** - data was inserted/modified directly on the replica - **Subscription re-initialization** - trying to replicate data that already exists - **Multiple publications** - same data being replicated from different sources - **Failed initial sync** - `COPY` phase didn't complete cleanly ## 4. **Resolution Strategies** ### Option A: **Skip the Conflicting Transaction** (use carefully) ```sql -- First, disable the subscription ALTER SUBSCRIPTION your_sub DISABLE; -- Get the current LSN position SELECT received_lsn FROM pg_stat_subscription WHERE subname = 'your_sub'; -- Skip to next LSN (advances past the failing transaction) -- Note: This requires knowing the exact LSN causing issues ALTER SUBSCRIPTION your_sub SKIP (lsn = '<problematic_lsn>'); -- Re-enable subscription ALTER SUBSCRIPTION your_sub ENABLE; ``` ### Option B: **Fix the Conflicting Data** ```sql -- If subscriber has stale/incorrect data, remove it DELETE FROM your_table WHERE id = <conflicting_id>; -- Or update it to match what's expected -- Then the replication worker should automatically retry ``` ### Option C: **Refresh the Subscription** (cleanest but requires downtime) ```sql -- Drop and recreate the subscription ALTER SUBSCRIPTION your_sub DISABLE; ALTER SUBSCRIPTION your_sub SET (slot_name = NONE); DROP SUBSCRIPTION your_sub; -- Ensure data is in sync (compare row counts, checksums) -- Then recreate: CREATE SUBSCRIPTION your_sub CONNECTION 'host=publisher dbname=yourdb' PUBLICATION your_pub WITH (copy_data = true); -- Re-sync all data ``` ## 5. **Diagnostic Query to Find Conflicts** ```sql -- Find potential unique constraint violations before they occur -- Compare publisher vs subscriber data -- On PUBLISHER: SELECT id, COUNT(*) FROM your_table GROUP BY id HAVING COUNT(*) > 1; -- Check if IDs on subscriber would conflict with incoming data -- This helps identify the scope of the problem ``` ## 6. **Safe Resume Checklist** 1. **Identify root cause** - prevent recurrence 2. **Check data consistency** - compare publisher/subscriber counts 3. **Backup before changes** - `pg_dump` on subscriber 4. **Test in staging** - if available 5. **Monitor after resume**: ```sql -- Watch for successful replication SELECT NOW() - last_msg_receipt_time AS lag FROM pg_stat_subscription; ``` ## 7. **Prevent Future Issues** - **Never modify data directly on subscriber** - it's read-only in logical replication - **Monitor replication lag** regularly - **Set up alerts** for subscription state changes - **Use conflict detection**: Consider triggers that log conflicts instead of failing Would you like me to help you create specific diagnostic queries based on your table structure, or walk through any of these resolution strategies in more detail?
With Schema
I'll help you diagnose and resolve this PostgreSQL logical replication issue. Here's a systematic approach: ## **1. Identify the Specific Error** First, check what's happening: ```sql -- Check subscription status and error details SELECT subname, pid, received_lsn, latest_end_lsn, last_msg_send_time, last_msg_receipt_time, latest_end_time, srsubstate FROM pg_stat_subscription; -- Get the specific error message SELECT * FROM pg_stat_subscription_stats; -- Check server logs for detailed error messages -- Look in your PostgreSQL log file for messages like: -- "ERROR: duplicate key value violates unique constraint" ``` ## **2. Find the Conflicting Row** The error message should tell you which table and constraint failed: ```sql -- Check the replication origin status SELECT * FROM pg_replication_origin_status; -- Look at worker logs for the exact conflicting data -- The log will show something like: -- ERROR: duplicate key value violates unique constraint "table_pkey" -- DETAIL: Key (id)=(123) already exists. ``` ## **3. Diagnose the Root Cause** Common causes: - **Direct writes to subscriber** - Data was inserted/modified on the subscriber outside of replication - **Subscription reinitialized** - Data was already present when subscription started - **Multiple publications** - Overlapping publications causing duplicate applies - **Failed transaction replay** - Previous partial application left data Check for direct writes: ```sql -- On subscriber, check if writes are happening locally SELECT schemaname, tablename, n_tup_ins, n_tup_upd, n_tup_del FROM pg_stat_user_tables WHERE schemaname NOT IN ('pg_catalog', 'information_schema'); ``` ## **4. Resolution Strategies** ### **Option A: Skip the Conflicting Transaction (Quick Fix)** ⚠️ **Use only if you're certain the conflicting data is identical** ```sql -- Disable the subscription temporarily ALTER SUBSCRIPTION your_subscription DISABLE; -- Advance the replication origin past the problematic transaction -- First, find the current LSN from pg_stat_subscription SELECT pg_replication_origin_advance('pg_<subscription_oid>', '<failing_lsn>'); -- Re-enable subscription ALTER SUBSCRIPTION your_subscription ENABLE; ``` ### **Option B: Remove Conflicting Data on Subscriber** If the publisher's data is authoritative: ```sql -- Disable subscription ALTER SUBSCRIPTION your_subscription DISABLE; -- Delete the conflicting row(s) on subscriber -- Example: DELETE FROM your_table WHERE id = 123; -- Re-enable subscription ALTER SUBSCRIPTION your_subscription ENABLE; ``` ### **Option C: Refresh the Subscription** For systematic conflicts across many rows: ```sql -- This will resync all tables (causes downtime) ALTER SUBSCRIPTION your_subscription REFRESH PUBLICATION WITH (copy_data = true); ``` ### **Option D: Use conflict_action (PostgreSQL 17+)** If you're on PostgreSQL 17 or later: ```sql -- Set subscription to skip conflicts automatically ALTER SUBSCRIPTION your_subscription SET (conflict_action = 'skip'); ``` ## **5. Detailed Diagnosis Steps** Create a diagnostic script: ```sql -- Save this as diagnose_replication.sql -- 1. Check subscription state \echo '=== Subscription Status ===' SELECT subname, subenabled, subslotname, subpublications, subsynccommit FROM pg_subscription; -- 2. Check worker status \echo '=== Replication Worker Status ===' SELECT subname, pid, relid::regclass as table_name, srsubstate, srrelid FROM pg_stat_subscription s LEFT JOIN pg_subscription_rel r ON s.subid = r.srsubid; -- 3. Check for lag \echo '=== Replication Lag ===' SELECT subname, pg_size_pretty(pg_wal_lsn_diff(received_lsn, latest_end_lsn)) as lag FROM pg_stat_subscription; -- 4. Check error statistics \echo '=== Error Statistics ===' SELECT * FROM pg_stat_subscription_stats; ``` ## **6. Safe Recovery Procedure** Here's a comprehensive recovery process: ```sql -- Step 1: Disable subscription ALTER SUBSCRIPTION your_subscription DISABLE; -- Step 2: Identify the conflicting data -- Check PostgreSQL logs for the exact error -- Step 3: Compare data between publisher and subscriber -- On publisher: SELECT * FROM your_table WHERE id = <conflicting_id>; -- On subscriber: SELECT * FROM your_table WHERE id = <conflicting_id>; -- Step 4: Decide on resolution -- If data is identical -> skip transaction -- If subscriber data is stale -> delete and re-replicate -- If subscriber has newer data -> manual merge needed -- Step 5: Apply fix (example: removing duplicate) DELETE FROM your_table WHERE id = <conflicting_id>; -- Step 6: Re-enable subscription ALTER SUBSCRIPTION your_subscription ENABLE; -- Step 7: Monitor SELECT * FROM pg_stat_subscription; ``` ## **7. Prevention** To avoid future issues: ```sql -- 1. Make subscriber tables read-only (if appropriate) REVOKE INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public FROM your_app_user; -- 2. Monitor replication lag CREATE VIEW replication_health AS SELECT subname, CASE WHEN pid IS NULL THEN 'STOPPED' WHEN srsubstate = 'r' THEN 'RUNNING' ELSE srsubstate::text END as state, pg_wal_lsn_diff(received_lsn, latest_end_lsn) as lag_bytes, latest_end_time FROM pg_stat_subscription; -- 3. Set up alerting -- Monitor: state != 'RUNNING' or lag_bytes > threshold ``` ## **What information do you need?** To give you more specific guidance, could you share: 1. The exact error message from the PostgreSQL logs? 2. What version of PostgreSQL you're running? 3. Is the subscriber supposed to be read-only, or do you have bidirectional replication? 4. How did the subscription initially get set up? This will help me provide a more targeted solution for your specific situation.

Agent Playbook

When an agent encounters this scenario, Schema provides these diagnostic steps automatically.

When logical replication stops with constraint violations, start by confirming the replication state and error count in pg_stat_subscription, then extract the specific conflict details from subscriber server logs including the LSN and transaction contents. Identify whether it's a unique constraint conflict, permissions issue, or row-level security problem before deciding whether to fix the conflicting data or skip the transaction—always understanding that skipping can introduce data inconsistencies.

1Verify subscription state and error accumulation
Check `apply_error_count` and `subscription_state` in pg_stat_subscription to confirm the worker has stopped. A non-zero `apply_error_count` combined with a state other than 'streaming' confirms replication is halted. Compare `received_lsn` with the last successfully applied LSN to quantify how far behind the subscriber has fallen—this tells you how much data is queued once you resolve the conflict.
2Extract conflict details from subscriber server logs
Look in the subscriber's PostgreSQL server log for ERROR messages containing 'CONTEXT: processing remote data for replication origin'. The log entry will show the exact constraint violation (e.g., 'duplicate key value violates unique constraint'), the conflicting table and column values, the finish LSN of the transaction, and the replication origin name. This LSN is critical—you'll need it whether you choose to fix the data or skip the transaction.
3Query the conflicting row on the subscriber
Using the table name and key values from the error log, SELECT the conflicting row on the subscriber to understand what data already exists. Compare this with the same row on the publisher to determine if the subscriber has stale data, duplicate data from a previous sync, or legitimate local modifications. For insert_exists or update_exists conflicts, the row on the subscriber may be identical to what's being replicated (harmless) or different (data loss risk if skipped).
4Rule out permissions and row-level security as the root cause
Verify the subscription owner role has INSERT, UPDATE, DELETE, and TRUNCATE grants on the target table—run \dp in psql or query pg_class and pg_roles. Check if row-level security is enabled on the subscriber table with \d+ tablename; if RLS policies exist, they'll cause replication conflicts even for operations that would normally succeed. These permission and RLS issues manifest as different error messages than constraint violations, so if your logs clearly show unique constraint errors, you can deprioritize this step.
5Decide whether to fix data or skip the conflicting transaction
If the subscriber row is stale or incorrect, DELETE or UPDATE it on the subscriber to allow the replicated change to succeed, then let replication retry automatically. If the subscriber data is correct and matches the publisher, you can skip the transaction using ALTER SUBSCRIPTION yoursubname SKIP (lsn => 'finish_lsn_from_logs'). Be aware that skipping discards the entire transaction, not just the conflicting statement—if the transaction contained other changes, those are lost too, creating subscriber inconsistency.
6Monitor for recurring conflicts and silent data divergence
After replication resumes, watch `apply_error_count` and `sync_error_count` for new conflicts—recurring errors suggest an ongoing issue like a misconfigured application writing to the subscriber. Also be aware that UPDATE or DELETE operations on rows that don't exist on the subscriber are silently skipped without errors, so implement periodic data validation queries comparing row counts and checksums between publisher and subscriber to catch missing data divergence.

Technologies

Related Insights

Relevant Metrics

Monitoring Interfaces

PostgreSQL Datadog
PostgreSQL Native
PostgreSQL Prometheus Exporter (pg_exporter)