Hi!

I cannot figure out proper way to implement safe HA upsert. I will be very 
grateful if someone would help me.

Imagine we have primary server after failover. It is network-partitioned. We 
are doing INSERT ON CONFLICT DO NOTHING; that eventually timed out.

az1-grx88oegoy6mrv2i/db1 M > WITH new_doc AS (
    INSERT INTO t(
        pk,
        v,
        dt
    )
    VALUES
    (
        5,
        'text',
        now()
    )
    ON CONFLICT (pk) DO NOTHING
    RETURNING pk,
              v,
              dt)
   SELECT new_doc.pk from new_doc;
^CCancel request sent
WARNING:  01000: canceling wait for synchronous replication due to user request
DETAIL:  The transaction has already committed locally, but might not have been 
replicated to the standby.
LOCATION:  SyncRepWaitForLSN, syncrep.c:264
Time: 2173.770 ms (00:02.174)

Here our driver decided that something goes wrong and we retry query.

az1-grx88oegoy6mrv2i/db1 M > WITH new_doc AS (
    INSERT INTO t(
        pk,
        v,
        dt
    )
    VALUES
    (
        5,
        'text',
        now()
    )
    ON CONFLICT (pk) DO NOTHING
    RETURNING pk,
              v,
              dt)
   SELECT new_doc.pk from new_doc;
 pk
----
(0 rows)

Time: 4.785 ms

Now we have split-brain, because we acknowledged that row to client.
How can I fix this?

There must be some obvious trick, but I cannot see it... Or maybe cancel of 
sync replication should be disallowed and termination should be treated as 
system failure?

Best regards, Andrey Borodin.

Reply via email to