Re: Commit to primary with unavailable sync standby

Fabio Ugo Venchiarutti Thu, 19 Dec 2019 04:55:11 -0800



On 19/12/2019 12:25, Andrey Borodin wrote:

Hi Fabio!

Thanks for looking into this.

19 дек. 2019 г., в 17:14, Fabio Ugo Venchiarutti <f.venchiaru...@ocado.com> 
написал(а):


You're hitting the CAP theorem ( https://en.wikipedia.org/wiki/CAP_theorem )


You cannot do it with fewer than 3 nodes, as the moment you set your standby to 
synchronous to achieve consistency, both your nodes become single points of 
failure.

We have 3 nodes, and the problem is reproducible with all standbys being 
synchronous.

With 3 or more nodes you can perform what is called a quorum write against ( 
floor(<total_nodes> / 2) + 1 ) nodes .

The problem seems to be reproducible in quorum commit too.

With 3+ nodes, the "easy" strategy is to set a <quorum - 1> number of standby 
nodes in synchronous_standby_names ( 
https://www.postgresql.org/docs/current/runtime-config-replication.html#GUC-SYNCHRONOUS-STANDBY-NAMES
 )


This however makes it tricky to pick the correct standby for promotions during 
auto-failovers, as you need to freeze all the standbys listed in the above 
setting in order to correctly determine which one has the highest WAL location 
without running into race conditions (as the operation is non-atomic, stateful 
and sticky).

After promotion of any standby we still can commit to old primary with the 
combination of cancel and retry.

AFAICT this pseudo-idempotency issue can only be solved if every queryis validated against the quorum.

A quick-and-dirty solution would be to wrap the whole thing in a CTEwhich also returns a count from pg_stat_replication (a stray/partitionedmaster would have less than (quorum - 1 standbys).(May be possible to do it directly in the RETURNING clause, I don't havea backend handy test that).

You can either look into the result at the client or force an error viasome bad cast/zero division in the query.

All the above is however still subject to (admittedly tight) raceconditions.

This problem is precisely why I don't use any of the off-the shelfsolutions: last time I checked none of that had a connectionproxy/router to direct clients to the real master and not a node thatthinks it is.

I personally prefer to designate a fixed synchronous set at setup time and 
automatically set a static synchronous_standby_names on the master whenever a 
failover occurs. That allows for a simpler failover mechanism as you know they 
got the latest WAL location.

No, synchronous standby does not necessarily own latest WAL. It has WAL point 
no earlier than all commits acknowledged to client.

You're right. I should have said "latest WAL holding an acknowledgedtransaction"


Thanks!

Best regards, Andrey Borodin.


--
Regards

Fabio Ugo Venchiarutti
OSPCFC Network Engineering Dpt.
Ocado Technology

--

Notice:This email is confidential and may contain copyright material ofmembers of the Ocado Group. Opinions and views expressed in this messagemay not necessarily reflect the opinions and views of the members of theOcado Group.

If you are not the intended recipient, please notify usimmediately and delete all copies of this message. Please note that it isyour responsibility to scan this message for viruses.

References to the"Ocado Group" are to Ocado Group plc (registered in England and Wales withnumber 7098618) and its subsidiary undertakings (as that expression isdefined in the Companies Act 2006) from time to time. The registered officeof Ocado Group plc is Buildings One & Two, Trident Place, Mosquito Way,Hatfield, Hertfordshire, AL10 9UL.

Re: Commit to primary with unavailable sync standby

Reply via email to