On Friday, May 12, 2023 7:58 PM Bharath Rupireddy <bharath.rupireddyforpostg...@gmail.com> wrote: > > On Wed, May 10, 2023 at 3:23 PM Drouvot, Bertrand > <bertranddrouvot...@gmail.com> wrote: > > > > >> My current guess is that mis-using a condition variable is the best > > >> bet. I think it should work to use > > >> ConditionVariablePrepareToSleep() before a WalSndWait(), and then > > >> ConditionVariableCancelSleep(). I.e. to never use > > >> ConditionVariableSleep(). The latch set from > ConditionVariableBroadcast() would still cause the necessary wakeup. > > > > > > How about something like the attached? Recovery and subscription > > > tests don't complain with the patch. > > > > I launched a full Cirrus CI test with it but it failed on one > > environment (did not look in details, just sharing this here): > > https://cirrus-ci.com/task/6570140767092736 > > Yeah, v1 had ConditionVariableInit() such that the CV was getting initialized > for > every backend as opposed to just once after the WAL sender shmem was > created. > > > Also I have a few comments: > > Indeed, v1 was a WIP patch. Please have a look at the attached v2 patch, which > has comments and passing CI runs on all platforms - > https://github.com/BRupireddy/postgres/tree/optimize_walsender_wakeup_ > logic_v2. > > On Wed, May 10, 2023 at 3:41 PM Zhijie Hou (Fujitsu) <houzj.f...@fujitsu.com> > wrote: > > > > if (AllowCascadeReplication()) > > - WalSndWakeup(switchedTLI, true); > > + ConditionVariableBroadcast(&WalSndCtl->cv); > > > > After the change, we wakeup physical walsender regardless of switchedTLI > flag. > > Is this intentional ? if so, I think It would be better to update the > > comments > above this. > > That's not the case with the attached v2 patch. Please have a look.
Thanks for updating the patch. I did some simple primary->standby replication test for the patch and can see the degradation doesn't happen in the replication after applying it[1]. One nitpick in the comment: + * walsenders. It makes WalSndWakeup() callers life easy. callers life easy => callers' life easy. [1] max_wal_senders = 100 before regression(ms) after regression(ms) v2 patch(ms) 13394.4013 14141.2615 13455.2543 Compared with before regression 5.58% 0.45% max_wal_senders = 200 before regression(ms) after regression(ms) v2 patch(ms) 13280.8507 14597.1173 13632.0606 Compared with before regression 9.91% 1.64% max_wal_senders = 300 before regression(ms) after regression(ms) v2 patch(ms) 13535.0232 16735.7379 13705.7135 Compared with before regression 23.65% 1.26% Best Regards, Hou zj