Hi, Thanks for working on the patch!
On 2023-05-15 20:09:00 +0530, Bharath Rupireddy wrote: > > [1] > > max_wal_senders = 100 > > before regression(ms) after regression(ms) v2 patch(ms) > > 13394.4013 14141.2615 13455.2543 > > Compared with before regression 5.58% 0.45% > > > > max_wal_senders = 200 > > before regression(ms) after regression(ms) v2 patch(ms) > > 13280.8507 14597.1173 13632.0606 > > Compared with before regression 9.91% 1.64% > > > > max_wal_senders = 300 > > before regression(ms) after regression(ms) v2 patch(ms) > > 13535.0232 16735.7379 13705.7135 > > Compared with before regression 23.65% 1.26% > > Yes, the numbers with v2 patch look close to where we were before. > Thanks for confirming. Just wondering, where does this extra > 0.45%/1.64%/1.26% coming from? We still do more work for each WAL record than before, so I'd expect something small. I'd say right now the main overhead with the patch comes from the spinlock acquisitions in ConditionVariableBroadcast(), which happen even when nobody is waiting. I'll try to come up with a benchmark without the issues I pointed out in https://postgr.es/m/20230517194331.ficfy5brpfq5lrmz%40awork3.anarazel.de > + ConditionVariableInit(&WalSndCtl->physicalWALSndCV); > + ConditionVariableInit(&WalSndCtl->logicalWALSndCV); It's not obvious to me that it's worth having two CVs, because it's more expensive to find no waiters in two CVs than to find no waiters in one CV. > + /* > + * We use condition variable (CV) to efficiently wake up walsenders in > + * WalSndWakeup(). > + * > + * Every walsender prepares to sleep on a shared memory CV. Note that it > + * just prepares to sleep on the CV (i.e., adds itself to the CV's > + * waitlist), but not actually waits on the CV (IOW, it never calls > + * ConditionVariableSleep()). It still uses WaitEventSetWait() for > waiting, > + * because CV infrastructure doesn't handle FeBe socket events > currently. > + * The processes (startup process, walreceiver etc.) wanting to wake up > + * walsenders use ConditionVariableBroadcast(), which in turn calls > + * SetLatch(), helping walsenders come out of WaitEventSetWait(). > + * > + * This approach is simple and efficient because, one doesn't have to > loop > + * through all the walsenders slots, with a spinlock acquisition and > + * release for every iteration, just to wake up only the waiting > + * walsenders. It makes WalSndWakeup() callers' life easy. > + * > + * XXX: When available, WaitEventSetWait() can be replaced with its CV's > + * counterpart. I don't really understand that XXX - the potential bright future would be to add support for CVs into wait event sets, not to replace WES with a CV? Greetings, Andres Freund