Hi, On 2018-04-11 11:57:20 +1200, Thomas Munro wrote: > Rebased, but I don't actually like this patch any more. Over in > another thread[1] I proposed that we should just make exit(1) the > default behaviour built into latch.c for those cases that don't want > to do something special (eg SyncRepWaitForLSN()), so we don't finish > up duplicating the ugly exit(1) code everywhere (or worse, forgetting > to). Tom Lane seemed to think that was an idea worth pursuing. > > I think what we need for PG12 is a patch that does that, and also > introduces a reused WaitEventSet object in several of these places. > Then eg SyncRepWaitForLSN() won't be interacting with contended kernel > objects every time (assuming an implementation like epoll or > eventually kqueue, completion ports etc is available). > > Then if pgarch_ArchiverCopyLoop() and HandleStartupProcInterrupts() > (ie loops without waiting) adopt a prctl(PR_SET_PDEATHSIG)-based > approach where available as suggested by Andres[2] or fall back to > polling a reusable WaitEventSet (timeout = 0), then there'd be no more > calls to PostmasterIsAlive() outside latch.c.
I'm still unconvinced by this. There's good reasons why code might be busy-looping without checking the latch, and we shouldn't force code to add more latch checks if unnecessary. Resetting the latch frequently can actually increase the amount of context switches considerably. - Andres