On Fri, Feb 25, 2022 at 08:31:37PM +0530, Bharath Rupireddy wrote: > Thanks Satya and others for the inputs. Here's the v1 patch that > basically allows async wal senders to wait until the sync standbys > report their flush lsn back to the primary. Please let me know your > thoughts.
I haven't had a chance to look too closely yet, but IIUC this adds a new function that waits for synchronous replication. This new function essentially spins until the synchronous LSN has advanced. I don't think it's a good idea to block sending any WAL like this. AFAICT it is possible that there will be a lot of synchronously replicated WAL that we can send, and it might just be the last several bytes that cannot yet be replicated to the asynchronous standbys. І believe this patch will cause the server to avoid sending _any_ WAL until the synchronous LSN advances. Perhaps we should instead just choose the SendRqstPtr based on the current synchronous LSN. Presumably there are other things we'd need to consider, but in general, I think we ought to send as much WAL as possible for a given call to XLogSendPhysical(). > I've done pgbench testing to see if the patch causes any problems. I > ran tests two times, there isn't much difference in the txns per > seconds (tps), although there's a delay in the async standby receiving > the WAL, after all, that's the feature we are pursuing. I'm curious what a longer pgbench run looks like when the synchronous replicas are in the same region. That is probably a more realistic use-case. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com