On Tue, Dec 27, 2022 at 2:50 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Tue, Dec 27, 2022 at 1:44 PM Hayato Kuroda (Fujitsu) > <kuroda.hay...@fujitsu.com> wrote: > > > > Thanks for checking my proposal! > > > > > - * Note that if we determine that there's still more data to send, this > > > - * function will return control to the caller. > > > + * Note that if we determine that there's still more data to send or we > > > are in > > > + * the physical replication more, this function will return control to > > > the > > > + * caller. > > > > > > I think in this comment you meant to say > > > > > > 1. "or we are in physical replication mode and all WALs are not yet > > > replicated" > > > 2. Typo /replication more/replication mode > > > > Firstly I considered 2, but I thought 1 seemed to be better. > > PSA the updated patch. > > > > I think even for logical replication we should check whether there is > any pending WAL (via pq_is_send_pending()) to be sent. Otherwise, what > is the point to send the done message? Also, the caller of > WalSndDone() already has that check which is another reason why I > can't see why you didn't have the same check in function WalSndDone(). > > BTW, even after fixing this, I think logical replication will behave > differently when due to some reason (like time-delayed replication) > send buffer gets full and walsender is not able to send data. I think > this will be less of an issue with physical replication because there > is a separate walreceiver process to flush the WAL which doesn't wait > but the same is not true for logical replication. Do you have any > thoughts on this matter? >
In logical replication, it can happen today as well without time-delayed replication. Basically, say apply worker is waiting to acquire some lock that is already acquired by some backend then it will have the same behavior. I have not verified this, so you may want to check it once. -- With Regards, Amit Kapila.