On Tue, Jan 31, 2023 at 5:03 PM Ashutosh Bapat <ashutosh.bapat....@gmail.com> wrote: > > On Tue, Jan 31, 2023 at 4:58 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > Thanks, the patch looks good to me. I have slightly adjusted one of > > the comments and ran pgindent. See attached. As mentioned in the > > commit message, we shouldn't backpatch this as this requires a new > > callback and moreover, users can increase the wal_sender_timeout and > > wal_receiver_timeout to avoid this problem. What do you think? > > The callback and the implementation is all in core. What's the risk > you see in backpatching it? >
Because we are changing the exposed structure and which can break existing extensions using it. > Customers can adjust the timeouts, but only after the receiver has > timed out a few times. Replication remains broekn till they notice it > and adjust timeouts. By that time WAL has piled up. It also takes a > few attempts to increase timeouts since the time taken by a > transaction to decode can not be estimated beforehand. All that makes > it worth back-patching if it's possible. We had a customer who piled > up GBs of WAL before realising that this is the problem. Their system > almost came to a halt due to that. > Which version are they using? If they are at >=14, using "streaming = on" for a subscription should also avoid this problem. -- With Regards, Amit Kapila.