On Mon, Jun 7, 2021 at 3:13 PM Amit Kapila <amit.kapil...@gmail.com> wrote:
> On Mon, Jun 7, 2021 at 12:54 PM Kyotaro Horiguchi > <horikyota....@gmail.com> wrote: > > > > At Sat, 5 Jun 2021 16:08:00 +0500, Abbas Butt < > abbas.b...@enterprisedb.com> wrote in > > > Hi, > > > I have observed the following behavior with PostgreSQL 13.3. > > > > > > The WAL sender process sends approximately 500 keepalive messages per > > > second to pg_recvlogical. > > > These keepalive messages are totally un-necessary. > > > Keepalives should be sent only if there is no network traffic and a > certain > > > time (half of wal_sender_timeout) passes. > > > These keepalive messages not only choke the network but also impact the > > > performance of the receiver, > > > because the receiver has to process the received message and then > decide > > > whether to reply to it or not. > > > The receiver remains busy doing this activity 500 times a second. > > > > I can reproduce the problem. > > > > > On investigation it is revealed that the following code fragment in > > > function WalSndWaitForWal in file walsender.c is responsible for > sending > > > these frequent keepalives: > > > > > > if (MyWalSnd->flush < sentPtr && > > > MyWalSnd->write < sentPtr && > > > !waiting_for_ping_response) > > > WalSndKeepalive(false); > > > > The immediate cause is pg_recvlogical doesn't send a reply before > > sleeping. Currently it sends replies every 10 seconds intervals. > > > > Yeah, but one can use -s option to send it at lesser intervals. > That option can impact pg_recvlogical, it will not impact the server sending keepalives too frequently. By default the status interval is 10 secs, still we are getting 500 keepalives a second from the server. > > > So the attached first patch stops the flood. > > > > I am not sure sending feedback every time before sleep is a good idea, > this might lead to unnecessarily sending more messages. Can we try by > using one-second interval with -s option to see how it behaves? As a > matter of comparison the similar logic in workers.c uses > wal_receiver_timeout to send such an update message rather than > sending it every time before sleep. > > > That said, I don't think it is not intended that logical walsender > > sends keep-alive packets with such a high frequency. It happens > > because walsender actually doesn't wait at all because it waits on > > WL_SOCKET_WRITEABLE because the keep-alive packet inserted just before > > is always pending. > > > > So as the attached second, we should try to flush out the keep-alive > > packets if possible before checking pg_is_send_pending(). > > > > /* Send keepalive if the time has come */ > WalSndKeepaliveIfNecessary(); > > + /* We may have queued a keep alive packet. flush it before sleeping. */ > + pq_flush_if_writable(); > > We already call pq_flush_if_writable() from WalSndKeepaliveIfNecessary > after sending the keep-alive message, so not sure how this helps? > > -- > With Regards, > Amit Kapila. > -- -- *Abbas* Senior Architect Ph: 92.334.5100153 Skype ID: gabbasb edbpostgres.com *Follow us on Twitter* @EnterpriseDB