On Thu, Sep 30, 2021 at 8:49 AM Hou, Zhijie/侯 志杰 <houzj.f...@fujitsu.com> wrote: > > I noticed another patch that Horiguchi-San posted earlier[1]. > > The approach in that patch is to splits the sleep into two > pieces. If the first sleep reaches the timeout then send a keepalive > then sleep for the remaining time. > > I tested that patch and can see the keepalive message is reduced and > the patch won't cause the current regression test fail. > > Since I didn't find some comments about that patch, > I wonder did we find some problems about that patch ? >
I am not able to understand some parts of that patch. + If the sleep is shorter + * than KEEPALIVE_TIMEOUT milliseconds, we skip sending a keepalive to + * prevent it from getting too-frequent. + */ + if (MyWalSnd->flush < sentPtr && + MyWalSnd->write < sentPtr && + !waiting_for_ping_response) + { + if (sleeptime > KEEPALIVE_TIMEOUT) + { + int r; + + r = WalSndWait(wakeEvents, KEEPALIVE_TIMEOUT, + WAIT_EVENT_WAL_SENDER_WAIT_WAL); + + if (r != 0) + continue; + + sleeptime -= KEEPALIVE_TIMEOUT; + } + + WalSndKeepalive(false); It claims to skip sending keepalive if the sleep time is shorter than KEEPALIVE_TIMEOUT (a new threshold) but the above code seems to suggest it sends in both cases. What am I missing? Also, more to the point this special keep_alive seems to be sent for synchronous replication and walsender shutdown as they can expect updated locations. You haven't given any reason (theory) why those two won't be impacted due to this change? I am aware that for synchronous replication, we wake waiters while ProcessStandbyReplyMessage but I am not sure how it helps with wal sender shutdown? I think we need to know the reasons for this message and then need to see if the change has any impact on the same. -- With Regards, Amit Kapila.