On Tuesday, July 27, 2021 3:59 PM Ajin Cherian <itsa...@gmail.com> wrote: > On Thu, Jul 8, 2021 at 4:55 PM osumi.takami...@fujitsu.com > <osumi.takami...@fujitsu.com> wrote: > > > Attached file is the POC patch for this. > > Current design is to save failed stats data in the ReplicationSlot struct. > > This is because after the error, I'm not able to access the ReorderBuffer > object. > > Thus, I chose the object where I can interact with at the > ReplicationSlotRelease timing. > > I think this is a good idea to capture the failed replication stats. > But I'm wondering how you are deciding if the replication failed or not? Not > all > cases of ReplicationSLotRelease are due to a failure. It could also be due to > a > planned dropping of subscription or disable of subscription. I have not tested > this but won't the failed stats be updated in this case as well? Is that > correct? Yes, what you said is true. Currently, when I run DROP SUBSCRIPTION or ALTER SUBSCRIPTION DISABLE, failed stats values are added to pg_stat_replication_slots unintentionally, if they have some left values. This is because all those commands, like the subscriber apply failure by duplication error, have the publisher get 'X' message at ProcessRepliesIfAny() and go into the path to call ReplicationSlotRelease().
Also, other opportunities like server stop call the same in the end, which leads to a situation that after the server restart, the value of failed stats catch up with the (successful) existing stats values. Accordingly, I need to change the patch to adjust those situations. Thank you. Best Regards, Takamichi Osumi