On Thu, Aug 26, 2021 at 1:54 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Thu, Aug 26, 2021 at 9:21 AM Ajin Cherian <itsa...@gmail.com> wrote: > > > > On Thu, Aug 26, 2021 at 1:06 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > > > > > You have a point but if we see the below logs, it seems the second > > > walsender (#step6) seemed to exited before the first walsender > > > (#step4). > > > > > > 2021-08-15 18:44:38.041 CEST [16475:10] tap_sub LOG: disconnection: > > > session time: 0:00:00.036 user=nm database=postgres host=[local] > > > 2021-08-15 18:44:38.043 CEST [16336:14] tap_sub LOG: disconnection: > > > session time: 0:00:06.367 user=nm database=postgres host=[local] > > > > > > Isn't it possible that pid is cleared in the other order due to which > > > we are seeing this problem? > > > > If the pid is cleared in the other order, wouldn't the query [1] return a > > false? > > > > [1] - " SELECT pid != 16336 FROM pg_stat_replication WHERE > > application_name = 'tap_sub';" > > > > I think it should return true because pid for 16336 is cleared first > and the remaining one will be 16475.
Yes, that was what I explained as well. 16336 is PID 'a' (first walsender) in my explanation. The first walsender should be cleared first for this theory to work. regards, Ajin Cherian Fujitsu Australia