On Thu, Aug 26, 2021 at 7:38 AM Ajin Cherian <itsa...@gmail.com> wrote:
>
> On Thu, Aug 26, 2021 at 11:02 AM Masahiko Sawada <sawada.m...@gmail.com> 
> wrote:
> >
>
> Luckily these logs have the disconnection times of the tap test client
> sessions as well. (not sure why I don't see these when I run these
> tests).
>
> Step 5 could have happened anywhere between 18:44:38.016 and 18:44:38.063
> 18:44:38.016 CEST [16474:3] 001_rep_changes.pl LOG:  statement: SELECT
> pid != 16336 FROM pg_stat_replication WHERE application_name =
> 'tap_sub';
> :
> :
> 18:44:38.063 CEST [16474:4] 001_rep_changes.pl LOG:  disconnection:
> session time: 0:00:00.063 user=nm database=postgres host=[local]
>
> When the query starts both walsenders are present but when the query
> completes both walsenders are gone, the actual query evaluation could
> have happened any time in between. This is the rare timing window that
> causes this problem.
>

You have a point but if we see the below logs, it seems the second
walsender (#step6) seemed to exited before the first walsender
(#step4).

2021-08-15 18:44:38.041 CEST [16475:10] tap_sub LOG:  disconnection:
session time: 0:00:00.036 user=nm database=postgres host=[local]
2021-08-15 18:44:38.043 CEST [16336:14] tap_sub LOG:  disconnection:
session time: 0:00:06.367 user=nm database=postgres host=[local]

Isn't it possible that pid is cleared in the other order due to which
we are seeing this problem?

-- 
With Regards,
Amit Kapila.


Reply via email to