Andres Freund <and...@anarazel.de> writes: > I think the test is telling us that something may be broken. We shouldn't > silence that without at least some understanding what it is.
I looked at the recent failure on komodoensis [1], and I think what is happening is just that the walsender for the basebackup run (launched at 019_replslot_limit.pl line 325) hasn't exited yet at the point where we do a blind "SELECT pid FROM pg_stat_activity WHERE backend_type = 'walsender'" and expect that we're only going to see the walsender launched for the standby at line 331. The two PIDs reported in the failure correspond to this postmaster log trace: 2022-02-16 23:06:29.596 CET [620d7565.38dd62:1] LOG: connection received: host=[local] 2022-02-16 23:06:29.596 CET [620d7565.38dd62:2] LOG: replication connection authorized: user=bf application_name=019_replslot_limit.pl 2022-02-16 23:06:29.596 CET [620d7565.38dd62:3] LOG: received replication command: SHOW data_directory_mode 2022-02-16 23:06:29.596 CET [620d7565.38dd62:4] STATEMENT: SHOW data_directory_mode 2022-02-16 23:06:29.596 CET [620d7565.38dd62:5] LOG: received replication command: CREATE_REPLICATION_SLOT "pg_basebackup_3726690" TEMPORARY PHYSICAL ( RESERVE_WAL) 2022-02-16 23:06:29.596 CET [620d7565.38dd62:6] STATEMENT: CREATE_REPLICATION_SLOT "pg_basebackup_3726690" TEMPORARY PHYSICAL ( RESERVE_WAL) 2022-02-16 23:06:29.597 CET [620d7565.38dd62:7] LOG: received replication command: IDENTIFY_SYSTEM 2022-02-16 23:06:29.597 CET [620d7565.38dd62:8] STATEMENT: IDENTIFY_SYSTEM 2022-02-16 23:06:29.597 CET [620d7565.38dd62:9] LOG: received replication command: START_REPLICATION SLOT "pg_basebackup_3726690" 0/600000 TIMELINE 1 2022-02-16 23:06:29.597 CET [620d7565.38dd62:10] STATEMENT: START_REPLICATION SLOT "pg_basebackup_3726690" 0/600000 TIMELINE 1 and this one: 2022-02-16 23:06:29.687 CET [620d7565.38dd6f:1] LOG: connection received: host=[local] 2022-02-16 23:06:29.687 CET [620d7565.38dd6f:2] LOG: replication connection authorized: user=bf application_name=standby_3 2022-02-16 23:06:29.687 CET [620d7565.38dd6f:3] LOG: received replication command: IDENTIFY_SYSTEM 2022-02-16 23:06:29.687 CET [620d7565.38dd6f:4] STATEMENT: IDENTIFY_SYSTEM 2022-02-16 23:06:29.687 CET [620d7565.38dd6f:5] LOG: received replication command: START_REPLICATION SLOT "rep3" 0/700000 TIMELINE 1 2022-02-16 23:06:29.687 CET [620d7565.38dd6f:6] STATEMENT: START_REPLICATION SLOT "rep3" 0/700000 TIMELINE 1 There's no disconnection log entry for either, which I suppose means that somebody didn't bother logging disconnection for walsenders ... shouldn't we fix that? But in any case, I don't see anything interesting here, just a query that needs to be more selective. Perhaps we can look for application_name=standby_3? regards, tom lane [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=komodoensis&dt=2022-02-16%2021%3A16%3A04