I wrote: > Hmm ... desmoxytes has failed this test once, out of four runs since > it went in: > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=desmoxytes&dt=2021-06-19%2003%3A06%3A04
I studied this failure a bit more, and I think the test itself has a race condition. It's doing # freeze walsender and walreceiver. Slot will still be active, but walreceiver # won't get anything anymore. kill 'STOP', $senderpid, $receiverpid; $logstart = get_log_size($node_primary3); advance_wal($node_primary3, 4); ok(find_in_log($node_primary3, "to release replication slot", $logstart), "walreceiver termination logged"); The string it's looking for does show up in node_primary3's log, but not for another second or so; we can see instances of the following poll_query_until query before that happens. So the problem is that there is no interlock to ensure that the walreceiver terminates before this find_in_log check looks for it. You should be able to fix this by adding a retry loop around the find_in_log check (which would likely mean that you don't need to do multiple advance_wal iterations here). However, I agree with reverting the test for now and then trying again after beta2. regards, tom lane