Hi, On 2022-04-04 00:50:27 -0400, Tom Lane wrote: > My pet dinosaur gaur just failed [1] in > src/test/recovery/t/022_crash_temp_files.pl, which does this: > > ----- > my $ret = PostgreSQL::Test::Utils::system_log('pg_ctl', 'kill', 'KILL', $pid); > is($ret, 0, 'killed process with KILL'); > > # Close psql session > $killme->finish; > $killme2->finish; > > # Wait till server restarts > $node->poll_query_until('postgres', undef, ''); > ----- > > It's hard to be totally sure, but I think what happened is that > gaur hit the in-hindsight-obvious race condition in this code: > we managed to execute a successful iteration of poll_query_until > before the postmaster had noticed its dead child and commenced > the restart. The test lines after these are not prepared to see > failure-to-connect. > > It's not obvious to me how to remove this race condition. > Thoughts?
Maybe we can use pump_until() with the psql that's not getting killed? With a non-matching regex? That'd only return once the backend was killed by postmaster, afaics? Greetings, Andres Freund