On Fri, Mar 19, 2021, at 12:23 AM, Tom Lane wrote: > [ reads code... ] > ... no, I think the problem is the test is still full of race conditions. > > In the first place, waiting till you see the output of a SELECT that's > before the useful query is not enough to guarantee that the useful query > is done, or even started. That's broken on both sessions. That's an ugly and fragile mechanism to workaround the fact that pump_until reacts after you have the query return.
> In the second place, even if the second INSERT has started, you don't know > that it's reached the point of blocking on the tuple conflict yet. > Which in turn means that it might not've filled its tuplestore yet. > > In short, this script is designed to ensure that the test query can't > finish too soon, but that proves nothing about whether the test query > has even started. And since you also haven't really guaranteed that the > intended-to-be-blocking query is done, I don't think that the first > condition really holds either. In order to avoid the race condition between filling the tuplestore and killing the backend, we could use a pool_query_until() before SIGKILL to wait the temporary file being created. Do you think this modification will make this test more stable? -- Euler Taveira EDB https://www.enterprisedb.com/
diff --git a/src/test/recovery/t/022_crash_temp_files.pl b/src/test/recovery/t/022_crash_temp_files.pl index 8044849b73..41a91ebd06 100644 --- a/src/test/recovery/t/022_crash_temp_files.pl +++ b/src/test/recovery/t/022_crash_temp_files.pl @@ -100,6 +100,11 @@ ok(pump_until($killme, \$killme_stdout, qr/in-progress-before-sigkill/m), $killme_stdout = ''; $killme_stderr = ''; +# Wait till a temporary file is created +$node->poll_query_until( + 'postgres', + 'SELECT COUNT(1) FROM pg_ls_dir($$base/pgsql_tmp$$)', '1'); + # Kill with SIGKILL my $ret = TestLib::system_log('pg_ctl', 'kill', 'KILL', $pid); is($ret, 0, 'killed process with KILL'); @@ -168,6 +173,11 @@ ok(pump_until($killme, \$killme_stdout, qr/in-progress-before-sigkill/m), $killme_stdout = ''; $killme_stderr = ''; +# Wait till a temporary file is created +$node->poll_query_until( + 'postgres', + 'SELECT COUNT(1) FROM pg_ls_dir($$base/pgsql_tmp$$)', '1'); + # Kill with SIGKILL $ret = TestLib::system_log('pg_ctl', 'kill', 'KILL', $pid); is($ret, 0, 'killed process with KILL');