On Fri, Mar 19, 2021, at 12:23 AM, Tom Lane wrote:
> [ reads code... ]
> ... no, I think the problem is the test is still full of race conditions.
> 
> In the first place, waiting till you see the output of a SELECT that's
> before the useful query is not enough to guarantee that the useful query
> is done, or even started.  That's broken on both sessions.
That's an ugly and fragile mechanism to workaround the fact that pump_until
reacts after you have the query return.

> In the second place, even if the second INSERT has started, you don't know
> that it's reached the point of blocking on the tuple conflict yet.
> Which in turn means that it might not've filled its tuplestore yet.
> 
> In short, this script is designed to ensure that the test query can't
> finish too soon, but that proves nothing about whether the test query
> has even started.  And since you also haven't really guaranteed that the
> intended-to-be-blocking query is done, I don't think that the first
> condition really holds either.
In order to avoid the race condition between filling the tuplestore and killing
the backend, we could use a pool_query_until() before SIGKILL to wait the
temporary file being created. Do you think this modification will make this
test more stable?


--
Euler Taveira
EDB   https://www.enterprisedb.com/
diff --git a/src/test/recovery/t/022_crash_temp_files.pl b/src/test/recovery/t/022_crash_temp_files.pl
index 8044849b73..41a91ebd06 100644
--- a/src/test/recovery/t/022_crash_temp_files.pl
+++ b/src/test/recovery/t/022_crash_temp_files.pl
@@ -100,6 +100,11 @@ ok(pump_until($killme, \$killme_stdout, qr/in-progress-before-sigkill/m),
 $killme_stdout = '';
 $killme_stderr = '';
 
+# Wait till a temporary file is created
+$node->poll_query_until(
+	'postgres',
+	'SELECT COUNT(1) FROM pg_ls_dir($$base/pgsql_tmp$$)', '1');
+
 # Kill with SIGKILL
 my $ret = TestLib::system_log('pg_ctl', 'kill', 'KILL', $pid);
 is($ret, 0, 'killed process with KILL');
@@ -168,6 +173,11 @@ ok(pump_until($killme, \$killme_stdout, qr/in-progress-before-sigkill/m),
 $killme_stdout = '';
 $killme_stderr = '';
 
+# Wait till a temporary file is created
+$node->poll_query_until(
+	'postgres',
+	'SELECT COUNT(1) FROM pg_ls_dir($$base/pgsql_tmp$$)', '1');
+
 # Kill with SIGKILL
 $ret = TestLib::system_log('pg_ctl', 'kill', 'KILL', $pid);
 is($ret, 0, 'killed process with KILL');

Reply via email to