On Wed, Apr 16, 2025 at 09:18:01AM -0700, Michael Paquier wrote:
> On Wed, Apr 16, 2025 at 09:31:59PM +0700, a.kozhemyakin wrote:
>> After commit 2cce0fe on master
>> 
>> ERROR:  unexpected message type 0x50 during COPY from stdin
>> CONTEXT:  COPY psql_pipeline, line 1
>> Pipeline aborted, command did not run
>> psql: common.c:1510: discardAbortedPipelineResults: Assertion `res == ((void
>> *)0) || result_status == PGRES_PIPELINE_ABORTED' failed.
>> Aborted (core dumped)
> 
> Reproduced here, thanks for the report.  I'll look at that at the
> beginning of next week, adding an open item for now.

The failure is not related to 2cce0fe.  The following sequence fails
as well, as long as we have one SELECT after the COPY to mess up with
the \getresults that causes a PGRES_FATAL_ERROR combined with a
"terminating connection because protocol synchronization was lost" on
the backend side, because the server expects some data while the
client does not send it but psql is not able to cope with this state:
\startpipeline
COPY psql_pipeline FROM STDIN \bind \sendpipeline
SELECT $1 \bind 'val1' \sendpipeline
\syncpipeline
\getresults
\endpipeline

It's actually nice that we are able to emulate such query patterns
with psql using all these meta-commands, I don't think we have
any coverage for the backend synchronization loss case yet like this
one?

2cce0fe makes that easier to reach by allowing more command patterns,
but it's the mix of COPY followed by a SELECT that causes psql to be
confused.  All the tests that we have don't check this kind of
scenarios, for COPY TO/FROM, with always use a flush or a sync
followed quickly by \getresults, but we don't have tests where we mix
things.

Anyway, I don't think that there is much we can do under a
PGRES_FATAL_ERROR in this code path when discarding the pipe results.
As far as I can tell, the server has failed the query suddenly and the
whole pipeline flow is borked.  The best thing that I can think of is
to discard all the results while decrementing the counters, then let
psql complain about that like in the attached.  I've added two tests
in TAP, as these trigger a FATAL in the backend so we cannot use the
normal SQL route, so as we have some coverage.

@Anthonin: Any thoughts or comments, perhaps?  A second opinion would
be welcome here.
--
Michael
From cec3613b43f536ec1f91a75a034264ed304cd628 Mon Sep 17 00:00:00 2001
From: Michael Paquier <mich...@paquier.xyz>
Date: Mon, 21 Apr 2025 15:17:43 +0900
Subject: [PATCH] psql: Fix assertion failure with pipeline mode

---
 src/bin/psql/common.c       | 20 ++++++++++++++++++++
 src/bin/psql/t/001_basic.pl | 35 +++++++++++++++++++++++++++++++----
 2 files changed, 51 insertions(+), 4 deletions(-)

diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 21d660a8961a..6eb56d7bb66e 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -1478,6 +1478,26 @@ discardAbortedPipelineResults(void)
 			 */
 			return res;
 		}
+		else if (result_status == PGRES_FATAL_ERROR)
+		{
+			/*
+			 * Fatal error found from a query execution failure, there is not
+			 * much that can be done, so give up and clear all the results
+			 * available.  There may be more than one result stacked.
+			 */
+			PQclear(res);
+
+			if (pset.available_results > 0)
+				pset.available_results--;
+			if (pset.requested_results > 0)
+				pset.requested_results--;
+
+			/* leave if no more results */
+			if (pset.requested_results == 0)
+				return NULL;
+			else
+				continue;
+		}
 		else if (res == NULL)
 		{
 			/* A query was processed, decrement the counters */
diff --git a/src/bin/psql/t/001_basic.pl b/src/bin/psql/t/001_basic.pl
index 739cb4397087..91c56cd7c08c 100644
--- a/src/bin/psql/t/001_basic.pl
+++ b/src/bin/psql/t/001_basic.pl
@@ -34,11 +34,13 @@ sub psql_fails_like
 {
 	local $Test::Builder::Level = $Test::Builder::Level + 1;
 
-	my ($node, $sql, $expected_stderr, $test_name) = @_;
+	my ($node, $sql, $expected_stderr, $test_name, $replication) = @_;
+
+	# Use the context of a WAL sender, if requested by the caller.
+	$replication = '' unless defined($replication);
 
-	# Use the context of a WAL sender, some of the tests rely on that.
 	my ($ret, $stdout, $stderr) =
-	  $node->psql('postgres', $sql, replication => 'database');
+	  $node->psql('postgres', $sql, replication => $replication);
 
 	isnt($ret, 0, "$test_name: exit code not 0");
 	like($stderr, $expected_stderr, "$test_name: matches");
@@ -79,7 +81,8 @@ psql_fails_like(
 	$node,
 	'START_REPLICATION 0/0',
 	qr/unexpected PQresultStatus: 8$/,
-	'handling of unexpected PQresultStatus');
+	'handling of unexpected PQresultStatus',
+	'database');
 
 # test \timing
 psql_like(
@@ -481,4 +484,28 @@ psql_like($node, "copy (values ('foo'),('bar')) to stdout \\g | $pipe_cmd",
 my $c4 = slurp_file($g_file);
 like($c4, qr/foo.*bar/s);
 
+# Tests with pipelines.  These trigger FATAL failures in the backend,
+# so they cannot be tested through the SQL regression tests.
+$node->safe_psql('postgres', 'CREATE TABLE psql_pipeline()');
+psql_fails_like(
+	$node,
+	qq{\\startpipeline
+COPY psql_pipeline FROM STDIN;
+SELECT 'val1';
+\\syncpipeline
+\\getresults
+\\endpipeline},
+	qr/unexpected message type 0x50 during COPY from stdin/,
+	'handling of protocol synchronization loss with pipelines');
+psql_fails_like(
+	$node,
+	qq{\\startpipeline
+COPY psql_pipeline FROM STDIN \\bind \\sendpipeline
+SELECT 'val1' \\bind \\sendpipeline
+\\syncpipeline
+\\getresults
+\\endpipeline},
+	qr/unexpected message type 0x50 during COPY from stdin/,
+	'handling of protocol synchronization loss with pipelines');
+
 done_testing();
-- 
2.49.0

Attachment: signature.asc
Description: PGP signature

Reply via email to