On Thu, 27 Mar 2025 at 16:19, Andres Freund <[email protected]> wrote:
And we don't even just add this message when the connection was actually closed unexpectedly, we often do it even when we *did* get a FATAL, as in this example:psql -c 'select pg_terminate_backend(pg_backend_pid())' FATAL: 57P01: terminating connection due to administrator command LOCATION: ProcessInterrupts, postgres.c:3351 server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. connection to server was lost I think this one is mostly a weakness in how libpq tracks connection state, but it kind of shows the silliness of claiming postgres probably crashed.
I ran into this for the nth time (this time while trying to have psql handle certain FATAL errors differently). Turns out fixing this is actually really simple. All that's needed is to mark a connection as CONNECTION_BAD whenever a FATAL or PANIC error is received by the client. (this change is intended for PG20)
From a299e7c7779331dbc09d2c8bfc5923153e15763a Mon Sep 17 00:00:00 2001 From: Jelte Fennema-Nio <[email protected]> Date: Tue, 26 May 2026 09:05:56 +0200 Subject: [PATCH v1] libpq: Consider a connection with a FATAL error to be closed This starts marking a connection as closed (i.e. CONNECTION_BAD) when the client receives a FATAL/PANIC error. Previously any FATAL error would get the the "server closed the connection unexpectedly" string appended like such: FATAL: 57P01: terminating connection due to administrator command LOCATION: ProcessInterrupts, postgres.c:3431 server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. This addition to the error is just plain incorrect, the server told the client that it was closing the connection. So it's not unexpected, nor did the server terminate abnormally. It also makes the error harder to parse by a client, because it would lose the ability to use PQresultErrorField on the final PGresult. --- src/interfaces/libpq/fe-protocol3.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/src/interfaces/libpq/fe-protocol3.c b/src/interfaces/libpq/fe-protocol3.c index 840e018cd18..504e8592196 100644 --- a/src/interfaces/libpq/fe-protocol3.c +++ b/src/interfaces/libpq/fe-protocol3.c @@ -955,6 +955,25 @@ pqGetErrorNotice3(PGconn *conn, bool isError) sizeof(conn->last_sqlstate)); else if (id == PG_DIAG_STATEMENT_POSITION) have_position = true; + else if (isError && id == PG_DIAG_SEVERITY_NONLOCALIZED && + (strcmp(workBuf.data, "FATAL") == 0 || + strcmp(workBuf.data, "PANIC") == 0)) + { + /* + * A FATAL or PANIC from the server means the backend is going to + * tear the connection down right after delivering this message. + * Mark the connection bad immediately so callers that drain + * results (PQexecFinish, PQexecStart's discard loop, etc.) stop + * reading from the socket after receiving this result. Further + * reads from the socket will receive an EOF, which would cause us + * to incorrectly report this as an unexpected connection closure + * by appending "server closed the connection unexpectedly ..." to + * the server's own error message. We read SEVERITY_NONLOCALIZED + * rather than SEVERITY so the check is independent of the + * server's lc_messages setting. + */ + conn->status = CONNECTION_BAD; + } } /* base-commit: 2c4bd2bf5700db98be0602854a8b7fa2c16b5f4a -- 2.54.0
