Disconnect if socket cannot be put into non-blocking mode

Heikki Linnakangas Mon, 11 Mar 2024 07:44:57 -0700

While self-reviewing my "Refactoring backend fork+exec code" patches, Inoticed this in pq_init():

        /*
         * In backends (as soon as forked) we operate the underlying socket in
         * nonblocking mode and use latches to implement blocking semantics if
         * needed. That allows us to provide safely interruptible reads and
         * writes.
         *
         * Use COMMERROR on failure, because ERROR would try to send the error 
to
         * the client, which might require changing the mode again, leading to
         * infinite recursion.
         */
#ifndef WIN32
        if (!pg_set_noblock(MyProcPort->sock))
                ereport(COMMERROR,
                                (errmsg("could not set socket to nonblocking mode: 
%m")));
#endif


#ifndef WIN32

        /* Don't give the socket to any subprograms we execute. */
        if (fcntl(MyProcPort->sock, F_SETFD, FD_CLOEXEC) < 0)
                elog(FATAL, "fcntl(F_SETFD) failed on socket: %m");
#endif

Using COMMERROR here seems bogus. Firstly, if there was a problem withrecursion, surely the elog(FATAL) that follows would also be wrong. Butmore seriously, it's not cool to continue using the connection as ifeverything is OK, if we fail to put it into non-blocking mode. We shoulddisconnect. (COMMERROR merely logs the error, it does not bail out likeERROR does)


Barring objections, I'll commit and backpatch the attached to fix that.

--
Heikki Linnakangas
Neon (https://neon.tech)

From fc737586783a05672ff8eb96e245dfdeeadcd506 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakan...@iki.fi>
Date: Mon, 11 Mar 2024 16:36:37 +0200
Subject: [PATCH 1/1] Disconnect if socket cannot be put into non-blocking mode

Commit 387da18874 moved the code to put socket into non-blocking mode
from socket_set_nonblocking() into the one-time initialization
function, pg_init(). In socket_set_nonblocking(), there indeed was a
risk of recursion on failure like the comment said, but in pg_init(),
ERROR or FATAL is fine. There's even another elog(FATAL) just after
this, if setting FD_CLOEXEC fails.

Note that COMMERROR merely logged the error, it did not close the
connection, so if putting the socket to non-blocking mode failed we
would use the connection anyway. You might not immediately notice,
because most socket operations in a regular backend wait for the
socket to become readable/writable anyway. But e.g. replication will
be quite broken.

Backpatch to all supported versions.

Discussion: xx
---
 src/backend/libpq/pqcomm.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
index c606bf34473..d9e49ca7028 100644
--- a/src/backend/libpq/pqcomm.c
+++ b/src/backend/libpq/pqcomm.c
@@ -189,14 +189,10 @@ pq_init(void)
 	 * nonblocking mode and use latches to implement blocking semantics if
 	 * needed. That allows us to provide safely interruptible reads and
 	 * writes.
-	 *
-	 * Use COMMERROR on failure, because ERROR would try to send the error to
-	 * the client, which might require changing the mode again, leading to
-	 * infinite recursion.
 	 */
 #ifndef WIN32
 	if (!pg_set_noblock(MyProcPort->sock))
-		ereport(COMMERROR,
+		ereport(FATAL,
 				(errmsg("could not set socket to nonblocking mode: %m")));
 #endif

-- 
2.39.2

Disconnect if socket cannot be put into non-blocking mode

Reply via email to