Hi, On 2023-12-09 12:41:30 -0500, Tom Lane wrote: > I wrote: > > Andres Freund <and...@anarazel.de> writes: > >> I was wondering about that too. But if we do so, why not also do it for > >> writes? > > > Writes don't act that way, do they? EOF on a pipe gives you an error, > > not silently reporting that zero bytes were written and leaving you > > to retry indefinitely. > > On further reflection I realized that you're right so far as the SSL > code path goes, because SSL_write() can involve physical reads as well > as writes, so at least in principle it's possible that we'd see EOF > reported this way from that function.
Heh. I'll just claim that's what I was thinking about. > Also, the libpq side does need work of the same sort, leading to the > v2-0001 patch attached. I'd perhaps add a comment explaining why it's plausible that we'd see that that in the write case. > I also realized that we have more or less the same problem at the > caller level, allowing a similar failure for non-SSL connections. > So I'm also proposing 0002 attached. Your results from aggregated > logs didn't show "could not receive data from client: Success" as a > common case, but since we weren't bothering to zero errno beforehand, > it's likely that such failures would show up with very random errnos. I did only look at the top ~100 internal errors, after trying to filter out extensions, i.e. the list wasn't exhaustive. There's also very few non-ssl connections. But I just checked, and for that error message, I do see some XX000, but only in older versions. There's ENETUNREACH, ECONNRESET, ETIMEDOUT, EHOSTUNREACH which these days are all handled as non XX000 by errcode_for_socket_access(). > diff --git a/src/interfaces/libpq/fe-secure.c > b/src/interfaces/libpq/fe-secure.c > index bd72a87bbb..76b647ce1c 100644 > --- a/src/interfaces/libpq/fe-secure.c > +++ b/src/interfaces/libpq/fe-secure.c > @@ -211,6 +211,8 @@ pqsecure_raw_read(PGconn *conn, void *ptr, size_t len) > int result_errno = 0; > char sebuf[PG_STRERROR_R_BUFLEN]; > > + SOCK_ERRNO_SET(0); > + > n = recv(conn->sock, ptr, len, 0); > > if (n < 0) > @@ -232,6 +234,7 @@ pqsecure_raw_read(PGconn *conn, void *ptr, size_t len) > > case EPIPE: > case ECONNRESET: > + case 0: /* treat as EOF */ > libpq_append_conn_error(conn, "server closed > the connection unexpectedly\n" > > "\tThis probably means the server terminated abnormally\n" > > "\tbefore or while processing the request."); If we were treating it as EOF, we'd not "queue" an error message, no? Normally recv() returns 0 in that case, so we'd just return, right? Greetings, Andres Freund