Hi, On 2021-12-11 17:41:34 +1300, Thomas Munro wrote: > --- a/src/backend/libpq/pqcomm.c > +++ b/src/backend/libpq/pqcomm.c > @@ -1959,22 +1959,36 @@ pq_settcpusertimeout(int timeout, Port *port) > bool > pq_check_connection(void) > { > -#if defined(POLLRDHUP) > - /* > - * POLLRDHUP is a Linux extension to poll(2) to detect sockets closed by > - * the other end. We don't have a portable way to do that without > - * actually trying to read or write data on other systems. We don't > want > - * to read because that would be confused by pipelined queries and COPY > - * data. Perhaps in future we'll try to write a heartbeat message > instead. > - */ > - struct pollfd pollfd; > + WaitEvent event; > int rc; > > - pollfd.fd = MyProcPort->sock; > - pollfd.events = POLLOUT | POLLIN | POLLRDHUP; > - pollfd.revents = 0; > + /* > + * Temporarily ignore the latch, so that we can poll for just the one > + * event we care about. > + */ > + ModifyWaitEvent(FeBeWaitSet, FeBeWaitSetLatchPos, WL_LATCH_SET, NULL); > > - rc = poll(&pollfd, 1, 0); > + PG_TRY(); > + { > + /* > + * It's OK to clobber the socket event to report only the event > we're > + * interested in without restoring the previous state > afterwards, > + * because every FeBeWaitSet wait site does the same. > + */ > + ModifyWaitEvent(FeBeWaitSet, FeBeWaitSetSocketPos, > WL_SOCKET_CLOSED, > + NULL); > + rc = WaitEventSetWait(FeBeWaitSet, 0, &event, 1, 0); > + } > + PG_FINALLY(); > + { > + /* > + * Restore the latch, so we can't leave FeBeWaitSet in a broken > state > + * that ignores latches. > + */ > + ModifyWaitEvent(FeBeWaitSet, FeBeWaitSetLatchPos, WL_LATCH_SET, > + MyLatch); > + } > + PG_END_TRY();
Yuck. Is there really no better way to deal with this? What kind of errors is this trying to handle transparently? Afaics this still changes when we'd e.g. detect postmaster death. Am I misunderstanding code or comment, or is the comment saying that it's ok to clobber the wes, but then we actually unclobber it? Greetings, Andres Freund