On Sat, Jul 6, 2019 at 12:27 AM Stas Kelvich <s.kelv...@postgrespro.ru> wrote: > Well, indeed in case of cable disconnect only way to detect it with > proposed approach is to have tcp keepalive. However if disconnection > happens due to client application shutdown then client OS should itself > properly close than connection and therefore this patch will detect > such situation without keepalives configured.
Yeah. +1 for this patch, with a few adjustments including making the test use pg_sleep() as mentioned. It does something useful, namely cancelling very long running queries sooner if the client has gone away instead of discovering that potentially much later when sending a response. It does so with a portable kernel interface (though we haven't heard from a Windows tester), and I think it's using it in a safe way (we're not doing the various bad things you can do with MSG_PEEK, and the fd is expected to be valid for the process's lifetime, and the socket is always in non-blocking mode*, so I don't think there is any bad time for pg_check_client_connection() to run). It reuses the existing timer infrastructure so there isn't really any new overhead. One syscall every 10 seconds or whatever at the next available CFI is basically nothing. On its own, this patch will reliably detect clients that closed abruptly or exited/crashed (so they client kernel sends a FIN packet). In combination with TCP keepalive, it'll also detect clients that went away because the network or client kernel ceased to exist. *There are apparently no callers of pg_set_block(), so if you survived pq_init() you have a non-blocking socket. If you're on Windows, the code always sets the magic pgwin32_noblock global flag before trying to peek. I wondered if it's OK that the CFI would effectively clobber that with 0 on its way out, but that seems to be OK because every place in the code that sets that flag does so immediately before an IO operationg without a CFI in between. As the comment in pgstat.c says "This is extremely broken and should be fixed someday.". I wonder if we even need that flag at all now that all socket IO is non-blocking. -- Thomas Munro https://enterprisedb.com