Hi, On 2022-12-08 18:08:15 -0800, Andres Freund wrote: > I just re-discovered this issue, via > https://www.postgresql.org/message-id/20221209003607.bz2zdznvfnkq4zz3%40awork3.anarazel.de > > > On 2022-09-25 16:22:37 -0700, Andres Freund wrote: > > Maybe I am missing something, but I don't think it's OK for > > connect_pg_server() to connect in a blocking manner, without accepting > > interrupts? > > It's definitely not. This basically means network issues or such can lead to > connections being unkillable... > > We know how to do better, c.f. libpqrcv_connect(). I hacked that up for > postgres_fdw, and got through quite a few runs without related issues ([1]). > > The same problem is present in two places in dblink.c. Obviously we can copy > and paste the code to dblink.c as well. But that means we have the same not > quite trivial code in three different c files. There's already a fair bit of > duplicated code around AcquireExternalFD(). > > It seems we should find a place to put backend specific libpq wrapper code > somewhere. Unless we want to relax the rule about not linking libpq into the > backend we can't just put it in the backend directly, though. > > The only alternative way to provide a wrapper that I can think of are to > a) introduce a new static library that can be linked to by libpqwalreceiver, > postgres_fdw, dblink > b) add a header with static inline functions implementing interrupt-processing > connection establishment for libpq > > Neither really has precedent. > > The attached quick patch just adds and uses libpq_connect_interruptible() in > postgres_fdw. If we wanted to move this somewhere generic, at least part of > the external FD handling should also be moved into it. > > > dblink.c uses a lot of other blocking libpq functions, which obviously also > isn't ok. > > > Perhaps we could somehow make this easier from within libpq? My first thought > was a connection parameter that'd provide a different implementation of > pqSocketCheck() or such - but I don't think that'd work, because we might > throw errors when processing interrupts, and that'd not be ok from deep within > libpq.
Any opinions? Due to the simplicity I'm currently leaning to a header-only helper, but I don't feel confident about it. The rate of cfbot failures is high enough that it'd be good to do something about this. Greetings, Andres Freund