On 05/10/2024 01:03, Thomas Munro wrote:
On Sat, Oct 5, 2024 at 7:41 AM Heikki Linnakangas <hlinn...@iki.fi> wrote:
My test for dead-end backends opens 20 TCP (or unix domain) connections
to the server, in quick succession. That works fine my system, and it
passed cirrus CI on other platforms, but on FreeBSD it failed
repeatedly. The behavior in that scenario is apparently
platform-dependent: it depends on the accept queue size, but what
happens when you reach the queue size also seems to depend on the
platform. On my Linux system, the connect() calls in the client are
blocked, if the server is doesn't call accept() fast enough, but
apparently you get an error on *BSD systems.

Right, we've analysed that difference in AF_UNIX implementation
before[1], which shows up in the real world, where client sockets ie
libpq's are usually non-blocking, as EAGAIN on Linux (which is not
valid per POSIX) vs ECONNREFUSED on other OSes.  All fail to connect,
but the error message is different.

Thanks for the pointer!

For blocking AF_UNIX client sockets like in your test, Linux
effectively has an infinite queue made from two layers.  The listen
queue (a queue of connecting sockets) does respect the requested
backlog size, but when it's full it has an extra trick: the connect()
call waits (in a queue of threads) for space to become free in the
listen queue, so it's effectively unlimited (but only for blocking
sockets), while FreeBSD and I suspect any other implementation
deriving from or reimplementing the BSD socket code gives you
ECONNREFUSED.  macOS behaves just the same as FreeBSD AFAICT, so I
don't know why you didn't see the same thing... I guess it's just
racing against accept() draining the queue.

In fact I misremembered: the failure happened on macOS, *not* FreeBSD. It could be just luck I didn't see it on FreeBSD though.

It's possible that Windows copied the Linux behaviour for AF_UNIX,
given that it probably has something to do with the WSL project for
emulating Linux, but IDK.

Sadly Windows' IO::Socket::UNIX hasn't been implemented on Windows (or at least on this perl distribution we're using in Cirrus CI):

Socket::pack_sockaddr_un not implemented on this architecture at C:/strawberry/5.26.3.1/perl/lib/Socket.pm line 872.

so I'll have to disable this test on Windows anyway.

--
Heikki Linnakangas
Neon (https://neon.tech)



Reply via email to