Control: reopen -1 Control: severity -1 important On Sun, Apr 17, 2016 at 12:41:15AM +0200, gregor herrmann wrote: > Control: tag -1 + unreproducible > > On Sat, 16 Apr 2016 23:23:50 +0100, Chris Lamb wrote: > > > perlbal fails to build from source in unstable/amd64: > > > Didn't get 200 OK: GET /reqdecr,status HTTP/1.0 > > Unable to start socket: Address already in use > > # Looks like your test exited with 29 before it could output anything. > > t/32-selector.t .......... > > 1..38 > > Dubious, test returned 29 (wstat 7424, 0x1d00) > > Failed 38/38 subtests > > The tests pass for me, both during build and in autopkgtest.
It has just failed on ci.debian.net in the same way for the first time in a year or so. I can (eventually) reproduce it locally by running t/32-selector.t in a loop. Caught it with strace, and it looks like - the main process checks that the local port (in my case 60070) is free in Perlbal::Test::test_port() by making a new socket there and closing it immediately - later, in Perlbal::Test::WebServer::start_webserver() a server child process is forked off and then the main process tries to connect it. It gets a connection on something which doesn't respond properly but rather echoes the request back - in parallel but a bit later, the child process starts a server but gets EADDRINUSE from bind(2). I'm not aware of having any other process running that's greedy for local ports, and it seems really improbable that something would hit by chance in the window between the first check and the actual use. I wonder if the kernel is just slow to actually release the port after the first check. But surely at least the connect(2) call should have failed when the port was already closed. And why does the peer look like an echo service? I'm reopening but downgrading this. Perhaps we should just have start_webserver() retry a couple of times somehow if it fails the first time. -- Niko Tyni [email protected]

