Hi, On 24/12/13 12:58, Arno Töll wrote: > On 01.12.2013 18:03, Michael Gilbert wrote: >> I did some testing with a real kfreebsd machine, and this test always >> passes (I uploaded those binaries), so it seems the problem is >> specific to the buildds. > > could anyone with sufficient buildd-fu please provide some insight here? > In my local sbuild installation lighttpd builds fine on kfreebsd, as it > seems to do for Michael's.
I could reproduce it 8 times out of 100 here on kfreebsd-amd64. The race happens after getting Connection Refused trying to contact the fcgi-responder which has just exit (intentionally). The parent does a wait4() syscall, and if this returns the pid of the process that just exit, a new listen is set up on 10000/tcp and lighttpd forks a new fcgi-responder. The parent waits with select() for it to be ready, then sends the FCGI request which is successful: > 73263 102368 lighttpd 1390849444.054371 RET connect -1 errno 36 Operation > now in progress > 73263 102368 lighttpd 1390849444.054389 CALL > accept(0x3,0x7fffffffd010,0x7fffffffd00c) > 73263 102368 lighttpd 1390849444.054405 RET accept -1 errno 35 Resource > temporarily unavailable > 73263 102368 lighttpd 1390849444.054421 CALL gettimeofday(0x7fffffffd0f0,0) > 73263 102368 lighttpd 1390849444.054437 RET gettimeofday 0 > 73263 102368 lighttpd 1390849444.054454 CALL poll(0x6578c0,0x2,0x3e8) > 73263 102368 lighttpd 1390849444.054471 RET poll 1 > 73263 102368 lighttpd 1390849444.054485 CALL > getsockopt(0x7,SOL_SOCKET,<invalid=4103>,0x7fffffffceac,0x7fffffffceb0) > 73263 102368 lighttpd 1390849444.054500 RET getsockopt 0 > 73263 102368 lighttpd 1390849444.054518 CALL > stat(0x80181d3b6,0x7fffffffcac0) > 73263 102368 lighttpd 1390849444.054532 NAMI "/etc/localtime" > 73263 102368 lighttpd 1390849444.054560 STRU invalid record > 73263 102368 lighttpd 1390849444.054574 RET stat 0 > 73263 102368 lighttpd 1390849444.054596 CALL write(0x4,0x6b17b0,0xa8) > 73263 102368 lighttpd 1390849444.054648 GIO fd 4 wrote 168 bytes > "2014-01-27 19:04:04: (mod_fastcgi.c.2884) backend is overloaded; > we'll disable it for 1 seconds and send the request to another backend > instead: reconnects: 0 load: 1 > " > 73263 102368 lighttpd 1390849444.054664 RET write 168/0xa8 > 73263 102368 lighttpd 1390849444.054679 CALL > wait4(0x11e30,0x7fffffffcdcc,0x1<WNOHANG>,0) > 73263 102368 lighttpd 1390849444.054702 RET wait4 73264/0x11e30 > 73263 102368 lighttpd 1390849444.054718 CALL > stat(0x80181debb,0x7fffffffca60) > 73263 102368 lighttpd 1390849444.054732 NAMI "/etc/resolv.conf" > 73263 102368 lighttpd 1390849444.054752 STRU invalid record > 73263 102368 lighttpd 1390849444.054766 RET stat 0 > 73263 102368 lighttpd 1390849444.054782 CALL > socket(PF_INET,<invalid=1>,IPPROTO_IP) > 73263 102368 lighttpd 1390849444.054800 RET socket 8 > 73263 102368 lighttpd 1390849444.054814 CALL > connect(0x8,0x7fffffffccc0,0x10) > 73263 102368 lighttpd 1390849444.054828 STRU struct sockaddr { AF_INET, > 127.0.0.1:10000 } > 73263 102368 lighttpd 1390849444.054945 RET connect -1 errno 61 Connection > refused > 73263 102368 lighttpd 1390849444.054964 CALL close(0x8) > 73263 102368 lighttpd 1390849444.054983 RET close 0 > 73263 102368 lighttpd 1390849444.054997 CALL > socket(PF_INET,<invalid=1>,IPPROTO_IP) > 73263 102368 lighttpd 1390849444.055014 RET socket 8 > 73263 102368 lighttpd 1390849444.055029 CALL > setsockopt(0x8,SOL_SOCKET,<invalid=4>,0x7fffffffcc8c,0x4) > 73263 102368 lighttpd 1390849444.055044 RET setsockopt 0 > 73263 102368 lighttpd 1390849444.055058 CALL bind(0x8,0x7fffffffccc0,0x10) > 73263 102368 lighttpd 1390849444.055072 STRU struct sockaddr { AF_INET, > 127.0.0.1:10000 } > 73263 102368 lighttpd 1390849444.055096 RET bind 0 > 73263 102368 lighttpd 1390849444.055134 CALL listen(0x8,0x400) > 73263 102368 lighttpd 1390849444.055149 RET listen 0 But sometimes the wait4() above returns zero. This seems to be an unrecoverable error, and an HTTP 503 response generated: > 73244 107398 lighttpd 1390849422.864927 CALL > connect(0x7,0x7fffffffcae0,0x10) > 73244 107398 lighttpd 1390849422.864943 STRU struct sockaddr { AF_INET, > 127.0.0.1:10000 } > 73244 107398 lighttpd 1390849422.864985 RET connect -1 errno 36 Operation > now in progress > 73244 107398 lighttpd 1390849422.865005 CALL gettimeofday(0x7fffffffd0f0,0) > 73244 107398 lighttpd 1390849422.865022 RET gettimeofday 0 > 73244 107398 lighttpd 1390849422.865036 CALL poll(0x6578c0,0x3,0x3e8) > 73244 107398 lighttpd 1390849422.865052 RET poll 1 > 73244 107398 lighttpd 1390849422.865068 CALL > getsockopt(0x7,SOL_SOCKET,<invalid=4103>,0x7fffffffceac,0x7fffffffceb0) > 73244 107398 lighttpd 1390849422.865084 RET getsockopt 0 > 73244 107398 lighttpd 1390849422.865128 CALL write(0x4,0x6b17b0,0xa8) > 73244 107398 lighttpd 1390849422.865186 GIO fd 4 wrote 168 bytes > "2014-01-27 19:03:42: (mod_fastcgi.c.2884) backend is overloaded; > we'll disable it for 1 seconds and send the request to another backend > instead: reconnects: 0 load: 1 > " > 73244 107398 lighttpd 1390849422.865202 RET write 168/0xa8 > 73244 107398 lighttpd 1390849422.865218 CALL > wait4(0x11e1d,0x7fffffffcdcc,0x1<WNOHANG>,0) > 73244 107398 lighttpd 1390849422.865235 RET wait4 0 > 73244 107398 lighttpd 1390849422.865251 CALL close(0x7) > 73244 107398 lighttpd 1390849422.865274 RET close 0 > 73244 107398 lighttpd 1390849422.865291 CALL > wait4(0x11e1d,0x7fffffffc9bc,0x1<WNOHANG>,0) > 73244 107398 lighttpd 1390849422.865305 RET wait4 0 > 73244 107398 lighttpd 1390849422.865321 CALL > wait4(0x11e1d,0x7fffffffc9bc,0x1<WNOHANG>,0) > 73244 107398 lighttpd 1390849422.865335 RET wait4 0 > 73244 107398 lighttpd 1390849422.865351 CALL > wait4(0x11e1d,0x7fffffffc9bc,0x1<WNOHANG>,0) > 73244 107398 lighttpd 1390849422.865365 RET wait4 0 > 73244 107398 lighttpd 1390849422.865380 CALL > wait4(0x11e1d,0x7fffffffc9bc,0x1<WNOHANG>,0) > 73244 107398 lighttpd 1390849422.865394 RET wait4 0 > 73244 107398 lighttpd 1390849422.865409 CALL > wait4(0x11e1d,0x7fffffffc9bc,0x1<WNOHANG>,0) > 73244 107398 lighttpd 1390849422.865423 RET wait4 0 > 73244 107398 lighttpd 1390849422.865451 CALL writev(0x6,0x6b0ba0,0x2) > 73244 107398 lighttpd 1390849422.865491 GIO fd 6 wrote 532 bytes > "HTTP/1.0 503 Service Not Available\r Regards, -- Steven Chamberlain ste...@pyro.eu.org
signature.asc
Description: OpenPGP digital signature