On 2013-11-14 17:04, Mark Andrews wrote:
In message <fd9b2cb2b33e394fae3b7466954760571d666...@dfwx10hmptc01.amer.dell.co
M>, vinny_abe...@dell.com writes:
Hi Everyone,

I recently had a recursive server running BIND 9.9.4 on FreeBSD 9.2
appear to wedge and stop responding to clients. I had a flurry of these
errors on the console:

sonewconn: pcb 0xfffffe007211d930: Listen queue overflow: 16 already in
queue awaiting acceptance

I couldn't trace that directly back to the named process by the time I
looked at it, but I suspect that's what it was since it's really the only
thing this machine is used for and it stopped working. It seems to have
oddly become unstuck when I logged into the machine and started looking
around. I never restarted named. Everything else on the server was
running normally from what I could tell and no other errors existed that
I could find. Unfortunately my logs rolled over too fast to check if
named had logged anything else interesting.

From what I've found in googling, this is an OS level error stating the
process isn't accepting new TCP connections and it's an application
fault. I've only ever seen this on this particular machine, and just this
once. My other recursive servers are running older versions of FreeBSD.

Or it's just a plain DoS attack.  For any service it is possible to
send tcp connection requests faster than the service can handle it.

Has anyone come across this before and know how to prevent or correct
this properly?

You can tune tcp-listen-queue in named.conf.  The current default is 10.

Thanks!

-Vinny


My logs have been filling up with

sonewconn: pcb 0xfffffe02bb7187a8: Listen queue overflow: 10 already in queue awaiting acceptance

Which seems to have started since upgrading to FreeBSD 9.2 (though there have been other changes, but on the email front...so looking at BIND hadn't crossed my mind at all until I spotted this thread), though its only on one server, so I had been hunting around trying to figure out where its been coming from.

The hex number doesn't correspond to any socket that shows up with lsof, though the sockets that lsof show some resemblence.

doing "lsof -i -T fqs" and looking at QLIM=, I had thought sendmail was the culprit since its default Listen queue is 10. But bumping it to 128, didn't stop the messages. And, I couldn't find any other sockets this way with QLIM=10.

The sockets associated with named ... the tcp domain sockets have QLIM=3 and the rndc socket has a QLIM=128. For these systems, they're all running the system BIND (9.8.4-P2).

named 1276 bind 20u IPv4 0xfffffe00a73697a0 0t0 TCP zen:domain (LISTEN QR=0 QS=0 SO=ACCEPTCONN,NOSIGPIPE,PQLEN=0,QLEN=0,QLIM=3,RCVBUF=524288,REUSEADDR,SNDBUF=524288 SS=NBIO TF=MSS=536,REQ_SCALE,REQ_TSTMP,SACK_PERMIT) named 1276 bind 21u IPv4 0xfffffe00a73693d0 0t0 TCP zen2:domain (LISTEN QR=0 QS=0 SO=ACCEPTCONN,NOSIGPIPE,PQLEN=0,QLEN=0,QLIM=3,RCVBUF=524288,REUSEADDR,SNDBUF=524288 SS=NBIO TF=MSS=536,REQ_SCALE,REQ_TSTMP,SACK_PERMIT) named 1276 bind 22u IPv4 0xfffffe00a738b3d0 0t0 TCP localhost:domain (LISTEN QR=0 QS=0 SO=ACCEPTCONN,NOSIGPIPE,PQLEN=0,QLEN=0,QLIM=3,RCVBUF=524288,REUSEADDR,SNDBUF=524288 SS=NBIO TF=MSS=536,REQ_SCALE,REQ_TSTMP,SACK_PERMIT) named 1276 bind 23u IPv4 0xfffffe00a75223d0 0t0 TCP localhost:rndc (LISTEN QR=0 QS=0 SO=ACCEPTCONN,NOSIGPIPE,PQLEN=0,QLEN=0,QLIM=128,RCVBUF=524288,REUSEADDR,SNDBUF=524288 SS=NBIO TF=MSS=536,REQ_SCALE,REQ_TSTMP,SACK_PERMIT)

FWIW, the only socket with QLIM=16 on my system is upsd (nut).


--
Who: Lawrence K. Chen, P.Eng. - W0LKC - Sr. Unix Systems Administrator
For: Enterprise Server Technologies (EST) -- & SafeZone Ally
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Reply via email to