On Thu, 3 Aug 2006, Helge Oldach wrote:

Well... I've spotted a regression not with the ports tree but with 6-STABLE. On several boxes with this change applied I see lots of sendmails stacking up over time, for example:

 713  ??  Ss     0:01.05 sendmail: accepting connections (sendmail)
 717  ??  Is     0:00.02 sendmail: Queue [EMAIL PROTECTED]:30:00 for 
/var/spool/client
31747  ??  I      0:00.00 sendmail: startup with 71.119.31.81 (sendmail)
32834  ??  I      0:00.00 sendmail: startup with 83.36.190.38 (sendmail)
33569  ??  I      0:00.00 sendmail: startup with 221.206.76.60 (sendmail)
34023  ??  I      0:00.00 sendmail: startup with 49.195.192.61.tokyo.flets.alph
34459  ??  I      0:00.00 sendmail: startup with 221.165.35.46 (sendmail)
36517  ??  I      0:00.00 sendmail: startup with 61.192.180.137 (sendmail)
38722  ??  I      0:00.00 sendmail: startup with 203.177.238.78 (sendmail)
39126  ??  I      0:00.00 sendmail: startup with 222.90.251.185 (sendmail)
39203  ??  I      0:00.00 sendmail: startup with 221.9.214.183 (sendmail)
39859  ??  I      0:00.00 sendmail: startup with 59.20.101.111 (sendmail)
41090  ??  I      0:00.00 sendmail: startup with 61.192.166.235 (sendmail)
41766  ??  I      0:00.00 sendmail: startup with 68.118.52.132 (sendmail)
42482  ??  I      0:00.00 sendmail: startup with 219.249.201.36 (sendmail)
42483  ??  I      0:00.00 sendmail: startup with 219.249.201.36 (sendmail)
43467  ??  I      0:00.00 sendmail: startup with 210.213.191.70 (sendmail)
43757  ??  I      0:00.00 sendmail: startup with 220.189.144.7 (sendmail)
44176  ??  I      0:00.00 sendmail: startup with 71.205.226.98 (sendmail)
44850  ??  I      0:00.00 sendmail: startup with 72.89.135.133 (sendmail)
44943  ??  I      0:00.00 sendmail: startup with 220.167.134.212 (sendmail)
48031  ??  I      0:00.00 sendmail: startup with 60.22.198.23 (sendmail)

On one busy sendmail box I've seen literally thousands of such processes. Note that these processes don't disappear, so it is not related to sendmail.cf's timeouts.

Broswing through the recent STABLE commits, I firstly thought it was related to the recent socket code changes, but no, it's not. It is definitely this introduction of BIND9's resolver. If I back out this change, all is fine again.

As said, this is a very recent 6-STABLE. I'm tracking CTM, not cvs.

I would seriously suggest to more thoroughly test this. I'm not asking to back it out right now, but this is definitely a breakage in 6-STABLE that should be fixed before 6.2.

I've had a similar report from Bjoern Zeeb; at first we thought the reason he had stacking up TCP connections was a bug I introduced in 7.x, but it turns out it's because his sshd is wedging in name resolution, and not closing the TCP sockets (which are now visible in netstat in a way they weren't before). We only concluded that it was not a kernel socket bug a day or so ago, so I'm not sure he's had a chance to generate a resolver bug report. He reported that the application appeared to have two connected UDP sockets for name resolution, and one bad name server entry, but that the resolver appeared to be blocked in a read on the UDP socket that didn't have data queued, rather than the one that did. This was all from looking at netstat, and as far as I know, he's not dug into the resolver yet to see what might be happening. I've CC'd Bjoern in case he has further insight or can offer some more suggestions on what might be going on.

Robert N M Watson
Computer Laboratory
University of Cambridge
_______________________________________________
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to