On 2008-04-12 14:57:52 +0000, R wrote: > On Sat, 12 Apr 2008, Peter J. Holzer wrote: > > On 2008-04-11 15:20:21 +0000, R wrote: > > > On Fri, 11 Apr 2008, Charlie Brady wrote: > > > > I suspect that the only reason you hit the limit was because some plugin > > > > was not returning. I suspect that qpsmtpd wasn't actually hanging, it > > > > just wasn't accepting more connections. > > > > > > Tracing a non-returning plugin could be very difficult :-( > > > > Shouldn't be that bad. If you run at log level LOGINFO, all calls to > > plugins are logged (in vanilla 0.32, if you are using my RPMs, you need > > LOGDEBUG). The "Too many connections: ..." message is also at priority > > LOGINFO, so if you are running at a higher log level, that explains why > > you don't see that message. > > I set up from scratch. I do get that log message for too many connections > from the same IP.
That's also LOGINFO. > I'm set for LOGNOTICE, though all of my "comment" logging is LOGCRIT - > I think all plugins used are highly customized for this particular > site. Probably. You woudn't see the the message above with the unmodified plugin at that setting. (Yep, LOGINFO is way too low for such an important message, but log levels in qpsmtpd are generally strange). Anyway, if you modified your code you'll have to check for yourself if the messages you expect to see have the correct level - we can't help you there. > Agreed. It seems unusual to me that it would happen with ALL 31 > connections, which would all stay hung for several hours before I > discovered it and restarted forkserver. > > It sounds more, to me, as if the 32nd connection triggered something that > hung all of the connections, permanently. They are different processes, so they cannot cause each other to hang. I can only think of two scenarios which would cause all of them to hang: 1) The parent process hangs. In that case the child processes do not hang, but run to completion. But they won't be reaped by the parent so their corpses hang around forever as zombies (state Z). 2) One of the processes sends (or causes the OS to send) a SIGSTOP to the process group. That would really cause all processes to hang but you would see that they are all in state T. Anyway, you can test that pretty easily. Just open 32 parallel connections to the server yourself. Something like for i in `seq 1 32` do xterm -e telnet yourserver 25 & done should do nicely. Does the server really hang at that point? Or are you able to speak SMTP in the open sessions and quit them? Does the server accept a new connection for each one you quit? > > As for the latter, I don't think that is happening. It is more probable > > that number of hanging processes is slowly increasing. Unless you are > > monitoring for long-running qpsmtpd processes you won't notice this at > > first (if you have 10 hanging processes, you can still accept 20 mails > > in parallel). Only when the number of hanging processes approaches your > > limit, you will notice that less and less mail gets through until when > > you hit the limit, no new connections are accepted at all (which is > > probably the point where users start to complain). > > As I indicated, I monitor connections a lot - just grep the for forkserver > processes. They usually range from 5-20 at any one time, and they > constantly change. I think it's too much of a coincidence to suggest that > 31 connections gradually, or even quickly, got filled and hung at the same > point in a plugin. If you don't see the same processes hanging around for some time before the hang that shoots down the "gradually filling up" theory. It's still possible that it fills up quickly. All you need is one client opening a lot of connections at once and then keeping them open (either by hitting a bug which hangs the process or simply by keeping qpsmtpd busy for long enough for you to notice - Something like sending "RSET" every few minutes would suffice). > When I tried a manual connection, there was neither logging nor any > kind of response from qpsmtpd to my telnet to port 25. That's to be expected. qpsmtpd doesn't accept any new connections when the limit is reached. All it does is wait for children to die and to log a message once per second (but you probably won't see that at your settings). hp -- _ | Peter J. Holzer | It took a genius to create [TeX], |_|_) | Sysadmin WSR | and it takes a genius to maintain it. | | | [EMAIL PROTECTED] | That's not engineering, that's art. __/ | http://www.hjp.at/ | -- David Kastrup in comp.text.tex
signature.asc
Description: Digital signature