Thanks for the help guys. I am going to post this info to the list for general
review:
Mark Delany wrote:
> Sounds particularly nasty. It may well be a bug associated with establishing
> tcp sessions.
>
> 700 outstanding sessions sounds awfully high.
>
> It may also be a router issue. Is it possible that the tcp sessions are
> starting and not completing because certain packet types are being rejected?
>
> Btw. How are you getting 700 sessions? Do you have the concurrency on
> tcpserver set *that* high?
>
> Regards.
> John White wrote:
> Uhhh... You definitely want to check out that tcpserver manpage again.
> Especially the difference between -c and -b.
>
> Also, the listen manpage.
>
> It would greatly help if you would give me your current tcpserver
> invocation for qmail-smtpd
>
> --
> John White
> [EMAIL PROTECTED]
>
Yes, 700 connections seems high, but after some period of down time, it seems to
me to be within the realm of possibilities. Our Alteon ACEdirector (load
balancing switch) seemed to indicate approximately 600 connections and I had
originally thgought that this number was artificially inflated due to some
time-out value. The number that the Alteon showed was up as high as 1200
connections on a single server (when the other 2 were down). Here is the
invocation of tcpserver:
tcpserver -l$hostvalue -q -b100 -H -R -D 0 pop-3
/var/qmail/bin/qmail-popup $hostvalue /var/qmail/bin/CheckPasswd
/var/qmail/bin/qmail-pop3d Maildir &
tcpserver -l$hostvalue -q -b50 -H -R -D 0 2001
/var/qmail/bin/qmail-popup $hostvalue /var/qmail/bin/CheckPasswdVirtual cyb
/var/qmail/bin/qmail-pop3d Maildir &
tcpserver -l$hostvalue -t8 -q -b5000 -D -u502 -g2108
-x/var/qmail/control/tcprules.dat 0 smtp /var/qmail/bin/qmail-smtpd &
tcpserver -l$hostvalue -t8 -q -b50 -D -u502 -g2108
-x/var/qmail/control/tcprules.dat 0 2002 /var/qmail/bin/qmail-smtpd &
In case you are wondering, CheckPasswdVirtual (on port 2001) and SMTP (on port
2002) are actually accepting connections to a different switched address on the
Alteon. CheckPasswdVirtual attaches a "cyb-" to their username so that they did
not have to make any changes to accomodate a VirtualDomain type login.
ohh.. and $hostvalue = mailX.desupernet.net where X = [124]
After reading "man listen" I am reminded of the help this list gave us when we
had this problem before. Our connections were being artificially limited by
Linux to 5 at a time! This was solved with adding the -b20 (then later upping
it to 30, then 50, then 100) . Last night, I grew frustrated with this and set
it to 5000 on the port that was unresponsive (25). This had no effect. 'man
listen' seems to indicate that 128 may be a max value this can be set to? Is
that my actual limit?
If one server reaches this limit, it is overloaded and if lucky it is dropped
out of the rotation by the alteon. This causes the other servers to overload
and reach the same state.
SMTP connections will eventually get through, it just may take 5 or 10 or 15+
minutes for the
220 mail1.desupernet.net ESMTP
line to come up.
Thanks again for the help (I have been sleeping under my desk awaiting helpful
replies :) )
If you need any other info let me know.
> At 01:16 18/02/99 -0500, Jere Cassidy wrote:
> >
> >What is wrong with the following setup:
> >
> >less than 30K customers:
> >
> >qmail 1.03 running on 3 high speed alpha's (each with 128MB ram)
> >Running 4 TCPSERVER daemon processes.
> >
> >
> >1 SMTP (port 25)
> >1 POP3 (port 110)
> >1 SMTP (port 2001)
> >1 POP3 (port 2002)
> >
> >These 3 servers running these 4 daemons share a Netapp filer for backend
> >storage.
> >
> >We have done major tuning to these servers time after time. Here is the
> >current situation:
> >3 of the daemons run fine. the SMTP (on regular port 25) does not
> >respond.
> >
> >I have set the -b option for TCPserver(this helped us immensely before)
> >to 5000 (supposedly allowing tcpserver to respond to 5000 connections).
> >
> >Is there some default limit somewhere that would only allow tcpserver to
> >pass so many connections to qmail-smtpd? The downtime on the servers is
> >getting rediculous because of this problem.
> >
> >If I run /var/qmail/bin/qmail-smtpd it comes right up.
> >If I telnet localhost 2002 (simply another instance of tcpserver) it
> >comes right up.
> >Both POP3 connections come right up
> >
> >If i do a "netstat -n|grep ":25 " I get almost 700 connections although
> >most of these are in the "CLOSE WAIT" stage or something similar.
> >
> >On one of the servers, when this happens and qmail is totally
> >unresponsive on port 25, the load drops to 0.00 and the server just sits
> >there.
> >
> >restarting qmail seems to help for about 5 minutes... then the imaginary
> >limit is hit and everything goes to hell.
> >
> >Anyone have any suggestions for the current situation?
> >
> >
> >
> >
> >
> >
> >
> >
> >--
> >------------------------------------------------------------------------
> >
> >// Jere Cassidy - System Administration - D&E SuperNet
> > email: [EMAIL PROTECTED] phone: (717)738-7054
> > web: http://www.desupernet.net/jere
> > pager/pcs: [EMAIL PROTECTED] - (717)203-0042
> >~~~ "While sowing the seeds of Utopia,
> > you invoked a convenient amnesia" -BR ~~~
> >------------------------------------------------------------------------
> >
> >
> >
> >
> >
--
------------------------------------------------------------------------
// Jere Cassidy - System Administration - D&E SuperNet
email: [EMAIL PROTECTED] phone: (717)738-7054
web: http://www.desupernet.net/jere
pager/pcs: [EMAIL PROTECTED] - (717)203-0042
~~~ "While sowing the seeds of Utopia,
you invoked a convenient amnesia" -BR ~~~
------------------------------------------------------------------------