Charles, thanks for the response!
That scenario sounds feasible enough, but looks like that is not it.
The output from tcpserver is all very much like this (and annoyingly has no
date/time):
@400000003b433ad92bbc7304 tcpserver: status: 1/20
@400000003b433ad92bc0b8c4 tcpserver: pid 12691 from 131.193.178.181
@400000003b433ad92eddf29c tcpserver: ok 12691 :10.0.0.5:25
muncher.math.uic.edu:131.193.178.181::26023
@400000003b433ada0ffedb34 tcpserver: end 12691 status 0
@400000003b433ada0fff5834 tcpserver: status: 0/20
Am I correct to think that my concurrency is the 1/20, 0/20, etc? If so it
does not seem to be piling up dead connections.
And, yes, rebooting the machine also does not fix the problem.
Like you say, tcprules does leave a clean .cdb file, but it seems as though
some other process comes along and steps on it later. I guess filesystem or
memory corruption could be the culprit. I guess that is off-topic for the
qmail list, but any ideas how I might test out that theory?
Thanks again, j
-----Original Message-----
From: Charles Cazabon [mailto:[EMAIL PROTECTED]]
Sent: Thursday, July 05, 2001 3:15 PM
To: [EMAIL PROTECTED]
Subject: Re: Strange tcp.smtp issue
Josiah Hobson <[EMAIL PROTECTED]> wrote:
> Now the smtp socket seems to die or hang or something after 12-24 hours.
> netstat shows it as still listening, but a telnet to port 25 will connect
> but not initiate a session or respond to any commands. Applications
> attempting to send mail will of course timeout waiting for a response.
If tcpserver hits the concurrency limit you've specified (default 40),
additional connections will go into this limbo state until one of the
existing
connections exits. Perhaps something is tying up connections? Make sure
you
use the -v option of tcpserver, and capture/log tcpserver's output. That
will
log status lines, showing you what the concurrency is at every connection.
If
it goes steadily up, this is your problem.
> Restarting qmail with `qmail restart` does NOT solve the problem, nor does
> restarting the OS.
Restarting the OS? You mean rebooting? That clears the TCP connection
backlog in every OS I know :).
> Simply rebuilding the cdb with `qmail cdb` DOES HOWEVER fix the problem
> immediately. So it does seem that the tcp.smtp.cbd file is being
corrupted.
It appears so. Are you logging tcpserver's output and using the -v flag?
> I don't see anything anywhere in the logs to indicate that something bad
> is happenening. I am no qmail expert so I'm about at the end of my
> troubleshooting capability. Anybody who can shed some light on the inner
> workings of this cdb so I can figure out why it is being corrupted?
"tcprules" can't leave a corrupt .cdb file -- it writes to a temp file, and
only moves it into place of the original if successful.
> I mean, I could always cron a `qmail cdb` script and probably avoid the
> issue, but I hate not get down to the root cause.
I would guess filesystem or memory corruption.
Charles
--
-----------------------------------------------------------------------
Charles Cazabon <[EMAIL PROTECTED]>
GPL'ed software available at: http://www.qcc.sk.ca/~charlesc/software/
-----------------------------------------------------------------------