Lance Albertson wrote:
I recently updated SA on our machines from 3.1.1 to 3.1.8 and I started
noticing a new issue crop up. I also noticed that someone else had a
similar problem and reported it on this last back in January [1], but it
never got an answer back about it. I've looked elsewhere online and have
yet to find a solution yet.
Here is a log excerpt of what I see:
Mar 23 11:50:48 spamfilter5 spamd[28398]: Use of uninitialized value in
subroutine entry at
/usr/lib/perl5/5.8.5/i386-linux-thread-multi/Socket.pm line 370.
Mar 23 11:50:48 spamfilter5 spamd[28398]: Bad arg length for
Socket::unpack_sockaddr_in, length is 0, should be 16 at
/usr/lib/perl5/5.8.5/i386-linux-thread-multi/Socket.pm line 370.
Mar 23 11:50:48 spamfilter5 spamd[28398]: spamd: error: Bad arg length
for Socket::unpack_sockaddr_in, length is 0, should be 16 at
/usr/lib/perl5/5.8.5/i386-linux-thread-multi/Socket.pm
line 370.
Mar 23 11:50:48 spamfilter5 spamd[28398]: , continuing at
/usr/bin/spamd line 924.
Mar 23 11:50:48 spamfilter5 spamd[25791]: prefork: child states:
BBBBBBKKKKKKKKBBBBBBBBBBBBBBBBBB
Mar 23 11:50:48 spamfilter5 spamd[25791]: prefork: server reached
--max-children setting, consider raising it
During the time I get these errors, I seem to have emails go through the
system without getting tagged with any X-Spam* tags. Yet, I can find in
the log that the email was tagged and was done under the timeout setting
we have for spamc. These errors seem to be related to the amount of load
the machine is having at the time (i.e. higher loads tends to bring
these errors out more). They also seem to be transient in that after a
few minutes they seem to go away and things are back to normal (probably
when the load goes down).
I'm no programmer, but from my point of view it seems as though the
child algorithms used to clean up connections is getting confused when
they're close to their max setting.
Now, some background on our setup. We have a pool of seven servers that
are behind a BigIP running spamassassin (running mostly RHAS4, but we
also have two Solaris 10 amd64 machines). We have a pool of mail
delivery servers running sendmail and invoking procmail which then
invokes spamc to connect to the virtual IP. I do not see any timeout
errors in the logs from spamc during these periods of errors.
About a month ago, we were running into a resource limit on our oracle
database server (where all the user prefs are stored). I found the
persistent DB plugin on the wiki site [2] and added it to all our
servers. It fixed the resource issue and no other issue came up at that
time. However, I did notice after adding the plug-in that a lot of spamd
children weren't dying and were staying active. So I suspect this
plug-in might be a source of the problem.
Now since I've upgraded to the latest version, I'm seeing these problem
of non-tagged email. Now, my actual questions:
* Does anyone have any idea what might be causing this problem?
* Do I need to upgrade perl (currently running 5.8.5 on RHAS4)?
* Is the persistent DB plug-in causing the issue?
I just updated one of the Solaris 10 machines and haven't noticed the
error yet. It does have a newer version of perl on it (5.8.8).
Anyways, any help would be appreciated! Thanks!
[1] http://article.gmane.org/gmane.mail.spam.spamassassin.general/94500
[2] http://wiki.apache.org/spamassassin/DBIPlugin
I would see if you could maybe get a fresher version of IO::Socket The
latest on CPAN is 1.2301
(http://search.cpan.org/CPAN/authors/id/G/GB/GBARR/IO-1.2301.tar.gz)
I would *not* try to upgrade Perl. In doing so, you could cause you
machine to laps in an error-log extravaganza.
-=Aubrey=-