-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Over the last several days, we've been getting email making it through
SpamAssassin unfiltered.  Looking through the log files, we appeared to
be maxing out the number of spamd child processes.  Further log review
made it look like we might be having some issues with iXHash so I
removed that network test.  I also increased the max number of child
processes allowed.

We are still getting some messages through unfiltered and I'm seeing an
error like this in the log file:

        child processing timeout at /usr/bin/spamd line 1085

After Googling for other references, I cam across this thread dealing
with the Bayes database:

http://www.nabble.com/SA-gone-mad,-times-out-and-stucks-t2324602.html


I have run "sa-learn --sync" and "sa-learn --force-expire" a couple of
times this morning and I don't seem to be expiring as much data as I'm
expecting to:

Here is the output from "sa-learn --dump magic"

0.000          0          3          0  non-token data: bayes db version
0.000          0      73538          0  non-token data: nspam
0.000          0      29705          0  non-token data: nham
0.000          0     294798          0  non-token data: ntokens
0.000          0 1172380678          0  non-token data: oldest atime
0.000          0 1172424244          0  non-token data: newest atime
0.000          0 1172424219          0  non-token data: last journal
sync atime
0.000          0 1172424063          0  non-token data: last expiry atime
0.000          0      43200          0  non-token data: last expire
atime delta
0.000          0       3918          0  non-token data: last expire
reduction count

I have not overridden bayes_expiry_max_db_size so it should be the
default value of 150000 tokens -- I currently have almost 295000 tokens
in the Bayes database.

- From the EXPIRATION section of the sa-learn man page, I got the
calculations:

keep = 150000 * 0.75 = 112500
goal = current - keep = 295000 - 112500 = 182500
new atime delta = old atime delta * old reduction count / goal
                = 43200 * 3918 / 182500 =~ 927

Now since I've run a manual --force-expire already this morning, I
appear to fall in to the 'weird' definition.  I haven't quite figured
out the 'estimation pass logic' yet.

Is there some way, other than completely nuking my current bayes
database, that I can reduced its size significantly back down to around
100K tokens?

Thanks,
David Goldsmith
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3rc2 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFF4ciD417vU8/9QfkRAnqBAJ97crMGTMs1fP0yAN3gdbbyx00EZQCfZFoy
vUgFs94YYMsIXAfAvdAcAgo=
=o9Zc
-----END PGP SIGNATURE-----

Reply via email to