Hi,

The file size of the bayes database on a server is becoming large :
bayes_seen is 160 MB and bayes_toks is 8 MB. This mail server processes
around 30000 mails a day, as a relay.

I did not configure any bayes_expiry_max_db_size, so it should be set to
default (150000), and the only bayes-related configuration directives in
my local.cf are :

bayes_auto_learn                        1
bayes_auto_learn_threshold_nonspam      0.1
bayes_auto_learn_threshold_spam         12.0

Is it normal to have such large file sizes ? The fine manual says that
with such settings, the file size should stay around 8 MB, but do these
8 MB represent the "normal" size of the bayes_toks file, or the normal
size of the bayes_seen one ?

Today, spamd stopped working with the following error :

Dec 15 04:25:15 server spamc[18803]: connect(AF_INET) to spamd at
127.0.0.1 failed, retrying (#1 of 3): Connection refused

I did not understand why it died. Manually restarting spamd solved the
problem but I think it could happen again, and it might be related to
some lack of resources due to the bayes file size ?

Some more info :
su spam -s /bin/sh -c "sa-learn --dump magic -D"
(...)
debug: bayes: 6765 tie-ing to DB file R/O
/home/spam/.spamassassin/bayes_toks
debug: bayes: 6765 tie-ing to DB file R/O
/home/spam/.spamassassin/bayes_seen
debug: bayes: found bayes db version 3
debug: Score set 2 chosen.
0.000          0          3          0  non-token data: bayes db version
0.000          0     405891          0  non-token data: nspam
0.000          0     948334          0  non-token data: nham
0.000          0     287829          0  non-token data: ntokens
0.000          0 1103037764          0  non-token data: oldest atime
0.000          0 1103107296          0  non-token data: newest atime
0.000          0 1103107219          0  non-token data: last journal
sync atime
0.000          0 1103105595          0  non-token data: last expiry
atime
0.000          0      43200          0  non-token data: last expire
atime delta
0.000          0     161098          0  non-token data: last expire
reduction count
debug: bayes: 6765 untie-ing
debug: bayes: 6765 untie-ing db_toks
debug: bayes: 6765 untie-ing db_seen

I am using postfix 1.1.12, SA 3.0.1, MIME-Base64-3.05, DB_File-1.809,
and db4-4.0.14-20 (RedHat 9) on a postfix+SA relay. The bayes database
is common to all users, and located on the "spam" user's home directory.

SA is invoked with "spamd -d -c -u spam" and "/usr/bin/spamc -t 180 -s
500000 -e /usr/sbin/sendmail -i -f ${sender} -- ${recipient}"



Many thanks to whoever has any clue on how I could shrink the bayes
files without loosing them, if they need to ("--force-expire" does not
reduce their sizes). I would particularly be interested in the right
bayes_expiry_max_db_size setting I should use for a server handling
around 30000 mails daily.



Reply via email to