Hi, The file size of the bayes database on a server is becoming large : bayes_seen is 160 MB and bayes_toks is 8 MB. This mail server processes around 30000 mails a day, as a relay.
I did not configure any bayes_expiry_max_db_size, so it should be set to default (150000), and the only bayes-related configuration directives in my local.cf are : bayes_auto_learn 1 bayes_auto_learn_threshold_nonspam 0.1 bayes_auto_learn_threshold_spam 12.0 Is it normal to have such large file sizes ? The fine manual says that with such settings, the file size should stay around 8 MB, but do these 8 MB represent the "normal" size of the bayes_toks file, or the normal size of the bayes_seen one ? Today, spamd stopped working with the following error : Dec 15 04:25:15 server spamc[18803]: connect(AF_INET) to spamd at 127.0.0.1 failed, retrying (#1 of 3): Connection refused I did not understand why it died. Manually restarting spamd solved the problem but I think it could happen again, and it might be related to some lack of resources due to the bayes file size ? Some more info : su spam -s /bin/sh -c "sa-learn --dump magic -D" (...) debug: bayes: 6765 tie-ing to DB file R/O /home/spam/.spamassassin/bayes_toks debug: bayes: 6765 tie-ing to DB file R/O /home/spam/.spamassassin/bayes_seen debug: bayes: found bayes db version 3 debug: Score set 2 chosen. 0.000 0 3 0 non-token data: bayes db version 0.000 0 405891 0 non-token data: nspam 0.000 0 948334 0 non-token data: nham 0.000 0 287829 0 non-token data: ntokens 0.000 0 1103037764 0 non-token data: oldest atime 0.000 0 1103107296 0 non-token data: newest atime 0.000 0 1103107219 0 non-token data: last journal sync atime 0.000 0 1103105595 0 non-token data: last expiry atime 0.000 0 43200 0 non-token data: last expire atime delta 0.000 0 161098 0 non-token data: last expire reduction count debug: bayes: 6765 untie-ing debug: bayes: 6765 untie-ing db_toks debug: bayes: 6765 untie-ing db_seen I am using postfix 1.1.12, SA 3.0.1, MIME-Base64-3.05, DB_File-1.809, and db4-4.0.14-20 (RedHat 9) on a postfix+SA relay. The bayes database is common to all users, and located on the "spam" user's home directory. SA is invoked with "spamd -d -c -u spam" and "/usr/bin/spamc -t 180 -s 500000 -e /usr/sbin/sendmail -i -f ${sender} -- ${recipient}" Many thanks to whoever has any clue on how I could shrink the bayes files without loosing them, if they need to ("--force-expire" does not reduce their sizes). I would particularly be interested in the right bayes_expiry_max_db_size setting I should use for a server handling around 30000 mails daily.