Thanks for the responses.

Few questions - will running 'check_whitelist' affect our server's
performance?  Do I risk creating other problems if I leave things as they
are until our sys admin returns?  :)




On 7/18/07, Matt Kettler <[EMAIL PROTECTED]> wrote:

Tammy George wrote:
> Hello.
>
> Our Linux server is running SpamAssassin version 3.1.5.
>
> Backups started dying with 'inactivity timeout'.  Dug around & found
> the following:
>
> drwx------   3 vscan  vscan            512 Jul 18 16:28 .
> -rw-------   1 vscan  vscan  1099983372288 Jul 18 16:28 auto-whitelist
> -rw-------   1 vscan  vscan     1205862400 Jul 18 16:28 bayes_seen
> -rw-------   1 vscan  vscan       10846208 Jul 18 16:28 bayes_toks
> -rw-------   1 vscan  vscan          18240 Jul 18 16:28 bayes_journal
> drwxr-x---  12 vscan  vscan           1024 Jul 18 12:12 ..
> -rw-------   1 vscan  vscan        2654208 Jan 26  2005
> bayes_toks.expire42066
> -rw-------   1 vscan  vscan         606208 Mar 30  2004
> bayes_toks.expire93303
> drwxr-xr-x   2 vscan  vscan            512 Jan 28  2004 old
> -rw-r--r--   1 vscan  vscan           1165 Jan 27  2004 user_prefs
>
> A du -k shows auto-whitelist as being 1747968.
>
> Surprisingly, we aren't experiencing any problems other than the
> backups.  Our site handles A LOT of email.
>
> After I send this email, I'm going to look into check_whitelist and
> trim_whitelist (and probably sa-learn re: the bayes files), however,
> any suggestions would be most appreciated!  Our sys admin is on
> vacation and he's our expert.
for the auto-whitelist file you need to run this command:

   check_whitelist --clean /path/to/auto-whitelist

That said, IMHO, the AWL isn't really ready for production use on large
systems unless you're going to run it on SQL and use your own scripts to
do expiry.

The bayes_toks and bayes_journal files auto-expire, so you don't need to
do anything to them.

The bayes_seen file doesn't have any kind of date information, so it
can't auto-expire. However, you can remove the file reasonably safely.
This file is just a list of all the files that have already been run
through sa-learn. The only drawback to deleting it is that it will allow
you to re-train a message that you've already learned. So if you
maintain a massive directory of files to be "relearned" but don't clean
it out, you might have a minor amount of over-learning (no big deal).



>
> Thanks in advance for any advice.
>


Reply via email to