Derek Catanzaro wrote:
> Matt Kettler wrote:
>> Derek Catanzaro wrote:
>>  
>>> I have a ton of bayes_toks.expire files listed in
>>> /root/.spamassassin.  Is it safe to delete these files?      
>> Yes, provided no expire process is currently running and using one.
>>   
> I did wind up deleting all of the bayes_toks.expire files, there were
> hundreds.
>> 1) run sa-learn --force-expire to fix the immediate problem.
>>   
> After deleting the bayes_toks.expire files I ran sa-learn
> --force-expire and received the result below and it just stayed there
> for at least 20 minutes so I forced it to stop.  Is this normal
> behavior?  Was I too impatient with the process?  My bayes_toks file
> is 321MB, not sure if that is part of the issue.
That's *enormous* for bayes_toks. It should be something on the order of
10-20 megs with the default bayes_expiry_max_db_size settings.

Since your bayes DB has so many excess tokens, it may take sa-learn
--force-expire a VERY long time to actually do the expire. Bayes Expiry
performance doesn't seem to scale very well to large databases when
you're using db files.

Try kicking it off again. If you're concerned it's hung, do two things:

1) add -D to the command line to turn on the debug output. Use this to
make sure it's not getting hung up trying to get a lock. If it cant' get
a file lock you'll see it retrying constantly. Based on the journal sync
message below, it probably didn't get hung here, but it's still useful
to see the debug output in this case.

2) use another terminal to check the .expire file. *after* the debug
tells you SA has figured out a good atime, it should start writing to
this file. It should keep slowly growing in size, so you can use this to
check if it's still working away.

>
> .spamassassin]# sa-learn --force-expire
> bayes: synced databases from journal in 0 seconds: 1611 unique entries
> (2099 total entries)
>> 2) prevent future problems by fixing your spamassassin timeout value in
>> MailScanner.cf
>>
>> Anything under 600 seconds is bad news if you use bayes. In fact, I'd
>> set it to 3000 seconds. I use MailScanner myself, and have since the SA
>> 2.31 days, and I've NEVER had MailScanner time out a SA process for any
>> valid reason. I've only had it time out because the timeout value was
>> too short.
>>   
> Matt, After posting this to the list I did some more research online
> and found the following thread which you responded to.  I have applied
> the settings listed in this thread to my MS/SA setup.  Do these
> settings still apply in your opinion?  The thread recommends a minimum
> of 60 seconds for the spamassassin timeout value, mine is set to 75. 
> Based on what you are saying above I believe I need to increase the
> spamassassin timeout dramatically, can you confirm?  Since I deleted
> the bayes_toks.expire files there has been 1 .expire file generated
> already, so I 'm assuming that should tell me my timeout is still too
> low?
>
> http://mail-archives.apache.org/mod_mbox/spamassassin-users/200410.mbox/[EMAIL
>  PROTECTED]
>

I've since found that 120 seconds wasn't adequate for situations where
there were a lot of tokens to expire. While 120 made this problem less
frequent, it didn't go away. Hence my current recommendations vary from
5 to 10 minutes.


That said, you're bayes_toks is currently so large you probably won't be
able to increase the timeout enough to fix the problem. Given that your
manual run took over 20 mins, it's going to take an opportunistic run
just as long.

>
> Thanks for all of the information you provided.  I really appreciate
> the assistance.
> Thanks,
> Derek
>
>

Reply via email to