On Fri, Sep 19, 2003 at 04:38:40PM -0400, Pete O'Hara wrote:Thanks Theo,
Yes, I figured that if for some reason the 50k was too low that I should endup with 100k, but I here I have 165k and this is what is confusing me.
0.000 0 165010 0 non-token data: ntokens
Just remember, it's all "best effort", so you may end up with > max_db_size, or < 100k, depending on how the calculations go, but the code does it's best (surprise!) not to do that. ;)
what would cause an old lock file and a bayes_toks.new that is
static (not being written to and just hanging around)? - I have seen
users with memory problems that cause this but they seem to have
mail problems and database access issues that I don't have - the logs
show that BAYES_XX tests are being utilized
something dying during an expire or import.
-- I believe auto_expiry but how do I know for sure (bayes_auto_expire 1 in
-- /etc/mail/spamassassin/local.cf - which is being read - see below) -- but it's not expiring AFAIK. I have bayes_expiry_max_db_size 50000. I know
-- that with such a small size the result should be 100,000 tokens
irrelevent. auto_expire occurs because it appears in the -D output. whether or not it does anything is a different issue.
-- have "bayes_expiry_max_db_size 50000" to try to force an auto expire debug: bayes: expiry check keep size, 75% of max: 37500
debug: bayes: expiry keep size too small, resetting to 100,000 tokens
debug: bayes: token count: 165454, final goal reduction size: 65454
debug: bayes: First pass? Current: 1063914795, Last: 1063725493, atime: 1382400, count: 66791, newdelta: 1410637, ratio: 1.02042655911021
-- why were 155616 tokens kept? should have been 100,000 I thought
debug: expired old Bayes database entries in 84 seconds: 155616 entries kept, 9838 deleted
This is explained in the sa-learn docs, but in short... 2.6x's expiry code tries to be efficient (and time-saving) by estimating time deltas based on the previous expire run, on the assumption that your mail flow will be semi-constant and therefore expires will be on roughly the same number of tokens with roughly the same time delta.
The "First pass?" line shows the values used to figure out if an estimation is likely to work or not. I'm not going to go into specifics (see sa-learn's poddoc's EXPIRATION section), but with the values listed above, SA decided that it can estimate based on the previous expiry. (note: I just added a debug statement to the expire code to say that it will use estimation and not run a first pass.) By doing so, however, it was only able to remove 9838 tokens, leaving 155616. ie: you learned a lot less tokens in the last 2 weeks than the 2 weeks previous to that.
The estimation, btw, calculated that atime*count/goal_reduction (1382400*66791/65454) gave a new atime delta estimate of 1410637, or approximately 16 days.
Hope this helps. :)
This helps a LOT :). I did read the docs on the EXPIRY conditions but couldn't make sense of some of it which you just cleared up for me.
Thanks again Pete
------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk