At 04:56 PM 2/15/2005, Chris Santerre wrote:
Yes, it does tolerate a deviation well. But I remember DQ saying somethnig
like this.

Here's one reference to the post I was talking about. In the thread I'd been suggesting "optimal" would be best if the training ratio matched your "real world" spam:ham ratio (which historically was somewhere around 75/25 here, but recently it's closer to 60/40).


Dan corrected me and said 50/50 was the goal to shoot for:

http://readlist.com/lists/incubator.apache.org/spamassassin-users/0/2046.html

Of course, my all-of-history ratio is about 96:4, and my recent training ratio is 90:10 (past day).



I agree on a personal scale it works wonders if you *continue* to feed it a
proper diet.

Really, I think ratios are helpful, but a fresh feed of both seems more important. I totally agree with the above. between autolearn and forced training scripts, SA learns quite a bit of mail.


 But when you get to a more general server side solution, I
don't think the results are worth the effort, when one can write a simple
rule faster then training.

I don't think that's true.. the autolearner is a big help here.. Although I force feed, SA autolearns more mail than my scripts feed it.


(64% of spam and 12% of ham get autolearned the way I'm set up, and I've not seen any learning errors so far. However, I do use a setup tweaked to avoid false ham learning, something I consider a major issue with the default autolearn threshold.)




Reply via email to