Yes, it does tolerate a deviation well. But I remember DQ saying somethnig like this.
Here's one reference to the post I was talking about. In the thread I'd been suggesting "optimal" would be best if the training ratio matched your "real world" spam:ham ratio (which historically was somewhere around 75/25 here, but recently it's closer to 60/40).
Dan corrected me and said 50/50 was the goal to shoot for:
http://readlist.com/lists/incubator.apache.org/spamassassin-users/0/2046.html
Of course, my all-of-history ratio is about 96:4, and my recent training ratio is 90:10 (past day).
I agree on a personal scale it works wonders if you *continue* to feed it a proper diet.
Really, I think ratios are helpful, but a fresh feed of both seems more important. I totally agree with the above. between autolearn and forced training scripts, SA learns quite a bit of mail.
But when you get to a more general server side solution, I don't think the results are worth the effort, when one can write a simple rule faster then training.
I don't think that's true.. the autolearner is a big help here.. Although I force feed, SA autolearns more mail than my scripts feed it.
(64% of spam and 12% of ham get autolearned the way I'm set up, and I've not seen any learning errors so far. However, I do use a setup tweaked to avoid false ham learning, something I consider a major issue with the default autolearn threshold.)