On Fri, 12 Sep 2014 16:21:48 +0200 Axb wrote: > On 09/12/2014 03:48 PM, RW wrote: > > There's a qualitative difference between a threshold of 0.1 and > > -1.0. At 0.1 ham can be learned just by not hitting any spam tests, > > Which means that FNs get easily learnt as ham, which is what we're > trying to avoid.
You're assuming that broad and balanced learning with a little miss-training is necessarily worse than any kind of learning without miss-training. My point is not that 0.1 is better, just that it's better understood. At -1.0 the training will be sensitive to custom rules and some of the most variable quality stock rules there are - rules that are often zeroed or made positive. As an extreme example, someone might try it and think it's working well, while it's learning nothing more than autogenerated mail from a single website due to one custom rule with a negative score. If I relied on auto-training I wouldn't drop the threshold to -1.0 without determining what extra custom rules are needed to keep the training broad.