On 30 Sep 2004 at 9:00, Chip Paswater wrote: > Does a human review the scores generated by the statistics engine? > > Doesn't it make sense to have more of a bell curve on the 2nd set of bayes > scores? > > If not, why not? > > The teeth seem seem to be taken out of BAYES_99 with it's low 1.9 score, > and most of my spam is triggering .99 to 1. That to me seems like an > obvious oversight, and I'm just wondering what the thinking was to leave it > at 1.9 for the 3.0 release.
I can't speak for the developers, but the discussion of how the GA evolved scores work has come up on previous releases as well. Basically what you're seeing is that the network tests are so effective (most likely due to DNSURI tests) that the effect of high bayes scores becomes much less important when classifying the spammiest spam *during the GA run.* That last bit is important--the GA run is processing a known corpus, rather than a live mail feed. In my opinion, it therefore gives more weight to the network tests than is directly applicable to a real-world SA setup because there is no GA-testable reporting delay. It takes some (admittedly quite short) period of time before specific spam is reported to Razor, SpamCop, etc. and makes its way onto the various DNSBL and hash-based servers, and if you happen to be one of the unlucky few at the front of a particular spam wave, many of the net-based tests can be missed. There are ways of alleviating this-- many who have implemented MTA-based greylisting, for example, report that delivery delays of as short as five to fifteen minutes can provide enough time for spam reporting propogation and significantly increase net-test efficacy. With SA alone, however, I agree that the distributed bayes scores should be bumped up. I'm setting the bayes scores in ruleset 4 to be the same as ruleset 3 (bayes but no network tests) and I may even revert to the 2.6 bayes values. If you do override scores for individual tests, make sure you do it in local.cf rather than 50_scores.cf. That way, a future point release upgrade won't clobber your careful massaging. :-) ---- Nels Lindquist <*> Information Systems Manager Morningstar Air Express Inc.