Re: training SpamAssassin without updating bayes*

mouss Sun, 05 Mar 2006 08:58:02 -0800

Gabriel Wachman a écrit :
> 
> Yes. I know it may sound strange from some people's perspective, but
> there are good reasons we need to do it this way. We are comparing
> several spam filters; in order to make claims about the performance of
> any of the filters we need to evaulate a _fixed_ classifier on a test
> set. If the classifier is not fixed, then our confidence intervals go
> out the window. It actually helps SpamAssassin if we can do this because
> if we can't, we need to mention in the paper that any results from
> SpamAssassin are not statistically robust since it changes its
> classifier during training. Since SpamAssassin is so widely used and in
> my experience performs very well, we would really like to include
> results from it without any such caveats.
> 
> I hope that helps explain the situation. Regardless, our testing
> methodology is really not up for discussion, I just want to know if
> there is an easy way to do what we want.
>


All adaptive filters change learn. consider auto learn as an additionnal
Bayesian feature. SA uses non textual attributes collected by its rules.
If you remove the ability to use these attributes, you are removing a
possibly helpful feature (whether this feature works well or not,
whether it is theoritically sound, ... are other questions).

Or do you want to compare SA's "heuristic" rules only (disabling both
Bayes and auto_learn) to other filters? If so, just disable these. Of
course, I assume you'll signal this in your report:)

Re: training SpamAssassin without updating bayes*

Reply via email to