On Sat, Mar 04, 2006 at 09:56:14PM -0500, Gabriel Wachman wrote:
> During training I run:
> sa-learn --dbpath $WORKDIR --ham $DATADIR/$message_dir
> (likewise for spam)
>
> During testing I run:
> spamassassin -t -p $PREFSPATH $DATADIR/$message_dir
You may want to look into mass-check. It's much
Gabriel Wachman a écrit :
>
> Yes. I know it may sound strange from some people's perspective, but
> there are good reasons we need to do it this way. We are comparing
> several spam filters; in order to make claims about the performance of
> any of the filters we need to evaulate a _fixed_ classi
On Sat, Mar 04, 2006 at 10:50:19PM -0500, Daryl C. W. O'Shea wrote:
> Even with bayes_auto_learn disabled, the tokens' atimes are still
> updated. That's the way SpamAssassin works. That's what helps
> SpamAssassin's bayes implementation in being effective.
Well, sort of. The atime updates ar
Daryl C. W. O'Shea wrote:
On 04/03/06 09:56 PM, Gabriel Wachman wrote:
A colleague and I are writing a paper about a spam filter he developed.
We'd like to compare it against various open source filters, including
SpamAssassin. The methodology we are using is to train the filter on a
set of mes
From: "mouss" <[EMAIL PROTECTED]>
Gabriel Wachman a écrit :
A colleague and I are writing a paper about a spam filter he developed.
We'd like to compare it against various open source filters, including
SpamAssassin. The methodology we are using is to train the filter on a
set of messages, and
Gabriel Wachman a écrit :
> A colleague and I are writing a paper about a spam filter he developed.
> We'd like to compare it against various open source filters, including
> SpamAssassin. The methodology we are using is to train the filter on a
> set of messages, and then test it on an independent
On 04/03/06 09:56 PM, Gabriel Wachman wrote:
A colleague and I are writing a paper about a spam filter he developed.
We'd like to compare it against various open source filters, including
SpamAssassin. The methodology we are using is to train the filter on a
set of messages, and then test it on a
> During testing, I can see spamassassin create a "bayes_journal" file and
> write to it continuously. I understand this is spamassassin's way of
If the journal is only growing it isn't being learned from.
Typically at some point if auto-learn were enabled one of the spam mail runs
would take som
A colleague and I are writing a paper about a spam filter he developed.
We'd like to compare it against various open source filters, including
SpamAssassin. The methodology we are using is to train the filter on a
set of messages, and then test it on an independent set of messages. The
key is that