Re: [SAtalk] Bayes classification: how do I know it's working?

Matthew Cline Sat, 23 Nov 2002 00:15:40 -0800

On Friday 22 November 2002 11:08 pm, [EMAIL PROTECTED] wrote:
> Hi,
>
> I've setup another server to try out version 2.5.0. I already trained
> it to identify spam but not yet non-spam messages. How do I know it's
> working?


You can't, not until you train it with a non-spam corpus as well.  How 
to train and test:

- Use tools/split-corpora to split your spam corpora and your non-spam
  corpora into two (giving you spam.1, spam.2, non-spam.1, non-spam.2).

- Remove the current Bayes databases, which are probably in
  ~/.spamassassin

- Train Bayes on spam.1 and non-spam.1

- Copy the Bayes dbs into masses/spamassassin/, or make softlinks from
  that directory to the dbs; if you don't, mass-check won't do Bayes.

- Run masses/mass-check on spam.2 and non-spam.2

- run "masses/hit-frequencies -p -m BAYES" to see how the Bayes
  rules are working.

To make mass-check run more quickly, make a second copy of the 
spamassassin tree, then remove all of the ".cf" files from the rules/ 
directory, except for 10_misc.cf, 23_bayes.cf and 50_scores.cf, so that 
*only* the Bayes tests will be run during by mass-check.

-- 
Give a man a match, and he'll be warm for a minute, but set him on
fire, and he'll be warm for the rest of his life.

Advanced SPAM filtering software: http://spamassassin.org


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Bayes classification: how do I know it's working?

Reply via email to