Re: [SAtalk] chi2 combining

Justin Mason Tue, 30 Sep 2003 15:06:11 -0700

Matt Tolton writes:
> 1.  I've been trying to gather information about chi2 combining which is provided as 
> an option in spam assassin.  I've searched the list archives, the web, and 
> newsgroups and haven't come up with much.  Could someone please fill me in on what 
> the difference is here, and what advantages/disadvantages it gives?


It's a nifty combining scheme suggested by some folks on the spambayes
project, which has some very nice properties in (a) putting messages where
the classifier is "mostly sure" right at 0.0 or near 1.0, (b) avoiding
"cancellation disease", and (c) still putting mails where it really is
"unsure" around 0.5.

I can't find the exact discussion now, but it's somewhere around:
http://mail.python.org/pipermail/spambayes/2002-September/
http://mail.python.org/pipermail/spambayes/2002-October/

Cancellation disease is covered in
http://mail.python.org/pipermail/spambayes/2002-October/001236.html .

> 2.  Will the ***SPAM*** tag that I have put in the subject line affect
> bayesian learning when I use sa-learn?  Do I need to take that out?
> 3.  Does sa-learn take out the spamassassin header information
> automatically (I thought I read in the docs that it does, but I can't
> seem to find it.), or do I need to specifically filter out those
> headers?

FAQ:
http://spamassassin.taint.org/faq/index.cgi?req=show&file=faq05.002.htp

--j.


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] chi2 combining

Reply via email to