On Tue, Sep 30, 2003 at 02:27:51PM -0700, Matt Tolton wrote:
> Should the Bayesian classifier use chi-squared combining, instead of 
> Robinson/Graham-style naive Bayesian combining? Chi-squared produces more 'extreme' 
> output results, but may be more resistant to changes in corpus size etc. 
>  
> This isn't very clear to me...can anyone give a more in-depth explanation?

This is realted to the way Bayesian analysis predicts things.  In the
field of decision theory, a popular way to analyze problems is to
consider how risky the choices are.

The text means that the Chi-squared method is riskier - its results are
less conservative.  This can be considered a bad thing (since score
variance probably increases as a result) on one hand.  But the text says
it has other benefits - namely making the analysis more resistent to
outside influences (like corpus size).

It has its good and bad points, but is probably only interesting to
statisticians and developers ::-)

DISCLAIMER: I am not an SA develper and haven't read much of the code -
I'm simply translating the text above into more common language.

-- 
Ross Vandegrift
[EMAIL PROTECTED]

A Pope has a Water Cannon.                               It is a Water Cannon.
He fires Holy-Water from it.                        It is a Holy-Water Cannon.
He Blesses it.                                 It is a Holy Holy-Water Cannon.
He Blesses the Hell out of it.          It is a Wholly Holy Holy-Water Cannon.
He has it pierced.                It is a Holey Wholly Holy Holy-Water Cannon.
He makes it official.       It is a Canon Holey Wholly Holy Holy-Water Cannon.
Batman and Robin arrive.                                       He shoots them.


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to