On Tue, Sep 30, 2003 at 02:27:51PM -0700, Matt Tolton wrote: > Should the Bayesian classifier use chi-squared combining, instead of > Robinson/Graham-style naive Bayesian combining? Chi-squared produces more 'extreme' > output results, but may be more resistant to changes in corpus size etc. > > This isn't very clear to me...can anyone give a more in-depth explanation?
This is realted to the way Bayesian analysis predicts things. In the field of decision theory, a popular way to analyze problems is to consider how risky the choices are. The text means that the Chi-squared method is riskier - its results are less conservative. This can be considered a bad thing (since score variance probably increases as a result) on one hand. But the text says it has other benefits - namely making the analysis more resistent to outside influences (like corpus size). It has its good and bad points, but is probably only interesting to statisticians and developers ::-) DISCLAIMER: I am not an SA develper and haven't read much of the code - I'm simply translating the text above into more common language. -- Ross Vandegrift [EMAIL PROTECTED] A Pope has a Water Cannon. It is a Water Cannon. He fires Holy-Water from it. It is a Holy-Water Cannon. He Blesses it. It is a Holy Holy-Water Cannon. He Blesses the Hell out of it. It is a Wholly Holy Holy-Water Cannon. He has it pierced. It is a Holey Wholly Holy Holy-Water Cannon. He makes it official. It is a Canon Holey Wholly Holy Holy-Water Cannon. Batman and Robin arrive. He shoots them. ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk