Bart Schaefer writes:
>The point is that -- aside from the rule "do not teach spam as ham, nor
>teach ham as spam" -- YOU DON'T REALLY KNOW what data will increase or
>decrease the classifier's accuracy.  As a human, you're good at making the
>gestalt (and subjective) judgement "this is spam" (or ham).  You're not
>good at instantly recognizing every fragment of the message that the
>classifier considers to be a token and then determining whether each such
>token occurs more frequently (or uniquely) in spam or ham.

Ah, that's the good point right there about Bayes.

The reason Bayesish probabilistic classification systems work well, is
*because* you don't have to second-guess them -- just feed them spam and
ham, and they will "do the right thing", even if you think you might be
confusing them.

--j.


-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to