> Unless you do a LOT of training, I don't see how a purely Bayesian > classifier can perform as well as something as multifacited as > SpamAssassin. (I wonder how well a purely Bayesian classifier > does on the HTML image only spams for example)
Unless I am mistaken, death2spam is running a service that one would point a domain MX record to or forward to - similar to Postini. They probably have already primed the pump with a very large corpus of messages. From my ignorant perspective, it would be analagous to auto_learn. I know a guy from another list that builds appliances. He provides SpamBayes filtering for clients. He starts them with a site-wide database that includes a corpus of 3000 messages. The "system" database is then trained on a few mailboxes. He provides the option to transfer individual users to a "personal" database. Now one can always say this is a "pure" Bayesian system. However, as soon as one starts white-listing, is it really still pure Bayesian. The filtration engine may be pure Bayesian but the system is not. Semantics is a woven web. Now if the white-listing, black-listing, and virus scanning is performed before the filtration engine, the performance numbers increase dramatically. This is why I use Procmail. I perform checks in the following order: attachment, white-list, black-list, SA, header, and finally body (currently disabled). With very little effort I get at least 99% of the spam. But there are far more white-listed messages than the number that pass through SA or the other filters. So the point is that everyone that uses SA implements it differently. SA is the system for some implementations but for others it is a component of the system. So we can not look at services like death2spam and try to judge it based on the fact that it uses Bayesian classification as it's sole filtration engine. We have no idea what the system implementation is. --Larry ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk