> Unless you do a LOT of training, I don't see how a purely Bayesian 
> classifier can perform as well as something as multifacited as 
> SpamAssassin. (I wonder how well a purely Bayesian classifier 
> does on the HTML image only spams for example)

Unless I am mistaken, death2spam is running a service that one would point a
domain MX record to or forward to - similar to Postini.  They probably have
already primed the pump with a very large corpus of messages.  From my
ignorant perspective, it would be analagous to auto_learn.

I know a guy from another list that builds appliances.  He provides
SpamBayes filtering for clients.  He starts them with a site-wide database
that includes a corpus of 3000 messages.  The "system" database is then
trained on a few mailboxes.  He provides the option to transfer individual
users to a "personal" database.  Now one can always say this is a "pure"
Bayesian system.  However, as soon as one starts white-listing, is it really
still pure Bayesian.  The filtration engine may be pure Bayesian but the
system is not.  Semantics is a woven web.  Now if the white-listing,
black-listing, and virus scanning is performed before the filtration engine,
the performance numbers increase dramatically.  This is why I use Procmail.
I perform checks in the following order:  attachment, white-list,
black-list,  SA, header, and finally body (currently disabled).  With very
little effort I get at least 99% of the spam.  But there are far more
white-listed messages than the number that pass through SA or the other
filters.

So the point is that everyone that uses SA implements it differently.  SA is
the system for some implementations but for others it is a component of the
system.  So we can not look at services like death2spam and try to judge it
based on the fact that it uses Bayesian classification as it's sole
filtration engine.  We have no idea what the system implementation is.

--Larry



-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to