Hi Am Die, 2003-09-09 um 03.41 schrieb Kenneth Porter: > > - The results of the AI alone are as good as Spamassassin's results. > > Combined it is therefor better. > > What would make the combined result better?
My experiences with a practical use of spamassassin with fitz show the following results: - spamassassin gets the 90% or more spam which are NOT optimized for spamassassin to get through. - the rest is caught by fitz. The rest is optimized spam and things the user doesn't want. Interesting was the experiment where I got mails from an account for a role-playing-game weekend. Subscriptions and questions as ham. Spam as usual PLUS an Roleplaying game newsletter with a lot of announcements. Normally the newsletter is non-spam. And it looks like ham. Talking about RPGs and even mentioning the convention and where to subscribe. But after learning two instances of it, it was classified as spam. > What does Fitz do different from SA? The big new thing is a special tokenization. Many naive bayes solutions dissect the spam word by word. Even the header. My Fitz dissects every field of the header a special way. It doesn't learn -007 but: Time-zone = -007 By that it get more information out of an mail. The Date-Header alone supports us with: Mon, => When does the user normally get mail ? Job accounts get less HAM on weekends 08 Sep 2003 => Not really relevant 18:41:33 => A lot of SPAMs are written between Midnight and about 5 o clock -0700 => Time zone interesting for firms who only have local partners And this is only the date header. I also tried not to use Paul Grahams Naive Bayes but as much of the AI-book-standard Naive Bayes as possible. I had to alter it for my special tokenization a bit. But not much. Thorsten Sick -- Thorsten Sick [EMAIL PROTECTED] www.hort-des-wissens.de Winter is coming -----BEGIN GEEK CODE BLOCK----- Version: 3.12 GCS d-- s++:- a-- C++ UL+++ P+++ L+++ E W++ N o K w--- O-- M- V- PS+ PE- Y+ PGP++ t 5+++ X+ R+ !tv b++++ DI- D G e+ h-- r++ y? ------END GEEK CODE BLOCK------ ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk