Mark Martinec said: > Although Bayesian works best when it is trained for a particular > user, it is still _very_ useful with a single site-wide database. > Given a larger set of ham/spam messages to train, the lack > of specialization can be compensated to some degree. > > AWL on the other hand (as I understand it) is only useful > as a per-recipient information. (but I may be wrong here)
yep, pretty correct. Well, the AWL should work OK for all recipients too, though. But bayes will work better. > | (I assume that md5-hash has something to do with this) then > > Well, not directly, it solves the opposite problem. > The cache of body digests is used to save time on calling > SpamAssassin and virus checkers when the same message content > comes-in as separate messages, close one after the other, > such as with some poor-man's mailing lists or in spam bursts. BTW quite a lot of big list hosts will do this too, as some kind of bounce-handling technique. Here's why: some very large "big name" ISPs do not have decent "address no longer exists" bounce reporting if they get a single SMTP message to multiple recipients at the site. As a result, some list-sending sites now use per-recipient Message-Ids, From addresses, Errors-To addresses, etc., and send 1 mail per recipient, in order to figure out which recipient is bouncing. There's also something called VERP, which I think is related, but I can't remember what that stands for ;) --j. ------------------------------------------------------- This SF.NET email is sponsored by: Order your Holiday Geek Presents Now! Green Lasers, Hip Geek T-Shirts, Remote Control Tanks, Caffeinated Soap, MP3 Players, XBox Games, Flying Saucers, WebCams, Smart Putty. T H I N K G E E K . C O M http://www.thinkgeek.com/sf/ _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk