on Thu, Aug 28, 2003 at 03:09:48PM +0200, Andreas Metzler ([EMAIL PROTECTED]) wrote: > Karsten M. Self <kmself@ix.netcom.com> wrote: > [...] > > SpamAssassin achieves a false-positive rate (non-spam reported as spam) > > of 5% with a default threshold of 5. This can be dramatically improved > > using a whitelist, to ~98% in my experience. This is not the best > > performance of all filters, so makes a somewhat generous threshold. > > > http://www.spamassassin.org/dist/rules/STATISTICS.txt > > http://freshmeat.net/articles/view/964/ > > > So a spam-reduction system user would at worst see a typical rate of 2% > > of spam to be manually disposed of. > [...] > > You are mixing up percentages. "5% non-spam reported as spam" ... can > be ... improved to ~98% ...
Correct. And yes, I was thinking "false-negative". Spam not flagged as spam. What I meant to say was this: - Currently feasible content-based filters + whitelists can achieve a spam rate of 2% of spam passing to the inbox, by independent tests. - A C-R system should then target having no more than 2% of challenges sent be misdirected (based on spoofed headers, etc.). At this rate, it's still transferring burden inappropriately, but at a level that matches a reasonable-case technological alternative. This also achieves a secondary goal in the interests of C-R proponents of keeping the incidence of false challenges low enough that recipients would be likely to respond to the challenge. > When I last checked my personal rate with spamassassin 2.55 with > default rules and no DNS lists or razor (but including a rather well > trained bayesian filter) and a default threshold of 5, I came up with > these numbers[1]: > * 0% false positives, i.e. ham sorted into the spam folder > * 10% of the spam was not recognized as such and I had to filter it > out by hand. I use a whitelisting system. It's based on Lars Wizenius's spamfilter package, my local add being a shell script to scan messages for sender to add to white, black, gray, or spam lists. Mail from previously unknown senders ends up in a "grey" box. The principle is the same as C-R, except that assessment is done by me, rather than a third party. Peace. -- Karsten M. Self <kmself@ix.netcom.com> http://kmself.home.netcom.com/ What Part of "Gestalt" don't you understand? Verio webhosting? Guaranteed downtime: http://www.wired.com/news/politics/0,1283,57011,00.html http://www.dowethics.com/r/environment/freedom.html
pgp7SQrlsknKk.pgp
Description: PGP signature