Jason Haar writes:
> Justin Mason wrote:
> > However: it's important for SpamAssassin developers and mass-checkers to
> > get a "representative" feed of spam -- with all kinds of spam included --
> > so that the rules are measured against something close to reality.  
> On a related note, we actually *stopped* using front-line RBLs as with
> them in place, we were no longer able to get true stats as to the actual
> flow of Spam/Ham into our sites. Which meant that we really couldn't
> tell how effective our antispam systems were being. The "broad axe" that
> is RBL meant that a single mail message coming from servers may be
> blocked dozens of times (as it retries), meaning that our stats would
> over-represent the effectiveness of front-line RBL methods. Now we just
> let it all hit SpamAssassin, and have simply upped the score on those
> RBLs we used to trust to reject directly, so that the Spam doesn't get
> any further. End result: no delivery changes - but better quality stats.

Yes, that's a closely related problem. Using front-line RBLs (or other
SMTP-time discard tactics like an early-talker test) distorts your view of
your incoming spam.  Worse than that, you effectively have no way to
accurately estimate FP rates -- you have to guess based on rejection
figures added to the more accurate SpamAssassin-tagged corpora.

> Obviously you have to have over-speced your mail servers to be able to
> do this - something poor old Justin can't manage I think :-)

Yeah.  If I could persuade someone to donate a server just for *my*
personal mail, that'd solve it, but in the meantime, not so much ;)

(Actually, we recently upgraded the RAM, so it looks like it can probably
cope with the volume again.)

> (FYI: picking a random user of ours and looking at all Internet email
> they received in Aug 2006 showed SA had >99% success rate at tagging
> Spam. 85% was quarantined (scores >10/5) and the rest tagged for the
> users to filter on. Also, ZERO ham misclassification - which is
> something certain commercial competitors to SpamAssassin are actually
> pretty bad at...)

wow, that's really good!

> Now if only it could deal with this storm of "VIiiagra"/"VIragra" spam
> that has been sneaking in... :-)

yep, working on those ;)

--j.

Reply via email to