On 9/9/10 7:46 AM, "RW" <rwmailli...@googlemail.com> wrote:
> On Wed, 8 Sep 2010 16:02:10 -0700 (PDT) > John Hardin <jhar...@impsec.org> wrote: > >> On Wed, 8 Sep 2010, RW wrote: > >>> What's the reason for the age limit? >> >> The nature of spam (and, to a lesser degree, ham, barring major >> changes like the widespread adoption of HTML email) changes over >> time. A rule that hit lots of spam and had a good S/O three years ago >> (e.g. the multilayer obfuscated image pharma spams that were all the >> rage a few years back) might hit nearly nothing today. > > > Would it not be sensible to keep ham for as long as necessary, and > supplement the spam corpus with spamtraps? No. One maxim of the corpus is that it must be hand inspected. Ham is plentiful - I get 20-50 hams a day in my personal mailbox, and around a thousand a day in my business mailbox. It just takes a little discipline on a few people to sort out and keep the ham, then run the nightly mass-checks. The current rules are 39 months before the ham ages out. I should be able to eventually build and keep a 30-40 thousand ham library just by tossing my read mail into a different bucket than the deleted items folder. -- Daniel J McDonald, CCIE # 2495, CISSP # 78281