On 9/9/10 7:46 AM, "RW" <rwmailli...@googlemail.com> wrote:

> On Wed, 8 Sep 2010 16:02:10 -0700 (PDT)
> John Hardin <jhar...@impsec.org> wrote:
> 
>> On Wed, 8 Sep 2010, RW wrote:
> 
>>> What's the reason for the age limit?
>> 
>> The nature of spam (and, to a lesser degree, ham, barring major
>> changes like the widespread adoption of HTML email) changes over
>> time. A rule that hit lots of spam and had a good S/O three years ago
>> (e.g. the multilayer obfuscated image pharma spams that were all the
>> rage a few years back) might hit nearly nothing today.
> 
> 
> Would it not be sensible to keep ham for as long as necessary, and
> supplement the spam corpus with spamtraps?

No.  One maxim of the corpus is that it must be hand inspected.

Ham is plentiful - I get 20-50 hams a day in my personal mailbox, and around
a thousand a day in my business mailbox.  It just takes a little discipline
on a few people to sort out and keep the ham, then run the nightly
mass-checks.  The current rules are 39 months before the ham ages out.  I
should be able to eventually build and keep a 30-40 thousand ham library
just by tossing my read mail into a different bucket than the deleted items
folder.

-- 
Daniel J McDonald, CCIE # 2495, CISSP # 78281



Reply via email to