On Wed, Sep 16, 2009 at 8:16 PM, Daryl C. W. O'Shea
<spamassas...@dostech.ca> wrote:
[snip]
> now hope to do this Thursday/Friday.  I should be able to scan my
> million or so messages in a day on my cluster.

Wow, that makes me feel inadequate :)  I'm struggling to clean up my
little ham sample of 3600 messages, and looking at another couple
thousand that I'll do if I've got time...

Also, I need some advice, if someone can provide it.  I'm looking at a
message (and I have several like this in my corpus at present) which
generates the following log line

.  1 /home/gems/ham//cur/n8500ejj019591:2,S
MISSING_DATE,MISSING_HEADERS,MISSING_MID,T_FSL_HELO_NON_FQDN_2,__DKIM_DEPENDABLE,__DNS_FROM_RFC_ABUSE,__DOS_DIRECT_TO_MX,__DOS_HAS_ANY_URI,__DOS_RCVD_FRI,__DOS_SINGLE_EXT_RELAY,__HAS_ANY_EMAIL,__HAS_ANY_URI,__HAS_RCVD,__HAS_SUBJECT,__HAVE_BOUNCE_RELAYS,__LAST_EXTERNAL_RELAY_NO_AUTH,__LAST_UNTRUSTED_RELAY_NO_AUTH,__MISSING_REF,__MISSING_REPLY,__MISSING_THREAD,__NONEMPTY_BODY,__NUMBERS_IN_SUBJ,__RCVD_IN_2WEEKS,__RFC_IGNORANT_ENVFROM,__TO_NO_ARROWS_R,__TVD_BODY
learn=ham,time=1252108840,scantime=1,format=f,reuse=no,set=1

It's clearly a poorly constructed message, but it's also clearly ham
(it originated from an application that someone somewhere in my
organization runs).  It had one header: Subject.  Then a body.  Should
I leave stuff like this in?  I mean, it is ham, but...

thanks in advance for any guidance,
Austin.

Reply via email to