alan premselaar <[EMAIL PROTECTED]> writes:

> I pre-sorted (by hand) the ham and spam into 2 different mbox files
> using an IMAP client.  when I run mass-check ham:mbox:/path/to/ham it
> runs fine (i use the --progress option) and then i run the sort
> command as in the CORPUS_SUBMIT documentation and it shows a bunch of
> SA tests and stuff in the file.

Egads!  I hope you didn't hand sort the entire corpus?  It's much easier
to sort it using a SpamAssassin mass-check first, then pick out the
miscategorizations.  Speaking of which, I find that I have to scan the
entirety of the corpus by Subject/From/whatever and not just the top N
messages as suggested in the documentation (especially to find all the
false negatives in the ham file).

Although, I can see the utility of using a drag and drop GUI (like a
mail client) to do the hand-filtering.  It's a pain in the neck to move
between two windows and type "mv" commands all afternoon long.

> when i run mass-check spam:mbox:/path/to/spam it also runs fine, and i
> run the sort command and it appears to be working ok, however, after
> doing that my ham.log file contains only the following:
>
> # mass-check results from [EMAIL PROTECTED], on Thu Apr  3
> 08:24:06 UTC 2003
> # M:SA version 2.60-cvs
> # CVS tag: $Name: CURRENT_CORPORA_SUBMIT_VERSION $
>
> is this normal?

Nope.  Does "mass-check --mbox /path/to/spam" produce any output?

My guess would be a problem with the IMAP file not being in mbox format.
If not that, maybe try using tools/mboxsplit to split the mbox files
into separate message files and use "dir" instead of "mbox" ?

Daniel

-- 
Daniel Quinlan                     anti-spam (SpamAssassin), Linux, and open
http://www.pathname.com/~quinlan/   source consulting (looking for new work)


-------------------------------------------------------
This SF.net email is sponsored by: ValueWeb: 
Dedicated Hosting for just $79/mo with 500 GB of bandwidth! 
No other company gives more support or power for your dedicated server
http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to