Don't worry about the fact that the spam messages are attachements of
the report (and that the attachment has no headers). Sa-learn will
realise this and strip it out. - at least that's what I've been told,
and bayes is working like a treat for me- with a spam corpus of 3300
messages using libpst and PST Export in Outlook together with mostly
auto-learned ham.


-----Original Message-----
From: Colin A. Bartlett [mailto:[EMAIL PROTECTED] 
Sent: 30 October 2003 12:17
To: [EMAIL PROTECTED]
Subject: [SAtalk] outlook corpus?


All,

I have amassed a pretty large corpus of spam (over 16,000 messages).
Unfortunately they are just in a folder in my Outlook. (Sorry. I like
it.)
And, they are all attached to the SA-generated report message. I've
mined
Google and the archives. I've found some references to converting
Outlook
files to mbox format. That would work I believe except that the messages
I
want are really attached to those in my Outlook. Anyone know of a tool
or
method of generating an mbox or other file from my Outlook messages for
use
in checking rules and such?

I'd hate to have to throw out my 16,000 spams as they are pretty varied,
hand verified, and would be great for testing.

cheers,
Colin

Colin A. Bartlett
Kinetic Web Solutions
www.kineticweb.biz



-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?   SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?   SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to