> I have a script that can take spam/ham messages forwarded as attachments
> from
> Outlook and turn them into rfc822 individual files.  It allows external
> users to send me Outlook spam/ham for review.  I will in turn feed
> sa-learn
> with those messages once vetted.  That part of the process is getting me
> the
> messages in-tact as far as I can tell, as the user received them.

As long as you aware, once outlook touches a message, there is nothing
original left. It adds/removes/reorders headers and modifies mime parts
(even html).

> I could
> pipe those messages to sa-learn directly; that's what the script is
> designed
> to do.  But I don't trust the user's submissions, and prefer to review
> first.  FYI, the script that handles the separation of the attachments is
> from here:

For reviewing this sounds ok. But I am unsure what all the outlook
mangling does to the effectiveness of sa-learn. I guess its better than
nothing as most of the tokens are probably still the same...

All the 'outlook' tokens trained is probably balanced by training ham
which arrives from people using outlook, so i guess that should cause no
problem...right?

> http://www.localside.net/sal-wrapper/
>
> I would like to turn around and put those individual messages back into
> mbox
> format, again, without changing their original headers.  Anyone have a
> script or a method which will accomplish that?  I tried to figure out how
> to
> do it but was unsuccessful.
>
>

Reply via email to