Peter Marshall wrote:
Kevin Sullivan wrote:

--On 02/03/05 01:59:21 +0100 Sander Holthaus - Orange XL wrote:

I've been interested in offering customers to train manually train the
SpamAssassin Bayes filter for ham and spam (to reduce false positives and
negatives). However, I can only find documentation to this for local
mailboxes and IMAP. Most users however, retrieve their mail through POP
and use Outlook (Express) as mail client. Is there a way to train
SpamAssassin with such a setup (e.g. forwarding mail with Outlook
(Express) using SMTP)?



If you want to do a lot of programming, you could save all incoming messages for a few days in a database somewhere. When a user forwards a message to a special "ham" or "spam" mailbox, you pull the message-id from the message and use it to recover the original message from your database.


-Kevin


My question is the same as Henrik, I have a bunch of email that is spam (either tagged by spam assassin or not tagged at all. I forwared it as an attachment to a "spam" mail box. What do I have to do now before I can get bayes to learn the message ... I read you have to remove the headers .... Could anyone give me a little more detail ?

I use a modified version of the DMZS-sa-learn.pl from: http://www.dmzs.com/tools/files/spam.phtml


When someone forwards a spam to me, I move the message to a special imap folder that gets processed by the script. My additions look something like:

use Email::MIME;
...
my $msg = Email::MIME->new($raw_message_body);

my @parts = $msg->parts;

foreach (@parts) {
  if ($_->content_type =~ m|message/rfc822|) {
    sa_learn($_->body_raw);
  }
}


I've tested this with messages forwarded as attachment from Outlook and Thunderbird. I'm not sure how effective it is though. I'm sure that it still looses something in the translation. All imap is really the way to go if you can.



Stuart Johnston

Reply via email to