Good day to all! I have a RedHat 8 server which serves mail for my domain.
I have been wrestling with a way to train my filter for spam and ham. I read that it isn't prudent to train the filter with mail that has been forwarded to another mailbox since the headers will be modified resulting in a badly trained filter. This made me paranoid when I noticed that Evolution itself adds header info that it could be problematic if I used the Evolution mbox itself in training. I devised a method that others may find useful, so I thought I would share. In my case my mail folders are in /root/evolution/local. I created a spam folder to which I move my SPAM. Since I read my mail under the root account, it resides in /root/evolution/local/spam/mbox. My idea was to make an entry in my procmailrc that saves all mail in /var/spool/mail/allmail. This allows me to preserve original mail until I can use it to train the filter, but I didn't want to have to dig through it with vi to sort the ham from the spam. I wrote a program that scans the mail in spam/mbox or Inbox/mbox and builds a linked-list of the message Id's from the messages. Next, it opens up the allmail file and matches the message id's and copies to stdout each raw message from allmail that it matches from the source list. To train, I use evolution to move all my spam from my inbox folder to my spam folder. Then run: mailcut /root/evolution/local/Inbox/mbox \ /var/spool/mail/allmail >hambunch followed by: sa-learn --ham --mbox hambunch followed by: mailcut /root/evolution/local/spam/mbox \ /var/spool/mail/allmail >spambunch followed by: sa-learn --spam --mbox spambunch The code is published at www.heggood.com/mailcut.html If anyone notices anything off-track in my thinking, please let me know. Regards, -steve- ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk