Re: [SAtalk] Maintaining Bayes from an MUA

Robert Nicholson Thu, 18 Dec 2003 16:00:36 -0800

Right now I've got qmail putting all incoming mail thru a perlscript whereby I then use Mail::Audit to filter the mail based on certain criteria and then programtically invoke Spam::Assassin to determine what is spam. I also maintain my own whitelist using this filter. I want to be able to send comments from my MUA to do things like as you suggest.

forget, learn spam, learn ham.

Currently anything that passes my whitelist criteria will automatically get learnt as ham likewise all spam will be learnt as spam so i only have to deal with the case whereby i have a false positive or false negative which means basically forgetting and then learning it as spam or ham. So it really becomes a process of identifying the message by Message-ID in order to recreate the original message to then be learnt correctly.

I already use a boring shell script to batch learn mail since my IMAP server uses Maildir which means each messaqe is a separate file. Not the most effecient solution.

In my case I think I'll create pseudo usernames within my domain and simply learn either spam or ham depending the user I'm "replying" to.

I would be much nicer if my MUA easily allowed me to include the original mail as a MIME attachment but I have to save each message to disk in order to do that. Way too tedious.

On Dec 18, 2003, at 4:47 PM, David Smith wrote:

On Thu, 2003-12-18 at 16:27, Robert Nicholson wrote:
So I was thinking about how one could maintain Bayes from an MUA for
certain messages.
Specifically whenever you see false positives in your spam that perhaps were learnt by bayes you could have something described below.

Assuming you can fetch mail with courier IMAP thru it's message-id and nothing else. ie. you don't have to know the folder of the message and if you do it's more than likely "spam" in my case... Anyway, the idea is that if you reply to a message a script to intercept that message pull out the In-Reply-To forget the ham and learn the spam for this message. The key is that the MUA doesn't provide all the original email headers when you reply to the message so you need to be able to construct the original message in order for it to be forgotten and then learnt. I'm guessing you don't need the original just to forget the message just the Message-id but if you want it learnt correctly as ham you'd need all the original header/body.
I haven't used IMAPClient for a while but this sounds like it's doable
to me.
I believe it to be doable.  I haven't started on this yet, but if you
want to get to it first go ahead.
My plan was to develop a perl script that used IMAPClient and
SpamAssassin that would be useful for bayes training.  My thought would
be that it would have arguments similar to those of sa-learn (i.e.
--hame, --spam).  I'd probably always point this script at an imap
folder, and the script would run every message in that folder through
the learn process.
Right now I'm working on a script that runs SpamAssassin on an IMAP folder, rewriting every message and optionally filing them away in other folders.
--
David Smith
[EMAIL PROTECTED]
Red Hat, Inc.
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)


-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] Maintaining Bayes from an MUA

Reply via email to