On 1/31/2013 5:50 PM, RW wrote: > On Thu, 31 Jan 2013 12:12:15 -0800 (PST) > John Hardin wrote: > >> On Thu, 31 Jan 2013, Ben Johnson wrote: >> > >>> So, I finally got around to tackling this change. >>> >>> With a couple of simple modifications, I was able to achieve the >>> desired result with the Dovecot Antispam plug-in. >>> >>> Basically, I changed the last two directive values from the switches >>> that are normally passed to the "sa-learn" binary (--spam and >>> --ham) to destination email addresses that are passed to "sendmail" >>> in my revised pipe script. >> >> Passing the messages through sendmail again isn't optimal as that >> will make further changes to the headers. This may have effects on >> the quality of the learning, unless the original message is attached >> as an RFC-822 attachment to the message being sent to the corpus >> mailbox, which of course means you then can't just run sa-learn >> directly against that mailbox - the review process would involve >> moving the attachment as a standalone message to the spam or ham >> learning mailbox. >> >> Ideally you want to just move the messages between mailboxes without >> involving another delivery processing. I don't know enough about >> Dovecot or your topology to say whether that's going to be as easy as >> using sendmail to mail the message to you. > > Actually that's the way that the dovecot plugin works. I think that the > sendmail option is mainly a way to get training done on a remote > machine - it's a standard feature of DSPAM for which the plugin was > originally developed. > > When I looked at the plugin it seemed to have quite a serious flaw. > IIRC it disables IMAP APPENDs on the Spam folder which makes it > incompatible with synchronisation tools like OfflineImap and probably > some IMAP clients that implement offline support in the same way. >
John, thanks for pointing-out the problems associated with re-sending the messages via sendmail. I threw a line out to the Dovecot users group and learned how to move messages without going through the MTA. Dovecot has a utility executable, "deliver", which is well-suited to the task. For those who may have a similar need, here's the Dovecot Antispam pipe script that I'm using, courtesy of Steffen Kaiser on the Dovecot Users mailing list: --------------------------------------- #!/bin/bash mode= for opt; do if test "x$*" == "x--ham"; then mode=HAM break elif test "x$*" == "x--spam"; then mode=SPAM break fi done if test -n "$mode"; then # options from http://wiki1.dovecot.org/LDA /usr/lib/dovecot/deliver -d u...@example.com -m Training.$mode fi exit 0 --------------------------------------- And here are the Antispam plug-in options: --------------------------------------- # For Dovecot < 2.0. antispam_spam_pattern_ignorecase = SPAM;JUNK antispam_mail_tmpdir = /tmp antispam_mail_sendmail = /usr/bin/sa-learn-pipe.sh antispam_mail_spam = --spam antispam_mail_notspam = --ham --------------------------------------- RW, thank you for underscoring the issue with IMAP appends. It looks as though a configuration directive exists to control this behavior: # Whether to allow APPENDing to SPAM folders or not. Must be set to # "yes" (case insensitive) to be activated. Before activating, please # read the discussion below. # antispam_allow_append_to_spam = no Unfortunately, I don't fully understand the implications or enabling or disabling this option. Here's the "discussion below" that is referenced in the above comment: --------------------------------------- ALLOWING APPENDS? You should be careful with allowing APPENDs to SPAM folders. The reason for possibly allowing it is to allow not-SPAM --> SPAM transitions to work with offlineimap. However, because with APPEND the plugin cannot know the source of the message, multiple bad scenarios can happen: 1. SPAM --> SPAM transitions cannot be recognised and are trained 2. the same holds for Trash --> SPAM transitions Additionally, because we cannot recognise SPAM --> not-SPAM transitions, training good messages will never work with APPEND. --------------------------------------- In consideration of the first point, what is a "SPAM --> SPAM transition"? Is that when the mailbox contains more than one "spam folder", e.g., "JUNK" and "SPAM", and the user drags a message from one to the other? Regarding the second point, I'm not sure I understand the problem. If someone drags a message from Trash to SPAM, shouldn't it be submitted for learning as spam? The last sentence sounds like somewhat of a deal-breaker. Doesn't my whole strategy go somewhat limp if ham cannot be submitted for training? John and RW, do you recommend enabling or disabling the append option, given the way I'm reviewing the submissions and sorting them manually? Sorry for all the questions! And thanks! -Ben