On 1/15/2013 5:22 PM, John Hardin wrote:
>>>> Yes, users are allowed to train Bayes, via Dovecot's Antispam plug-in.
>>>> They do so unsupervised. Why this could be a problem is obvious. And
>>>> no,
>>>> I don't retain their submissions. I probably should. I wonder if I can
>>>> make a few slight modifications to the shell script that Antispam
>>>> calls,
>>>> such that it simply sends a copy of the message to an administrator
>>>> rather than calling sa-learn on the message.
>>>
>>> That would be a very good idea if the number of users doing training is
>>> small. At the very least, the messages should be captured to a permanent
>>> corpus mailbox.
>>
>> Good idea! I'll see if I can set this up.

So, I finally got around to tackling this change.

With a couple of simple modifications, I was able to achieve the desired
result with the Dovecot Antispam plug-in.

In dovecot.conf:

-----------------------------------------------------
plugin {
  # [...]

  # For Dovecot < 2.0.
  antispam_spam_pattern_ignorecase = SPAM;JUNK
  antispam_mail_tmpdir = /tmp
  antispam_mail_sendmail = /usr/bin/sa-learn-pipe.sh
  antispam_mail_spam = proposed-s...@example.com
  antispam_mail_notspam = proposed-...@example.com
}
-----------------------------------------------------

Basically, I changed the last two directive values from the switches
that are normally passed to the "sa-learn" binary (--spam and --ham) to
destination email addresses that are passed to "sendmail" in my revised
pipe script.

Here is the full pipe script, /usr/bin/sa-learn-pipe.sh (apologies for
the wrapping); the original commands are commented with two pound
symbols [##]):

-----------------------------------------------------
#!/bin/sh

# Add "starting now" string to log.
echo "$$-start ($*)" >> /tmp/sa-learn-pipe.log

# Copy the message contents to a temporary text file.
cat<&0 >> /tmp/sendmail-msg-$$.txt

CURRENT_USER=$(whoami)

##echo "Calling (as user $CURRENT_USER) '/usr/bin/sa-learn $*
/tmp/sendmail-msg-$$.txt'" >> /tmp/sa-learn-pipe.log
echo "Calling (as user $CURRENT_USER) 'sendmail $* <
/tmp/sendmail-msg-$$.txt'" >> /tmp/sa-learn-pipe.log

# Execute sa-learn, with the passed ham/spam argument, and the temporary
message contents.
# Send the output to the log file while redirecting stderr to stdout (so
we capture debug output).
##/usr/bin/sa-learn $* /tmp/sendmail-msg-$$.txt >>
/tmp/sa-learn-pipe.log 2>&1
sendmail $* < /tmp/sendmail-msg-$$.txt >> /tmp/sa-learn-pipe.log 2>&1

# Remove the temporary message.
rm -f /tmp/sendmail-msg-$$.txt

# Add "ending now" string to log.
echo "$$-end" >> /tmp/sa-learn-pipe.log

# Exit with "success" status code.
exit 0
-----------------------------------------------------

It seems as though creating a temporary copy of the message is not
strictly necessary, as the message contents could be passed to the
"sendmail" command via standard input (stdin), but creating the copy
could be useful in debugging.

>>> Do your users also train ham? Are the procedures similar enough that
>>> your users could become easily confused?
>>
>> They do. The procedure is implemented via Dovecot's Antispam plug-in.
>> Basically, moving mail from Inbox to Junk trains it as spam, and moving
>> mail from Junk to Inbox trains it as ham. I really like this setup
>> (Antispam + calling SA through Amavis [i.e. not using spamd]) because
>> the results are effective immediately, which seems to be crucial for
>> combating this snowshoe spam (performance and scalability aside).
>>
>> I don't find that procedure to be confusing, but people are different, I
>> suppose.
> 
> Hm. One thing I would watch out for in that environment is people who
> have intentionally subscribed to some sort of mailing list deciding they
> don't want to receive it any longer and just junking the messages rather
> than unsubscribing.

The steps I've taken above will allow me to review submissions and
educate users who engage in this practice. Thanks again for elucidating
this scenario.

I hope that this approach to user-based SpamAssassin training is useful
to others.

Best regards,

-Ben

Reply via email to