You have described why it wont work as good. Using your method the headers become useless - bayes ill learn only the body/subject content. You would need to tell bayes to ignore the headers.

With pop3 you only have the option you describe - this may still work ok?

Pete

Oenus Tech Services wrote:
Thanks, Peter

Yes, but this would not work with our more than 1500 customers that have
only pop3 access and do not have access to any shared or private folder
in the servers. We needed to implement some way for these pop3-only
customers to report spam back to us, and for now we've only thought of
forwarding spam to an spamabuse account where some scripts could check
its inbox and do bayesian learning. However, does sa-learn take into
account that those emails are being forwarded and they're not the
original source of the spam? I guess not. If that is the case, does
anybody has come up with a similar idea?

Ignacio



Peter Russell escribió:
See attach python script written by one of the folks on the MailScanner
list.

Its designed for use with exchange, so i will describe the Exchange
usage and you can modify as you see fit to work on your pop3 server.

In Exchange 2003 create a public folder called SPAM, give everyone
contributor access, not read or edit. Then any user can simply drag spam
to the public folder, but no user can see in the folder.

Modify the script to suit your environment (Exchange server name and
credentials). Make it executable.

Now run it. It will scan your public folder called SPAM and learn the
contents into bayes, then delete the messages it has learned.

The script doesnt seem to run recursively all that well, it maye stop
randomly and need to be re run again - if any python scripters see this
would you mind having a go fixing this and re posting to the list?

Many thanks
Pete

Oenus Tech Services wrote:
Hi there!

Most of our email is delivered through pop3, so right now bayes
filtering is off. Nevertheless Spamassassin is doing a good job
filtering email, but we want to setup a way for our customers to report
to us undetected spam by forwarding that spam to an
[EMAIL PROTECTED] account in our server. If we then point sa-learn
to that inbox, will it work? My concern is that email arriving to that
account is not from the spammer anymore, but from a forwarded mail by
our customer.

TIA,

Ignacio

------------------------------------------------------------------------

#!/usr/bin/env python
import commands, os, time
import imaplib
import sys, re
import string, random
import StringIO, rfc822

# Set required variables
PREFS = "/etc/MailScanner/spam.assassin.prefs.conf"
TMPFILE = "/var/tmp/salearn.tmp"
SALEARN = "/usr/bin/sa-learn"
SERVER = "x.x.x.x"
USER  = "someuserwithaccesstopublicfolder"
PASSWORD = "somepassword"
LOGFILE = "/var/log/learn.spam.log"
log = file(LOGFILE, 'a+')
log.write("\n\nTraining SpamAssassin on %s at %s\n" % (time.strftime("%Y-%m-%d"), 
time.strftime("%H:%M:%S")))

# connect to server
server = imaplib.IMAP4(SERVER)

# login
server.login(USER, PASSWORD)
server.select("Public Folders/Spam")

# Get messages
typ, data = server.search(None, 'ALL')
for num in data[0].split():
        typ, data = server.fetch(num, '(RFC822)')
        tmp = file(TMPFILE, 'w+')
        tmp.write(data[0][1])
        tmp.close()
        log.write(commands.getoutput("%s --prefs-file=%s --spam %s" % \
                (SALEARN, PREFS, TMPFILE)))
        log.write("\n")
        # Mark learned spam as "Deleted"
        server.store(num, '+FLAGS', '\\Deleted')
# Delete messages marked as "Deleted" from server
        server.expunge()
server.logout


------------------------------------------------------------------------

#!/usr/bin/env python
import commands, os, time
import imaplib
import sys, re
import string, random
import StringIO, rfc822

# Set required variables
PREFS = "/opt/MailScanner/etc/spam.assassin.prefs.conf"
TMPFILE = "/var/tmp/salearn.tmp"
SALEARN = "/usr/bin/sa-learn"
SERVER = "x.x.x.x"
USER  = "someuserwithaccesstopublicfolder"
PASSWORD = "somepassword"
LOGFILE = "/var/log/learn.spam.log"
log = file(LOGFILE, 'a+')
log.write("\n\nTraining SpamAssassin on %s at %s\n" % (time.strftime("%Y-%m-%d"), 
time.strftime("%H:%M:%S")))

# connect to server
server = imaplib.IMAP4(SERVER)

# login
server.login(USER, PASSWORD)
server.select("Public Folders/Spam")

# Get messages
typ, data = server.search(None, 'ALL')
for num in data[0].split():
        typ, data = server.fetch(num, '(RFC822)')
        tmp = file(TMPFILE, 'w+')
        tmp.write(data[0][1])
        tmp.close()
        log.write(commands.getoutput("%s --prefs-file=%s --spam %s" % \
                (SALEARN, PREFS, TMPFILE)))
        log.write("\n")
        # Mark learned spam as "Deleted"
        server.store(num, '+FLAGS', '\\Deleted')
# Delete messages marked as "Deleted" from server
#server.expunge()
server.logout


Reply via email to