Hi, > To tell you the truth I'm losing ground lately against spammers. Two > reasons. The Image spam is getting through and because it poisons the > bayes I've lost much of the effectiveness of bayes filtering. I'm still > holding on but I've had people who I hosted for for over a year who > never had a single spam who are now getting a few. I am also having a > few more false positives than I used to.
I'm having succes here detecting image spam using OSBF-Lua filter: from OSBF-lua website: "OSBF-Lua (Orthogonal Sparse Bigrams with confidence Factor) is a Lua C module for text classification. It is a port of the OSBF classifier implemented in the CRM114 project. This implementation attempts to put focus on the classification task itself by using Lua as the scripting language, a powerful yet light-weight and fast language, which makes it easier to build and test more elaborated filters and training methods. The OSBF algorithm is a typical Bayesian classifier but enhanced with two techniques that I originally developed for the CRM114 project: Orthogonal Sparse Bigrams - OSB, for feature extraction, and the Exponential Differential Document Count - EDDC (a.k.a Confidence Factor) for automatic feature selection. Combined, these two techniques produce a highly accurate classifier. OSBF was developed focused on two classes, SPAM and NON-SPAM, so the performance for more than two classes may not be the same." OSBF-Lua learn very fast. It only require Lua 5.1 installed on Exim server with dynamic loading enabled. See install doc; http://osbf-lua.luaforge.net/#installation On exim.conf I add this statements: On ## ON CONFIGURATION SETTINGS ## # set OSBF_LUA_DIR to where spamfilter.lua, spamfilter_command.lua etc were #installed OSBF_LUA_DIR=/usr/local/osbf-lua On ## TRANSPORTS CONFIGURATION ## add transport_filter to local_delivery transport: local_delivery: driver = appendfile check_string = "" create_directory delivery_date_add directory = ${home}/Maildir/ directory_mode = 700 envelope_to_add return_path_add group = mail maildir_format maildir_tag = ,S=$message_size message_prefix = "" message_suffix = "" mode = 0600 quota = ${lookup{$local_part}lsearch*{/etc/mail/quota_usr}{$value} {4M}} quota_size_regex = S=(\d+)$ quota_warn_threshold = 75% transport_filter = OSBF_LUA_DIR/spamfilter.lua --udir $home/osbf-lua that's it!! :) Verify our setup sending a message to yourself with the following in the subject line: help <your password> You will receive a message with a help about spamfilter. To verify that databases wre created correctly: stats <your password> >From now, all mesages that you received will be classified and tagged according the score they get: Tag Meaning [--] almost sure it's a spam - score <= -20 [-] probably it's a spam (reinforcement zone) - score < 0 and > -20 [+] probably it's not spam (reinforcement zone) - score >=0 and < 20 [++] almost sure it's not spam - score >= 20. This tag is here just for symmetry, it's not used. An empty tag is used in place of it so as not to pollute the messages. If the classification is wrong you nust train the filter replaying the message back to yourself, replacing the subject with the correspondent training command: learn <password> spam or learn <password> nonspam After training a few messages, osbf-lua will increase the accuracy on spam detection. If you have a pre-classified messages (nonspam / spam) database on a imap folder, you can use the script toer.lua to do the training. Regards, Marlon -- ## List details at http://www.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://www.exim.org/eximwiki/
