On Sun, 20 Sep 2015, AK wrote:

Hi all.

I'm getting hit with lots of JUNK mail that has multiple lines with just a '.' on several lines [0]. Most of the JUNK email has at least 5 and at most 10 lines (so far) with just this '.' character somewhere in the middle of the message.

I've copied the message source to RegexBuddy [1] and have been able to come up with a regex that matches what I want using the Perl 5.20 engine:

(^\.\n){5,}

However, adding this rule to /etc/spamassassin/local.cf doesn't hit at all when I run it against my test message as follows:

===== Start Rule Block =====
rawbody __MANY_PERIODS_1 ALL =~ /(^\.\n){5,}/
meta MANY_PERIODS __MANY_PERIODS_1
score MANY_PERIODS 2.0
describe MANY_PERIODS JUNK mail with several lines that contain single dot
===== End Rule Block =====

===== Begin Test Command =====
spamassassin -L -t test.msg
===== End Test Command =====


Please help me understand what I'm doing wrong as this is my first attempt at creating a rule. Previously I've just copied and pasted what I've found here in the forums, but this time I'm trying to do it myself but failing.


Regards,
ak.

SA does some interesting pre-processing on mail messages before applying rules, so you need to understand that.

Try this:

 rawbody T__LOCAL_MANY_PERIODS        /\n(?:\.\n){5}?/
 describe T__LOCAL_MANY_PERIODS       Many lines with just a single "dot"

Notes:
1) Due to SA pre-processing collapsing body into one long line, cannot match on '^' repeatedly, need to look for '\n' as line break indicator.
Find start of a line and then following repeats of ".\n"
2) use '(?:' as grouping optimization unless you care about capture.
3) for terminal match clause use '{5}' not '{5,}' as we're done as soon
as we see at least 5 matches, don't care if there are more.
4) use "non-greedy" match quantifier '}?' look for first hit on that pattern and don't try to go for more.

Un-optimised pattern: /\n(\.\n){5}/

Note use of "testing" rule name format, that "T_". remove the leading 'T' to make it into a silent rule for combining with metas.

Personal convention; I interpolate '_LOCAL_' ( or '_L_') in locally created rule names to distinguish them for debugging. And then when things don't work as expected (EG: FPs) it helps to determine if the problem is self-inflicted.

Final note; now that we've discussed this spam sign, it will probably become useless as spammers follow this list and mutate their crap accordingly to dodge our rules. ;(

--
Dave Funk                                  University of Iowa
<dbfunk (at) engineering.uiowa.edu>        College of Engineering
319/335-5751   FAX: 319/384-0549           1256 Seamans Center
Sys_admin/Postmaster/cell_admin            Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{

Reply via email to