On Wed, 6 Feb 2013, Martin Gregorie wrote:
On Wed, 2013-02-06 at 17:45 +0200, Eliezer Croitoru wrote:
Sorry but I didn't had much time to understand all of the rules syntax.
When developing a meta rule that combines subrules there';s littlew
point in writing descriptions for the subrules. In addition I find its
helpful to do the initial development without the leading underscores
because this way you can see these rules firing. After the combination
is working as I want it to I put the underscores in. So, I'd start your
main rule like this:
describe HBRW_SPAM Trap spam thats < 50% hebrew from specific a sender
header HSFROM From =~ /spamadmin\@ngtech.co.il/i
mimeheader HSENC Content-type =~ /charset=.{0,3}windows-1251/i
body HSHCH /[\xC0-\xCB\xCD-\xDB\xDF-\xFB]?/
tflags HSHCH multiple
body HSTCH /[\x30-\x39\x41-\x5A\x61-\x7A\x80-\xFF]?/
tflags HSTCH multiple
meta HSPCT ( (HSHCH * 100) / (HSTCH + 1 ) )
meta HBRW_SPAM (HSPCT < 1) && HSFROM && HSENC
score HBRW_SPAM 10.3
Then this gets tested on a set of messages that exercise every subrule as well
as
checking that the metas work correctly. In this case I'd manually create simpler
message bodies that exercise every test case (I think you'd need at least 10
test
messages to fully test HBRW_SPAM and all its subrules). With this technique
you do need to use the lint check but don't need debugging because the
list of rules 6that fires tell you whether a rule fired or didn't *and* will
show the number of times a 'multiple' fired.
After all is working correctly I put the underscores back:
#
# HBRW_SPAM detects messages from spamad...@ngtech.co.il with a message body or
# part using the Windows 1251 (Hebrew) charset and that contains mostly
# non-Hebrew text.
#
describe HBRW_SPAM Trap spam thats < 50% hebrew from specific a sender
header __HSFROM From =~ /spamadmin\@ngtech.co.il/i
mimeheader __HSENC Content-type =~ /charset=.{0,3}windows-1251/i
body __HSHCH /[\xC0-\xCB\xCD-\xDB\xDF-\xFB]?/
tflags __HSHCH multiple
body __HSTCH /[\x30-\x39\x41-\x5A\x61-\x7A\x80-\xFF]?/
tflags __HSTCH multiple
meta HBRW_SPAM ( (__HSHCH * 100) / (__HSTCH + 1 ) )
score HBRW_SPAM 10.3
After that I re-lint and try all test cases again. I this case I'd do
the underscore additions on two stages: first add them to HSHCH and
HSTCH so I can see that HSPCT still works and, if so, put the rest back
and re-test.
[snip..]
One caveat, an "indirect rule" (one that starts with '__') receives no
intrinsic score. A regular rule will receive a default score of 1.0
So all your "HS*" rules in the above example will have a score of 1.0
and contribute to the final score whereas when they're changed to "__HS*"
will be scoreless and not show up in the final score.
This may make development more difficult.
An alternate way to handle this is to use "testing rules" (rules that
start with 'T_'). These rules are given a default score of 0.01 and thus
show up in the rule report but do not materially contribute to the final
score. So for your example use:
describe HBRW_SPAM Trap spam thats < 50% hebrew from specific a sender
header T_HSFROM From =~ /spamadmin\@ngtech.co.il/i
mimeheader T_HSENC Content-type =~ /charset=.{0,3}windows-1251/i
body T_HSHCH /[\xC0-\xCB\xCD-\xDB\xDF-\xFB]?/
tflags T_HSHCH multiple
body T_HSTCH /[\x30-\x39\x41-\x5A\x61-\x7A\x80-\xFF]?/
tflags T_HSTCH multiple
meta HBRW_SPAM ( (T_HSHCH * 100) / (T_HSTCH + 1 ) )
score HBRW_SPAM 10.3
It's also easier to do an edit s/T_/__/g when you've got things working
to your satisfaction to move from testing to production.
--
Dave Funk University of Iowa
<dbfunk (at) engineering.uiowa.edu> College of Engineering
319/335-5751 FAX: 319/384-0549 1256 Seamans Center
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{