On Wed, 8 Jul 2009, Benny Pedersen wrote:
do you have a dual quad core that idles ? :)
I have a dual Pentium-III that idles 99% of the time, yes.
rawbody takes more cpu power then (body)
I wouldn't think that it takes much more as the only difference
is whether HTML is still present....
why missing /i ?
and why exact match on begin of line ?
I use these rules as quick 'poison pill' rules added as needed, then
remove them a few weeks later.
The use of case-sensitive matching and exact line matching are intended to
match the spam as exactly as possible and minimize the possibility of
FP's. Someone could very well have a deceased client of some kind, but
it's not likely that ham will use that exact phrase, with that
capitalization, all alone on a single line (the original regex matches
beginning to END of the line).
Also, anchoring tests to the beginning or end of lines should improve
efficiency, as the only places it will check the regex is at line breaks.
body __A1 /\basserts\b/i
body __A2 /\bof\b/i
body __A3 /\bmy\b/i
body __A4 /\bdeceased\b/i
body __A5 /\bclient\b/i
meta LOC09070702 (__A1 && __A2 && __A3 && __A4 && __A5)
Far too much chance of FP's. Given that 'for' and 'my' occur in many
e-mails, you are really basing this on 'deceased', 'client' and 'assets'.
- C