On 26 January 2002, Sidney Markowitz said:
> Here are some excerpts from a spam that scored 4.9 using a cvs pull from
> a few weeks ago and 4.2 with a fresh cvs pull. I think someone has
> introduced a bug into the LINE_OF_YELLING rule:
> 
> [begin excerpts]
> 
> HAPPY NEW YEAR!
> INSTEAD OF GIVING YOU A KISS WHEN THE BALL DROPS,
> WE'VE DECIDED TO GIVE YOU AN ORGASM!

Yeah, I had a similar spam slip through (or at least LINE_OF_YELLING
didn't match).  Here's how this test is defined in 2.0:

# (contrib: WW)
# modified by jm to stop it matching on all-space lines
rawbody LINE_OF_YELLING         
/^[A-Z0-9\$\.,\'\!\?\s]{20,}[A-Z\$\.,\'\!\?]{5,}[A-Z0-9\$\.,\'\!\?\s]{20,}$/
describe LINE_OF_YELLING        A WHOLE LINE OF YELLING DETECTED

If you read the regex carefully, you'll realize that there are some odd
restrictions on that line of yelling:
  * it must be at least 45 characters long
  * there must be a "word" at least 5 characters long in the middle,
    at least 20 characters from either end

I haven't counted, but I bet none of the lines-of-yelling in your sample
spam meet those criteria.

Here's a candidate replacement test using two regexes:

  /^[A-Z0-9\$\.,\'\!\?\s]+$/ && /[A-Z]/

This will match any line that consists solely of caps, digits, and
punctuation *as long as it contains at least one letter*.

Can you define tests that way in the .cf files?  Or would that have to
be coded as an "eval"?

Here's a slightly subtler variation:

  /^[A-Z0-9\$\.,\'\!\?\s]+$/ && /\b[A-Z]+\b/

Same as above except the line must contain a complete uppercase word.

        Greg
-- 
Greg Ward - software developer                [EMAIL PROTECTED]
MEMS Exchange                            http://www.mems-exchange.org

_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to