Here's my attempt at improving the LINE_OF_YELLING rule.  First I changed it 
from a rawbody rule to a body rule.  I'm not sure why it was a rawbody rule 
in the first place, since that would have HTML markup, non-decoded text, and 
such.  Then I chaned it from a regular expression to an eval test.  I strip 
the lines of the body of everything but letters, then grep for the lines of 
20 or more characters which is made soley of uppercase.  This gets rid of 
false positives, like a line made up entirely of periods, and also lets you 
count the number of yelling lines.  I then added two extra rules for messages 
with at least two lines of yelling, and at least three lines of yelling.

Following are the new rules, and then the code.  In the rules files, I 
changed the name "LINE_OF_YELLING" to "LINES_OF_YELLING", and changed 
"rawbody" to "body" for the rule.

---------
body     LINES_OF_YELLING_2     eval:check_for_num_yelling_lines("2")
describe LINES_OF_YELLING_2     2 WHOLE LINES OF YELLING DETECTED

body     LINES_OF_YELLING_3     eval:check_for_num_yelling_lines("3")
describe LINES_OF_YELLING_3     3 WHOLE LINES OF YELLING DETECTED

score    LINES_OF_YELLING_2     1
score    LINES_OF_YELLING_3     1
---------
sub check_for_yelling {
  my ($self, $body) = @_;

  # Make local copy of body.
  my @lines = @{$body};

  # Get rid of everything but upper AND lower case letters
  map (s/[^A-Za-z]//sg, @lines);

  # Now that we have a mixture of upper and lower case, see if it's
  # 1) All upper case
  # 2) 20 or more characters in length
  my $num_lines = scalar grep(/^[A-Z]{20,}$/, @lines);

  $self->{num_yelling_lines} = $num_lines;

  return ($num_lines > 0);
}

sub check_for_num_yelling_lines {
  my ($self, $body, $threshold) = @_;

  return ($self->{num_yelling_lines} >= $threshold);
}

---------


-- 
Visit http://dmoz.org, the world's   | Give a man a match, and he'll be warm
largest human edited web directory.  | for a minute, but set him on fire, and
                                     | he'll be warm for the rest of his life.
[EMAIL PROTECTED]  ICQ: 132152059 |

_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to