> -----Original Message----- > From: Matt Kettler [mailto:[EMAIL PROTECTED] > Sent: Monday, December 26, 2005 11:54 PM > To: Mark R. London; users@spamassassin.apache.org > Subject: Re: Testing for short message? > > At 08:47 AM 12/25/2005, Mark R. London wrote: > >Has anyone come up with a way to test for short messages? > I.e., I want > >to test for short messages that only include a single URL. > Thanks for > >any help. - Mark > > > You can test for message body-text length by doing a body > rule such as this: > > body __SHORT_100 /.{100}/ > > which will match any message of at least 100 charachters of > body text. Use it in negated form in a meta rule. Something like this: > > body __SHORT_100 /.{100}/ > uri __HAS_URI /./ > > meta URI_SHORT_MSG (__HAS_URI && ! __SHORT_100) > score URI_SHORT_MSG 0.01 > describe URI_SHORT_MSG Highly experimental untested rule > > Note: please test this rule before assigning it any significant score. > >
FWIW, I wrote a function to check content sizes of uris for uribl.com. I'm sure you can do the same thing with email... The problem is, people write short emails often... When it comes to spam uris, webpage size is important in many cases. I just throw this into EvalTests.pm ( probably should pluginize and ifdef the rules) sub check_body_size { my ($self,$content,$min,$max,$type) = @_; return 0 if (!defined $min); $max=$min+1 if (!defined $max); $type ||= 'body'; my $setting = $type . "_size"; my ($bytes); if (defined $self->{$setting}) { $bytes = $self->{$setting}; # dbg("check_body_size: return cached $bytes $type bytes in $setting"); } else { foreach (@$content) { $bytes += length($_); } # dbg("check_body_size: return new $bytes $type bytes into $setting"); } $self->{$setting} = $bytes; return 0 if (!defined $bytes || $bytes eq ""); $bytes=~s/[^0-9]//g; $max=$min+1 if (!defined $max); return 1 if ($bytes >= $min && $bytes < $max); return 0; } Then I define some rules, which I use to meta against some other more important tests. body __TEXT_BYTES_10 eval:check_body_size(0,10,'body') body __TEXT_BYTES_50 eval:check_body_size(10,50,'body') body __TEXT_BYTES_100 eval:check_body_size(50,100,'body') body __TEXT_BYTES_250 eval:check_body_size(100,250,'body') body __TEXT_BYTES_500 eval:check_body_size(250,500,'body') rawbody __RAW_BYTES_10 eval:check_body_size(0,10,'rawbody') rawbody __RAW_BYTES_50 eval:check_body_size(10,50,'rawbody') rawbody __RAW_BYTES_100 eval:check_body_size(50,100,'rawbody') rawbody __RAW_BYTES_250 eval:check_body_size(100,250,'rawbody') rawbody __RAW_BYTES_500 eval:check_body_size(250,500,'rawbody') rawbody __RAW_BYTES_1000 eval:check_body_size(500,1000,'rawbody') rawbody __RAW_BYTES_1000P eval:check_body_size(1000,256000,'rawbody') Dallas