> -----Original Message-----
> From: Matt Kettler [mailto:[EMAIL PROTECTED] 
> Sent: Monday, December 26, 2005 11:54 PM
> To: Mark R. London; users@spamassassin.apache.org
> Subject: Re: Testing for short message?
> 
> At 08:47 AM 12/25/2005, Mark R. London wrote:
> >Has anyone come up with a way to test for short messages?  
> I.e., I want 
> >to test for short messages that only include a single URL. 
> Thanks for 
> >any help. - Mark
> 
> 
> You can test for message body-text length by doing a body 
> rule such as this:
> 
> body    __SHORT_100     /.{100}/
> 
> which will match any message of at least 100 charachters of 
> body text. Use it in negated form in a meta rule. Something like this:
> 
> body    __SHORT_100     /.{100}/
> uri     __HAS_URI       /./
> 
> meta URI_SHORT_MSG      (__HAS_URI && ! __SHORT_100)
> score URI_SHORT_MSG     0.01
> describe URI_SHORT_MSG  Highly experimental untested rule
> 
> Note: please test this rule before assigning it any significant score.
> 
> 

FWIW, I wrote a function to check content sizes of uris for uribl.com.
I'm sure you can do the same thing with email... The problem is, people
write short emails often...  When it comes to spam uris, webpage size is
important in many cases.  I just throw this into EvalTests.pm ( probably
should pluginize and ifdef the rules) 

sub check_body_size {
  my ($self,$content,$min,$max,$type) = @_;
  return 0 if (!defined $min);
  $max=$min+1 if (!defined $max);
  $type ||= 'body';
  my $setting = $type . "_size";
  my ($bytes);
  if (defined $self->{$setting}) {
    $bytes = $self->{$setting};
    # dbg("check_body_size: return cached $bytes $type bytes in
$setting");
  }
  else {
    foreach (@$content) {  $bytes += length($_); }
    # dbg("check_body_size: return new $bytes $type bytes into
$setting");
  }
  $self->{$setting} = $bytes;
  return 0 if (!defined $bytes || $bytes eq "");
  $bytes=~s/[^0-9]//g;
  $max=$min+1 if (!defined $max);
  return 1 if ($bytes >= $min && $bytes < $max);
  return 0;
}

Then I define some rules, which I use to meta against some other more
important tests.

body            __TEXT_BYTES_10   eval:check_body_size(0,10,'body')
body            __TEXT_BYTES_50   eval:check_body_size(10,50,'body')
body            __TEXT_BYTES_100  eval:check_body_size(50,100,'body')
body            __TEXT_BYTES_250  eval:check_body_size(100,250,'body')
body            __TEXT_BYTES_500  eval:check_body_size(250,500,'body')

rawbody         __RAW_BYTES_10    eval:check_body_size(0,10,'rawbody')
rawbody         __RAW_BYTES_50    eval:check_body_size(10,50,'rawbody')
rawbody         __RAW_BYTES_100   eval:check_body_size(50,100,'rawbody')
rawbody         __RAW_BYTES_250
eval:check_body_size(100,250,'rawbody')
rawbody         __RAW_BYTES_500
eval:check_body_size(250,500,'rawbody')
rawbody         __RAW_BYTES_1000
eval:check_body_size(500,1000,'rawbody')
rawbody         __RAW_BYTES_1000P
eval:check_body_size(1000,256000,'rawbody')


Dallas


Reply via email to