I've gotten lots of spam that's only an attachment. To detect this, I've written two rawbody eval subroutines. One checks if the first part of a multi-part mail has any non-blank lines, and if it has none, it returns true; this is supposed to detect messages that are soley attachments with no actual message in them. The second one detects multi-part mails with conent types other than "text/*" or "application/octet-stream"; a mail containing attachments other than these MIME types is likely not a spam (like "application/msword"), so this can be used to compensate for mails that are only a spread-sheet or whatnot.
Any critiques would be apreciated. ------------------------------ rawbody ONLY_ATTACHMENTS eval:check_for_only_attachments() describe ONLY_ATTACHMNETS Only attachmnets, no text rawbody GOOD_ATTACHMENTS eval:check_for_non_spamish_attachments() desrbibe GOOD_ATTACHMENTS Comensate for ONLY_ATTACHMENTS score ONLY_ATTACHMENTS 5.0 score GOOD_ATTACHMENTS -5.0 -------------------------- ########################################################################### # RAWBODY TESTS: ########################################################################### sub check_for_non_spamish_attachments { my ($self, $body) = @_; my $ctype = $self->{msg}->get_header ('Content-Type'); $ctype ||= ''; my $multipart_boundary; if ($ctype !~ /boundary=".*"/) { # Not multipart, so no attachments return 0; } my @ctypes = grep(/^Content-Type:/i, @$body); # Are there any content types other than "text/*" or # "application/octet-stream"? @ctypes = grep(!/^Content-Type: (text\/|application\/octet-stream)/, @ctypes); return (@ctypes > 0); } sub check_for_only_attachments { my ($self, $body) = @_; my $ctype = $self->{msg}->get_header ('Content-Type'); $ctype ||= ''; my $multipart_boundary; if ($ctype =~ /boundary="(.*)"/) { $multipart_boundary = "--$1\n"; } else { # Not a multipart, no attachments return 0; } my $i; my $part_num = 0; for ($i = 0; $i < @$body; $i++) { my $line = $body->[$i]; next if ($line =~ /^This is a multipart MIME message/); next if ($line =~ /^\s*$/); if ($line =~ /^$multipart_boundary$/) { # First part is the non-attachment message. If we reach the # second part without finding a non-blank line, then the # first part of the mail was blank. $part_num++; return 1 if ($part_num > 1); if ($body->[$i + 1] =~ /^Content-Type:/) { $i++; $i++ while ($body->[$i + 1] =~ /^\s/); } $i++ while ($body->[$i + 1] =~ /^Content-(Transfer-Encoding|Disposition):/); next; } # if ($line =~ /^$multipart_boundary$/) # Found a non-blank line before the multi-part boundry return 0; } # for ($i = 0; $i < @$body; $i++) # Should we ever even get here? return 0; } _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk