I've come up with a solution for the problem with not finding headers that
are in uppercase (ie, TO: instead of To: and FROM: instead of From:).

This problem affected mail from passport.com, as well as from other places.  

For example, here's scoring on the mail from passport.com before this fix:

SPAM: Content analysis details:   (6.46 hits, 5 required)
SPAM: Hit! (1 point)     Missing From: header
SPAM: Hit! (2.36 points) SUBJECT: header found
SPAM: Hit! (1 point)     Missing Date: header
SPAM: Hit! (0.7 points)  BODY: Contains a line >=199 characters long
SPAM: Hit! (0.7 points)  BODY: A WHOLE LINE OF YELLING DETECTED
SPAM: Hit! (0.7 points)  Missing To: header

And after:

SPAM: Content analysis details:   (-94.63 hits, 5 required)
SPAM: Hit! (0.5 points)  Subject has an exclamation mark
SPAM: Hit! (0.7 points)  BODY: Contains a line >=199 characters long
SPAM: Hit! (0.7 points)  BODY: A WHOLE LINE OF YELLING DETECTED
SPAM: Hit! (1.56 points) Contains phrases frequently found in spam
SPAM:                    [score:  10, hits: credit card, for more, mail]
SPAM:                    [address, more information, more than, reply]
SPAM:                    [this, thank you, the net, this mail, this]
SPAM:                    [message, web site, you for, you need, you not,]
SPAM:                    [you want, your credit, your mail, your]
SPAM:                    [privacy]
SPAM: Hit! (1.91 points) Date: is in the future or unparseable
SPAM: Hit! (-100 points) From: address is in the user's white-list

As you can see, it now finds the From: To: and Subject: headers, as they've
been stored internally in proper case, and the get re-written in mixed case. 
Also, the whitelisting tests now work.

A couple minor problems:

1) Is header modifying in this way kosher per the RFCs?
2) This invalidates tests such as the "SUBJECT: all in caps" test.
3) Although simple, this reformating would turn "REPLY-TO:" in to
"Reply-to:" instead of "Reply-To:".  Is this a problem?

A possible solution to 1) and 2) would be to keep a seperate array of the
raw headers and use them when the message is re-written, as well as for the
tests that check case on headers.

Anyway, I've included the small patch.  Any comments?

Dan.

--- lib/Mail/SpamAssassin/NoMailAudit.pm~       Wed Jan 23 18:39:15 2002
+++ lib/Mail/SpamAssassin/NoMailAudit.pm        Fri Feb 15 10:26:04 2002
@@ -117,6 +117,9 @@
 
     } elsif (/^([^\x00-\x1f\x7f-\xff :]+): (.*)$/) {
       $hdr = $1; $val = $2;
+      if ($hdr =~ /^.[A-Z]/) {
+        $hdr = ucfirst lc $hdr;
+      }
       $val =~ s/\r+//gs;          # trim CRs, we don't want them
       $entry = $self->_get_or_create_header_object ($hdr);
       $entry->{original} = 1;

_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to