On Thu, Mar 21, 2002 at 10:58:35AM -0500, Greg Ward wrote: > I think "message id with no dot after the @" is worth detecting, but > with a low positive score -- that sort of thing occurs depressingly > often in real email too.
What I'm arguing is: don't replace the regexp of INVALID_MSGID with one that isn't actually checking for invalid message-ids. Adding a new test that looks for ".+@.+\..+" would be fine by me, since it's a new test that looks for something different. Just for some stats BTW: I ran a quick script against my current montly spam archive: $ perl -nle 'BEGIN{$/="\n\nFrom "} next unless /^Message-Id/mi; /Message-Id:.*?<(.+)>/i;$_=$1; s/.+?@//; print; if (/\./){$with++}elsif (/\S/){$without++}else{$blank++}; $total++; END{print "$with/$without/$blank/$total/",($without/$total),"%/",($blank/$total),"%\n"}' spammers This month so far comes back with: 743/104/1/848/0.122641509433962%/0.00117924528301887% (with a dot/without a dot/blank RHS/total/pct without/pct blank) Now, my spam archive doesn't have many (I'm surprised I have any -- I think it's the one with random 8-bit chars in it) blanks since I filter things that aren't ".+@.+" at the SMTP level. So I did a quick grep on the mail logs for the past month: $ grep CheckMessageId maillog* | wc -l 135 Assuming these mean "blanks" (some are completely invalid message-ids that look like dates, etc,) that makes the new stats: 743/104/136/983/0.105798575788403%/0.138351983723296% So of the 983 known and very likely spam messages, 10.5% don't have a dot on the right-hand side, and 13.8% are invalid via ".+@.+". -- Randomly Generated Tagline: "You have to stay in shape. My grandmother, she started walking 5 miles a day when she was 60. She's 97 today and we don't know where the hell she is." - Ellen DeGeneres _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk