On Fri, 12 Dec 2003 01:38:16 -0500, Bryan Hoover <[EMAIL PROTECTED]> posted to spamassassin-talk: > [EMAIL PROTECTED] wrote: >> For many of these, one can observe that the "user name" in the From: >> header often also occurs in the Subject line. This could be a useful >> rule pattern, although there are bound to be false positives, so the >> score should be rather low. >> I don't know off-hand if there is a way to do this in SA currently. >> I'd guess it would take a specialized eval: rule. Maybe it's not worth >> the effort. > Well, I decided it was worthwhile to play around with Perl. I managed > to learn a little, and conjure up a bit -- it's supposed to find words > that appear in both from:, and subject: fields:
Allow me to paraphrase the C code in Perl you wrote as actual Perl code ;^) # I take it you are defining this as just a test case, # and it would normally simply be the contents of the From: header? my $from = "from:word1\*\&[EMAIL PROTECTED]"; $from =~ s/^From:\s*//i; my @from_words = split (/\W+/, $from); # same with Subject my $subject = "subject:word1\*\&[EMAIL PROTECTED]"; $subject =~ s/^Subject:\s*//; my @subj_words = split (/\W+/, $subject); # find words which are in both subject and from. # As a simple optimization, copy @from_words into hash keys, # and find words in @subj_words which are also defined in %from_words ######## TODO: canonicalize to lower case? my %from_words = map { $_ => 1 } @from_words; my @best_of_both = grep { defined $from_words{$_} } @subj_words; print "Found in both From and Subject: \"", join ('", "', @best_of_both), "\"\n"; /* era */ -- The email address era the contact information Just for kicks, imagine at iki dot fi is heavily link on my home page at what it's like to get spam filtered. If you <http://www.iki.fi/era/> 500 pieces of spam for want to reach me, see instead. each wanted message. ------------------------------------------------------- This SF.net email is sponsored by: IBM Linux Tutorials. Become an expert in LINUX or just sharpen your skills. Sign up for IBM's Free Linux Tutorials. Learn everything from the bash shell to sys admin. Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk