On Fri, 12 Dec 2003 01:38:16 -0500, Bryan Hoover <[EMAIL PROTECTED]>
posted to spamassassin-talk:
 > [EMAIL PROTECTED] wrote:
 >> For many of these, one can observe that the "user name" in the From:
 >> header often also occurs in the Subject line. This could be a useful
 >> rule pattern, although there are bound to be false positives, so the
 >> score should be rather low.
 >> I don't know off-hand if there is a way to do this in SA currently.
 >> I'd guess it would take a specialized eval: rule. Maybe it's not worth
 >> the effort.
 > Well, I decided it was worthwhile to play around with Perl.  I managed
 > to learn a little, and conjure up a bit -- it's supposed to find words
 > that appear in both from:, and subject: fields:

Allow me to paraphrase the C code in Perl you wrote as actual Perl code ;^)

# I take it you are defining this as just a test case,
# and it would normally simply be the contents of the From: header?

my $from = "from:word1\*\&[EMAIL PROTECTED]";
$from =~ s/^From:\s*//i;
my @from_words = split (/\W+/, $from);

# same with Subject

my $subject = "subject:word1\*\&[EMAIL PROTECTED]";
$subject =~ s/^Subject:\s*//;
my @subj_words = split (/\W+/, $subject);

# find words which are in both subject and from.
# As a simple optimization, copy @from_words into hash keys,
# and find words in @subj_words which are also defined in %from_words
######## TODO: canonicalize to lower case?

my %from_words = map { $_ => 1 } @from_words;
my @best_of_both = grep { defined $from_words{$_} } @subj_words;

print "Found in both From and Subject: \"",
        join ('", "', @best_of_both), "\"\n";

/* era */

-- 
The email address era     the contact information   Just for kicks, imagine
at iki dot fi is heavily  link on my home page at   what it's like to get
spam filtered.  If you    <http://www.iki.fi/era/>  500 pieces of spam for
want to reach me, see     instead.                  each wanted message.



-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to