Hi again Sam
[snip and paste...reordering your original post]
> So to restate the second part of my original request,
> Is there a method to modify the score as a function
> of the number of hits of the same rule?

Easier to answer this way.  Sorry, I wasn't feeling wordy yesterday and
thought the site would explain this.

Actually, there may be some other way to accomplish what you are asking,
but the rules I pointed you to actually do what you say in a roundabout
way.  I explained this in a much earlier post, but I'll do so again.

The set is written to catch a pattern of obfuscation, you're right.
When spammers include meaning<blahbitty blah blah>less tags in a spam
(in order to either disquise a spammy word or some other goal..) they
generally do so throughout the spam.  That gives you something to look
for other than a spammy word.  You can now look for many spammy
patterns, making the set, in essence, additive. (though maybe not in the
common meaning of the word "additive" in the world of programming...i'm
not a programmer so I could be talking out of my bum here)

More below...

> 
> From: "Jennifer Wheeler" <[EMAIL PROTECTED]>
> Date: Mon, 22 Dec 2003 15:01:25 -0500
> 
> >http://www.emtinc.net/spamhammers.htm
> 
> Indeed, yours was one of the places I *had* looked.
> Forgive me if I'm confused, but it seems that your
> rules are looking for a variety of tag patterns.
> E.g. frobnoz<flibber>digibbet and mumble<frap>nuts
> are two separate matches.
> 
> Did you find that a more general pattern missed too
> much spam or hit too much ham?

No, I never made a more general rule.  I saw a spam come through that
looked like an extremely blatant in your face use of spammy lingo.  I
was all, "wtf..", and I looked in the source, and saw th<oariegh>at
t<wiouebhv>hey had broken it a<aoeribh>ll up with meaningless tags.
Temporary defeatist attitude took me to the couch to watch tv.  I
thought about how to catch those, and realized that writing to catch the
pattern would be the same thing as looking for a big number of spammy
words.  Just the occurrence of that tag bracketed by words is a spam
flag.  New spammy terms, and you just have to tell the computer how to
read the new words.

If you don't like the set, write a general rule that looks for the
embedded tag with a random number of letters to the right and left of
the tag, bracketed by some sort of stopper to keep it from matching too
much, and give it a whopping score.  I just think it's better to edge
emails up towards spam thresholds with more rules to try and reduce
false positives.

> 
> I had originally considered / .+\<.*\>.* / ,but was
> concerned about inadvertently catching everything by
> accident.

Looking at that rule, I believe the second "." would match a closing
bracket.. ">" so you might actually end up hitting something that
matches a legit tag, then keeps looking in the rest of the email until
it matches the end of the regex.  Sorry I can't give an example, that is
just a suspicion and I'm no regex pro.  Try it, give it a score of .1,
and see what it does hit.

Hope I answered what you are asking.  It's early, so if not, and after a
few cokes, I'll give it another stab.

Jennifer
> 
> I'm hoping that doing this without explicit text
> strings combined with additive scoring will be
> enough to get these auto-learned.

[snipped to above]
> 
> 
> CHeers!
> -sam
> 
> 
> 
> 
> -------------------------------------------------------
> This SF.net email is sponsored by: IBM Linux Tutorials.
> Become an expert in LINUX or just sharpen your skills.  Sign up for
IBM's
> Free Linux Tutorials.  Learn everything from the bash shell to sys
admin.
> Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
> _______________________________________________
> Spamassassin-talk mailing list
> [EMAIL PROTECTED]
> https://lists.sourceforge.net/lists/listinfo/spamassassin-talk



-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to