Hi again Sam [snip and paste...reordering your original post] > So to restate the second part of my original request, > Is there a method to modify the score as a function > of the number of hits of the same rule?
Easier to answer this way. Sorry, I wasn't feeling wordy yesterday and thought the site would explain this. Actually, there may be some other way to accomplish what you are asking, but the rules I pointed you to actually do what you say in a roundabout way. I explained this in a much earlier post, but I'll do so again. The set is written to catch a pattern of obfuscation, you're right. When spammers include meaning<blahbitty blah blah>less tags in a spam (in order to either disquise a spammy word or some other goal..) they generally do so throughout the spam. That gives you something to look for other than a spammy word. You can now look for many spammy patterns, making the set, in essence, additive. (though maybe not in the common meaning of the word "additive" in the world of programming...i'm not a programmer so I could be talking out of my bum here) More below... > > From: "Jennifer Wheeler" <[EMAIL PROTECTED]> > Date: Mon, 22 Dec 2003 15:01:25 -0500 > > >http://www.emtinc.net/spamhammers.htm > > Indeed, yours was one of the places I *had* looked. > Forgive me if I'm confused, but it seems that your > rules are looking for a variety of tag patterns. > E.g. frobnoz<flibber>digibbet and mumble<frap>nuts > are two separate matches. > > Did you find that a more general pattern missed too > much spam or hit too much ham? No, I never made a more general rule. I saw a spam come through that looked like an extremely blatant in your face use of spammy lingo. I was all, "wtf..", and I looked in the source, and saw th<oariegh>at t<wiouebhv>hey had broken it a<aoeribh>ll up with meaningless tags. Temporary defeatist attitude took me to the couch to watch tv. I thought about how to catch those, and realized that writing to catch the pattern would be the same thing as looking for a big number of spammy words. Just the occurrence of that tag bracketed by words is a spam flag. New spammy terms, and you just have to tell the computer how to read the new words. If you don't like the set, write a general rule that looks for the embedded tag with a random number of letters to the right and left of the tag, bracketed by some sort of stopper to keep it from matching too much, and give it a whopping score. I just think it's better to edge emails up towards spam thresholds with more rules to try and reduce false positives. > > I had originally considered / .+\<.*\>.* / ,but was > concerned about inadvertently catching everything by > accident. Looking at that rule, I believe the second "." would match a closing bracket.. ">" so you might actually end up hitting something that matches a legit tag, then keeps looking in the rest of the email until it matches the end of the regex. Sorry I can't give an example, that is just a suspicion and I'm no regex pro. Try it, give it a score of .1, and see what it does hit. Hope I answered what you are asking. It's early, so if not, and after a few cokes, I'll give it another stab. Jennifer > > I'm hoping that doing this without explicit text > strings combined with additive scoring will be > enough to get these auto-learned. [snipped to above] > > > CHeers! > -sam > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: IBM Linux Tutorials. > Become an expert in LINUX or just sharpen your skills. Sign up for IBM's > Free Linux Tutorials. Learn everything from the bash shell to sys admin. > Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click > _______________________________________________ > Spamassassin-talk mailing list > [EMAIL PROTECTED] > https://lists.sourceforge.net/lists/listinfo/spamassassin-talk ------------------------------------------------------- This SF.net email is sponsored by: IBM Linux Tutorials. Become an expert in LINUX or just sharpen your skills. Sign up for IBM's Free Linux Tutorials. Learn everything from the bash shell to sys admin. Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk