Hi Jennifer, > -----Original Message----- > From: Jennifer Wheeler > Sent: Tuesday, October 14, 2003 12:27 PM > To: 'Larry Gilson' > Subject: RE: [SAtalk] Popcorn, Backhair, and Weeds > > > Hi Larry! > > I read your message late yesterday, and had to digest it a > little before responding. My problem is this. I was going > to reply to it from work today, but remembered I started > pulling mail from the server at home. :) So I am replying > using an earlier message you sent, and with no reference. Apologies!
No apologies needed. Thanks for the reply! > The rules you're working on look good to me. I have a couple > questions though, I'm a little confused. What score will you > be giving the rules? And are you just trying to reduce the > set to one rule? Or are these suggestions for additional > rules to supplement the others? I just would like a frame of > reference when I think about them. I am starting by using 2 points per test. My original goal was to shorten the tests into fewer tests but I think I found a way to shorten the tests into one test - bonus. :) I have changed the test since my message. I had / \w{1,7}<\/?[\w\W]{0,150}>\w{1,7}/ This created some false positives in that it would literally catch anything between the first word and the last. This would mean it would skip over other legitimate tags until the test matched '>word'. This was not good. So I changed it to: / \w{1,7}<\/?[^<>]{0,150}>\w{1,7}/ This one seems to be working well so far. It will catch any normal and funky stuff within the tags but makes sure it will not run over any subsequent tags. The second rule: /<!?-?-? ?\w{7,} ?-?-?>/ Is just pattern matching and really reinforces the above test in a subset of spam messages the the above will match. The last test with the consonants is being developed to get the random keys (or whatever they are). An example is shown below. > I was confused because the rules actually cover some of the > things you said you were working on. (obfu tags in words > with line breaks etc...) , they just don't get the junk in > the tag. Those rules match the pattern created by obfu tags > in words, and add up based on the patterns (since they have > removed all the hits on spammy words and spammy terms.) So if > you come up with some great little obfu rules... coo :) I > am using a combo of fred's and chris' now and they hit nicely > with these sets. The one you're working on looked promising > to me. Can you shorten the consonant set to [^aeiou] = all > but aeiou ? or would that hit numbers too? I do want the digits but I think [^aeiou] would open it up to non-word characters as well. I really only want the (4+ consonants and digits) then (one to two vowels) then (3+ consonants and digits). The test is working great on the random keys. For example: 47E20DDF-4BD4E3B8-6B3492E2-3CC8345F-1F74D6DD qdg22k2vn5 1l43lf3z7ehr vckwoh3h9zm643 The down side is that it also catches on some URLs, fake Message-Id, and MTA transaction (?) id. I can get rid of the last two by doing a rawbody test instead of full which I will do. > I do have a rule that I'm testing that covers the smaller > <obfu> tags. I'll letcha know, but it seems to be working. > I'll just give that a hairy score to supplement the others. Please do let me know! I will look forward to it. > Operating on less than 1 Coke so that may have made little to > no sense! You did that on one coke? You're good! ;) Thanks Jennifer! --Larry ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. SourceForge.net hosts over 70,000 Open Source Projects. See the people who have HELPED US provide better services: Click here: http://sourceforge.net/supporters.php _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk