Hi Larry, (I added RD since this has turned into rule discussion. Hope that is ok)
Well we might have inadvertantly merged popcorn and backhair. You gave two suggestions, and because I've been so busy, I didn't immediately get around to testing. Yesterday I started testing your consolidation, but I used the wrong one. You corrected your second rule when you were showing me an easier way to type the expression I had written (when you moved the "!" outside the set.) /[>\s]\w{1}<![\w\s\$&!-]{0,150}>\w{1}\W/ but when I was going through emails to find the rule to test, I accidentally tested your first edit, which made the "!" optional. /[>\s]\w{1}<[\w\s\$&!-]{0,150}>\w{1}\W/ (using a score of zero, and also running the other two sets, popcorn & backhair). In every email I've checked today, that rule is matching rule for rule with popcorn, and the same thing with backhair. I haven't seen any false positives, so it could very well be that this will work. ?? I think the thing that is keeping it safe is that this is an html tag bracketed by characters, rather than tags bracketing words...which should be the case with html. The ends of the expression basically are looking for safeguard stoppers... can anyone think of a case that this would not be so?? Off hand I couldn't. even if they didn't use a closing tag, as in " <li>hey " , that wouldn't match because there are not characters on either side of the "<>". I cant believe though that there isn't a case that this would hit wrong. If it did, maybe it would only hit once, and not be dangerous with a score of just one. (I set up the same set as popcorn, 11-57) the best thing would be to run some known ham and spam just to be sure, but if nobody wants to do that (as I don't know how!! Noob!) I'll just keep watching this and let you know how it goes. Have you done any testing? I’m not going to make a change on the distribution page unless and until I'm more certain of the results. >Not withstanding the '\w{1}' can be changed to '\w'. Correct? Yeah, it could be, but I like to see tidiness and 'sameness'. Call me OCD. :) Thanks!, Jennifer > -----Original Message----- > From: Keith C. Ivey > Sent: Wednesday, October 15, 2003 8:21 PM > To: Larry Gilson > Subject: RE: [SAtalk] Popcorn, Backhair, and Weeds > > > Larry Gilson <[EMAIL PROTECTED]> wrote: > > > /[>\s]\w{1}<\!-?-?[\w\s\$&]{0,150}\!?-?-?>\w{1}\W/ > > > > Keep adding characters as needed. Additionally, since the > > script tags characters inside '< >' are optional, you could > > reduce the complexity of the rule to: > > > > # I don't think '!' needs escaping - true/false? > > /[>\s]\w{1}<[\w\s\$&!-]{0,150}>\w{1}\W/ > > That's a very significant change, because you're no longer > requiting a '!' after the '<'. The earlier pattern isn't going > to match any tags, just comments and comment-like things (many > of them broken HTML). Without the '!', the chance of false > positives will be much higher (though I'm sure it will also > match more spam). Oops, my mistake. I thought the original pattern was '<\!?'. So it could be revised to: /[>\s]\w{1}<![\w\s\$&!-]{0,150}>\w{1}\W/ Not withstanding the '\w{1}' can be changed to '\w'. Correct? --Larry ------------------------------------------------------- This SF.net email sponsored by: Enterprise Linux Forum Conference & Expo The Event For Linux Datacenter Solutions & Strategies in The Enterprise Linux in the Boardroom; in the Front Office; & in the Server Room http://www.enterpriselinuxforum.com _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk