On Thu, 8 Mar 2007 10:23:05 +0100, [EMAIL PROTECTED] wrote: >On Thu, 8 Mar 2007, [EMAIL PROTECTED] wrote: > >> I searched the list and found this rule to catch URL with single space >> (www.ledrx .com). Please help me in modifying this rule to catch URL >> with double space (www.superveils . com). >> >> body URL_WITH_SPACE m/\bhttp:\/\/[a-z0-9\-.]+[!*%&, -]+\.?com\b/ > >Personally I would make it something like this: > ># Handles www. a.com, www.a .com, www. a .com, www . a.com, ... >body __URL_WITH_SPACE1 /www[\ ]+?\.([a-z0-9\-]?\ [a-z0-9\-]?)+\.[ >]+?(com|net|org)/ ># Handles www .xxx.com >body __URL_WITH_SPACE2 /www[\ ]+\.([a-z0-9\-\ ]?)+\.[\ ]+?(com|net|org)/ ># Handles www.xxx. com >body __URL_WITH_SPACE3 /www[\ ]+?\.([a-z0-9\-\ ]?)+\.[\ ]+(com|net|org)/ > >meta URL_WITH_SPACE ( __URL_WITH_SPACE1 || __URL_WITH_SPACE2 || >__URL_WITH_SPACE3 ) >describe Body contains an URL with a space >score URL_WITH_SPACE xx > >I did a few quick tests against some URL's, though it's untested against >my ham & spam boxes :-) > >K.
That looks a much better solution. Be interesting to see how it runs. Tho inclusion of the obfuscation characters that have been floating about would make it more useful perhaps? Nigel