you should be able to check against img src content, right?
2011/10/14 Christian Grunfeld <christian.grunf...@gmail.com>: > and what about when there is no anchor text in the link ? eg. paypal > image button > > > 2011/10/14 <dar...@chaosreigns.com>: >> Existing rule: >> >> rawbody __SPOOFED_URL m/<a\s[^>]{0,2048}\bhref=(?:3D)?.?(https?:[^>"'\# >> ]{8,29}[^>"'\# >> :\/?&=])[^>]{0,2048}>(?:[^<]{0,1024}<(?!\/a)[^>]{1,1024}>){0,99}\s{0,10}(?!\1)https?[^\w<]{1,3}[^<]{5}/i >> >> >> How about this, to only check for a changed domain part instead? >> >> rawbody SPOOFED_URL_DOMAIN >> /<a\s[^>]{0,2048}\bhref=(?:3D)?.?(https?:\/\/?[^\/>"'\# >> ]{8,29})[^>]{0,2048}>(?:[^<]{0,1024}<(?!\/a)[^>]{1,1024}>){0,99}\s{0,10}(?!\1)https?[^\w<]{1,3}[^<]{5}/i >> >> It matches this: >> >> <a href="http://www.chaosreigns.com/">http://www.example.com</a> >> >> But does not match this (example from actual non-spam): >> >> <a >> href="http://www.jr.com/tracking?ord_q_num=105725494&ord_q_zip=03076">http://www.jr.com/tracking</a> >> >> >> A very simplified form of this new one: >> >> rawbody SPOOFED_URL_DOMAIN /<a href="(https?:\/\/[^\/">]+)[^>]*>(?!\1)http/i >> >> That "(?!\1)" bit is nice and fancy. It means "not what was in the first >> set of parentheses). In the perlre man page: "A zero-width negative >> look-ahead assertion." >> >> -- >> "Every normal man must be tempted at times to spit upon his hands, >> hoist the black flag, and begin slitting throats." >> - Henry Louis Mencken (1880-1956) >> http://www.ChaosReigns.com >> >