Existing rule:
rawbody __SPOOFED_URL m/<a\s[^>]{0,2048}\bhref=(?:3D)?.?(https?:[^>"'\#
]{8,29}[^>"'\#
:\/?&=])[^>]{0,2048}>(?:[^<]{0,1024}<(?!\/a)[^>]{1,1024}>){0,99}\s{0,10}(?!\1)https?[^\w<]{1,3}[^<]{5}/i
How about this, to only check for a changed domain part instead?
rawbody SPOOFED_URL_DOMAIN
/<a\s[^>]{0,2048}\bhref=(?:3D)?.?(https?:\/\/?[^\/>"'\#
]{8,29})[^>]{0,2048}>(?:[^<]{0,1024}<(?!\/a)[^>]{1,1024}>){0,99}\s{0,10}(?!\1)https?[^\w<]{1,3}[^<]{5}/i
It matches this:
<a href="http://www.chaosreigns.com/">http://www.example.com</a>
But does not match this (example from actual non-spam):
<a
href="http://www.jr.com/tracking?ord_q_num=105725494&ord_q_zip=03076">http://www.jr.com/tracking</a>
A very simplified form of this new one:
rawbody SPOOFED_URL_DOMAIN /<a href="(https?:\/\/[^\/">]+)[^>]*>(?!\1)http/i
That "(?!\1)" bit is nice and fancy. It means "not what was in the first
set of parentheses). In the perlre man page: "A zero-width negative
look-ahead assertion."
--
"Every normal man must be tempted at times to spit upon his hands,
hoist the black flag, and begin slitting throats."
- Henry Louis Mencken (1880-1956)
http://www.ChaosReigns.com