On 1/30/06, Theo Van Dinter <[EMAIL PROTECTED]> wrote:
> On Mon, Jan 30, 2006 at 11:48:17AM -0500, Dan wrote:
> > <a
> > href="http://123.123.123.123/fraud_uri">http://amazon.com/official_looking_path</a>
> > I can write a regexp that looks for an address in the <a> tag's body
> > that is different than in it's href, but I figured someone else had
> > already written one. Can someone point me in the right direction?
>
> You can't do this in a regexp, you need to write some code. There's already
> the check_https_ip_mismatch() function which looks for something similar to
> this. It turns out that href != anchor text is a pretty bad spam sign since
> it happens in ham all the time.
I was thinking of a regexp along the lines of:
/href=\"https?:\/\/[0-9]{1,3}(\.[0-9]{1,3}){3}[^>]+>http:\/\/\w/i
It's not perfect, but it would detect the above scenerio.
What does check_https_ip_mismatch() do?
-Dan