On 1/30/06, Theo Van Dinter <[EMAIL PROTECTED]> wrote:
> On Mon, Jan 30, 2006 at 11:48:17AM -0500, Dan wrote:
> > <a 
> > href="http://123.123.123.123/fraud_uri";>http://amazon.com/official_looking_path</a>
> > I can write a regexp that looks for an address in the <a> tag's body
> > that is different than in it's href, but I figured someone else had
> > already written one.  Can someone point me in the right direction?
>
> You can't do this in a regexp, you need to write some code.  There's already
> the check_https_ip_mismatch() function which looks for something similar to
> this.  It turns out that href != anchor text is a pretty bad spam sign since
> it happens in ham all the time.

I was thinking of a regexp along the lines of:
/href=\"https?:\/\/[0-9]{1,3}(\.[0-9]{1,3}){3}[^>]+>http:\/\/\w/i

It's not perfect, but it would detect the above scenerio.

What does check_https_ip_mismatch()  do?

-Dan

Reply via email to