Re: uri rules

Matt Kettler Wed, 28 May 2008 06:31:27 -0700

Joseph Brennan wrote:


I was surprised that this rule...

 uri CU_CN_LINK      /http:..\w+\.cn\b/

matches not only this...

 <a href="http://foobar.cn";>

but also this...

<a href="http://www.columbia.edu/foo.html";>KooXoo Buys Kuxun.cnDomain</a>



First, I did not realize that SpamAssassin's idea of "uri" includes not
only the uri, but the start tag, end tag, and all in between.  That's
useful but not real clear in Mail::SpamAssassin::Conf.

Actually, it doesn't.. your second example has two URIs as far asSpamAssassin is concerned. "http://www.columbia.edu/foo.html"; and"http://Kuxun.cn";. Two separate URIs.

Since many email clients "auto-link" domains in text portions, likewww.google.com, SpamAssassin tries to find text strings that clientswill treat as URIs and use them in the URI tests as well.


Second, I can't figure out how \w+ matches the punctuation and spaces!

It doesn't. :)

Re: uri rules

Reply via email to