Re: FP: URI_NOVOWEL

mouss Wed, 20 Sep 2006 14:29:39 -0700

Theo Van Dinter wrote:

On Tue, Sep 19, 2006 at 10:58:46PM +0200, mouss wrote:

URI_NOVOWEL fires with things like href="#id" where id is a string thatstarts with 7 "no-vowel" chars.
uri URI_NOVOWEL             m%^https?://[^/?]*[bcdfghjklmnpqrstvwxz]{7}%i
uri URI_NOVOWEL             m%^https?://[^/?\#]*[bcdfghjklmnpqrstvwxz]{7}%i

is this correct?


That depends on your definition of "correct".  The RE looks ok, but the
hitrate could change dramatically.  It's hard to say without testing.

my understanding is that the rule looks for "dummy" hostnames in theserver part. unfortunately, the way URIs are "exposed" by SA, this rulealso applies to any thing that resembles a URI. This is a problem withrelative URIs (aka href="foo.html" if foo matches the rule). [In thepast, I have reported problems with things like ldap strings, ... thatwere interpreted as URIs by SA and caught by some rules].

in the present case, the FP ocurred for a "silly" NL that I whitelisted(they trigger other rules. but I am not the recipient, otherwise, I'llblock'em at smtp time). so whether this is a real FP or not is debatable.

however, my understading of the rule is that it was not designed tocatch such relative URIs. If so, then it should be fixed. thus my question.

In other words, should we "fix" the rule because t catches things it wasnot designed to catch, or should we be happy that it detects spam it wasnot supposed to catch? This is a general question of course.

I personally tend to believe that when Bayes is used, "logical" rulesshould only catch what they were supposed to catch. and I do use Bayes(I have disabled Bayes for two months to see the results, and while itwas done on a single installation, the results were that Bayes is veryhelpful).

Re: FP: URI_NOVOWEL

Reply via email to