On Thursday 04 May 2006 16:00, Magnus Holmgren wrote:

> uri URI_NO_WWW_INFO_CGI /^(?:https?:\/\/)?[^\/]+(?<!\/www)\.[^.]
> {7,}\.info\/(?=\S{15,})\S*\?/i
> 
> Let's see if I can get this straight...
> 
> (?:https?:\/\/)?   (optionally) "http://"; or "https://"; followed by
> [^\/]+             one or more of any characters except forward slash /
> (?<!\/www)         of which the last part is not "/www", followed by
> \.[^.]{7,}         a dot and at least 7 characters that are not dots, and
> \.info\/           ".info/"
> (?=\S{15,})        (which is followed by at least 15 non-space characters
> \S*\?              (which we match again here, up to the first question mark
> which we add that there has to be.))
> 
> So it should match e.g. "foo.hellothere.info/forum/viewtopic.php?p=1 " as
> well as "www.hellothere.info/forum/viewtopic.php?p=1 " and
> "http://www.foo.hellothere.info/forum/viewtopic.php?p=1 "
> 
> but not "http://www.hellothere.info/forum/viewtopic.php?p=1 ",
> "foo.hello.info/forum/viewtopic.php?p=1 ",
> "hellothere.info/forum/viewtopic.php?p=1 ",
> or "foo.hellothere.info/bar.php?p=1 ".

(The following is also in reply to Bowie Bailey's message. BTW, Bowie, your
mailclient doesn't set a message reference, so threading is messed up.)

The real URLs are (why I didn't post them before, I don't know...):

http://studentwebzone.tc-online.info/forum2/viewtopic.php?p=47#47

http://studentwebzone.tc-online.info/forum2/viewtopic.php?t=17&unwatch=topic

If all the rule does is check for uri's in a certain form, then I would say
that this specific rule can backfire on completely legitimate mail.

Also, I know I can lookup the rules (in /usr/share/spamassassin) myself, but I
got very confused by all the regexps. I also didn't know what to do with the
regexp result, but I know now it should simply check if it matches.

> 
>> I get false positive spam which have URI's in the .info TLD in it. Like:
>>
>>         http://foo.hello.info/forum/viewtopic.php?p=1
>>
>> Does this rule mean that the webpage accessed by this URI is different then
>> the one accessed by:
>>
>>         http://far.hello.info/forum/viewtopic.php?p=1
> 
> It just means that someone has seen much spam containing URI:s of the
> previous form and that the mass-checks confirmed it.

I can't connect that to the description "URI: CGI in .info TLD other than
third-level "www"". 

> You can always lower the score of any rule you feel misfires. 

I'm trying to help the forum owner to avoid his reply notifcations from being
marked as spam, so what I do to my config is irrelevant.

Reply via email to