On Fri, 2013-11-29 at 13:30 +1000, Nick Edwards wrote:
> Hi, have a problem with our internal uribl
> 
> urirhsbl        INT_URI uri.int.lan. A
> body            INT_URI eval:check_uridnsbl('INT_URI')
> describe        INT_URI Contains a URI listed in internal URIBL
> tflags          INT_URI net
> score           INT_URI 3

That's correct.

> this rule performs lookups if in normal text of body, however, i we
> have inside html if does not lookup. eg
> 
> "hi see example.org"  looks up example.org
> but
> "hi see <a href="http://example.org";>example.net</a>"
> it will lookup example.net, not example.org

How do you tell SA does not lookup the domain in the HTML anchor href?

The general SA method of verifying which domains are queried for, is to
have a look at the debug output. In your case, you can also check your
local DNSBL's logs.

  spamassassin -D uridnsbl  < msg

will limit the debug output to the URIDNSBL plugin, which would look
like this:

  dbg: uridnsbl: domain example.net in skip list
  dbg: uridnsbl: domain example.com in skip list
  dbg: uridnsbl: domains to query: anchor-text.net anchor-href.net
  dbg: uridnsbl: domain "anchor-text.net" listed (INT_URI): 127.0.0.2
  dbg: uridnsbl: domain "anchor-href.net" listed (INT_URI): 127.0.0.2

Note the placeholder domains as found in the HTML anchor href and parsed
from the text. The example.(net|com) domains you used are perfect for
the HTML sample snippet, but won't work for actual debugging, since they
are in the default skip list.

To see more of the URIDNSBL plugin activity, including which DNSBLs are
queried and what domains are looked up, you can use e.g.

  spamassassin -D  < msg  2>&1 | grep URI-DNSBL

To limit that to your local DNSBL, grep for DNSBL:uri.int.lan.


Note: The absence of a rule match for the second domain in the Report
header is NOT an indicator of a missing query. If more than one domain
is listed in the DNSBL, the urirhsbl rule will still be triggered once
only, showing one domain, not all listed domains:

  X-Spam-Report:
    *  3.0 INT_URI Contains a URI listed in internal URIBL
    *      [URIs: example.net]

Despite the plural in the automatically added detail, it does list one
domain only. Probably a bug in the URIDNSBL plugin, though might also be
intended.

Since the DNSBL lookups are asynchronous, it is likely undefined which
listed domain will trigger the rule to hit and be reported, influenced
by lookup time and the order they are parsed from the message.


-- 
char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
(c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}

Reply via email to