On Fri, 2013-11-29 at 13:30 +1000, Nick Edwards wrote: > Hi, have a problem with our internal uribl > > urirhsbl INT_URI uri.int.lan. A > body INT_URI eval:check_uridnsbl('INT_URI') > describe INT_URI Contains a URI listed in internal URIBL > tflags INT_URI net > score INT_URI 3
That's correct. > this rule performs lookups if in normal text of body, however, i we > have inside html if does not lookup. eg > > "hi see example.org" looks up example.org > but > "hi see <a href="http://example.org">example.net</a>" > it will lookup example.net, not example.org How do you tell SA does not lookup the domain in the HTML anchor href? The general SA method of verifying which domains are queried for, is to have a look at the debug output. In your case, you can also check your local DNSBL's logs. spamassassin -D uridnsbl < msg will limit the debug output to the URIDNSBL plugin, which would look like this: dbg: uridnsbl: domain example.net in skip list dbg: uridnsbl: domain example.com in skip list dbg: uridnsbl: domains to query: anchor-text.net anchor-href.net dbg: uridnsbl: domain "anchor-text.net" listed (INT_URI): 127.0.0.2 dbg: uridnsbl: domain "anchor-href.net" listed (INT_URI): 127.0.0.2 Note the placeholder domains as found in the HTML anchor href and parsed from the text. The example.(net|com) domains you used are perfect for the HTML sample snippet, but won't work for actual debugging, since they are in the default skip list. To see more of the URIDNSBL plugin activity, including which DNSBLs are queried and what domains are looked up, you can use e.g. spamassassin -D < msg 2>&1 | grep URI-DNSBL To limit that to your local DNSBL, grep for DNSBL:uri.int.lan. Note: The absence of a rule match for the second domain in the Report header is NOT an indicator of a missing query. If more than one domain is listed in the DNSBL, the urirhsbl rule will still be triggered once only, showing one domain, not all listed domains: X-Spam-Report: * 3.0 INT_URI Contains a URI listed in internal URIBL * [URIs: example.net] Despite the plural in the automatically added detail, it does list one domain only. Probably a bug in the URIDNSBL plugin, though might also be intended. Since the DNSBL lookups are asynchronous, it is likely undefined which listed domain will trigger the rule to hit and be reported, influenced by lookup time and the order they are parsed from the message. -- char *t="\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4"; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1: (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}