Bill Cole wrote:
On 2023-04-26 at 11:06:56 UTC-0400 (Wed, 26 Apr 2023 11:06:56 -0400)
Kris Deugau <kdeu...@vianet.ca>
is rumored to have said:
Am I missing some configuration option that can do this, or am I left
with doing one of:
- just suppressing lookups of the canonicalized URI
- removing the canonicalized URI from the DNSBL, even if the listing
might be justified where the *NON*-canonical version absolutely isn't
- applying the welcomelist_* sledgehammer
It's extremely hard to say, given that you've not provided an actual
example of what you're talking about.
When I come up against these odd issues, I try not to include too much
case-specific information, because everyone jumps in with highly
case-specific solutions, most of which don't generalize to solve the
actual problem.
Yes, I do mean an actual message. Evidence that your analysis of what
is happening is not entirely wrong.
I took a closer look and it was easier than expected to redact/replace
customer details with filler or my address.
http://deepnet.cx/~kdeugau/spamtools/cornell-birds.eml
You may be able to nail down what is actually happening by scanning a
problematic message with "-D all" and determining *exactly* what SA is
parsing as a URI that it should not.
As far as I've ever seen, the URI extraction doesn't actually spit out a
larger surrounding chunk of text with -D to actually show where it got
whatever it got. So I have no way to tell what message element SA found
the literal text "none" in, in a place that usually contains a "real"
URI. An extract run on this message, around some key lines for the
problem non-URI:
Apr 26 14:57:31.646 [16796] dbg: uri: canonicalizing parsed uri:
https://www.macaulaylibrary.org/
Apr 26 14:57:31.646 [16796] dbg: uri: cleaned uri:
https://www.macaulaylibrary.org/
Apr 26 14:57:31.646 [16796] dbg: uri: added host:
www.macaulaylibrary.org domain: macaulaylibrary.org
Apr 26 14:57:31.646 [16796] dbg: uri: canonicalizing html uri: none
Apr 26 14:57:31.646 [16796] dbg: uri: cleaned uri: http://none
Apr 26 14:57:31.646 [16796] dbg: uri: cleaned uri: none
Apr 26 14:57:31.646 [16796] dbg: uri: cleaned uri: http://www.none.com
Apr 26 14:57:31.646 [16796] dbg: uri: added host: www.none.com domain:
none.com
Apr 26 14:57:31.646 [16796] dbg: uri: canonicalizing html uri:
https://secure.birds.cornell.edu/sso-static/img/lab-logo-short.png
Apr 26 14:57:31.646 [16796] dbg: uri: cleaned uri:
https://secure.birds.cornell.edu/sso-static/img/lab-logo-short.png
Apr 26 14:57:31.646 [16796] dbg: uri: added host:
secure.birds.cornell.edu domain: cornell.edu
Apr 26 14:57:32.133 [16796] dbg: uri: canonicalizing domainkeys uri:
domainkeys:birds.cornell.edu
which is... not helpful in locating whatever SA grabbed "none" from.
-kgd