On Thu, May 13, 2021 at 09:24:34AM -0400, Wietse Venema wrote: > > > ; EDNS: version: 0, flags: do; udp: 1232 > > > > Which "dig" uses, but the C library likely sets the historical default > > of "4096" bytes, expecting that to work. I am not aware of any way to > > configure the EDNS buffer size in the C library stub resolver, short of > > recompiling the C library. > > Another data point: by default, Postfix uses a 4096-byte buffer > when it calls the C library stub resolver, but it will repeat the > call with a larger buffer if the response has the 'truncated' flag > raised, and leaving it up to the library to switch to TCP as needed.
It is largely a mistake to confuse the application's result buffer size passed to the DNS resolver (to allow the stub resolver to return larger result packets after TCP fallback) with any notion of the requested EDNS buffer size. The choice of EDNS (UDP!) buffer size is baked into the usual Unix stub resolver library, and is not configurable. When, e.g., Postfix passes a 32K buffer to res_query() or res_search(), it is not indicating an EDNS UDP buffer size. All that's happening is that results of up to that size can be returned to Postfix by the stub resolver, after doing whatever it does to perform the query. > This has been sufficient at least with 'main stream' libc implementations > for the past 21+ years. > > However, I recall that some stub resolvers (libc-musl?) don't support > queries over TCP. Could that be the problem? Yes, fair point, but it seems that this is not the issue in this specific case, and in any case the EDNS buffer size in musl is short, and musl does not do DNSSEC, it just supports the AD bit, and so negative replies would carry just the SOA record. On Thu, May 13, 2021 at 12:04:00PM -0400, Wietse Venema wrote: > > However, I recall that some stub resolvers (libc-musl?) don't support > > queries over TCP. Could that be the problem? > > Indeed, libc-musl does not support DNS queries over TCP. > https://www.linkedin.com/pulse/musl-libc-alpines-greatest-weakness-rogan-lynch/?trackingId=FsMR%2BhJfQqyOH9e1MIN0jw%3D%3D, > which also has a link for the resolver author's rationale. Rich, can be a bit of an iconoclast, and while the world derives important benefits from the presence of iconoclasts, who challenge long-standing beliefs and practices that have outlived their usefulness, success as an iconoclast can put one in situations, where one is fighting lost causes, or challenging practices that are still firmly grounded in reality. In this specific case, refusal to support TCP fallback is questionable, unless there's a realistic choice of widely available more feature-rich DNS library that portable applications can expect to use when expecting larger response sizes (e.g. for the new HTTPS and SVCB records, for TLS encrypted-client-hello (ECH) public keys, ...). Unfortunately, none of "libunbound", "ldns", "getdns", ..., are ubiquitous, and for portability the rational choice is still generally the libc stub resolver. But in the case of MTAs there's a simple correct solution, deploying a local iterative resolver (see below). On Thu, May 13, 2021 at 06:53:08PM +0200, Bjoern Franke wrote: > > However, I recall that some stub resolvers (libc-musl?) don't support > > queries over TCP. Could that be the problem? > > Postfix is running here on Arch Linux, so usual glibc and no musl is used. Your issue appears to be that UDP fragmentation is not working between your stub resolver and the upstream iterator which appears to honour the requested 4K EDNS buffer size. While the DNS operator community is presently in the process of updating specifications to deal with this issue, not all operators have made preemptive changes to resolve the issues in their servers. https://datatracker.ietf.org/doc/html/draft-ietf-dnsop-avoid-fragmentation-04 But this thread suggests that the OP is not using a local resolver (e.g. unbound, ...) on his MTA, and that his Postfix server's stub resolver is directly querying some distant DNS nameserver. That's not a good idea. MTAs should have a local resolver (BIND, unbound, ...) and *that* resolver can be configure with an EDNS buffer size cap. For example, my unbound resolver has: unbound.conf: server: ... edns-buffer-size: 1400 ... With "127.0.0.1" in /etc/resolv.conf, the problem of lost UDP fragments goes away, and the local resolver handles packet size limits, TCP, ... -- Viktor.