On Thu, May 13, 2021 at 09:24:34AM -0400, Wietse Venema wrote:

> > > ; EDNS: version: 0, flags: do; udp: 1232
> > 
> > Which "dig" uses, but the C library likely sets the historical default
> > of "4096" bytes, expecting that to work.  I am not aware of any way to
> > configure the EDNS buffer size in the C library stub resolver, short of
> > recompiling the C library.
> 
> Another data point: by default, Postfix uses a 4096-byte buffer
> when it calls the C library stub resolver, but it will repeat the
> call with a larger buffer if the response has the 'truncated' flag
> raised, and leaving it up to the library to switch to TCP as needed.

It is largely a mistake to confuse the application's result buffer size
passed to the DNS resolver (to allow the stub resolver to return larger
result packets after TCP fallback) with any notion of the requested EDNS
buffer size.  The choice of EDNS (UDP!) buffer size is baked into the
usual Unix stub resolver library, and is not configurable.

When, e.g., Postfix passes a 32K buffer to res_query() or res_search(),
it is not indicating an EDNS UDP buffer size.  All that's happening is
that results of up to that size can be returned to Postfix by the
stub resolver, after doing whatever it does to perform the query.

> This has been sufficient at least with 'main stream' libc implementations
> for the past 21+ years.
> 
> However, I recall that some stub resolvers (libc-musl?) don't support
> queries over TCP. Could that be the problem?

Yes, fair point, but it seems that this is not the issue in this
specific case, and in any case the EDNS buffer size in musl is short,
and musl does not do DNSSEC, it just supports the AD bit, and so
negative replies would carry just the SOA record.

On Thu, May 13, 2021 at 12:04:00PM -0400, Wietse Venema wrote:

> > However, I recall that some stub resolvers (libc-musl?) don't support
> > queries over TCP. Could that be the problem?
> 
> Indeed, libc-musl does not support DNS queries over TCP.
> https://www.linkedin.com/pulse/musl-libc-alpines-greatest-weakness-rogan-lynch/?trackingId=FsMR%2BhJfQqyOH9e1MIN0jw%3D%3D,
> which also has a link for the resolver author's rationale.

Rich, can be a bit of an iconoclast, and while the world derives important
benefits from the presence of iconoclasts, who challenge long-standing
beliefs and practices that have outlived their usefulness, success as an
iconoclast can put one in situations, where one is fighting lost causes,
or challenging practices that are still firmly grounded in reality.

In this specific case, refusal to support TCP fallback is questionable,
unless there's a realistic choice of widely available more feature-rich
DNS library that portable applications can expect to use when expecting
larger response sizes (e.g. for the new HTTPS and SVCB records, for
TLS encrypted-client-hello (ECH) public keys, ...).

Unfortunately, none of "libunbound", "ldns", "getdns", ..., are
ubiquitous, and for portability the rational choice is still generally
the libc stub resolver.

But in the case of MTAs there's a simple correct solution, deploying
a local iterative resolver (see below).

On Thu, May 13, 2021 at 06:53:08PM +0200, Bjoern Franke wrote:

> > However, I recall that some stub resolvers (libc-musl?) don't support
> > queries over TCP. Could that be the problem?
> 
> Postfix is running here on Arch Linux, so usual glibc and no musl is used.

Your issue appears to be that UDP fragmentation is not working between
your stub resolver and the upstream iterator which appears to honour
the requested 4K EDNS buffer size.  While the DNS operator community
is presently in the process of updating specifications to deal with
this issue, not all operators have made preemptive changes to resolve
the issues in their servers.

    
https://datatracker.ietf.org/doc/html/draft-ietf-dnsop-avoid-fragmentation-04

But this thread suggests that the OP is not using a local resolver (e.g.
unbound, ...) on his MTA, and that his Postfix server's stub resolver is
directly querying some distant DNS nameserver.  That's not a good idea.

MTAs should have a local resolver (BIND, unbound, ...) and *that*
resolver can be configure with an EDNS buffer size cap.  For example,
my unbound resolver has:

    unbound.conf:
        server:
            ...
            edns-buffer-size: 1400
            ...

With "127.0.0.1" in /etc/resolv.conf, the problem of lost UDP fragments
goes away, and the local resolver handles packet size limits, TCP, ...

-- 
    Viktor.

Reply via email to