RFC: mount_nfs failure due to dns not running yet

Rick Macklem Wed, 19 Feb 2025 14:41:14 -0800

Hi,

The subject line basically describes the problem glebius@
ran into.  When doing an NFS mount in /etc/fstab, it failed
since the DNS service was not yet working and, as such,
the DNS lookup of the server fqdn failed, causing the mount
to fail. Note that this behaviour has existed for decades.


He feels this is a bug and that mount_nfs(8) should retry
getaddrinfo(3) calls until success, instead of failing the
mount when the first attempt fails.
The problem with just retrying getaddrinfo(3) is that it
could retry forever for simple failures like a typo in the
server fqdn.
I can see several ways this can be handled and would
like feedback from others w.r.t. these alternatives.

1) Simply document this case and encourage use of
    host names in /etc/hosts for NFS servers along with
    specifying use of file before dns in nsswitch.conf.
     Doing this results in the mounts working whether or
      not DNS is working.

2) Call it a bug and patch mount_nfs(8) to retry getaddrinfo(3)
     until it succeeds. (I feel this would be a POLA violation,
     given that the current behaviour has existed for decades
     and for simple cases where the fqdn will never resolve
     the behaviour would be to hang at the mount attempt
     during boot unless "bg" is specified for the /etc/fstab entry.)

3) Add a new NFS mount option "retrydns=<N>", which would enable
    retries of getaddrinfo(3). This would avoid any POLA violation and
    would allow for a convenient way to document the behaviour in
    "man mount_nfs".

4) ???

So, what do you think is the preferred change?

rick
ps: I looked and the return value from getaddrinfo(3) does not
      appear to be useful to discern the case of "DNS service not
      running yet". (I think it replies EAI_FAIL for this case.)

RFC: mount_nfs failure due to dns not running yet

Reply via email to