Excuse the top posting. I am replying to a top post. Correct. This is not a bug. It's been NFS behaviour ever since I switched careers from IBM Mainframe to UNIX ~ 30 years ago. Sun Solaris behaved this way, as did Tru64, DG-UX and HP/UX. Typically a sysadmin needed to -- and needs to today -- put NFS server IPs in hosts(5).
One person suggested using late for NFS shares. That too what Red Hat does. NFS mounts are flagged by systemd (and prior to that upstart [rc.d]), with _netserv. _netserv would cause the NFS mount to take place after the network is fully up, including DNS resolution. Solaris didn't do this when I last worked on it and AFAIK it still doesn't. I think our choices are to document that sysadmins must either use hosts(5) or ensure NFS shares are mounted late. Or, mount NFS shares after the network is fully up. A retry forever, until DNS finally provides a good answer, can potentially hang boot. This would be especially troublesome for remote unattended reboot in which remediation would require calling remote eyes and hands remote support to "fix" the situation on the console. BTW, with NFSv3 and v2, uninterruptible mounts, i.e. those without the intr option, did behave this way. NFSv4 doesn't support intr. I think the easiest solution would be some documentation. Next would be mounting NFS shares later at about the same time late mounts are processed (actually, immediately prior), like Red Hat Linux does. A _netserv fstab(5) option could serve the same purpose it does in linux, immediately prior to late option handling. Altering the kernel wait forever is undesirable. This would result in boot hangs requiring console access to work around the problem. This would be a PITA and POLA for unattended remote sites. -- Cheers, Cy Schubert <cy.schub...@cschubert.com> FreeBSD UNIX: <c...@freebsd.org> Web: https://FreeBSD.org NTP: <c...@nwtime.org> Web: https://nwtime.org e^(i*pi)+1=0 On Thu, 20 Feb 2025 01:25:20 +0100 Lars Tunkrans <drsn...@gmail.com> wrote: > This situation has existed these past 40 years. You have to put your > ipadress : hostname pairs into /etc/hosts if you dont have accsss to a > working DNS. This is not a bug. Its the way name resolution works. > > Den ons 19 feb. 2025 23:40Rick Macklem <rick.mack...@gmail.com> skrev: > > > Hi, > > > > The subject line basically describes the problem glebius@ > > ran into. When doing an NFS mount in /etc/fstab, it failed > > since the DNS service was not yet working and, as such, > > the DNS lookup of the server fqdn failed, causing the mount > > to fail. Note that this behaviour has existed for decades. > > > > He feels this is a bug and that mount_nfs(8) should retry > > getaddrinfo(3) calls until success, instead of failing the > > mount when the first attempt fails. > > The problem with just retrying getaddrinfo(3) is that it > > could retry forever for simple failures like a typo in the > > server fqdn. > > I can see several ways this can be handled and would > > like feedback from others w.r.t. these alternatives. > > > > 1) Simply document this case and encourage use of > > host names in /etc/hosts for NFS servers along with > > specifying use of file before dns in nsswitch.conf. > > Doing this results in the mounts working whether or > > not DNS is working. > > > > 2) Call it a bug and patch mount_nfs(8) to retry getaddrinfo(3) > > until it succeeds. (I feel this would be a POLA violation, > > given that the current behaviour has existed for decades > > and for simple cases where the fqdn will never resolve > > the behaviour would be to hang at the mount attempt > > during boot unless "bg" is specified for the /etc/fstab entry.) > > > > 3) Add a new NFS mount option "retrydns=<N>", which would enable > > retries of getaddrinfo(3). This would avoid any POLA violation and > > would allow for a convenient way to document the behaviour in > > "man mount_nfs". > > > > 4) ??? > > > > So, what do you think is the preferred change? > > > > rick > > ps: I looked and the return value from getaddrinfo(3) does not > > appear to be useful to discern the case of "DNS service not > > running yet". (I think it replies EAI_FAIL for this case.) > > > >