On Mon, 27 Jan 2025 19:29:24 +0000
Chris Billington <emu...@disroot.org> wrote:

> On Mon, 27 Jan 2025 14:02:17 +0000
> Chris Billington wrote:
> 
> > I am setting up net-booting of amd64 clients with an amd64 server
> > (HP Z400) running 7.6-release, using the diskless(8) manpage.
> > 
> > I want to set up a shared /usr nfs mount for the clients as described
> > in the manpage, so that a single set of installed packages can be
> > managed centrally.
> > 
> > tftpd, bootparams, dhcpd and nfs are set up as decribed in the mapage
> > 
> > Server IP (em0) 192.168.0.254
> > NFS rootfs /export is on a separate partition.
> > 
> > Client IP (also em0) 192.168.0.2 as set statically by mac address in
> > dhcpd.conf
> > 
> > /etc/exports on the server:
> > /export/client2 -maproot=root -alldirs 192.168.0.2
> > /usr -ro -network=192.168.0.0 -netmask=255.255.255.0
> > /var/db/pkg -ro -network=192.168.0.0 -netmask=255.255.255.0
> > 
> > showmount -e:
> > /var/db/pkg         192.168.0.0
> > /usr                                192.168.0.0
> > /export/client2     192.168.0.2
> > 
> >  /etc/fstab on client2:
> > 192.168.0.254:/export/client2 / nfs rw 0 0
> > 192.168.0.254:/usr/ nfs ro 0 0
> > swap /tmp mfs rw,-s=512M 0 0
> > 192.168.0.254:/var/db/pkg /var/db/pkg nfs 0 0
> > 
> > When booting multi-user, all goes normally until the boot hangs at the
> > point of mounting /usr in the client's /etc/rc (line 489):
> > 
> > mount -s /usr >/dev/null 2>&1 # if NFS, fstab must use IP address
> > 
> > By removing the redirection temporarily, I can see the following error
> > on the client:
> > 
> > mount_nfs: bad MNT RPC: RPC: Unable to send; errno = Permission denied
> > 
> > This is repeated at intervals of 30 seconds or so.
> > 
> > However, showmount -a on the server thinks /usr is mounted:
> > 
> > showmount -a:
> > client2:/export/client2
> > client2:/usr
> > client2:/var/db/pkg
> > 
> > At this point if I interrupt the processing of /etc/rc the boot
> > continues, but fails miserably because /usr is not mounted. (verified
> > with mountd -d on the server)
> > 
> > If I do 'boot -s', after going to a shell it is possible to
> > mount /usr, /tmp and /var/db/pkg without issue. 
> > 
> > If I add the bg (backgrouund the mount task) option to the client's
> > fstab for /usr (ro,bg) then boot proceeds but /usr never gets mounted.
> > 
> > As a check, I tried booting with a non-shared /usr in
> > the /export/client2 directory. Booting then works without problems. But
> > that defeats the object of net booting, to have a shared set of
> > installed packages. 
> > 
> > One strange thing that may be relevant: 
> > If I listen with 'tcpdump -nvi em0' on the server, I can see the rpc
> > request going to the server port 111 over udp each time the client
> > attempts to mount /usr :
> > 
> > 192.168.0.2.xxx > 192.168.0.254.111: [udp sum ok] udp 56 (ttl 64, id
> > xxxxx, len 84)
> > 
> > But the reply back to the client from the server from the same port has:
> > 
> > 192.168.0.254.111 > 192.168.0.2.xxx: [bad udp csum 8682! -> zzzz]] udp
> > 28 (ttl 64, id xxxxx, len 56)
> > 
> > (xxxx, yyyy, zzzz are the random values chosen by the networking stack)
> > 
> > Is it still possible to boot diskless clients with a shared /usr? What
> > could be the cause of the 'bad UDP csum' errors, and the 'mount_nfs bad
> > MNT RPC' error?
> > It's particularly odd because single-user boot allows /usr to be
> > mounted read-only without issue. 
> > 
> > I'm running out of things to try! All assistance in resolving this
> > gratefully accepted....
> > 
> > Possibly unrelated: before the boot process of /bsd starts, I see the
> > following PXEboot error flash by: 
> >  pxe_netif_open : PXENV_UDP_OPEN failed: 0x60
> >  net_open: netif_open() failed
> > However, after this the booting of /bsd continues as normal until the
> > nfs mount hang described above.
> > 
> > -- 
> > Chris Billington
> 
> 
> Further infomation:
> 
> - the bad checksum errors are a 'red herring', they occur because the
> network card supports checksum offloading and tcpdump sees the packets
> before the checksum is added.
> 
> - changing the mount options for /usr to 'ro,tcp' results in slightly
> different error messages on the client when booting multiuser and
> trying to mount /usr readonly:
> 
> first error: 
> Cannot MNT RPC: RPC: Remote system error: Permission denied
> 
> subsequent errors: 
> mount_nfs: Bad MNT RPC: RPC: Unable to send; errno = Bad file descriptor
> 
> --
> Chris Billington
> 
> 

I think I understand what was going on with read-only /usr over nfs
now. 

The temporary pf ruleset loaded in /etc/rc contains "don't kill NFS"
rules which allow communication out to the portmap/sunrpc and nfs ports
on the server only, 111 and 2049:

But to mount a separate /usr the client needs to talk to the mountd RPC
at the reserved port number it gets from portmap, which is blocked by
pf. But the mountd port varies boot-to-boot, so it can't be easily
included in a rule as far as I know. 

I tested this was the issue by hard-including the currently-running
mountd port number in the ruleset.

My workaround was to move the mount command for /usr to just before the
temporary pf ruleset is loaded.

Single-user boot does not load the temporary pf ruleset, and if /usr is
an integral part of the root filesystem, it does not get remounted by
the mount -s command in /etc/rc

I will make a report to bugs@ to see if this small change is possible to
accept for future releases.
-- 
Chris Billington

Reply via email to