On Mar 15 21:18:15, Thomas Ribbrock wrote:
> Hi all,
> 
> (2nd try - the first message didn't make it to the list, apparently)
> 
> I'm currently trying to get one of my SGI Indys to run as a diskless
> music player. I'm using Debian Linux (Lenny) for that and the Indy is
> supposed to be booting diskless off my OpenBSD file server (i.e. NFS
> root file system).
> 
> After having figured out all the necessary parts (dhcpd set-up, tftpd
> set-up, nfs set-up) I've got the Indy to boot - it gets its IP address,
> finds its kernel via tftp and starts booting. However, as soon as it
> starts doing something on its NFS root file system, I get tons of
> "server not responding, still trying" messages on the Indy. After a long
> time, the Indy will finally succeed in booting, but any further activity
> on the root filesystem generates more errors.
> 
> On the file server, I cannot find any errors in the log files. However,
> when the Indy is doing something, I can see the load going up without
> any program in particular using CPU (what's that - interrupts?). At one
> point it got that bad that the server hung completely and I was forced
> to reboot. That server is running OpenBSD 4.5.
> 
> I've also noticed that things get better when I remount the nfsroot
> on the Indy with an explicit "vers=3" (I think I read somewhere that the
> default is NFSv2 when booting it that way) - after the remount, the
> errors stop.
> 
> To investigate this further, I've set up a test server for the tftp and
> nfsroot (dhcp is still done by the main server) running OpenBSD 4.6. I
> did a basic install and all I configured was tftp and nfsd (no pf or any
> other extras). Same result: Any activity on the Indy results in loads of
> "server not responding" NFS errors and everything is very, very slow.
>
> tcpdump on the connection reveals loads of this:
> 00:23:51.589956 00:04:75:98:2b:9d 08:00:69:09:88:d3 0800 1514:
> 192.168.1.2.2049 > 192.168.1.82.940: xid 0x0 reply ERR 1448 (DF) (ttl
> 64, id 33443, len 1500)
> (.2 being the server and .82 the Indy) - but that doesn't tell me
> much...
> 
> As I remembered having NFS trouble with 4.5 before (after I upgraded
> the main server from 4.2), I installed OpenBSD 4.2 on the test server.
> Same configuration (just tftp and nfsd) - and presto, the Indy boots
> absolutely fine - no problems at all.
> 
> Apparently, something in NFS has changed between 4.2 and 4.5 (and
> higher) - and I just cannot figure out what... Hence, I have no idea
> what I would need to change nor what to investigate further. I've been
> over the release notes and the only NFS related change that I noticed
> was the addition of rpc.statd in 4.4 - could this have anything to do
> with the problems I'm seeing?

Are you sure that there is no pf standing in between the NFS client
(indy) and the NFS server (obsd)? In 4.6, the default is to run pf,
and the 4.6 version of pf (older versions too, probably) recognize
a 'no-df' option to 'scrub': http://www.openbsd.org/faq/pf/scrub.html

        no-df
            Clears the don't fragment bit from the IP packet header.
            Some operating systems are known to generate fragmented
            packets with the don't fragment bit set. This is
            particularly true with NFS. Scrub will drop such packets
            unless the no-df option is specified.

Could this be related to the "ERR 1448 (DF)" message above?

Also, there might be differences in exactly what packets the nfs2 client
and the nfs3 client generates; have a look at a full tcpdump of the boot.

Reply via email to