On Tue, May 11, 2010 at 12:50 AM, Stefan Unterweger <stefan+open...@aleturo.com> wrote: > Hello! > > I'm trying to set up my server for diskless boots, as described > in the diskless(8) manpage (at the moment, more or less mostly as > an academic exercise, but I was planning to take my oldish > laptops to some use this way). > > I went along the instructions from the manpage, setting up the > various pieces as I was instructed; since I was already running > a limited PXE boot environment so that I can do installs more > rapidly, many of the steps were already done, having to setup > only rarpd and nfs. > > However, when I now try to get the client actually to boot from > this setup, it fails quite miserably when trying to mount the > root filesystem via NFS. The kernel just hangs forever, printing > "RPC timeout for server 172.23.255.255 (0xac17ffff) prog 100000". > > After some research, I came up with an old posting from misc > (http://archives.neohapsis.com/archives/openbsd/2004-01/0603.html), > but without any solution. The problem described there is quite > similar to the one I'm experiencing here, but without all the > peculiarities that were used there (i.e., I'm using a stock > 4.6-release, stock-dhcpd, stock-everything). Especially, my > client does the same thing as the Soekris in that old posting, > i.e. trying to connect to the NFS server at the broadcast address > 172.23.255.255, instead of 172.23.12.2, which would be the "real" > public address of the server. It _does_ connect to 172.23.12.2 on > the original PXE bootstrap, but that might as well be because > dhcpd tells it to do so, as far as I understood the process. > > Since the server also runs some other services, pf is running, > which I first guessed might be the culprit. However, even with > "pass quick" for everything coming from the particular client, > nothing changes. tcpdump on the pflog-interface shows the sunrpc > packets to be allowed, so I don't think that it is a PF issue. > Disabling PF didn't change anything, for that matter. > > rpcinfo(8) shows everything up and running: > | % rpcinfo -p > | program vers proto port > | 100000 2 tcp 111 portmapper > | 100000 2 udp 111 portmapper > | 100003 2 udp 2049 nfs > | 100003 3 udp 2049 nfs > | 100003 2 tcp 2049 nfs > | 100003 3 tcp 2049 nfs > | 100021 0 udp 759 nlockmgr > | 100021 1 udp 759 nlockmgr > | 100021 3 udp 759 nlockmgr > | 100021 4 udp 759 nlockmgr > | 100021 1 tcp 776 nlockmgr > | 100021 3 tcp 776 nlockmgr > | 100021 4 tcp 776 nlockmgr > | 100024 1 udp 992 status > | 100024 1 tcp 726 status > | 100005 1 udp 994 mountd > | 100005 3 udp 994 mountd > | 100005 1 tcp 1011 mountd > | 100005 3 tcp 1011 mountd > > Especially the portmapper itself, as this one seems to be the > service that the client seems unable to find. Or at least, that's > how I interpret the "prog 100000" which scrolls continuously on > the client's error message. > > I have already tried to have tcpdump have a look at what's going > on, but unfortunately, I don't see very much in its output: > | $ tcpdump -n -s 140 -i em0 host 172.23.13.138 > | tcpdump: listening on em0, link-type EN10MB > | 01:29:31.853178 172.23.13.138.718 > 172.23.255.255.111: udp 96 > | 01:29:36.853392 172.23.13.138.718 > 172.23.255.255.111: udp 96 > | 01:29:41.853479 172.23.13.138.718 > 172.23.255.255.111: udp 96 > (ad infinitum) > > As far as I see it, the client sends some UDP packet to the > portmapper, but does not get any response. > > Since it looks like a RPC/NFS issue, I tried to see if "normal" > NFS access would yield similar issues, so I had the same client > try to connect from some Linux livecd thingie. This succeeded on > the first try---hence, NFS seems to work, at least in general. > However, the straightforward nfs mount did connect using > 172.23.13.2 (i.e., the "real" address of the server"), not the > broadcast address. Trying to do a mount to > 172.23.255.255:/export/client resulted in an error message, > namely "Network is unreachable", but no blip comes up at the > tcpdump above which was still running at this time, so it might > as well have been Linux who won't allow to connect NFS on > the broadcast address. > > The previously mentioned old mailinglist posting mentioned that > rpc.bootparamd'd be needed, but starting it or not does not make > any difference (and http://www.netbsd.org/docs/network/netboot/intro.i386.html > kind of implies that rpc.bootparamd is not needed on i386, and > the manpage actively discourages it). > > > I'm now quite at a loss now, and don't know where to look > anymore. I'm sure it's just some small thing that I'm still > overlooking, or some interoperatibility issue with some parts of > that setup, but I don't know where to look anymore. > > Thanks in advance for any hints, or for just having the patience > to read through to the end. :o) > > s//un
Hi, What does your dhcpd.conf look like on your server? It might be worth having -vv and -X on your tcpdump it might provide more info as to the problem. hth Fred