Re: Trying to set diskless(8) -- hanging in "RPC timeout for server"

Fred Crowson Tue, 11 May 2010 14:48:55 -0700

On Tue, May 11, 2010 at 12:50 AM, Stefan Unterweger
<stefan+open...@aleturo.com> wrote:
> Hello!
>
> I'm trying to set up my server for diskless boots, as described
> in the diskless(8) manpage (at the moment, more or less mostly as
> an academic exercise, but I was planning to take my oldish
> laptops to some use this way).
>
> I went along the instructions from the manpage, setting up the
> various pieces as I was instructed; since I was already running
> a limited PXE boot environment so that I can do installs more
> rapidly, many of the steps were already done, having to setup
> only rarpd and nfs.
>
> However, when I now try to get the client actually to boot from
> this setup, it fails quite miserably when trying to mount the
> root filesystem via NFS. The kernel just hangs forever, printing
> "RPC timeout for server 172.23.255.255 (0xac17ffff) prog 100000".
>
> After some research, I came up with an old posting from misc
> (http://archives.neohapsis.com/archives/openbsd/2004-01/0603.html),
> but without any solution. The problem described there is quite
> similar to the one I'm experiencing here, but without all the
> peculiarities that were used there (i.e., I'm using a stock
> 4.6-release, stock-dhcpd, stock-everything). Especially, my
> client does the same thing as the Soekris in that old posting,
> i.e. trying to connect to the NFS server at the broadcast address
> 172.23.255.255, instead of 172.23.12.2, which would be the "real"
> public address of the server. It _does_ connect to 172.23.12.2 on
> the original PXE bootstrap, but that might as well be because
> dhcpd tells it to do so, as far as I understood the process.
>
> Since the server also runs some other services, pf is running,
> which I first guessed might be the culprit. However, even with
> "pass quick" for everything coming from the particular client,
> nothing changes. tcpdump on the pflog-interface shows the sunrpc
> packets to be allowed, so I don't think that it is a PF issue.
> Disabling PF didn't change anything, for that matter.
>
> rpcinfo(8) shows everything up and running:
> | % rpcinfo -p
> |    program vers proto   port
> |     100000    2   tcp    111  portmapper
> |     100000    2   udp    111  portmapper
> |     100003    2   udp   2049  nfs
> |     100003    3   udp   2049  nfs
> |     100003    2   tcp   2049  nfs
> |     100003    3   tcp   2049  nfs
> |     100021    0   udp    759  nlockmgr
> |     100021    1   udp    759  nlockmgr
> |     100021    3   udp    759  nlockmgr
> |     100021    4   udp    759  nlockmgr
> |     100021    1   tcp    776  nlockmgr
> |     100021    3   tcp    776  nlockmgr
> |     100021    4   tcp    776  nlockmgr
> |     100024    1   udp    992  status
> |     100024    1   tcp    726  status
> |     100005    1   udp    994  mountd
> |     100005    3   udp    994  mountd
> |     100005    1   tcp   1011  mountd
> |     100005    3   tcp   1011  mountd
>
> Especially the portmapper itself, as this one seems to be the
> service that the client seems unable to find. Or at least, that's
> how I interpret the "prog 100000" which scrolls continuously on
> the client's error message.
>
> I have already tried to have tcpdump have a look at what's going
> on, but unfortunately, I don't see very much in its output:
> | $ tcpdump -n -s 140 -i em0 host 172.23.13.138
> | tcpdump: listening on em0, link-type EN10MB
> | 01:29:31.853178 172.23.13.138.718 > 172.23.255.255.111: udp 96
> | 01:29:36.853392 172.23.13.138.718 > 172.23.255.255.111: udp 96
> | 01:29:41.853479 172.23.13.138.718 > 172.23.255.255.111: udp 96
> (ad infinitum)
>
> As far as I see it, the client sends some UDP packet to the
> portmapper, but does not get any response.
>
> Since it looks like a RPC/NFS issue, I tried to see if "normal"
> NFS access would yield similar issues, so I had the same client
> try to connect from some Linux livecd thingie. This succeeded on
> the first try---hence, NFS seems to work, at least in general.
> However, the straightforward nfs mount did connect using
> 172.23.13.2 (i.e., the "real" address of the server"), not the
> broadcast address. Trying to do a mount to
> 172.23.255.255:/export/client resulted in an error message,
> namely "Network is unreachable", but no blip comes up at the
> tcpdump above which was still running at this time, so it might
> as well have been Linux who won't allow to connect NFS on
> the broadcast address.
>
> The previously mentioned old mailinglist posting mentioned that
> rpc.bootparamd'd be needed, but starting it or not does not make
> any difference (and
http://www.netbsd.org/docs/network/netboot/intro.i386.html
> kind of implies that rpc.bootparamd is not needed on i386, and
> the manpage actively discourages it).
>
>
> I'm now quite at a loss now, and don't know where to look
> anymore. I'm sure it's just some small thing that I'm still
> overlooking, or some interoperatibility issue with some parts of
> that setup, but I don't know where to look anymore.
>
> Thanks in advance for any hints, or for just having the patience
> to read through to the end. :o)
>
> s//un


Hi,

What does your dhcpd.conf look like on your server?

It might be worth having -vv and -X on your tcpdump it might provide
more info as to the problem.

hth

Fred

Re: Trying to set diskless(8) -- hanging in "RPC timeout for server"

Reply via email to