Hello, debianics, it sounds like an age old problem solved quite a lot of times, but I still can't get it to work.
I try to set up a diskless cluster in beowulf style with nfsroot. Base is debian wheezy. short story: when I boot into nfsv4 root, I find weird file ownership drwxr-xr-x 2 4294967294 4294967294 4096 Feb 15 13:00 bin drwxr-xr-x 3 4294967294 4294967294 4096 Feb 15 15:15 boot drwxr-xr-x 17 root root 3240 Feb 15 15:38 dev drwxr-xr-x 115 4294967294 4294967294 4096 Feb 15 14:27 etc drwxr-xr-x 2 4294967294 4294967294 4096 Dec 24 13:41 home indicating that idmap demon is not working as it should. The only workaraound to get rid of the problem is to mount nfsroot as nfsv3 instad of nfsv4 - see here : http://www.linuxquestions.org/questions/linux-networking-3/does-pxe-booting-nfs-root-supports-nfsv4-925154/#post4583418 and here http://serverfault.com/questions/379486/netboot-debian-wheezy-from-nfs-v4 The symptoms are well known and reported repeatedly. However, I checked following underlying causes - none was reflecting my Situation: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=724514 http://layer-acht.org/fai-irc/fai.log.20120613 https://kernel.googlesource.com/pub/scm/boot/dracut/dracut/+/3eca0cc846e89675949abb11e9606f3222a2e266%5E%5E!/ https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=537969 https://bugzilla.redhat.com/show_bug.cgi?id=922031#c5 I updated dracut to 040-207-g7252cde from testing - same picture - I have to use dracut instead of initramfs, since my cluser uses bonded Ethernet links which I could not get working with standard initfs. The plan is to layer common ro installation and nodewise rw-dirs using aufs and exporting them individually per node. Server has dnsmasq, (providing DHCP, DNS and TFTP) and nfs-kernel-server. My first setup started from a HD install. After switching from readonly to writable aufs, everything stalled and I blamed aufs as dmesg recommended... http://sourceforge.net/p/aufs/mailman/message/33392409/ I got rid of this problem by switching the server back form testing 3.16 kernel to standard 3.2 from wheezy. Don't ask me why... This way II messed up this HD based installation, so I decided to try a new clean one based on the debootstrap tool following this pointer https://help.ubuntu.com/community/Installation/OnNFSDrive - same picture - following https://dracut.wiki.kernel.org/index.php/Main_Page I tried both the deprecated command syntax root=/dev/nfs nfsroot= and the recommended one: root=nfs4:[<server-ip>:]<root-dir>[:<nfs-options>] rd.nfs.domain=<NFSv4 domain name> basically no difference I tried to start rpc.idmapd and stad manually from the console, which is possible either from a nfsv3 root or by copying the nobody-owned system into ramdisk and chown them to root. Then I can see error messages like rpc.statd: Failed to create /var/lib/nfs/state.new: Read-only file system I configured dracut manually to include this path - no help. When I get everything right to fire up idmapd manually, I can mount my nfsv4 export with proper UID. So, name resolution should be OK. I even could manage a manual switch_root (over ssh login...), but did not yield a decent running system this way (there is more stuff done during init I'm afraid, like /proc, /dev, /sys....). But it tells me that server and network issues are OK. It's a initram problem. I can capture long logs of the init process. As far as my understanding goes, dracut tries to kill both statd and idmapd at 99-nfsroot-cleanup. I think it shuold not do so anayway, but it even looks like it cannot get a PID for idmapd, so I suppose it could not even succeed in getting it up and running properly before. Basically I think the problem is located somewhere between the integration of dracut, nfsv4/idmapd and debian packaging scheme. see also http://marc.info/?l=linux-nfs&m=121621383812750&w=2 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=737554 I hoped that systemd might help out, but could not get it properly installed into the debootstrap base, see here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=668001 Do I just miss some silly detail? Or is debian & dracut & nfsv4-root simply no valid setup yet? I can provide MBytes of logs and screenshots, and with wireshark, we may be easily multiplying this figure.... Anybody out there to collaborate in a solution? yours Wolfgang Rosner -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/201502151824.08378.wros...@tirnet.de