Bug#756593: busybox's switch_root makes read-only NFS root read/write
Package: busybox Version: 1:1.22.0-6 Severity: important Dear Maintainer, we have a PXE environments in our lab, where we boot both physical boxes and XEN machines via NFS from one centralized Debian SID image. While the kernel/initramfs mounts the image correctly read only (I set a breakpoint just before switch_root get invoked) (see [1]), makes switch_root the NFS root read/write (see [2]). Alex [1] 192.168.0.1:/usr/local/muclab/image/debian-sid on /root type nfs (ro,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,nolock, proto=tcp,port=2049,timeo=7,retrans=10,sec=sys,local_lock=all,addr=192.168.0.1) [2] 192.168.0.1:/usr/local/muclab/image/debian-sid on / type nfs (rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,nolock, proto=tcp,port=2049,timeo=7,retrans=3,sec=sys,local_lock=all,addr=192.168.0.1) -- System Information: Debian Release: jessie/sid APT prefers unstable APT policy: (500, 'unstable') Architecture: amd64 (x86_64) Kernel: Linux 3.14.0-netem.fas3270-aufs+ (SMP w/4 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Versions of packages busybox depends on: ii libc6 2.19-7 busybox recommends no packages. busybox suggests no packages. -- no debconf information signature.asc Description: Message signed with OpenPGP using GPGMail
Bug#756593: busybox's switch_root makes read-only NFS root read/write
Hi Michael, Am 31.07.2014 um 20:53 schrieb Michael Tokarev : > Control: tag -1 + moreinfo > > 31.07.2014 11:56, Zimmermann, Alexander wrote: >> Package: busybox >> Version: 1:1.22.0-6 >> Severity: important >> >> Dear Maintainer, >> >> we have a PXE environments in our lab, where we boot both physical boxes >> and XEN machines via NFS from one centralized Debian SID image. While >> the kernel/initramfs mounts the image correctly read only (I set a >> breakpoint just before switch_root get invoked) (see [1]), makes >> switch_root the NFS root read/write (see [2]). > > Very interesting. > > I can't reproduce this behavor here. I use remote root a lot, > also with PXE booting, and never saw a read-write root after > switch_root run. > > Looking at the source, it only does one mount(2) syscall: > >// Overmount / with newdir and chroot into it >if (mount(".", "/", NULL, MS_MOVE, NULL)) { >// For example, fails when newroot is not a mountpoint >bb_perror_msg_and_die("error moving root"); > > and that’s about it. We also scanned the source code yesterday quickly. At the first glance we also saw nothing special here. > So unless the kernel is broken, Good point. We use a patched vanilla kernel (see below). Maybe the patch is broken. > it > should not result in changing the mount flags in any way. > > And it definitely doesn't change flags when switch_root'ing to > a regular ext4 or other local filesystem (in a regular initramfs > which is used by almost all debian systems). > > Maybe you can describe your environment a bit more? Sure. PXE, DHCP and NFS is provided by a FreeBSD 10.0-Stable box. The PXE config is: SERIAL 0 9600 DEFAULT linux LABEL linux KERNEL ../kernel/vmlinuz-3.14.0.fas3270-aufs+ APPEND tsc=reliable acpi=off quiet root=/dev/nfs nfsroot=192.168.0.1:/usr/local/muclab/image/debian-sid ro boot=nfs root-ro=aufs ip=:eth4:dhcp console=ttyS0 initrd=../initrd/initrd.img-3.14.0.fas3270-aufs+ As you can see, we use a vanilla 3.14 Kernel, patched w/ official AUFS patch (see http://aufs.sourceforge.net) To enable/disable AUFS we use a patched version of the root-ro script (see https://help.ubuntu.com/community/aufsRootFileSystemOnUsbFlash) in our initramfs. The script is located under /etc/initramfs-tools/scripts/init-bottom/. > Where do you set breakpoints? To ensure that the root-to script isn’t the culprit, I disabled it (and therefore AUFS too) via cmdline parameter root-ro=false and put a breakpoint right after (break=init). At the breakpoint, the NFS mount was still ro. I put another „breakpoint“ in /etc/rc3.d/S01* start script to verify the mount right after switch_root. Here, the mount was already rw. Let me double check that AUFS is not broken. I try to boot a vanilla kernel. I will come back to you w/ the results. Alex —- As a side note, if we boot w/ AUFS, the mount points are right. alexandz@two:/etc/initramfs-tools$ mount sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) udev on /dev type devtmpfs (rw,relatime,size=10240k,nr_inodes=2051429,mode=755) devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620) tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=1641472k,mode=755) 192.168.0.1:/usr/local/muclab/image/debian-sid on /mnt/root-ro type nfs (ro,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,nolock,proto=tcp,port=2049,timeo=7,retrans=10,sec=sys,local_lock=all,addr=192.168.0.1) tmpfs-root on /mnt/root-rw type tmpfs (rw,relatime) aufs-root on / type aufs (rw,relatime,si=b2127ecf3bdae6c7) tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) tmpfs on /run/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=3282940k) tmpfs on /tmp type tmpfs (rw,nosuid,nodev,relatime,size=3282940k) 192.168.0.1:/usr/local/muclab/boot on /mnt/boot type nfs (rw,nosuid,nodev,noatime,vers=3,rsize=8192,wsize=8192,namlen=255,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.0.1,mountvers=3,mountport=945,mountproto=udp,fsc,local_lock=all,addr=192.168.0.1) rpc_pipefs on /run/rpc_pipefs type rpc_pipefs (rw,relatime) /etc/auto.home on /home type autofs (rw,relatime,fd=6,pgrp=1894,timeout=300,minproto=5,maxproto=5,indirect) 192.168.0.1:/usr/home/puneeth on /home/puneeth type nfs (rw,noatime,vers=3,rsize=8192,wsize=8192,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.0.1,mountvers=3,mountport=945,mountproto=tcp,local_lock=none,addr=192.168.0.1) 192.168.0.1:/usr/home/alexandz on /home/alexandz type nfs (rw,noatime,vers=3,rsize=8192,wsize=8192,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.0.1,mountvers=3,mountport=945,mountproto=tcp,local_lock=none,addr=192.168.0.1) > > Thanks, > > /mjt signature.asc Description: Message signed with OpenPGP using GPGMail
Bug#756593: busybox's switch_root makes read-only NFS root read/write
Am 01.08.2014 um 13:46 schrieb Michael Tokarev : > 01.08.2014 15:37, Zimmermann, Alexander wrote: > >> As you can see, we use a vanilla 3.14 Kernel, patched w/ official AUFS patch >> (see >> http://aufs.sourceforge.net) > > I too use aufs here, for a very long time. But I never tried > nfs-root together with aufs, I used if in slightly different > scenarios. > >> To enable/disable AUFS we use a patched version of the root-ro script (see >> https://help.ubuntu.com/community/aufsRootFileSystemOnUsbFlash) in our >> initramfs. >> The script is located under /etc/initramfs-tools/scripts/init-bottom/. >> >>> Where do you set breakpoints? >> >> To ensure that the root-to script isn’t the culprit, I disabled it (and >> therefore >> AUFS too) via cmdline parameter root-ro=false and put a breakpoint right >> after >> (break=init). At the breakpoint, the NFS mount was still ro. >> >> I put another „breakpoint“ in /etc/rc3.d/S01* start script to verify the >> mount >> right after switch_root. Here, the mount was already rw. > > You can also write a small script - a wrapper for /sbin/init which will > show you the mount info and exec /sbin/init, and run it with init=yourscript. Despite the fact that I was unable to write a proper wrapper :-) - the kernel crashes - I know now that neither busybox nor AUFS is the culprit. See below: (initramfs) mount rootfs on / type rootfs (rw) sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) udev on /root/dev type devtmpfs (rw,relatime,size=10240k,nr_inodes=2051439,mode=755) devpts on /root/dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620) tmpfs on /root/run type tmpfs (rw,nosuid,relatime,size=3282972k,mode=755) 192.168.0.10:/muclab/image/debian-sid on /root type nfs (ro,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,nolock,proto=tcp,port=2049,timeo=7,retrans=10,sec=sys,local_lock=all,addr=192.168.0.10) (initramfs) exit sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) udev on /dev type devtmpfs (rw,relatime,size=10240k,nr_inodes=2051439,mode=755) devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620) tmpfs on /run type tmpfs (rw,nosuid,relatime,size=3282972k,mode=755) 192.168.0.10:/muclab/image/debian-sid on / type nfs (ro,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,nolock,proto=tcp,port=2049,timeo=7,retrans=10,sec=sys,local_lock=all,addr=192.168.0.10) Usage: init {-e VAR[=VAL] | [-t SECONDS] {0|1|2|3|4|5|6|S|s|Q|q|A|a|B|b|C|c|U|u}} Kernel panic - not syncing: Attempted to kill init! exitcode=0x0100 CPU: 3 PID: 1 Comm: myinit Tainted: G I 3.16.0.vanilla+ #1 Hardware name: NetApp, Inc. FAS3270/FAS3270, BIOS 5.2.1 03/07/2013 88043f8ebed8 814b1e74 815b4088 814b152b 88040010 88043f8ebee8 88043f8ebe80 0001 0100 88043f8e0358 81659c00 000141c0 Call Trace: [] ? dump_stack+0x41/0x51 [] ? panic+0xc1/0x1eb [] ? do_exit+0xa01/0xa10 [] ? recalc_sigpending+0xe/0x30 [] ? do_group_exit+0x3a/0x110 [] ? SyS_exit_group+0xb/0x10 [] ? system_call_fastpath+0x1a/0x1f Kernel Offset: 0x0 from 0x8100 (relocation range: 0x8000-0x9fff) ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0100 > >> Let me double check that AUFS is not broken. I try to boot a vanilla kernel. >> I will come back to you w/ the results. >> >> Alex >> >> —- >> As a side note, if we boot w/ AUFS, the mount points are right. > > That's even more interesting :) > > Thanks, > > /mjt signature.asc Description: Message signed with OpenPGP using GPGMail
Bug#756593: busybox's switch_root makes read-only NFS root read/write
Am 06.08.2014 um 07:42 schrieb Michael Tokarev : > 05.08.2014 17:36, Zimmermann, Alexander wrote: > >> Despite the fact that I was unable to write a proper wrapper :-) - the >> kernel crashes - >> I know now that neither busybox nor AUFS is the culprit. See below: > > Um. The wrapper should be something like: > > #! /bin/sh > echo mounts before-init: > mount > exec /sbin/init "$@" > > The key point is, I think, the `exec' keyword. Init should be started as > pid=1. I see. I miss the exec command. > >> (initramfs) mount >> rootfs on / type rootfs (rw) >> sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) >> proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) >> udev on /root/dev type devtmpfs >> (rw,relatime,size=10240k,nr_inodes=2051439,mode=755) >> devpts on /root/dev/pts type devpts >> (rw,nosuid,noexec,relatime,gid=5,mode=620) >> tmpfs on /root/run type tmpfs (rw,nosuid,relatime,size=3282972k,mode=755) >> 192.168.0.10:/muclab/image/debian-sid on /root type nfs >> (ro,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,nolock,proto=tcp,port=2049,timeo=7,retrans=10,sec=sys,local_lock=all,addr=192.168.0.10) >> (initramfs) exit >> sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) >> proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) >> udev on /dev type devtmpfs >> (rw,relatime,size=10240k,nr_inodes=2051439,mode=755) >> devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620) >> tmpfs on /run type tmpfs (rw,nosuid,relatime,size=3282972k,mode=755) >> 192.168.0.10:/muclab/image/debian-sid on / type nfs >> (ro,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,nolock,proto=tcp,port=2049,timeo=7,retrans=10,sec=sys,local_lock=all,addr=192.168.0.10) >> Usage: init {-e VAR[=VAL] | [-t SECONDS] >> {0|1|2|3|4|5|6|S|s|Q|q|A|a|B|b|C|c|U|u}} >> Kernel panic - not syncing: Attempted to kill init! exitcode=0x0100 Actually I was too fast. Since the node doesn’t came up in this test, I cannot be sure if the NFS mount will be RO or RW after the booting was completed. Let me rerun the test again. I will come back w/ the results. > > But I see. Busybox switch_root worked, ran your myinit, and the mount > in question - nfs mount - is still readonly like it should be... > > Now this is really interrresting. > > Do you have /etc/fstab entry for this mount? No. stab is empty. I will create > > Thanks, > > /mjt > signature.asc Description: Message signed with OpenPGP using GPGMail
Bug#756593: busybox's switch_root makes read-only NFS root read/write
Am 07.08.2014 um 15:02 schrieb Alexander Zimmermann : > > Am 06.08.2014 um 07:42 schrieb Michael Tokarev : > >> 05.08.2014 17:36, Zimmermann, Alexander wrote: >> >>> Despite the fact that I was unable to write a proper wrapper :-) - the >>> kernel crashes - >>> I know now that neither busybox nor AUFS is the culprit. See below: >> >> Um. The wrapper should be something like: >> >> #! /bin/sh >> echo mounts before-init: >> mount >> exec /sbin/init "$@" >> >> The key point is, I think, the `exec' keyword. Init should be started as >> pid=1. > So here is the full output. Vanilla Linux 3.16. No patches. There is definitely something broken in the userland. I will set up a new image via debootstrap next week. (initramfs) mount rootfs on / type rootfs (rw) sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) udev on /root/dev type devtmpfs (rw,relatime,size=10240k,nr_inodes=2051439,mode=755) devpts on /root/dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620) tmpfs on /root/run type tmpfs (rw,nosuid,relatime,size=3282972k,mode=755) 192.168.0.10:/muclab/image/debian-sid on /root type nfs (ro,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,nolock,proto=tcp,port=2049,timeo=7,retrans=10,sec=sys,local_lock=all,addr=192.168.0.10) (initramfs) exit mounts before-init: sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) udev on /dev type devtmpfs (rw,relatime,size=10240k,nr_inodes=2051439,mode=755) devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620) tmpfs on /run type tmpfs (rw,nosuid,relatime,size=3282972k,mode=755) 192.168.0.10:/muclab/image/debian-sid on / type nfs (ro,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,nolock,proto=tcp,port=2049,timeo=7,retrans=10,sec=sys,local_lock=all,addr=192.168.0.10) INIT: version 2.88 booting [info] Using makefile-style concurrent boot in runlevel S. [ ok ] Starting the hotplug events dispatcher: udevd. [ ok ] Synthesizing the initial hotplug events...done. [ ok ] Waiting for /dev to be fully populated...done. [ ok ] Activating swap...done. [ ok ] Activating lvm and md swap...done. [] Checking file systems...fsck from util-linux 2.20.1 done. [ ok ] Cleaning up temporary files [ ok ] Mounting local filesystems...done. [ ok ] Activating swapfile swap...done. [ ok ] Cleaning up temporary files [ ok ] Setting kernel variables ...done. [ ok ] Configuring network interfaces...done. [ ok ] Starting rpcbind daemon [ ok ] Starting NFS common utilities: statd idmapd. [ ok ] Cleaning up temporary files [info] Setting console screen modes and fonts. INIT: Entering runlevel: 2wersave m [info] Using makefile-style concurrent boot in runlevel 2. [ ok ] Starting NFS common utilities: statd idmapd. [ ok ] Starting enhanced syslogd: rsyslogd. [warn] Not running within Xen or no compatible utils ... (warning). [ ok ] Starting NTP server: ntpd. [ ok ] Starting OpenBSD Secure Shell server: sshd. [ ok ] Starting automount [ ok ] Starting periodic command scheduler: cron. Inserting openvswitch module. Starting ovsdb-server. Configuring Open vSwitch system IDs. Starting ovs-vswitchd. Enabling remote OVSDB managers. Debian GNU/Linux jessie/sid UNKNOWN ttyS0 UNKNOWN login: alexandz Password: Last login: Fri Aug 8 12:21:21 CEST 2014 from vpn2ntap-54538.vpn.netapp.com on pts/0 Linux UNKNOWN 3.16.0.vanilla+ #1 SMP Tue Aug 5 14:07:48 CEST 2014 x86_64 Please see http://wikid.netapp.com/w/MUClab for more information about the lab equipment. alexandz@UNKNOWN:~$ mount sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime) proc on /proc type proc (rw,nosuid,nodev,noexec,relatime) udev on /dev type devtmpfs (rw,relatime,size=10240k,nr_inodes=2051439,mode=755) devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620) tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=1641488k,mode=755) 192.168.0.10:/muclab/image/debian-sid on / type nfs (rw,relatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,nolock,proto=tcp,port=2049,timeo=7,retrans=10,sec=sys,local_lock=all,addr=192.168.0.10) tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k) tmpfs on /run/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=3282960k) tmpfs on /tmp type tmpfs (rw,nosuid,nodev,relatime,size=3282960k) rpc_pipefs on /run/rpc_pipefs type rpc_pipefs (rw,relatime) /etc/auto.home on /home type autofs (rw,relatime,fd=6,pgrp=1629,timeout=300,minproto=5,maxproto=5,indirect) 192.168.0.10:/home/alexandz on /home/alexandz type nfs (rw,noatime,vers=3,rsize=65536,wsize=65536,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.0.10,mountvers=3,mountport=635,mountproto=tcp,local_lock=none,addr=192.168.0.10) alexandz@UNKNOWN:~$ signature.asc Description: Message signed with OpenPGP using GPGMail
Bug#756593: busybox's switch_root makes read-only NFS root read/write
Hi Michael, Am 29.09.2014 um 08:37 schrieb Michael Tokarev : > [Rehashing a somewhat old thread...] > 08.08.2014 14:55, Zimmermann, Alexander wrote: > >> So here is the full output. Vanilla Linux 3.16. No patches. There is >> definitely something >> broken in the userland. I will set up a new image via debootstrap next week. > > So, Alexander, did you succeed in finding what turns your root > read-write? Actually, I was not able to find the culprit. I set up a new sid image from scratch (the old one was an upgrade from wheezy) and this time w/o any problems. > Since you confirm the bug is not in busybox, I'm > about to close this bug report, Yes. I agree. > or maybe we should reassign it > to some other package instead, because it contains quite some > useful debugging information and it'd be sad if this info will > be lost… But if we close the ticket the info ration is not lost. Right? BR Alex > > I still can't reproduce the problem you describe. > > Thanks, > > /mjt signature.asc Description: Message signed with OpenPGP using GPGMail