Public bug reported: When nbd-client is started from initramfs (during the process of a diskless boot), in order to provide the root filesystem, it is not persistent. This is because nbd-client is started before run-init, and later it is not able to find /dev/nbd0 and /sys/block/nbd0/pid anymore because run-init deleted them.
The nbd scripts for initramfs starts the nbd-client like this: @sbin/nbd-client NBD_SERVER_IP -N root /dev/nbd0 -swap -persist -systemd-mark The -persist option should allow the nbd-client to reconnect if the tcp session is lost. Here is a sequence of steps to observe the behaviour; 1. The system is booted ok. nbd-client is active: root 359 0.2 0.2 4372 2212 ? SL 11:16 0:00 @sbin/nbd-client 10.4.104.4 -N root /dev/nbd0 -swap -persist -systemd-mark root 362 0.0 0.0 0 0 ? S< 11:16 0:00 [nbd0] /dev/nbd0 exists: brw-rw---- 1 root disk 43, 0 Nov 24 09:19 /dev/nbd0 and the nbd-client process uses it: root@host:~# ls -l /proc/359/fd/ total 0 lr-x------ 1 root root 64 Nov 24 09:20 0 -> /dev/null lrwx------ 1 root root 64 Nov 24 09:20 1 -> /dev/console (deleted) lrwx------ 1 root root 64 Nov 24 09:20 2 -> /dev/console (deleted) lrwx------ 1 root root 64 Nov 24 09:20 3 -> socket:[9447] lrwx------ 1 root root 64 Nov 24 09:20 4 -> /dev/nbd0 2. If I restart the nbd-server, the nbd-client dies/exits. The only way to know what is happening is to strace the nbd-client. This generates a side effect: when strace is attached, the ioctl exits and nbd-client tries to reconnect and dies. So by just stracing the nbd-client, i simulate/force a disconnect/reconnect without any need to restart nbd- server. My guess is that when i restart the nbd-server, the same happens (but I just cannot see it). Please observe the behavior: root@GTSRO-S-123456:~# strace -p 359 strace: Process 359 attached getpid() = 359 write(2, "nbd,359: Kernel call returned: 1"..., 34) = 34 close(3) = 0 close(4) = 0 write(2, " Reconnecting\n", 14) = 14 socket(PF_NETLINK, SOCK_RAW, NETLINK_ROUTE) = 3 bind(3, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0 getsockname(3, {sa_family=AF_NETLINK, pid=359, groups=00000000}, [12]) = 0 sendto(3, "\24\0\0\0\26\0\1\3Y\2606X\0\0\0\0\0\0\0\0", 20, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 20 recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"L\0\0\0\24\0\2\0Y\2606Xg\1\0\0\2\10\200\376\1\0\0\0\10\0\1\0\177\0\0\1"..., 4096}], msg_controllen=0, msg_flags=0}, 0) = 256 recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"H\0\0\0\24\0\2\0Y\2606Xg\1\0\0\n\200\200\376\1\0\0\0\24\0\1\0\0\0\0\0"..., 4096}], msg_controllen=0, msg_flags=0}, 0) = 144 recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"\24\0\0\0\3\0\2\0Y\2606Xg\1\0\0\0\0\0\0", 4096}], msg_controllen=0, msg_flags=0}, 0) = 20 close(3) = 0 socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 3 connect(3, {sa_family=AF_INET, sin_port=htons(10809), sin_addr=inet_addr("10.4.104.4")}, 16) = 0 setsockopt(3, SOL_TCP, TCP_NODELAY, [1], 4) = 0 !!!!! this is the problem: open("/dev/nbd0", O_RDWR) = -1 ENOENT (No such file or directory) open("/etc/localtime", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) socket(PF_LOCAL, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 4 connect(4, {sa_family=AF_LOCAL, sun_path="/dev/log"}, 110) = -1 ENOENT (No such file or directory) close(4) = 0 socket(PF_LOCAL, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 4 connect(4, {sa_family=AF_LOCAL, sun_path="/dev/log"}, 110) = -1 ENOENT (No such file or directory) close(4) = 0 write(2, "Error: Cannot open NBD: No such "..., 59) = 59 exit_group(1) = ? +++ exited with 1 +++ 3. Now, if I have cached the /sbin/nbd-client file before doing the strace (by doing a cat /sbin/nbd-client > /dev/null), i can restart it again: root@host:~# /sbin/nbd-client 10.4.104.4 -N root /dev/nbd0 -swap -persist -systemd-mark Negotiation: ..size = 32765MB bs=1024, sz=34357604352 bytes 4. If I strace the newly launched nbd-client, strace will cause a disconnect, as above, but this time the nbd-client will be able to open /dev/nbd0 again so it will not die: strace: Process 1851 attached write(2, "nbd,1851: Kernel call returned: "..., 35) = 35 close(3) = 0 close(4) = 0 write(2, " Reconnecting\n", 14) = 14 socket(PF_NETLINK, SOCK_RAW, NETLINK_ROUTE) = 3 bind(3, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0 getsockname(3, {sa_family=AF_NETLINK, pid=1851, groups=00000000}, [12]) = 0 sendto(3, "\24\0\0\0\26\0\1\3.\2646X\0\0\0\0\0\0\0\0", 20, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 20 recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"L\0\0\0\24\0\2\0.\2646X;\7\0\0\2\10\200\376\1\0\0\0\10\0\1\0\177\0\0\1"..., 4096}], msg_controllen=0, msg_flags=0}, 0) = 256 recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"H\0\0\0\24\0\2\0.\2646X;\7\0\0\n\200\200\376\1\0\0\0\24\0\1\0\0\0\0\0"..., 4096}], msg_controllen=0, msg_flags=0}, 0) = 144 recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"\24\0\0\0\3\0\2\0.\2646X;\7\0\0\0\0\0\0", 4096}], msg_controllen=0, msg_flags=0}, 0) = 20 close(3) = 0 socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 3 connect(3, {sa_family=AF_INET, sin_port=htons(10809), sin_addr=inet_addr("10.4.104.4")}, 16) = 0 setsockopt(3, SOL_TCP, TCP_NODELAY, [1], 4) = 0 !!!!! here it works, because the process' VFS is the same as the real / open("/dev/nbd0", O_RDWR) = 4 write(1, "Negotiation: ", 13) = 13 read(3, "NBDMAGIC", 8) = 8 write(1, ".", 1) = 1 read(3, "IHAVEOPT", 8) = 8 write(1, ".", 1) = 1 read(3, "\0\3", 2) = 2 write(3, "\0\0\0\3", 4) = 4 write(3, "IHAVEOPT", 8) = 8 write(3, "\0\0\0\1", 4) = 4 write(3, "\0\0\0\4", 4) = 4 write(3, "root", 4) = 4 read(3, "\0\0\0\7\377\337p\0", 8) = 8 write(1, "size = 32765MB", 14) = 14 read(3, "\0\3", 2) = 2 write(1, "\n", 1) = 1 ioctl(4, NBD_SET_BLKSIZE, 0x1000) = 0 ioctl(4, NBD_SET_SIZE_BLOCKS, 0x7ffdf7) = 0 ioctl(4, NBD_SET_BLKSIZE, 0x400) = 0 write(2, "bs=1024, sz=34357604352 bytes\n", 30) = 30 ioctl(4, NBD_CLEAR_SOCK, 0x7f7daa224770) = 0 ioctl(4, NBD_SET_FLAGS, 0x3) = 0 ioctl(4, BLKROSET, [1]) = 0 ioctl(4, NBD_SET_SOCK, 0x3) = 0 mlockall(MCL_CURRENT|MCL_FUTURE) = 0 rt_sigprocmask(SIG_SETMASK, ~[KILL PIPE TERM RTMIN RT_1], ~[KILL PIPE TERM STOP RTMIN RT_1], 8) = 0 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f7daa4419d0) = 2160 ioctl(4, NBD_DO_IT [...] After discussing with somebody on the nbd-general mailing list, I have received from him the idea of cutting out the close(nbddev) and the open again - open(nbddev..) from the code. The process didn't die anymore, but the forked child which was trying to do some initialization magic was also not able to open the /sys filesystem, so this workaround was not good enough. There might be a nice/clean way to make the nbd-client to reattach to the correct VFS tree, but I don't know it. What came to my mind was to make nbd-client to mount again the devtmpfs and sysfs in its own VFS tree, so it can continue. I have created this small patch, which makes the nbd-client persistent when providing the root filesystem from initramfs. If there's any cleaner way to do it, I would gladly receive it. If not, please include this patch in future nbd-client releases. Not having persistence for the / filesystem on a diskless client is not nice at all. With this patch, my client switches from main to slave NBD server (in a corosync/pacemaker/drbd cluster) without any issue. ** Affects: nbd (Ubuntu) Importance: Undecided Status: New ** Tags: nbd ** Patch added: "nbd-client persistence for diskless root filesystem" https://bugs.launchpad.net/bugs/1645048/+attachment/4783590/+files/nbd-client.patch -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1645048 Title: nbd-client when started from initramfs (for diskless boot) is not persistent To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nbd/+bug/1645048/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs