Public bug reported:

When nbd-client is started from initramfs (during the process of a
diskless boot), in order to provide the root filesystem, it is not
persistent. This is because nbd-client is started before run-init, and
later it is not able to find /dev/nbd0 and /sys/block/nbd0/pid anymore
because run-init deleted them.

The nbd scripts for initramfs starts the nbd-client like this:

@sbin/nbd-client NBD_SERVER_IP -N root /dev/nbd0 -swap -persist
-systemd-mark

The -persist option should allow the nbd-client to reconnect if the tcp
session is lost.

Here is a sequence of steps to observe the behaviour;

1. The system is booted ok. nbd-client is active:
root       359  0.2  0.2   4372  2212 ?        SL   11:16   0:00 
@sbin/nbd-client 10.4.104.4 -N root /dev/nbd0 -swap -persist -systemd-mark
root       362  0.0  0.0      0     0 ?        S<   11:16   0:00 [nbd0]

/dev/nbd0 exists:
brw-rw---- 1 root disk 43, 0 Nov 24 09:19 /dev/nbd0

and the nbd-client process uses it:
root@host:~# ls -l /proc/359/fd/
total 0
lr-x------ 1 root root 64 Nov 24 09:20 0 -> /dev/null
lrwx------ 1 root root 64 Nov 24 09:20 1 -> /dev/console (deleted)
lrwx------ 1 root root 64 Nov 24 09:20 2 -> /dev/console (deleted)
lrwx------ 1 root root 64 Nov 24 09:20 3 -> socket:[9447]
lrwx------ 1 root root 64 Nov 24 09:20 4 -> /dev/nbd0

2. If I restart the nbd-server, the nbd-client dies/exits. The only way
to know what is happening is to strace the nbd-client. This generates a
side effect: when strace is attached, the ioctl exits and nbd-client
tries to reconnect and dies. So by just stracing the nbd-client, i
simulate/force a disconnect/reconnect without any need to restart nbd-
server. My guess is that when i restart the nbd-server, the same happens
(but I just cannot see it). Please observe the behavior:

root@GTSRO-S-123456:~# strace -p 359
strace: Process 359 attached
getpid()                                = 359
write(2, "nbd,359: Kernel call returned: 1"..., 34) = 34
close(3)                                = 0
close(4)                                = 0
write(2, " Reconnecting\n", 14)         = 14
socket(PF_NETLINK, SOCK_RAW, NETLINK_ROUTE) = 3
bind(3, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0
getsockname(3, {sa_family=AF_NETLINK, pid=359, groups=00000000}, [12]) = 0
sendto(3, "\24\0\0\0\26\0\1\3Y\2606X\0\0\0\0\0\0\0\0", 20, 0, 
{sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 20
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
msg_iov(1)=[{"L\0\0\0\24\0\2\0Y\2606Xg\1\0\0\2\10\200\376\1\0\0\0\10\0\1\0\177\0\0\1"...,
 4096}], msg_controllen=0, msg_flags=0}, 0) = 256
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
msg_iov(1)=[{"H\0\0\0\24\0\2\0Y\2606Xg\1\0\0\n\200\200\376\1\0\0\0\24\0\1\0\0\0\0\0"...,
 4096}], msg_controllen=0, msg_flags=0}, 0) = 144
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
msg_iov(1)=[{"\24\0\0\0\3\0\2\0Y\2606Xg\1\0\0\0\0\0\0", 4096}], 
msg_controllen=0, msg_flags=0}, 0) = 20
close(3)                                = 0
socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 3
connect(3, {sa_family=AF_INET, sin_port=htons(10809), 
sin_addr=inet_addr("10.4.104.4")}, 16) = 0
setsockopt(3, SOL_TCP, TCP_NODELAY, [1], 4) = 0

!!!!! this is the problem:
open("/dev/nbd0", O_RDWR)               = -1 ENOENT (No such file or directory)

open("/etc/localtime", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or 
directory)
socket(PF_LOCAL, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 4
connect(4, {sa_family=AF_LOCAL, sun_path="/dev/log"}, 110) = -1 ENOENT (No such 
file or directory)
close(4)                                = 0
socket(PF_LOCAL, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 4
connect(4, {sa_family=AF_LOCAL, sun_path="/dev/log"}, 110) = -1 ENOENT (No such 
file or directory)
close(4)                                = 0
write(2, "Error: Cannot open NBD: No such "..., 59) = 59
exit_group(1)                           = ?
+++ exited with 1 +++

3. Now, if I have cached the /sbin/nbd-client file before doing the strace (by 
doing a cat /sbin/nbd-client > /dev/null), i can restart it again:
root@host:~# /sbin/nbd-client 10.4.104.4 -N root /dev/nbd0 -swap -persist 
-systemd-mark
Negotiation: ..size = 32765MB
bs=1024, sz=34357604352 bytes

4. If I strace the newly launched nbd-client, strace will cause a
disconnect, as above, but this time the nbd-client will be able to open
/dev/nbd0 again so it will not die:

strace: Process 1851 attached
write(2, "nbd,1851: Kernel call returned: "..., 35) = 35
close(3)                                = 0
close(4)                                = 0
write(2, " Reconnecting\n", 14)         = 14
socket(PF_NETLINK, SOCK_RAW, NETLINK_ROUTE) = 3
bind(3, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0
getsockname(3, {sa_family=AF_NETLINK, pid=1851, groups=00000000}, [12]) = 0
sendto(3, "\24\0\0\0\26\0\1\3.\2646X\0\0\0\0\0\0\0\0", 20, 0, 
{sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 20
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
msg_iov(1)=[{"L\0\0\0\24\0\2\0.\2646X;\7\0\0\2\10\200\376\1\0\0\0\10\0\1\0\177\0\0\1"...,
 4096}], msg_controllen=0, msg_flags=0}, 0) = 256
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
msg_iov(1)=[{"H\0\0\0\24\0\2\0.\2646X;\7\0\0\n\200\200\376\1\0\0\0\24\0\1\0\0\0\0\0"...,
 4096}], msg_controllen=0, msg_flags=0}, 0) = 144
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, 
msg_iov(1)=[{"\24\0\0\0\3\0\2\0.\2646X;\7\0\0\0\0\0\0", 4096}], 
msg_controllen=0, msg_flags=0}, 0) = 20
close(3)                                = 0
socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 3
connect(3, {sa_family=AF_INET, sin_port=htons(10809), 
sin_addr=inet_addr("10.4.104.4")}, 16) = 0
setsockopt(3, SOL_TCP, TCP_NODELAY, [1], 4) = 0

!!!!! here it works, because the process' VFS is the same as the real /
open("/dev/nbd0", O_RDWR)               = 4

write(1, "Negotiation: ", 13)           = 13
read(3, "NBDMAGIC", 8)                  = 8
write(1, ".", 1)                        = 1
read(3, "IHAVEOPT", 8)                  = 8
write(1, ".", 1)                        = 1
read(3, "\0\3", 2)                      = 2
write(3, "\0\0\0\3", 4)                 = 4
write(3, "IHAVEOPT", 8)                 = 8
write(3, "\0\0\0\1", 4)                 = 4
write(3, "\0\0\0\4", 4)                 = 4
write(3, "root", 4)                     = 4
read(3, "\0\0\0\7\377\337p\0", 8)       = 8
write(1, "size = 32765MB", 14)          = 14
read(3, "\0\3", 2)                      = 2
write(1, "\n", 1)                       = 1
ioctl(4, NBD_SET_BLKSIZE, 0x1000)       = 0
ioctl(4, NBD_SET_SIZE_BLOCKS, 0x7ffdf7) = 0
ioctl(4, NBD_SET_BLKSIZE, 0x400)        = 0
write(2, "bs=1024, sz=34357604352 bytes\n", 30) = 30
ioctl(4, NBD_CLEAR_SOCK, 0x7f7daa224770) = 0
ioctl(4, NBD_SET_FLAGS, 0x3)            = 0
ioctl(4, BLKROSET, [1])                 = 0
ioctl(4, NBD_SET_SOCK, 0x3)             = 0
mlockall(MCL_CURRENT|MCL_FUTURE)        = 0
rt_sigprocmask(SIG_SETMASK, ~[KILL PIPE TERM RTMIN RT_1], ~[KILL PIPE TERM STOP 
RTMIN RT_1], 8) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0x7f7daa4419d0) = 2160
ioctl(4, NBD_DO_IT
[...]

After discussing with somebody on the nbd-general mailing list, I have
received from him the idea of cutting out the close(nbddev) and the open
again - open(nbddev..) from the code. The process didn't die anymore,
but the forked child which was trying to do some initialization magic
was also not able to open the /sys filesystem, so this workaround was
not good enough.

There might be a nice/clean way to make the nbd-client to reattach to
the correct VFS tree, but I don't know it. What came to my mind was to
make nbd-client to mount again the devtmpfs and sysfs in its own VFS
tree, so it can continue. I have created this small patch, which makes
the nbd-client persistent when providing the root filesystem from
initramfs. If there's any cleaner way to do it, I would gladly receive
it. If not, please include this patch in future nbd-client releases. Not
having persistence for the / filesystem on a diskless client is not nice
at all. With this patch, my client switches from main to slave NBD
server (in a corosync/pacemaker/drbd cluster) without any issue.

** Affects: nbd (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: nbd

** Patch added: "nbd-client persistence for diskless root filesystem"
   
https://bugs.launchpad.net/bugs/1645048/+attachment/4783590/+files/nbd-client.patch

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1645048

Title:
  nbd-client when started from initramfs (for diskless boot) is not
  persistent

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nbd/+bug/1645048/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to