Hello,

The last big issue I'm running into when running systemd in an
unprivileged LXC container is that it's crashing on an assert in the
shutdown/reboot path right after unmounting all devices.

That's because due to mknod not being allowed inside a user namespace,
we have to bind-mount all the required device nodes from the host's /dev on
top of empty files in the container's /dev.

This all works great until systemd unmounts everything. At which point,
all of those are 0 byte files. Systemd then opens /dev/urandom and
attempts to read some bytes from there, gets 0 bytes back and trips an
assertion.


To fix that, I've got two different approaches, both with an associated
patch attached to this e-mail:
 - 0001-Add-dev-urandom-to-ignore_paths.patch:
   This very simply adds /dev/urandom to the ignore_paths list alongside
   /dev/console. That way all the other mount entries are unmounted but
   /dev/urandom isn't, fixing the issue we're currently seeing.

 - 0001-Ignore-devices-bind-mounts.patch:
   This one is a more generic take on the problem and should be more
   future-proof. Rather than hardcoding /dev/urandom, it extends the
   existing mount_point_ignore function to ignore any mountpoint which is a
   character or block device.


I tend to prefer the latter because it's future-proof and avoids
hardcoding paths, however it certainly is more likely to have
side-effects than the first (though I can't think of any obvious one).

-- 
Stéphane Graber
Ubuntu developer
http://www.ubuntu.com
From ef87932352d505b477c27c40f33af6015ea2b2b5 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?St=C3=A9phane=20Graber?= <[email protected]>
Date: Thu, 15 Jan 2015 12:04:54 -0500
Subject: [PATCH] Add /dev/urandom to ignore_paths
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

systemd opens /dev/urandom during the shutdown sequence after it's done
unmounting everything. If /dev/urandom is a bind-mounted file, that will
unmount it and systemd will then crash on an assert when attempting to
read some random bytes and getting 0 back.

Signed-off-by: Stéphane Graber <[email protected]>
---
 src/core/mount-setup.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/core/mount-setup.c b/src/core/mount-setup.c
index bd3a035..b8ada98 100644
--- a/src/core/mount-setup.c
+++ b/src/core/mount-setup.c
@@ -130,6 +130,7 @@ static const char ignore_paths[] =
         /* Container bind mounts */
         "/proc/sys\0"
         "/dev/console\0"
+        "/dev/urandom\0"
         "/proc/kmsg\0";
 
 bool mount_point_is_api(const char *path) {
-- 
1.9.1

From 251cd89d77befb77238e4f6dd7903adcc8bbf035 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?St=C3=A9phane=20Graber?= <[email protected]>
Date: Thu, 15 Jan 2015 10:45:28 -0500
Subject: [PATCH] Ignore devices bind-mounts
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Devices bind-mounts are typically used in unprivileged containers where
mknod isn't possible.

On those systems, all entries under /dev are empty files with the device
node bind-mounted on top.

Current systemd code will unmount all of those during the shutdown
sequence even though systemd may still need them, ultimately leading to
crashes in the shutdown/reboot sequence.

Signed-off-by: Stéphane Graber <[email protected]>
---
 src/core/mount-setup.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/src/core/mount-setup.c b/src/core/mount-setup.c
index bd3a035..faa7ab2 100644
--- a/src/core/mount-setup.c
+++ b/src/core/mount-setup.c
@@ -147,11 +147,21 @@ bool mount_point_is_api(const char *path) {
 
 bool mount_point_ignore(const char *path) {
         const char *i;
+        struct stat st;
 
+        /* Skip any mountpoint that's listed in ignore_paths */
         NULSTR_FOREACH(i, ignore_paths)
                 if (path_equal(path, i))
                         return true;
 
+        /* Additionaly skip any block/character device bind-mount as
+           those can't possibly be blocking anything and are very likely
+           to be required by systemd itself. */
+        if (lstat(path, &st) == 0 &&
+            (st.st_mode & S_IFBLK || st.st_mode & S_IFCHR)) {
+                return true;
+        }
+
         return false;
 }
 
-- 
1.9.1

Attachment: signature.asc
Description: Digital signature

_______________________________________________
systemd-devel mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/systemd-devel

Reply via email to