This series adds an `mknod` feature flag for unprivileged containers which is handled by setting `lxc.seccomp.proxy.notify` to point to the socket where pve-lxc-syscalld is listening (and `….proxy.cookie` to the vmid for possible future use).
Currently the daemon handles `mknod()` with the following whitelist: c:0:0 - whiteout c:1:7 - /dev/full c:1:3 - /dev/null c:1:5 - /dev/zero c:1:8 - /dev/random c:1:9 - /dev/urandom c:5:0 - /dev/tty c:5:1 - /dev/console Currently the seccomp interface requires us to either completely handle a syscall or not catch it at all. (Iow. you can't just allow parts of a syscall manually and tell the kernel to "do its thing" for the cases you don't want to handle). For `mknod` this is not an issue since the kernel won't allow it at all in user namespaces. Later when we get a kernel >=5.5 we could also start partially handling some syscalls (eg. mount, but that'll only be feasable with the old mount api), and send cases we don't want to handle "back to the kernel". Wolfgang Bumiller (4): add mknod feature flag add a check_kernel_release helper mask 'mknod' feature by kernel version set lxc.seccomp.notify.cookie to the vmid src/Makefile | 1 - src/PVE/LXC.pm | 116 ++++++++++++++++++++++++++++++++++++------ src/PVE/LXC/Config.pm | 8 +++ 3 files changed, 109 insertions(+), 16 deletions(-) -- 2.20.1 _______________________________________________ pve-devel mailing list pve-devel@pve.proxmox.com https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel