From: Konstantin Khorenko <khore...@virtuozzo.com> Simple NFS mount inside a Container brings us to vfs_submount(), so if we want to enable NFS inside a Container (read - in CT root userns), we have to soften the check for init userns.
SyS_mount do_mount vfs_kern_mount mount_fs nfs_fs_mount nfs4_try_mount nfs_follow_remote_path mount_subtree vfs_path_lookup do_path_lookup filename_lookup path_lookupat lookup_slow follow_managed nfs_d_automount nfs4_submount nfs_do_submount vfs_submount https://jira.sw.ru/browse/PSBM-86277 Signed-off-by: Konstantin Khorenko <khore...@virtuozzo.com> https://jira.sw.ru/browse/PSBM-127234 (cherry picked from vz7 commit bc060d46276144f91a139b7d0acf384dcd0a4dde) vz7->vz8 port note: in vz7 the check has been dropped at all in vz8 we leave the check, but allow submounts only for root CT userns. Signed-off-by: Konstantin Khorenko <khore...@virtuozzo.com> Reviewed-by: Pavel Tikhomirov <ptikhomi...@virtuozzo.com> +++ ve/fs/namespace: fix allowing submounts in non-init userns When mounting nfs4 mount inside container with something like: mount -t nfs4 $NODEIP:/root/build/criu /mnt we can see that because the source "root" path is several directories long we do create several submounts. Adding perf probes to list mountpoint->d_sb->s_user_ns and mountpoint->d_iname from vfs_submount we see: crash > p &init_user_ns $2 = (struct user_namespace *) 0xffffffff9644efc0 1) First submount created has mountpoint dentry "root" and ve userns: mount.nfs4 ...: probe:vfs_submount: (ffffffff95a970e0) user_ns=0xffff8b6d6e86a000 dentry="root" 2) Second submount created has mountpoint dentry "build" from first submount and init userns of host: mount.nfs4 ...: probe:vfs_submount: (ffffffff95a970e0) user_ns=0xffffffff9644efc0 dentry="build" So on first step we have ve userns and on second init userns. Either compairing it to one of init userns or ve userns would not work because we can have both of them. So easy solution here is to disable the check completely like we do in vz7. Note: this patch allows nfs4 mounts in containers, thus we overcome nfs3 rpcbind non-dumpable socket migration problems, as now nfs mounts in v4 mode by default. https://jira.sw.ru/browse/PSBM-102629 mFixes: 81a2b734416d ("ve/fs/namespace: allow submounts in non-init userns") Signed-off-by: Pavel Tikhomirov <ptikhomi...@virtuozzo.com> Signed-off-by: Kirill Tkhai <ktk...@virtuozzo.com> --- fs/namespace.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/fs/namespace.c b/fs/namespace.c index c10614908e7e..85a451861e14 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -1051,12 +1051,37 @@ struct vfsmount * vfs_submount(const struct dentry *mountpoint, struct file_system_type *type, const char *name, void *data) { +#if 0 /* Until it is worked out how to pass the user namespace * through from the parent mount to the submount don't support * unprivileged mounts with submounts. */ + /* Simple NFS mount inside a Container brings us here, so if we want to + * enable NFS inside a Container (read - in non-init userns), we have + * to omit the check. Below is how is was in VZ8: + * + * SyS_mount + * do_mount + * vfs_kern_mount + * mount_fs + * nfs_fs_mount + * nfs4_try_mount + * nfs_follow_remote_path + * mount_subtree + * vfs_path_lookup + * do_path_lookup + * filename_lookup + * path_lookupat + * lookup_slow + * follow_managed + * nfs_d_automount + * nfs4_submount + * nfs_do_submount + * vfs_submount + */ if (mountpoint->d_sb->s_user_ns != &init_user_ns) return ERR_PTR(-EPERM); +#endif return vfs_kern_mount(type, SB_SUBMOUNT, name, data); } _______________________________________________ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel