From: Srivatsa S. Bhat <sriva...@csail.mit.edu>

The existing patch which disallows unprivileged CLONE_NEWUSER applies
the check for CAP_SYS_ADMIN capability on the 'init_user_ns'
namespace, which is not entirely correct. Consider the following sequence:

1. A process with root privileges calls
   clone(child_fn, ..., CLONE_NEWUSER, ...) to create a new user namespace.

2. child_fn, now running in the newly created user namespace enjoys the
   full set of capabilities in the new user namespace, but has lost
   its capabilities in the old user namespace (init_user_ns in this
   case).

3. child_fn now calls
   clone(child_fn2, ..., CLONE_NEWUSER, ...) to create a new (nested)
   user namespace.

Step 3 should have succeeded because child_fn has full privileges
(including CAP_SYS_ADMIN) in its user namespace, but this step fails
because the CAP_SYS_ADMIN capability is checked against init_user_ns,
as opposed to child_fn's user namespace.

So fix this by checking for CAP_SYS_ADMIN using ns_capable() on the
current task's user namespace.

This also helps the userns07 testcase from LTP
(testcases/kernel/containers/userns/userns07.c) to pass when running
with root privileges.

Signed-off-by: Srivatsa S. Bhat <sriva...@csail.mit.edu>
---
Applies on debian kernel's master branch.

 ...low-unprivileged-CLONE_NEWUSER-by-default.patch |   11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git 
a/debian/patches/debian/add-sysctl-to-disallow-unprivileged-CLONE_NEWUSER-by-default.patch
 
b/debian/patches/debian/add-sysctl-to-disallow-unprivileged-CLONE_NEWUSER-by-default.patch
index 55edbc7..fc978ea 100644
--- 
a/debian/patches/debian/add-sysctl-to-disallow-unprivileged-CLONE_NEWUSER-by-default.patch
+++ 
b/debian/patches/debian/add-sysctl-to-disallow-unprivileged-CLONE_NEWUSER-by-default.patch
@@ -12,6 +12,9 @@ issues are found, we have a fail-safe.
 
 Signed-off-by: Serge Hallyn <serge.hal...@ubuntu.com>
 [bwh: Remove unneeded binary sysctl bits]
+[Srivatsa: Fix capability checks when running nested user namespaces
+by using ns_capable() on the current task's user namespace.]
+Signed-off-by: Srivatsa S. Bhat <sriva...@csail.mit.edu>
 ---
 --- a/kernel/fork.c
 +++ b/kernel/fork.c
@@ -27,24 +30,24 @@ Signed-off-by: Serge Hallyn <serge.hal...@ubuntu.com>
  
  /*
   * Minimum number of threads to boot the kernel
-@@ -1550,6 +1555,10 @@ static __latent_entropy struct task_stru
+@@ -1550,6 +1555,10 @@ static __latent_entropy struct task_struct 
*copy_process(
        if ((clone_flags & (CLONE_NEWUSER|CLONE_FS)) == 
(CLONE_NEWUSER|CLONE_FS))
                return ERR_PTR(-EINVAL);
  
 +      if ((clone_flags & CLONE_NEWUSER) && !unprivileged_userns_clone)
-+              if (!capable(CAP_SYS_ADMIN))
++              if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
 +                      return ERR_PTR(-EPERM);
 +
        /*
         * Thread groups must share signals as well, and detached threads
         * can only be started up within the thread group.
-@@ -2343,6 +2352,12 @@ SYSCALL_DEFINE1(unshare, unsigned long,
+@@ -2343,6 +2352,12 @@ SYSCALL_DEFINE1(unshare, unsigned long, unshare_flags)
        if (unshare_flags & CLONE_NEWNS)
                unshare_flags |= CLONE_FS;
  
 +      if ((unshare_flags & CLONE_NEWUSER) && !unprivileged_userns_clone) {
 +              err = -EPERM;
-+              if (!capable(CAP_SYS_ADMIN))
++              if (!ns_capable(current_user_ns(), CAP_SYS_ADMIN))
 +                      goto bad_unshare_out;
 +      }
 +

Reply via email to