Restrict the ability to inspect kernel stacks of arbitrary tasks to root
in order to prevent a local attacker from exploiting racy stack unwinding
to leak kernel task stack contents.
See the added comment for a longer rationale.

There don't seem to be any users of this userspace API that can't
gracefully bail out if reading from the file fails. Therefore, I believe
that this change is unlikely to break things.
In the case that this patch does end up needing a revert, the next-best
solution might be to fake a single-entry stack based on wchan.

Fixes: 2ec220e27f50 ("proc: add /proc/*/stack")
Cc: sta...@vger.kernel.org
Signed-off-by: Jann Horn <ja...@google.com>
---
 fs/proc/base.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index ccf86f16d9f0..7e9f07bf260d 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -407,6 +407,20 @@ static int proc_pid_stack(struct seq_file *m, struct 
pid_namespace *ns,
        unsigned long *entries;
        int err;
 
+       /*
+        * The ability to racily run the kernel stack unwinder on a running task
+        * and then observe the unwinder output is scary; while it is useful for
+        * debugging kernel issues, it can also allow an attacker to leak kernel
+        * stack contents.
+        * Doing this in a manner that is at least safe from races would require
+        * some work to ensure that the remote task can not be scheduled; and
+        * even then, this would still expose the unwinder as local attack
+        * surface.
+        * Therefore, this interface is restricted to root.
+        */
+       if (!file_ns_capable(m->file, &init_user_ns, CAP_SYS_ADMIN))
+               return -EACCES;
+
        entries = kmalloc_array(MAX_STACK_TRACE_DEPTH, sizeof(*entries),
                                GFP_KERNEL);
        if (!entries)
-- 
2.19.0.rc2.392.g5ba43deb5a-goog

Reply via email to