[PATCH v6 1/9] Get rid of __get_task_comm()

Yafang Shao Sun, 11 Aug 2024 19:30:41 -0700

We want to eliminate the use of __get_task_comm() for the following
reasons:


- The task_lock() is unnecessary
  Quoted from Linus [0]:
  : Since user space can randomly change their names anyway, using locking
  : was always wrong for readers (for writers it probably does make sense
  : to have some lock - although practically speaking nobody cares there
  : either, but at least for a writer some kind of race could have
  : long-term mixed results

- The BUILD_BUG_ON() doesn't add any value
  The only requirement is to ensure that the destination buffer is a valid
  array.

- Zeroing is not necessary in current use cases
  To avoid confusion, we should remove it. Moreover, not zeroing could
  potentially make it easier to uncover bugs. If the caller needs a
  zero-padded task name, it should be explicitly handled at the call site.

Suggested-by: Linus Torvalds <torva...@linux-foundation.org>
Link: 
https://lore.kernel.org/all/CAHk-=wivfrF0_zvf+oj6==Sh=-npjoop8chlpefafv0onyt...@mail.gmail.com
 [0]
Link: 
https://lore.kernel.org/all/CAHk-=whwtuc-ajmgjveaetkomemfstwkwu99v7+b6ayhmma...@mail.gmail.com/
Suggested-by: Alejandro Colomar <a...@kernel.org>
Link: 
https://lore.kernel.org/all/2jxak5v6dfxlpbxhpm3ey7oup4g2lnr3ueurfbosf5wdo65dk4@srb3hsk72zwq
Signed-off-by: Yafang Shao <laoar.s...@gmail.com>
Cc: Alexander Viro <v...@zeniv.linux.org.uk>
Cc: Christian Brauner <brau...@kernel.org>
Cc: Jan Kara <j...@suse.cz>
Cc: Eric Biederman <ebied...@xmission.com>
Cc: Kees Cook <keesc...@chromium.org>
Cc: Alexei Starovoitov <alexei.starovoi...@gmail.com>
Cc: Matus Jokay <matus.jo...@stuba.sk>
Cc: Alejandro Colomar <a...@kernel.org>
Cc: "Serge E. Hallyn" <se...@hallyn.com>
---
 fs/exec.c             | 10 ----------
 fs/proc/array.c       |  2 +-
 include/linux/sched.h | 31 +++++++++++++++++++++++++------
 kernel/kthread.c      |  2 +-
 4 files changed, 27 insertions(+), 18 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index a47d0e4c54f6..2e468ddd203a 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1264,16 +1264,6 @@ static int unshare_sighand(struct task_struct *me)
        return 0;
 }
 
-char *__get_task_comm(char *buf, size_t buf_size, struct task_struct *tsk)
-{
-       task_lock(tsk);
-       /* Always NUL terminated and zero-padded */
-       strscpy_pad(buf, tsk->comm, buf_size);
-       task_unlock(tsk);
-       return buf;
-}
-EXPORT_SYMBOL_GPL(__get_task_comm);
-
 /*
  * These functions flushes out all traces of the currently running executable
  * so that a new one can be started
diff --git a/fs/proc/array.c b/fs/proc/array.c
index 34a47fb0c57f..55ed3510d2bb 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -109,7 +109,7 @@ void proc_task_name(struct seq_file *m, struct task_struct 
*p, bool escape)
        else if (p->flags & PF_KTHREAD)
                get_kthread_comm(tcomm, sizeof(tcomm), p);
        else
-               __get_task_comm(tcomm, sizeof(tcomm), p);
+               get_task_comm(tcomm, p);
 
        if (escape)
                seq_escape_str(m, tcomm, ESCAPE_SPACE | ESCAPE_SPECIAL, "\n\\");
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 33dd8d9d2b85..e0e26edbda61 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1096,9 +1096,11 @@ struct task_struct {
        /*
         * executable name, excluding path.
         *
-        * - normally initialized setup_new_exec()
-        * - access it with [gs]et_task_comm()
-        * - lock it with task_lock()
+        * - normally initialized begin_new_exec()
+        * - set it with set_task_comm()
+        *   - strscpy_pad() to ensure it is always NUL-terminated
+        *   - task_lock() to ensure the operation is atomic and the name is
+        *     fully updated.
         */
        char                            comm[TASK_COMM_LEN];
 
@@ -1912,10 +1914,27 @@ static inline void set_task_comm(struct task_struct 
*tsk, const char *from)
        __set_task_comm(tsk, from, false);
 }
 
-extern char *__get_task_comm(char *to, size_t len, struct task_struct *tsk);
+/*
+ * - Why not use task_lock()?
+ *   User space can randomly change their names anyway, so locking for readers
+ *   doesn't make sense. For writers, locking is probably necessary, as a race
+ *   condition could lead to long-term mixed results.
+ *   The strscpy_pad() in __set_task_comm() can ensure that the task comm is
+ *   always NUL-terminated. Therefore the race condition between reader and
+ *   writer is not an issue.
+ *
+ * - Why not use strscpy_pad()?
+ *   While strscpy_pad() prevents writing garbage past the NUL terminator, 
which
+ *   is useful when using the task name as a key in a hash map, most use cases
+ *   don't require this. Zero-padding might confuse users if it’s unnecessary,
+ *   and not zeroing might even make it easier to expose bugs. If you need a
+ *   zero-padded task name, please handle that explicitly at the call site.
+ *
+ * - ARRAY_SIZE() can help ensure that @buf is indeed an array.
+ */
 #define get_task_comm(buf, tsk) ({                     \
-       BUILD_BUG_ON(sizeof(buf) != TASK_COMM_LEN);     \
-       __get_task_comm(buf, sizeof(buf), tsk);         \
+       strscpy(buf, (tsk)->comm, ARRAY_SIZE(buf));     \
+       buf;                                            \
 })
 
 #ifdef CONFIG_SMP
diff --git a/kernel/kthread.c b/kernel/kthread.c
index f7be976ff88a..7d001d033cf9 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -101,7 +101,7 @@ void get_kthread_comm(char *buf, size_t buf_size, struct 
task_struct *tsk)
        struct kthread *kthread = to_kthread(tsk);
 
        if (!kthread || !kthread->full_name) {
-               __get_task_comm(buf, buf_size, tsk);
+               strscpy(buf, tsk->comm, buf_size);
                return;
        }
 
-- 
2.43.5

[PATCH v6 1/9] Get rid of __get_task_comm()

Reply via email to