On 2025/9/10 上午9:11, Jinyang He wrote:
On 2025-09-09 19:31, Tiezhu Yang wrote:

When testing the kernel live patching with "modprobe livepatch-sample",
there is a timeout over 15 seconds from "starting patching transition"
to "patching complete", dmesg shows "unreliable stack" for user tasks
in debug mode. When executing "rmmod livepatch-sample", there exists
the similar issue.

...

@@ -57,9 +62,14 @@ int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry,
      }
      regs->regs[1] = 0;
      regs->regs[22] = 0;
+    regs->csr_prmd = task->thread.csr_prmd;
      for (unwind_start(&state, task, regs);
           !unwind_done(&state) && !unwind_error(&state); unwind_next_frame(&state)) {
+        /* Success path for user tasks */
+        if (user_mode(regs))
+            return 0;
+
          addr = unwind_get_return_address(&state);
          /*
Hi, Tiezhu,

We update stack info by get_stack_info when meet ORC_TYPE_REGS in
unwind_next_frame. And in arch_stack_walk(_reliable), we always
do unwind_done before unwind_next_frame. So is there anything
error in get_stack_info which causing regs is user_mode while
stack is not STACK_TYPE_UNKNOWN?

When testing the kernel live patching, the error code path in
unwind_next_frame() is:

  switch (orc->fp_reg) {
          case ORC_REG_PREV_SP:
                  p = (unsigned long *)(state->sp + orc->fp_offset);
if (!stack_access_ok(state, (unsigned long)p, sizeof(unsigned long)))
                          goto err;

for this case, get_stack_info() does not return 0 due to in_task_stack()
is not true, then goto error, state->stack_info.type = STACK_TYPE_UNKNOWN
and state->error = true. In arch_stack_walk_reliable(), the loop will be
break and it returns -EINVAL, thus causing unreliable stack.

Maybe it can check whether the task is in userspace and set
state->stack_info.type = STACK_TYPE_UNKNOWN in get_stack_info(),
but I think no need to do that because it has similar effect with
this patch.

Thanks,
Tiezhu


Reply via email to