Hi Yao,
On 7/2/25 12:50, Yao Zi wrote:
On Tue, Jul 01, 2025 at 02:27:32PM +0200, Alexandre Ghiti wrote:
Hi Yao,
On 7/1/25 08:41, Yao Zi wrote:
Linux v6.16 built with dynamic ftrace randomly oops or triggers
ftrace_bug() on Sophgo SG2042 when booting systemd-based userspace,
...
Not sure either reverting the commits or fixing them up is a better
idea, but anyway the fatal first issue shouidn't go into the stable
release.
Let's fix this, we were expecting issues with dynamic ftrace :)
So the following diff fixes all the issues you mentioned (not the first
crash though, I'll let you test and see if it works better, I don't have
this board):
Thanks for the fix! I've tested it with both QEMU and SG2042, it does
fix the lockdep failures as well as the boot time crash on SG2042. The
boot-time crash is caused by the race so will disappear as long as we
fix the race.
diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c
index 4c6c24380cfd9..97ced537aa1e0 100644
--- a/arch/riscv/kernel/ftrace.c
+++ b/arch/riscv/kernel/ftrace.c
@@ -14,6 +14,16 @@
#include <asm/text-patching.h>
#ifdef CONFIG_DYNAMIC_FTRACE
+void ftrace_arch_code_modify_prepare(void)
+{
+ mutex_lock(&text_mutex);
+}
+
+void ftrace_arch_code_modify_post_process(void)
+{
+ mutex_unlock(&text_mutex);
+}
+
unsigned long ftrace_call_adjust(unsigned long addr)
{
if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS))
@@ -29,10 +39,8 @@ unsigned long arch_ftrace_get_symaddr(unsigned long
fentry_ip)
void arch_ftrace_update_code(int command)
{
- mutex_lock(&text_mutex);
command |= FTRACE_MAY_SLEEP;
ftrace_modify_all_code(command);
- mutex_unlock(&text_mutex);
flush_icache_all();
}
@@ -149,16 +157,17 @@ int ftrace_init_nop(struct module *mod, struct
dyn_ftrace *rec)
unsigned int nops[2], offset;
int ret;
+ mutex_lock(&text_mutex);
Besides using the guard API, could we swap the order between
ftrace_rec_set_nop_ops() and calculation of the nops array? This shrinks
the critical region a little.
If you don't mind, I won't, I don't like initializing stuff which could
never be used in case of error.
With or without the change, here's my tag,
Tested-by: Yao Zi <[email protected]>
and also
Reported-by: Han Gao <[email protected]>
Reported-by: Vivian Wang <[email protected]>
for their first-hand report of boot-time crash and analysis for the
first lock issue.
I'll add all those tags in the patch I'll send today (or tomorrow if the
CI is slow).
Thanks again for the great bug report, really appreciated.
Alex
Regards,
Yao Zi
ret = ftrace_rec_set_nop_ops(rec);
if (ret)
- return ret;
+ goto end;
offset = (unsigned long) &ftrace_caller - pc;
nops[0] = to_auipc_t0(offset);
nops[1] = RISCV_INSN_NOP4;
- mutex_lock(&text_mutex);
ret = patch_insn_write((void *)pc, nops, 2 * MCOUNT_INSN_SIZE);
+end:
mutex_unlock(&text_mutex);
return ret;
Andy is also taking a look, I'll let him confirm the above fix is correct.
Thanks for the thorough report!
Alex
Thanks for your suggestions on the problems.
Regards,
Yao Zi
[1]: https://lore.kernel.org/all/[email protected]/
#regzbot introduced: 881dadf0792c
_______________________________________________
linux-riscv mailing list
[email protected]
http://lists.infradead.org/mailman/listinfo/linux-riscv
_______________________________________________
linux-riscv mailing list
[email protected]
http://lists.infradead.org/mailman/listinfo/linux-riscv