On Thu, May 23, 2024 at 01:51:34PM +0200, Alexandre Ghiti wrote: > Commit c97bf629963e ("riscv: Fix text patching when IPI are used") > converted ftrace_make_nop() to use patch_insn_write() which does not > emit any icache flush relying entirely on __ftrace_modify_code() to do > that. > > But we missed that ftrace_make_nop() was called very early directly when > converting mcount calls into nops (actually on riscv it converts 2B nops > emitted by the compiler into 4B nops). > > This caused crashes on multiple HW as reported by Conor and Björn since > the booting core could have half-patched instructions in its icache > which would trigger an illegal instruction trap: fix this by emitting a > local flush icache when early patching nops. > > Fixes: c97bf629963e ("riscv: Fix text patching when IPI are used") > Signed-off-by: Alexandre Ghiti <alexgh...@rivosinc.com> > --- > arch/riscv/include/asm/cacheflush.h | 6 ++++++ > arch/riscv/kernel/ftrace.c | 3 +++ > 2 files changed, 9 insertions(+) > > diff --git a/arch/riscv/include/asm/cacheflush.h > b/arch/riscv/include/asm/cacheflush.h > index dd8d07146116..ce79c558a4c8 100644 > --- a/arch/riscv/include/asm/cacheflush.h > +++ b/arch/riscv/include/asm/cacheflush.h > @@ -13,6 +13,12 @@ static inline void local_flush_icache_all(void) > asm volatile ("fence.i" ::: "memory"); > } > > +static inline void local_flush_icache_range(unsigned long start, > + unsigned long end) > +{ > + local_flush_icache_all(); > +} > + > #define PG_dcache_clean PG_arch_1 > > static inline void flush_dcache_folio(struct folio *folio) > diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c > index 4f4987a6d83d..32e7c401dfb4 100644 > --- a/arch/riscv/kernel/ftrace.c > +++ b/arch/riscv/kernel/ftrace.c > @@ -120,6 +120,9 @@ int ftrace_init_nop(struct module *mod, struct dyn_ftrace > *rec) > out = ftrace_make_nop(mod, rec, MCOUNT_ADDR); > mutex_unlock(&text_mutex);
So, turns out that this patch is not sufficient. I've seen some more crashes, seemingly due to initialising nops on this mutex_unlock(). Palmer suggested moving the if (!mod) ... into the lock, which fixed the problem for me. > > + if (!mod) > + local_flush_icache_range(rec->ip, rec->ip + MCOUNT_INSN_SIZE); > + > return out; > } > > -- > 2.39.2 >
signature.asc
Description: PGP signature