Hi! On 07/03/2021 18:26, Qais Yousef wrote: > I tried on 5.12-rc2 and 5.11 but couldn't reproduce the problem using your > instructions on the other email. But most likely because I'm hitting another > problem that could be masking it. I'm not sure it is related or just randomly > happened to hit it. > > Did you see something similar?
[...] > [ 0.000000] [<c1b01a38>] (ftrace_bug) from [<c046316c>] > (ftrace_process_locs+0x2b0/0x518) > [ 0.000000] r7:c3817ac4 r6:c38040c0 r5:00000a3c r4:000134e4 > [ 0.000000] [<c0462ebc>] (ftrace_process_locs) from [<c2b25240>] > (ftrace_init+0xc8/0x174) > [ 0.000000] r10:c2ffa000 r9:c2be8a78 r8:c2c5d1fc r7:c2c0c208 > r6:00000001 r5:c2d0908c > [ 0.000000] r4:c362f518 > [ 0.000000] [<c2b25178>] (ftrace_init) from [<c2b00e14>] > (start_kernel+0x2f4/0x5b8) > [ 0.000000] r9:c2be8a78 r8:dbfffec0 r7:00000000 r6:c36385cc > r5:c2d08f00 r4:c2ffa000 > [ 0.000000] [<c2b00b20>] (start_kernel) from [<00000000>] (0x0) This means, FTRACE has more problems with your kernel/compiler/platform, I've addressed similar issue in the past, but my patch should be long merged: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1817963.html Could it be the same problem as here: https://www.spinics.net/lists/arm-kernel/msg854022.html Seems that the size check deserves something line BUILD_BUG_ON() with FTRACE... >> diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c >> index 9a79ef6..fa867a5 100644 >> --- a/arch/arm/kernel/ftrace.c >> +++ b/arch/arm/kernel/ftrace.c >> @@ -70,6 +70,19 @@ int ftrace_arch_code_modify_post_process(void) >> >> static unsigned long ftrace_call_replace(unsigned long pc, unsigned long >> addr) >> { >> + s32 offset = addr - pc; >> + s32 blim = 0xfe000008; >> + s32 flim = 0x02000004; > > This look like magic numbers to me.. These magic numbers are most probably the reason for your FTRACE to resign... Those are backward- and forward-branch limits. I didn't find the matching DEFINEs in the kernel, but I would be happy to learn them. I can also put some comments, but I actually thought the purpose would be obvious from the code... >> + >> + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { >> + blim = 0xff000004; >> + flim = 0x01000002; > > .. ditto .. > >> + } >> + >> + if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS) && >> + (offset < blim || offset > flim)) >> + return 0; > > .. I could have missed something, but wouldn't something like below be > clearer? > Only compile tested. I think abs() will do the right thing here given the > passed types. I admit I don't understand why you have the '4' and '8' at the > lowest nibble.. Yes, the limits are not symmetrical. These "magic numbers" have been checked many times by me, but I admit I'm not expert in ARM assembly. I'm however still quite sure about them. > diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c > index fa867a57100f..b44aee87c53a 100644 > --- a/arch/arm/kernel/ftrace.c > +++ b/arch/arm/kernel/ftrace.c > @@ -70,17 +70,13 @@ int ftrace_arch_code_modify_post_process(void) > > static unsigned long ftrace_call_replace(unsigned long pc, unsigned > long addr) > { > - s32 offset = addr - pc; > - s32 blim = 0xfe000008; > - s32 flim = 0x02000004; > + u32 offset = abs(addr - pc); > + u32 range = 0x02000000; /* +-32MiB */ > > - if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) { > - blim = 0xff000004; > - flim = 0x01000002; > - } > + if (IS_ENABLED(CONFIG_THUMB2_KERNEL)) > + range = 0x01000000; /* +-16MiB */ > > - if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS) && > - (offset < blim || offset > flim)) > + if (IS_ENABLED(CONFIG_ARM_MODULE_PLTS) && offset > range) > return 0; See above, the limits are not symmetrical. > return arm_gen_branch_link(pc, addr); > > In case CONFIG_ARM_MODULE_PLTS is not enabled what would happen? Is it > impossible to hit this corner case or we could fail one way or another? IOW, > should this check be always compiled in? I didn't want to modify the original behavior and the limits are again checked in either ARM or THUMB implementations of __arm_gen_branch() (there you will again find a nice set of "magic numbers". >> + >> return arm_gen_branch_link(pc, addr); >> } >> >> @@ -124,10 +137,22 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned >> long addr) >> { >> unsigned long new, old; >> unsigned long ip = rec->ip; >> + unsigned long aaddr = adjust_address(rec, addr); >> >> old = ftrace_nop_replace(rec); >> >> - new = ftrace_call_replace(ip, adjust_address(rec, addr)); >> + new = ftrace_call_replace(ip, aaddr); >> + >> +#ifdef CONFIG_ARM_MODULE_PLTS >> + if (!new) { >> + struct module *mod = rec->arch.mod; >> + >> + if (mod) { > > What would happen if !new and !mod? I believe, that's exactly what happens in the dump you experience with your kernel. This is not covered by this patch, this patch covers the issue with modules in vmalloc area. >> + aaddr = get_module_plt(mod, ip, aaddr); >> + new = ftrace_call_replace(ip, aaddr); > > I assume we're guaranteed to have a sensible value returned in 'new' here? Otherwise you'd see the dump you see :) It relies on the already existing error handling. >> + } >> + } >> +#endif -- Best regards, Alexander Sverdlin.