On Fri, 21 Dec 2018 12:11:35 +1100 Benjamin Herrenschmidt <b...@kernel.crashing.org> wrote:
> Hi Steven ! > > I'm trying to untangle something, and I need your help :-) > > In commit 3cb5f1a3e58c0bd70d47d9907cc5c65192281dee, you added a summy > stack frame around the assembly calls to trace_hardirqs_on/off on the > ground that when using the latency tracer (irqsoff), you might poke at > CALLER_ADDR1 and that could blow up if there's only one frame at hand. > > However, I can't see where it would be doing that. lockdep.c only uses > CALLER_ADDR0 and irqsoff uses the values passed by it. In fact, that > was already the case when the above commit was merged. > > I tried on a 32-bit kernel to remove the dummy stack frame with no > issue so far .... (though I do get stupid values reported with or > without a stack frame, but I think that's normal, looking into it). BTW, I only had a 64 bit PPC working, so I would have been testing that. > > The reason I'm asking is that we have other code path, on return > from interrupts for example, at least on 32-bits where we call the > tracing without the extra stack frame, and I yet to see it crash. Have you tried enabling the irqsoff tracer and running it for a while? echo irqsoff > /sys/kernel/debug/tracing/current_tracer The problem is that when we come from user space, and we disable interrupts in the entry code, it calls into the irqsoff tracer: [ in userspace ] <interrupt> [ in kernel ] bl .trace_hardirqs_off kernel/trace/trace_preemptirq.c: trace_hardirqs_off(CALLER_ADDR_0, CALLER_ADDR1) IIRC, without the stack frame, that CALLER_ADDR1 can end up having the kernel read garbage. -- Steve > > I wonder if the commit and bug fix above relates to some older code > that no longer existed even at the point where the commit was merged... >