"Naveen N. Rao" <naveen.n....@linux.vnet.ibm.com> writes: > Michael Ellerman wrote: >> Balbir Singh <bsinghar...@gmail.com> writes: >> >>> On Thu, Nov 23, 2017 at 4:32 AM, Mahesh J Salgaonkar >>> <mah...@linux.vnet.ibm.com> wrote: >>>> From: Mahesh Salgaonkar <mah...@linux.vnet.ibm.com> >>>> >>>> Rebooting into a new kernel with kexec fails in trace_tlbie() which is >>>> called from native_hpte_clear(). This happens if the running kernel has >>>> CONFIG_LOCKDEP enabled. With lockdep enabled, the tracepoints always >>>> execute few RCU checks regardless of whether tracing is on or off. >>>> We are already in the last phase of kexec sequence in real mode with >>>> HILE_BE set. At this point the RCU check ends up in RCU_LOCKDEP_WARN and >>>> causes kexec to fail. >>>> >>> >>> Effectively we can't enter the trace point code after we've set >>> HILE_BE. Do we need >>> a fixes tag? Or is this a side-effect of a new generic change? >> >> Yes I added: >> >> Fixes: 0428491cba92 ("powerpc/mm: Trace tlbie(l) instructions") >> Cc: sta...@vger.kernel.org # v4.13+ >> >>> I think the right thing in the longer run might be to do a >>> TRACE_EVENT_CONDITION >>> and have the condition do the right thing, but what you have for now is >>> good. >> >> No I think the right thing is to not call trace points from kexec code, >> it's too fragile. TRACE_EVENT_CONDITION wouldn't have saved us from this >> RCU breakage. > > I agree on the fragile part, though it appears to me that a > TRACE_EVENT_CONDITION() with a check for is_kexec (that needs to be > added) will prevent breakage since both the LOCKDEP block as well as the > tracepoint itself are guarded by the condition. So, none of the rcu code > should be executed as long as we set is_kexec at the right time.
Yes you're right, I misread that. So maybe that is an option. But it still makes me nervous :) cheers