Michael Ellerman's on June 19, 2019 3:14 pm: > Hi Naveen, > > Sorry I meant to reply to this earlier .. :/ > > "Naveen N. Rao" <naveen.n....@linux.vnet.ibm.com> writes: >> With -mprofile-kernel, gcc emits 'mflr r0', followed by 'bl _mcount' to >> enable function tracing and profiling. So far, with dynamic ftrace, we >> used to only patch out the branch to _mcount(). However, mflr is >> executed by the branch unit that can only execute one per cycle on >> POWER9 and shared with branches, so it would be nice to avoid it where >> possible. >> >> We cannot simply nop out the mflr either. When enabling function >> tracing, there can be a race if tracing is enabled when some thread was >> interrupted after executing a nop'ed out mflr. In this case, the thread >> would execute the now-patched-in branch to _mcount() without having >> executed the preceding mflr. >> >> To solve this, we now enable function tracing in 2 steps: patch in the >> mflr instruction, use synchronize_rcu_tasks() to ensure all existing >> threads make progress, and then patch in the branch to _mcount(). We >> override ftrace_replace_code() with a powerpc64 variant for this >> purpose. > > According to the ISA we're not allowed to patch mflr at runtime. See the > section on "CMODX".
According to "quasi patch class" engineering note, we can patch anything with a preferred nop. But that's written as an optional facility, which we don't have a feature to test for. > > I'm also not convinced the ordering between the two patches is > guaranteed by the ISA, given that there's possibly no isync on the other > CPU. Will they go through a context synchronizing event? synchronize_rcu_tasks() should ensure a thread is scheduled away, but I'm not actually sure it guarantees CSI if it's kernel->kernel. Could do a smp_call_function to do the isync on each CPU to be sure. Thanks, Nick