Naveen N. Rao's on June 19, 2019 7:53 pm: > Nicholas Piggin wrote: >> Michael Ellerman's on June 19, 2019 3:14 pm: >>> Hi Naveen, >>> >>> Sorry I meant to reply to this earlier .. :/ > > No problem. Thanks for the questions. > >>> >>> "Naveen N. Rao" <naveen.n....@linux.vnet.ibm.com> writes: >>>> With -mprofile-kernel, gcc emits 'mflr r0', followed by 'bl _mcount' to >>>> enable function tracing and profiling. So far, with dynamic ftrace, we >>>> used to only patch out the branch to _mcount(). However, mflr is >>>> executed by the branch unit that can only execute one per cycle on >>>> POWER9 and shared with branches, so it would be nice to avoid it where >>>> possible. >>>> >>>> We cannot simply nop out the mflr either. When enabling function >>>> tracing, there can be a race if tracing is enabled when some thread was >>>> interrupted after executing a nop'ed out mflr. In this case, the thread >>>> would execute the now-patched-in branch to _mcount() without having >>>> executed the preceding mflr. >>>> >>>> To solve this, we now enable function tracing in 2 steps: patch in the >>>> mflr instruction, use synchronize_rcu_tasks() to ensure all existing >>>> threads make progress, and then patch in the branch to _mcount(). We >>>> override ftrace_replace_code() with a powerpc64 variant for this >>>> purpose. >>> >>> According to the ISA we're not allowed to patch mflr at runtime. See the >>> section on "CMODX". >> >> According to "quasi patch class" engineering note, we can patch >> anything with a preferred nop. But that's written as an optional >> facility, which we don't have a feature to test for. >> > > Hmm... I wonder what the implications are. We've been patching in a > 'trap' for kprobes for a long time now, along with having to patch back > the original instruction (which can be anything), when the probe is > removed.
Will have to check what implementations support "quasi patch class" instructions. IIRC recent POWER processors are okay. May have to add a feature test though. >>> >>> I'm also not convinced the ordering between the two patches is >>> guaranteed by the ISA, given that there's possibly no isync on the other >>> CPU. >> >> Will they go through a context synchronizing event? >> >> synchronize_rcu_tasks() should ensure a thread is scheduled away, but >> I'm not actually sure it guarantees CSI if it's kernel->kernel. Could >> do a smp_call_function to do the isync on each CPU to be sure. > > Good point. Per > Documentation/RCU/Design/Requirements/Requirements.html#Tasks RCU: > "The solution, in the form of Tasks RCU, is to have implicit read-side > critical sections that are delimited by voluntary context switches, that > is, calls to schedule(), cond_resched(), and synchronize_rcu_tasks(). In > addition, transitions to and from userspace execution also delimit > tasks-RCU read-side critical sections." > > I suppose transitions to/from userspace, as well as calls to schedule() > result in context synchronizing instruction being executed. But, if some > tasks call cond_resched() and synchronize_rcu_tasks(), we probably won't > have a CSI executed. > > Also: > "In CONFIG_PREEMPT=n kernels, trampolines cannot be preempted, so these > APIs map to call_rcu(), synchronize_rcu(), and rcu_barrier(), > respectively." > > In this scenario as well, I think we won't have a CSI executed in case > of cond_resched(). > > Should we enhance patch_instruction() to handle that? Well, not sure. Do we have many post-boot callers of it? Should they take care of their own synchronization requirements? Thanks, Nick