"Naveen N. Rao" <naveen.n....@linux.ibm.com> writes: > Michael Ellerman wrote: >> Nicholas Piggin <npig...@gmail.com> writes: >>> The new mprofile-kernel mcount sequence is >>> >>> mflr r0 >>> bl _mcount >>> >>> Dynamic ftrace patches the branch instruction with a noop, but leaves >>> the mflr. mflr is executed by the branch unit that can only execute one >>> per cycle on POWER9 and shared with branches, so it would be nice to >>> avoid it where possible. >>> >>> This patch is a hacky proof of concept to nop out the mflr. Can we do >>> this or are there races or other issues with it? >> >> There's a race, isn't there? >> >> We have a function foo which currently has tracing disabled, so the mflr >> and bl are nop'ed out. >> >> CPU 0 CPU 1 >> ================================== >> bl foo >> nop (ie. not mflr) >> -> interrupt >> something else enable tracing for foo >> ... patch mflr and branch >> <- rfi >> bl _mcount >> >> So we end up in _mcount() but with r0 not populated. > > Good catch! Looks like we need to patch the mflr with a "b +8" similar > to what we do in __ftrace_make_nop().
Would that actually make it any faster though? Nick? cheers