Michael Ellerman wrote:
"Naveen N. Rao" <naveen.n....@linux.ibm.com> writes:
Michael Ellerman wrote:
Nicholas Piggin <npig...@gmail.com> writes:
The new mprofile-kernel mcount sequence is

  mflr  r0
  bl    _mcount

Dynamic ftrace patches the branch instruction with a noop, but leaves
the mflr. mflr is executed by the branch unit that can only execute one
per cycle on POWER9 and shared with branches, so it would be nice to
avoid it where possible.

This patch is a hacky proof of concept to nop out the mflr. Can we do
this or are there races or other issues with it?

There's a race, isn't there?

We have a function foo which currently has tracing disabled, so the mflr
and bl are nop'ed out.

  CPU 0                 CPU 1
  ==================================
  bl foo
  nop (ie. not mflr)
  -> interrupt
  something else        enable tracing for foo
  ...                   patch mflr and branch
  <- rfi
  bl _mcount

So we end up in _mcount() but with r0 not populated.

Good catch! Looks like we need to patch the mflr with a "b +8" similar to what we do in __ftrace_make_nop().

Would that actually make it any faster though? Nick?

Ok, how about doing this as a 2-step process?
1. patch 'mflr r0' with a 'b +8'
  synchronize_rcu_tasks()
2. convert 'b +8' to a 'nop'

- Naveen


Reply via email to