https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980
--- Comment #15 from Kewen Lin <linkw at gcc dot gnu.org> --- (In reply to Michael Matz from comment #14) > Hmm? But this is not how the global-to-local hand-off is implemented (and > expected by tooling): a fall-through. The global entry sets up the GOT > register, there simply is no '[b localentry]'. > > If you mean to imply that also the '[b localentry]' should be patched in at > live-patch application time (and hence the GOT setup would need to be moved > to still somewhere else), then you have the problem that (in the > not-yet-patched > case) as long as the L1-nops sit between global and local entry they will > always > be executed when the global entry is called. Sorry for confusion, I meant the sequence like: global entry: [TOC base setup] // always here [b localentry] // which is added when patching L1: [patched code] // from patching localentry: [b L1] // from patching > That's wasteful. I agree, nops are not zero cost on Power8/Power9. > > Additionally tooling will be surprised if the address difference between > global and local entry isn't exactly 8 (i.e. two instructions). The psABI > allows for different values, of course. But I'm willing to bet that there > are > bugs in the wild when different values would be actually used. > It's possible that some tooling doesn't conform the ABI doc well, but I think the tooling should fix itself if that is the case. :) > So, the nops-between-gep-and-lep could probably be somehow made to work with > userspace live patching, but your most recent patch here makes this all mood. > It generates exactly the sequence we want: a single nop at the LEP, and > a configurable patching area outside of, but near to, the function (here: in > front of the GEP). I agree, thanks for the comments! btw, I'm not fighting for the current implementation, just want to know more details why users are unable to make use of the current implementation, is it just due to its inefficiency (like the above sequence) or un-usability (unused at all). As your comments, I think it's due to the former (inefficiency)?!