On Thu, Apr 28, 2016 at 11:58:25AM +0100, Szabolcs Nagy wrote: > On 28/04/16 09:47, Maxim Kuvyrkov wrote: > >> On Apr 27, 2016, at 7:26 PM, Szabolcs Nagy <szabolcs.n...@arm.com> wrote: > >> > >> with -mfentry, by default the user only has to > >> implement the fentry call (linux wants nops there, but > >> e.g. glibc could use -pg -mfentry for profiling on > >> aarch64 and the target specific details are easier to > >> document for an -m option than for something general). > > > > I don't understand your point here, could you elaborate, please? > > > > if we only provide -mfentry then > > - the kernel can use it (they have tools to nop patch the binary), > > - others who don't want to fiddle with nops, just have the call, > can also use it (e.g. user-space profiling cannot really use > something that needs binary patching in case the user prefers > -pg -mfentry over the current -pg behaviour).
Any examples of users not satisfied with the current -pg ;-) ? > - it's target specific, so the magic abi of the fentry call can > be documented by the target according to the specific instruction There's a downside to this: you will have to reimplement it in gcc * for every architecture * for every ABI variant while the generic approach is -- well -- somewhat generic :-] > sequence that is used. (with nop-padding there are psabi and > compiler optimization interactions that may be hard to document > in a generic way and letting the user figure it out may cause > problems later in compiler development.. but i'm just speculating > based on the powerpc toc handling and ipa-ra findings.) ipa-ra is from hell ;) At least from a function-patcher's standpoint. You may argue that OTOH function binary patching is from hell :) > >> the nop-padding is more general, but the size and > >> layout of nops and the call abi will be target > >> specific and the user will most likely need to modify > >> the binary (to get the right sequence) which needs > >> additional tooling. i don't know who might use it > >> other than linux (which already has tools to deal with > >> -mfentry). On exactly 1 (one!) architecture. s390x uses NOP padding, hint, hint... > i'm trying to find where this happens in the kernel, but > i only see scripts/recordmcount.{c,pl} which are based on > nop patching the fentry/mcount call sites. > > without such call sites the tools have to be implemented > differently and the way the kernel records the call site > positions might not match the prolog-pad recording. AFAICS Maxim has provided a nice mechanism to find the NOP pads. Let's see how far we can get, then discuss this further, I suggest. Torsten