Hi Uros, > On 21 May 2019, at 19:36, Uros Bizjak <ubiz...@gmail.com> wrote: > > On Tue, May 21, 2019 at 6:15 PM Iain Sandoe <i...@sandoe.co.uk> wrote: >>
>> It seems to me that (even if it was working “properly”, which it isn't) >> ‘-mfentry’ would break ABI on Darwin for both 32 and 64b - which require >> 16byte stack alignment at call sites. >> >> For Darwin, the dynamic loader enforces the requirement when it can and will >> abort a program that tries to make a DSO linkage with the stack in an >> incorrect alignment. We previously had a bug against profiling caused by >> exactly this issue (but when the mcount call was in the post-prologue >> position). >> >> Actually, I’m not sure why it’s not an issue for other 64b platforms that >> use the psABI (AFAIR, it’s only the 32b case that’s Darwin-specific). > > The __fentry__ in glibc is written as a wrapper around the call to > __mcount_internal, and is written in such a way that it compensates > stack misalignment in a call to __mcount_internal. __fentry__ survives > stack misalignment, since no xmm regs are saved to the stack in the > function. Well, we can’t change Darwin’s libc to do something similar (and anyway the dynamic loader would also need to know that this was a special as well to avoid aborting the exe). ... however we could do a dodge where some shim code was inserted into any TU that used mfentry to redirect the call to an ABI-compliant launchpad… etc. etc. It seems we can’t instrument “exactly at the entry” .. only “pretty close to it”. >> Anyway, my current plan is to disable mfentry (for Darwin) - the alternative >> might be some kind of “almost at the start of the function, but needing some >> stack alignment change”, >> >> I’m interested in if you know of any compelling use-cases that would make it >> worth finding some work-around instead of disabling. > > Unfortunately, not from the top of my head… Well, I can’t either, so for now I’m going to make a patch to disable it (since it’s not fully working in any case) .. if anyone screams that there’s a major reduction in functionality - we can investigate some scheme as mentioned above. thanks Iain