* Alexander Monakov:

> On Mon, 16 Dec 2024, Florian Weimer via Gcc wrote:
>
>> I would like to provide a facility to create wrapper functions without
>> lots of argument shuffling.  To achieve that, the wrapping function and
>> the wrapped function should have the same prototype.  There will be a
>> trampoline that puts additional data somewhere (possibly including the
>> address of the wrapped function, but that interpretation is up to the
>> wrapping function) and then transfers control to the wrapper function
>> with an indirect jump (tail call).
>> 
>> For signal safety, I think the hidden argument needs to be in a register
>> (instead of, say, thread-local storage).  Most System V ABI variants
>> seem to reserve a register for use by the dynamic linker, or for the
>> static chain pointer of nested functions.
>> 
>> Is there a way to reuse either register for this purpose and assign it
>> to a local variable reliably at the start of the wrapper function
>> implementation?
>
> Not in a way that will work with LLVM, I'm afraid, and with GCC
> you'll have to shield wrappers from LTO:
>
> register void *r10 asm("r10");
> void f(int, int);
> void f_wrap(int a, int b)
> {
>     r10 = f;
>     f(a, b);
> }

Does this work on all primary GCC targets?

> This is the only approach I'm aware of, apart of generating wrappers
> in asm (speaking of, is there some reason that wouldn't work for you?).

You mean wrappers that inject the extra argument?  That doesn't work for
variadic functions.  It's also likely to break with unexpected calling
conventions.  Variadic functions are always problematic because you
can't directly forward to the original function.  But at least you can
write a wrapper that around fprintf that forwards to vfprintf in C,
without re-implementing fprintf argument parsing (for example).

If the assembler trampoline only has to store a configurable value in a
fixed register and do an indirect jump, the trampolines are very
regular.  This way, it is possible to create new trampolines at run time
without run-time code generation.  All you need is a pre-built page (or
a couple of pages) of trampoline code that loads parameter and address
using PC-relative loads.  These trampoline pages can be mapped multiple
times next to different data areas.

Here's the background:

I'm looking for a possible replacement for the pltenter/pltexit wrappers
in glibc's audit functionality.  The current approach breaks if
procedure call standards evolve:

<https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/aarch64/bits/link.h>
<https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/powerpc/bits/link.h>
<https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86/bits/link.h>

As you can see, we haven't been doing a good job at maintaining them.  I
think most targets nowadays have vector calling conventions that are
only imperfectly modelled.  The POWER version still does not support the
Linux system call/vDSO PCS.

It doesn't help that under the present model, we have to extend the
struct, and there isn't a good way to communicate this to audit modules
(short of bumping LAV_CURRENT).  For some ABIs, we would have to expose
a partial register file like this and also perform a full context switch
in the dynamic loader around the callbacks in case the callback code
clobbers the argument state, which is rather inefficient.  This is
currently not implemented, but x86-64 needs it, and I suspect it's not
the only architecture.

Thanks,
Florian

Reply via email to