I'm experimenting with ways to optimize wine (x86 target only) and I
believe I can shrink wine's total text size by around 7% by outlining
the lengthy pro- and epilogues required for ms_abi functions making
sysv_abi calls. Theoretically, fewer instruction cache misses will
offset the extra 4 instructions per function and result in a net
performance gain. However, I'm new to the gcc project and a novice x86
assembly programmer as well (have been wanting to work on gcc for a
while now!) In short, I want to:
1. Replace the prologue that pushes di, sp and xmm6-15 with a single
call to a global "ms_abi_push_regs" routine
2. Replace the epilogue that pops these regs with a jmp to a global
"ms_abi_pop_regs" routine
3. Add the two routines somewhere so that they are linked into the output.
I have this working in a small-scale experiment (writing the ms_abi
function in assembly), but I'm not certain how I would add these
routines. Should I make them built-ins?
I have found the code that adds the clobber RTL instructions in
ix86_expand_call() (gcc/config/i386/i386.c:25832), and I see that
thread_prologue_and_epilogue_insns() (gcc/function.c) is where these
clobbers are expanded into the prologue and epilogue, but I'm not sure
what the cleanest way to convert this is. My thought was to replace the
clobber_reg() calls with one that would add an insn_call, or would it be
better to do this in thread_prologue_and_epilogue_insns() where prologue
and epilogue generation belongs? But that function is for all targets.
Any pointers greatly appreciated!
For reference, this is my 64-bit test case:
outline_test.h:
extern void my_sysv_func(void);
extern int __attribute__((ms_abi)) my_ms_abi_func(void);
outline_test_asm.s:
.global ms_abi_push_regs
.global ms_abi_pop_regs
.global my_ms_abi_func
ms_abi_push_regs:
pop %rax
push %rdi
push %rsi
sub $0xa8,%rsp
movaps %xmm6,(%rsp)
movaps %xmm7,0x10(%rsp)
movaps %xmm8,0x20(%rsp)
movaps %xmm9,0x30(%rsp)
movaps %xmm10,0x40(%rsp)
movaps %xmm11,0x50(%rsp)
movaps %xmm12,0x60(%rsp)
movaps %xmm13,0x70(%rsp)
movaps %xmm14,0x80(%rsp)
movaps %xmm15,0x90(%rsp)
jmp *(%rax)
ms_abi_pop_regs:
movaps (%rsp),%xmm6
movaps 0x10(%rsp),%xmm7
movaps 0x20(%rsp),%xmm8
movaps 0x30(%rsp),%xmm9
movaps 0x40(%rsp),%xmm10
movaps 0x50(%rsp),%xmm11
movaps 0x60(%rsp),%xmm12
movaps 0x70(%rsp),%xmm13
movaps 0x80(%rsp),%xmm14
movaps 0x90(%rsp),%xmm15
add $0xa8,%rsp
pop %rsi
pop %rdi
retq
my_ms_abi_func:
callq ms_abi_push_regs
callq my_sysv_func
xor %eax, %eax
jmp ms_abi_pop_regs
Thanks!
Daniel