On Thu, Dec 18, 2014 at 3:09 PM, H.J. Lu <hjl.to...@gmail.com> wrote:

>>>>> The Linux kernel never passes floating point arguments around, vararg
>>>>> functions or not. Hence no vector registers are ever used when calling a
>>>>> vararg function.  But gcc still dutifully emits an "xor %eax,%eax" before
>>>>> each and every call of a vararg function.  Since no callee use that for
>>>>> anything, these instructions are redundant.
>>>>>
>>>>> This patch adds the -mskip-rax-setup option to skip setting up RAX
>>>>> register when SSE is disabled and there are no variable arguments passed
>>>>> in vector registers.  Since RAX register is used to avoid unnecessarily
>>>>> saving vector registers on stack when passing variable arguments, the
>>>>> impacts of this option are callees may waste some stack space, misbehave
>>>>> or jump to a random location.  GCC 4.4 or newer don't those issues,
>>>>> regardless the RAX register value since they don't check the RAX register
>>>>> value when SSE is disabled, regardless the RAX register value:
>>>>>
>>>>> https://gcc.gnu.org/ml/gcc-patches/2008-09/msg00127.html
>>>>>
>>>>> I used it on kernel 3.17.7:
>>>>>
>>>>>    text    data     bss       dec     hex    filename
>>>>> 11493571  2271232  5926912  19691715 12c78c3 vmlinux.skip-rax
>>>>> 11517879  2271232  5926912  19716023 12cd7b7 vmlinux.orig
>>>>>
>>>>> It removed 14309 redundant "xor %eax,%eax" instructions and saved about
>>>>> 27KB.  I am currently running the new kernel without any problem.  OK
>>>>> for trunk?
>>>>
>>>> How about skipping RAX setup unconditionally for !TARGET_SSE? Please
>>>> see ix86_conditional_register_usage, where SSE registers are squashed
>>>> for !TARGET_SSE, so it is not possible to use them even in the inline
>>>> asm.
>>>
>>> ... when -ffreestanding is in effect, of course.
>>
>> Ops, this is not the unconditional default kernel compile flag. It is
>> defined only for 32bit builds, where:
>>
>> # temporary until string.h is fixed
>> KBUILD_CFLAGS += -ffreestanding
>>
>> Yes, it looks to me that new option is the way to go.
>
> Is this an OK?

In principle, I'm OK with the patch approach, but let's wait for
eventual comments from Linux people.

> Some really old gcc versions used an indirect jump based on the eax input
> and they didn't zero extend first.  So with those compilers, you could 
> actually
> jump to a random location.  You can enable it only when you can compile
> everything with a newer GCC.

Uros.

Reply via email to