On Thu, Dec 18, 2014 at 2:49 PM, Uros Bizjak <ubiz...@gmail.com> wrote:
>>> The Linux kernel never passes floating point arguments around, vararg >>> functions or not. Hence no vector registers are ever used when calling a >>> vararg function. But gcc still dutifully emits an "xor %eax,%eax" before >>> each and every call of a vararg function. Since no callee use that for >>> anything, these instructions are redundant. >>> >>> This patch adds the -mskip-rax-setup option to skip setting up RAX >>> register when SSE is disabled and there are no variable arguments passed >>> in vector registers. Since RAX register is used to avoid unnecessarily >>> saving vector registers on stack when passing variable arguments, the >>> impacts of this option are callees may waste some stack space, misbehave >>> or jump to a random location. GCC 4.4 or newer don't those issues, >>> regardless the RAX register value since they don't check the RAX register >>> value when SSE is disabled, regardless the RAX register value: >>> >>> https://gcc.gnu.org/ml/gcc-patches/2008-09/msg00127.html >>> >>> I used it on kernel 3.17.7: >>> >>> text data bss dec hex filename >>> 11493571 2271232 5926912 19691715 12c78c3 vmlinux.skip-rax >>> 11517879 2271232 5926912 19716023 12cd7b7 vmlinux.orig >>> >>> It removed 14309 redundant "xor %eax,%eax" instructions and saved about >>> 27KB. I am currently running the new kernel without any problem. OK >>> for trunk? >> >> How about skipping RAX setup unconditionally for !TARGET_SSE? Please >> see ix86_conditional_register_usage, where SSE registers are squashed >> for !TARGET_SSE, so it is not possible to use them even in the inline >> asm. > > ... when -ffreestanding is in effect, of course. Ops, this is not the unconditional default kernel compile flag. It is defined only for 32bit builds, where: # temporary until string.h is fixed KBUILD_CFLAGS += -ffreestanding Yes, it looks to me that new option is the way to go. Uros.