Hi, currently we have somewhat non-sential setting for accumulate-ougoing-args. It is disabled for Intel chips because recent chips do have stack engines making push/pop instructions cheap, it is however enabled for AMD chips and Generic.
Originally accumulation was disabled since push/pop instructions was expensive on PentiumPro-Pentium4 and K6-K8 CPUS that did not have useful stack engines. This reason is now gone. There are still pros and cons of arg accumulation. I did quite extensive testing on AMD chips and found it performance neutral. On 32bit code it saves about 4% of code but with frame pointer disabled it expans unwind info quite a lot, so resulting binary is about 8% bigger. (This is also current default for -Os) I think we generally prefer code segment size reduction over EH frame, so we should flip the default (or disable it for cores if we decide otherwise). This patch disables accumulation by default. I intend to commit it once the bootstrap PR on unwind info is resolved if there is no significant oposition for doing so. I will also update release note explaining the code size effect. It would be great to get a heuristic enabling frame pointer for functions where doing so reduces code size without performance regressions. I think it is quite commonly the case. Bootstrapped/regtested x86_64-linux Honza * config/i38/x86-tune.def: Disable X86_TUNE_ACCUMULATE_OUTGOING_ARGS for generic and recent AMD chips Index: config/i386/x86-tune.def =================================================================== --- config/i386/x86-tune.def (revision 206233) +++ config/i386/x86-tune.def (working copy) @@ -143,7 +143,7 @@ DEF_TUNE (X86_TUNE_REASSOC_FP_TO_PARALLE regression on mgrid due to IRA limitation leading to unecessary use of the frame pointer in 32bit mode. */ DEF_TUNE (X86_TUNE_ACCUMULATE_OUTGOING_ARGS, "accumulate_outgoing_args", - m_PPRO | m_P4_NOCONA | m_BONNELL | m_SILVERMONT | m_AMD_MULTIPLE | m_GENERIC) + m_PPRO | m_P4_NOCONA | m_BONNELL | m_SILVERMONT | m_ATHLON_K8) /* X86_TUNE_PROLOGUE_USING_MOVE: Do not use push/pop in prologues that are considered on critical path. */