On Fri, Jan 05, 2018 at 12:22:44PM +0000, Wilco Dijkstra wrote: > The shrinkwrap optimization added late in GCC 7 allows each callee-save to > be delayed and done only across blocks which need a particular callee-save. > Although this reduces unnecessary memory traffic on code paths that need > few callee-saves, it typically uses LDR/STR rather than LDP/STP. The > number of LDP/STP instructions is reduced by ~7%. This means more memory > accesses and increased codesize, ~1.0% on average. > > To improve this, if a particular callee-save must be saved/restored, also > add the adjacent callee-save to allow use of LDP/STP. This significantly > reduces codesize (for example gcc_r, povray_r, parest_r, xalancbmk_r are > 1% smaller). This is a simple fix which can be backported. A more advanced > approach would scan blocks for pairs of callee-saves, but that requires a > rewrite of all the callee-save code which is too late at this stage. > > An example epilog in a shrinkwrapped function before: > > ldp x21, x22, [sp,#16] > ldr x23, [sp,#32] > ldr x24, [sp,#40] > ldp x25, x26, [sp,#48] > ldr x27, [sp,#64] > ldr x28, [sp,#72] > ldr x30, [sp,#80] > ldr d8, [sp,#88] > ldp x19, x20, [sp],#96 > ret > > And after this patch: > > ldr d8, [sp,#88] > ldp x21, x22, [sp,#16] > ldp x23, x24, [sp,#32] > ldp x25, x26, [sp,#48] > ldp x27, x28, [sp,#64] > ldr x30, [sp,#80] > ldp x19, x20, [sp],#96 > ret > > Passes bootstrap, OK for commit (and backport to GCC7)?
OK. Thanks, James > > ChangeLog: > 2018-01-05 Wilco Dijkstra <wdijk...@arm.com> > > * config/aarch64/aarch64.c (aarch64_components_for_bb): > Increase LDP/STP opportunities by adding adjacent callee-saves.