Re: [PATCH][AArch64] Use LDP/STP in shrinkwrapping

Segher Boessenkool Mon, 08 Jan 2018 10:24:06 -0800

On Mon, Jan 08, 2018 at 01:27:24PM +0000, Wilco Dijkstra wrote:
> Segher Boessenkool wrote:
> > On Fri, Jan 05, 2018 at 12:22:44PM +0000, Wilco Dijkstra wrote:
> >> An example epilog in a shrinkwrapped function before:
> >> 
> >> ldp    x21, x22, [sp,#16]
> >> ldr    x23, [sp,#32]
> >> ldr    x24, [sp,#40]
> >> ldp    x25, x26, [sp,#48]
> >> ldr    x27, [sp,#64]
> >> ldr    x28, [sp,#72]
> >> ldr    x30, [sp,#80]
> >> ldr    d8, [sp,#88]
> >> ldp    x19, x20, [sp],#96
> >> ret
> >
> > In this example, the compiler already can make a ldp for both x23/x24 and
> > x27/x28 just fine (if not in emit_epilogue_components, then simply in a
> > peephole); why did that not work?  Or is this not the actual generated
> > machine code (and there are labels between the insns, for example)?
> 
> This block originally had a label in it, 2 blocks emitted identical restores 
> and
> then branched to the final epilog. The final epilogue was then duplicated so
> we end up with 2 almost identical epilogs of 10 instructions (almost since
> there were 1-2 unrelated instructions in both blocks).
> 
> Peepholing is very conservative about instructions using SP and won't touch
> anything frame related. If this was working better then the backend could just
> emit single loads/stores and let peepholing generate LDP/STP.


How unfortunate; that should definitely be improved then.

Always pairing two registers together *also* degrades code quality.

> Another issue is that after pro_and_epilogue pass I see multiple restores
> of the same registers and then a branch to the same block. We should try
> to avoid the unnecessary duplication.

It already does that if *all* predecessors of that block do that.  If you
want to do it in other cases, you end up with more jumps.  That may be
beneficial in some cases, of course, but it is not an obvious win (and in
the general case it is, hrm let's use nice words, "terrible").


Segher

Re: [PATCH][AArch64] Use LDP/STP in shrinkwrapping

Reply via email to