Re: [PATCH] rs6000: Make the ctr* patterns allow ints in vector regs (PR71763)

Segher Boessenkool Fri, 08 Jul 2016 01:59:49 -0700

On Fri, Jul 08, 2016 at 12:37:55PM +0930, Alan Modra wrote:
> The regression tests passed.  I've been looking at differences in
> gcc/*.o and find many cases like the following.
> 
> orig/combine.o
>     1508:     01 00 3f 2c     cmpdi   r31,1
>     150c:     ff ff ff 3b     addi    r31,r31,-1
>     1510:     dc fe 82 41     beq     13ec
> patched/combine.o
>     1508:     ff ff ff 37     addic.  r31,r31,-1
>     150c:     e0 fe 82 41     beq     13ec
> 
> Combine transforms the first sequence to the second, then further
> transforms that to a bdz (ctr<mode>).  When that fails to get ctr
> allocated, the splitter takes us all the way back to the three insn
> sequence..


It used to do the addic. insn.  When I made the carry bit exposed to GCC,
it no longer was possible to always split to addic. though (CA might be
live there already).  Since the splitter should seldomly be used at all,
it now never splits to addic. (and addic. also is slower on some machines,
it is cracked, longer latency than you get with the compare to 1).

> With the patch we use ctr for the inner loop.  With unpatched gcc
> combine generates ctr<mode> for the outer loop, which of course uses
> ctr and isn't profitable with an inner loop using ctr.  Vagaries of
> the register allocator result in the outer loop using ctr with the
> inner one losing.  Oops, we generally want inner loops to be more
> highly optimized.

Lovely :-)


Segher

Re: [PATCH] rs6000: Make the ctr* patterns allow ints in vector regs (PR71763)

Reply via email to