Re: One more patch for PR93564

Jeff Law Mon, 02 Mar 2020 08:12:27 -0800

On Mon, 2020-03-02 at 08:40 -0700, Jeff Law wrote:
> On Mon, 2020-03-02 at 08:17 -0700, Jeff Law wrote:
> > On Mon, 2020-03-02 at 15:37 +0100, Christophe Lyon wrote:
> > > On Fri, 28 Feb 2020 at 17:39, Vladimir Makarov <vmaka...@redhat.com>
> > > wrote:
> > > >   The following patch is dealing with arm failures after submitting
> > > > original patch for PR93564.
> > > > 
> > > >    Changing heuristics in the original patch resulted in different
> > > > order
> > > > of allocation and creating gaps in hard reg file which were not enough
> > > > for pseudos requiring double regs.  So RA started to use caller-saved
> > > > regs and additional store/load insns in function prologue. That is the
> > > > reason for some arm failures.
> > > > 
> > > >    The patch was successfully bootstrapped and benchmarked on x86-64.
> > > > On x86-64 SPEC2000 the patch generates a bit smaller and faster in
> > > > average code.
> > > > 
> > > 
> > > Hi,
> > > 
> > > This is causing another set of regressions on arm.
> > > For instance on arm-linux-gnueabihf --with-cpu cortex-a9
> > > --with-fpu neon-fp16:
> > > FAIL: gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-not vmov\\.f16
> > > FAIL: gcc.target/arm/fp16-aapcs-1.c scan-assembler vmov\\.f32\\ts1, s0
> > > FAIL: gcc.target/arm/fp16-aapcs-3.c scan-assembler vmov\\.f32\\ts1, s0
> > > FAIL: gcc.target/arm/fuse-caller-save.c scan-assembler-times mov\tr3, r0
> > > 1
> > > FAIL: gcc.target/arm/unaligned-argument-2.c scan-assembler-times stm 1
> > I suspect at least some of these are likely just register assignments
> > changing.
> In fact, I'm certain that's the case for fuse-caller-save.c.  I'll be looking
> at armv8_2-fp16-move-1.c as well since my tester tripped over that as
> well.  If you could evaluate the others it'd be appreciated.
And I'm now certain armv8_2-fp16-move-1.c is of a similar nature.


In that test we get a slightly different packing of registers after Vlad's IRA
changes.  The different packing into registers ultimately results in one hard
register cprop not happening after Vlad's changes.  As a result we end up with
an extra reg->reg copy and the test fails.

This may be one we just have to live with.  As we come into cprop_hardreg we
have this after Vlad's changes:

(set (reg 0) (reg 18))
(set (reg 18) (float_extend ...)

[ ... ]
set (reg 17) (reg 0)


Obviously the set to reg18 in the middle insn blocks the ability to propagate
the source of the first set into the source of the last set.  Prior to Vlad's
change that middle set used a different hard register and thus didn't block the
hard register cprop.  But that was more of an accident than anything -- Vlad's
work results in, IMHO a better hard register allocation -- which in turn
inhibits hard register cprop.

I think the thing to do is either expect the single copy or xfail the test. 
I'm going to leave it to the ARM maintainers to decide how they can to handle
that.   I don't think this very minor code quality regression is significant
enough to warrant backing out Vlad's change.

Another approach would be to see if register renaming helps here, but that's a
can of worms I don't think we want to open at this point.


Jeff

Re: One more patch for PR93564

Reply via email to