Re: One more patch for PR93564

Vladimir Makarov Mon, 02 Mar 2020 07:47:29 -0800


On 2020-03-02 10:17 a.m., Jeff Law wrote:

On Mon, 2020-03-02 at 15:37 +0100, Christophe Lyon wrote:

On Fri, 28 Feb 2020 at 17:39, Vladimir Makarov <vmaka...@redhat.com> wrote:

   The following patch is dealing with arm failures after submitting
original patch for PR93564.

    Changing heuristics in the original patch resulted in different order
of allocation and creating gaps in hard reg file which were not enough
for pseudos requiring double regs.  So RA started to use caller-saved
regs and additional store/load insns in function prologue. That is the
reason for some arm failures.

    The patch was successfully bootstrapped and benchmarked on x86-64.
On x86-64 SPEC2000 the patch generates a bit smaller and faster in
average code.

Hi,

This is causing another set of regressions on arm.
For instance on arm-linux-gnueabihf --with-cpu cortex-a9
--with-fpu neon-fp16:
FAIL: gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-not vmov\\.f16
FAIL: gcc.target/arm/fp16-aapcs-1.c scan-assembler vmov\\.f32\\ts1, s0
FAIL: gcc.target/arm/fp16-aapcs-3.c scan-assembler vmov\\.f32\\ts1, s0
FAIL: gcc.target/arm/fuse-caller-save.c scan-assembler-times mov\tr3, r0 1
FAIL: gcc.target/arm/unaligned-argument-2.c scan-assembler-times stm 1

I suspect at least some of these are likely just register assignments changing.

It is a generation of unexpected but still correct code. Changingheursitics can create small gaps in hard reg files which are not enoughto fit multi-regs pseudos and there will be more probability of usage ofcallee-saved regs which means loads/stores in prologue/epilogue.

As assigning to multi-regs pseudos first was never the highestpriority in the assignment (execution frequency has a higher priority),we were lucky enough to generate the expected code. In general, thesekind failures are for very small functions without loops where evenstack is not used. The more important cases are RA for big functions(as we have aggressive inlining) with loops and for these cases thelatest patch decreases SPEC2000 code size and improved the performancevisibly at least for x86-64.

In any case, I'll look at these tests but fixing all RA performanceissues and tests checking them is might be just chasing a rainbow.

Re: One more patch for PR93564

Reply via email to