14 Regression] register allocators introduce an extra load operation since gcc-12

ubizjak at gmail dot com via Gcc-bugs Wed, 10 Apr 2024 23:54:36 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114591


--- Comment #13 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Hongtao Liu from comment #12)
> short a;
> short c;
> short d;
> void
> foo (short b, short f)
> {
>   c = b + a;
>   d = f + a;
> }
> 
> foo(short, short):
>         addw    a(%rip), %di
>         addw    a(%rip), %si
>         movw    %di, c(%rip)
>         movw    %si, d(%rip)
>         ret
> 
> this one is bad since gcc10.1 and there's no subreg, The problem is if the
> operand is used by more than 1 insn, and they all support separate m
> constraint, mem_cost is quite small(just 1, reg move cost is 2), and this
> makes RA more inclined to propagate memory across insns. I guess RA assumes
> the separate m means the insn only support memory_operand?

I don't see this as problematic. IIRC, there was a discussion in the past that
a couple (two?) memory accesses from the same location close to each other can
be faster (so, -O2, not -Os) than preloading the value to the register first.

In contrast, the example from the Comment #11 already has the correct value in
%eax, so there is no need to reload it again from memory, even in a narrower
mode.

[Bug target/114591] [12/13/14 Regression] register allocators introduce an extra load operation since gcc-12

Reply via email to