Ping. The kernel is still broken on AArch64.

On 30/09/2020 11:39, Alex Coplan via Gcc-patches wrote:
> Currently, make_extraction() identifies where we can emit an ASHIFT of
> an extend in place of an extraction, but fails to make the corresponding
> canonicalization/simplification when presented with a MULT by a power of
> two. Such a representation is canonical when representing a left-shifted
> address inside a MEM.
> 
> This patch remedies this situation: after the patch, make_extraction()
> now also identifies RTXs such as:
> 
> (mult:DI (subreg:DI (reg:SI r)) (const_int 2^n))
> 
> and rewrites this as:
> 
> (mult:DI (sign_extend:DI (reg:SI r)) (const_int 2^n))
> 
> instead of using a sign_extract.
> 
> (This patch also fixes up a comment in expand_compound_operation() which
> appears to have suffered from bitrot.)
> 
> This fixes PR96998: an ICE on AArch64 due to an unrecognised
> sign_extract insn which was exposed by
> r11-2903-g6b3034eaba83935d9f6dfb20d2efbdb34b5b00bf. That change
> introduced a canonicalisation in LRA to rewrite mult to shift in address
> reloads.
> 
> Prior to this patch, the flow was as follows. We start with the
> following insn going into combine:
> 
> (insn 9 8 10 3 (set (mem:SI (plus:DI (mult:DI (reg:DI 98 [ g ])
>                     (const_int 4 [0x4]))
>                 (reg/f:DI 96)) [3 *i_5+0 S4 A32])
>         (asm_operands:SI ("") ("=Q") 0 []
>              []
>              [] test.c:11)) "test.c":11:5 -1
>      (expr_list:REG_DEAD (reg:DI 98 [ g ])
>         (nil)))
> 
> Then combine turns this into a sign_extract:
> 
> (insn 9 8 10 3 (set (mem:SI (plus:DI (sign_extract:DI (mult:DI (subreg:DI 
> (reg/v:SI 92 [ g ]) 0)
>                         (const_int 4 [0x4]))
>                     (const_int 34 [0x22])
>                     (const_int 0 [0]))
>                 (reg/f:DI 96)) [3 *i_5+0 S4 A32])
>         (asm_operands:SI ("") ("=Q") 0 []
>              []
>              [] test.c:11)) "test.c":11:5 -1
>      (expr_list:REG_DEAD (reg/v:SI 92 [ g ])
>         (nil)))
> 
> Then LRA reloads the address and (prior to the LRA change) we get:
> 
> (insn 32 8 9 3 (set (reg:DI 0 x0 [100])
>         (plus:DI (sign_extract:DI (mult:DI (reg:DI 0 x0 [orig:92 g ] [92])
>                     (const_int 4 [0x4]))
>                 (const_int 34 [0x22])
>                 (const_int 0 [0]))
>             (reg/f:DI 19 x19 [96]))) "test.c":11:5 283 {*add_extvdi_multp2}
>      (nil))
> (insn 9 32 10 3 (set (mem:SI (reg:DI 0 x0 [100]) [3 *i_5+0 S4 A32])
>         (asm_operands:SI ("") ("=Q") 0 []
>              []
>              [] test.c:11)) "test.c":11:5 -1
>      (nil))
> 
> Now observe that insn 32 here is not canonical: firstly, we should be
> using an ASHIFT by 2 instead of a MULT by 4, since we're outside of a
> MEM. Indeed, the LRA change remedies this, and support for such insns in
> the AArch64 backend was dropped in
> r11-3033-g2f8ae301f6a125f50b0a758047fcddae7b68daa8.
> 
> Now the reason we ICE after the LRA change here is that AArch64 has
> never supported the ASHIFT variant of this sign_extract insn. Inspecting
> the unrecognised reloaded insn confirms this:
> 
> (gdb) p debug(insn)
> (insn 33 8 34 3 (set (reg:DI 100)
>         (sign_extract:DI (ashift:DI (subreg:DI (reg/v:SI 92 [ g ]) 0)
>                 (const_int 2 [0x2]))
>             (const_int 34 [0x22])
>             (const_int 0 [0]))) "test.c":11:5 -1
>      (nil))
> 
> The thesis of this patch is that combine should _never_ be producing
> such an insn. Clearly this should be canonicalised as an extend
> operation instead (as combine already does in make_extraction() for the
> ASHIFT form). After this change to combine, we get:
> 
> (insn 9 8 10 3 (set (mem:SI (plus:DI (mult:DI (sign_extend:DI (reg/v:SI 92 [ 
> g ]))
>                     (const_int 4 [0x4]))
>                 (reg/f:DI 96)) [3 *i_5+0 S4 A32])
>         (asm_operands:SI ("") ("=Q") 0 []
>              []
>              [] test.c:11)) "test.c":11:5 -1
>      (expr_list:REG_DEAD (reg/v:SI 92 [ g ])
>         (nil)))
> 
> coming out of combine, and LRA can happily reload the address:
> 
> (insn 32 8 9 3 (set (reg:DI 0 x0 [100])
>         (plus:DI (ashift:DI (sign_extend:DI (reg/v:SI 0 x0 [orig:92 g ] [92]))
>                 (const_int 2 [0x2]))
>             (reg/f:DI 19 x19 [96]))) "test.c":11:5 245 {*add_extendsi_shft_di}
>      (nil))
> (insn 9 32 10 3 (set (mem:SI (reg:DI 0 x0 [100]) [3 *i_5+0 S4 A32])
>         (asm_operands:SI ("") ("=Q") 0 []
>              []
>              [] test.c:11)) "test.c":11:5 -1
>      (nil))
> 
> and all is well, with nice simple and canonical RTL being used
> throughout.
> 
> Testing:
>  * Bootstrap and regtest on aarch64-linux-gnu, arm-linux-gnueabihf, and
>    x86-linux-gnu in progress.
> 
> OK for trunk (with AArch64 changes discussed here [0] as a follow-on
> patch) provided it passes testing?
> 
> Thanks,
> Alex
> 
> [0] : https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554257.html
> 
> ---
> 
> gcc/ChangeLog:
> 
>       PR target/96998
>       * combine.c (expand_compound_operation): Tweak variable name in
>       comment to match source.
>       (make_extraction): Handle mult by power of two in addition to
>       ashift.
> 
> gcc/testsuite/ChangeLog:
> 
>       PR target/96998
>       * gcc.c-torture/compile/pr96998.c: New test.

Reply via email to