On 30/11/16 16:47, Kyrill Tkachov wrote: > Hi all, > > In this awkward ICE we have a *load_multiple pattern that is being > transformed in reload from: > (insn 55 67 151 3 (parallel [ > (set (reg:SI 0 r0) > (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32])) > (set (reg:SI 158 [ c+4 ]) > (mem/u/c:SI (plus:SI (reg/f:SI 147) > (const_int 4 [0x4])) [2 c+4 S4 A32])) > ]) arm-crash.c:25 393 {*load_multiple} > (expr_list:REG_UNUSED (reg:SI 0 r0) > (nil))) > > > into the invalid: > (insn 55 67 70 3 (parallel [ > (set (reg:SI 0 r0) > (mem/u/c:SI (reg/f:SI 5 r5 [147]) [2 c+0 S4 A32])) > (set (mem/c:SI (plus:SI (reg/f:SI 102 sfp) > (const_int -4 [0xfffffffffffffffc])) [4 %sfp+-12 > S4 A32]) > (mem/u/c:SI (plus:SI (reg/f:SI 5 r5 [147]) > (const_int 4 [0x4])) [2 c+4 S4 A32])) > ]) arm-crash.c:25 393 {*load_multiple} > (nil)) > > The operands of *load_multiple are not validated through constraints > like LRA is used to, but rather through > a match_parallel predicate which ends up calling ldm_stm_operation_p to > validate the multiple sets. > But this means that LRA cannot reason about the constraints properly. > This two-regiseter load should not have used *load_multiple anyway, it > should have used *ldm2_ from ldmstm.md > and indeed it did until the loop2_invariant pass which copied the ldm2_ > pattern: > (insn 27 23 28 4 (parallel [ > (set (reg:SI 0 r0) > (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32])) > (set (reg:SI 1 r1) > (mem/u/c:SI (plus:SI (reg/f:SI 147) > (const_int 4 [0x4])) [2 c+4 S4 A32])) > ]) "ldm.c":25 385 {*ldm2_} > (nil)) > > into: > (insn 55 19 67 3 (parallel [ > (set (reg:SI 0 r0) > (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32])) > (set (reg:SI 158) > (mem/u/c:SI (plus:SI (reg/f:SI 147) > (const_int 4 [0x4])) [2 c+4 S4 A32])) > ]) "ldm.c":25 404 {*load_multiple} > (expr_list:REG_UNUSED (reg:SI 0 r0) > (nil))) > > Note that it now got recognised as load_multiple because the second > register is not a hard register but the pseudo 158. > In any case, the solution suggested in the PR (and I agree with it) is > to restrict *load_multiple to after reload. > The similar pattern *load_multiple_with_writeback also has a similar > condition and the comment above *load_multiple says that > it's used to generate epilogues, which is done after reload anyway. For > pre-reload load-multiples the patterns in ldmstm.md > should do just fine. > > Bootstrapped and tested on arm-none-linux-gnueabihf. > > Ok for trunk? >
I don't think this is right. Firstly, these patterns look to me like the ones used for memcpy expansion, so not recognizing them could lead to compiler aborts. Secondly, the bug is when we generate (insn 55 67 70 3 (parallel [ (set (reg:SI 0 r0) (mem/u/c:SI (reg/f:SI 5 r5 [147]) [2 c+0 S4 A32])) (set (mem/c:SI (plus:SI (reg/f:SI 102 sfp) (const_int -4 [0xfffffffffffffffc])) [4 %sfp+-12 S4 A32]) (mem/u/c:SI (plus:SI (reg/f:SI 5 r5 [147]) (const_int 4 [0x4])) [2 c+4 S4 A32])) ]) arm-crash.c:25 393 {*load_multiple} (nil)) These patterns are supposed to enforce that the load (store) target register is a hard register that is higher numbered than the hard register in the memory slot that precedes it (thus satisfying the constraints for a normal ldm/stm instruction. The real question is why did ldm_stm_operation_p permit this modified pattern through, or was the pattern validation bypassed incorrectly by the loop2 invariant code when it copied the insn and made changes? R. > Thanks, > Kyrill > > 2016-11-30 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > > PR target/71436 > * config/arm/arm.md (*load_multiple): Add reload_completed to > matching condition. > > 2016-11-30 Kyrylo Tkachov <kyrylo.tkac...@arm.com> > > PR target/71436 > * gcc.c-torture/compile/pr71436.c: New test. > > arm-ldm.patch > > > commit 996d28e2353badd1b29ef000f94d40c7dab9010f > Author: Kyrylo Tkachov <kyrylo.tkac...@arm.com> > Date: Tue Nov 29 15:07:30 2016 +0000 > > [ARM] Restrict *load_multiple pattern till after LRA > > diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md > index 74c44f3..22d2a84 100644 > --- a/gcc/config/arm/arm.md > +++ b/gcc/config/arm/arm.md > @@ -11807,12 +11807,15 @@ (define_insn "<crc_variant>" > > ;; Patterns in ldmstm.md don't cover more than 4 registers. This pattern > covers > ;; large lists without explicit writeback generated for APCS_FRAME epilogue. > +;; The operands are validated through the load_multiple_operation > +;; match_parallel predicate rather than through constraints so enable it only > +;; after reload. > (define_insn "*load_multiple" > [(match_parallel 0 "load_multiple_operation" > [(set (match_operand:SI 2 "s_register_operand" "=rk") > (mem:SI (match_operand:SI 1 "s_register_operand" "rk"))) > ])] > - "TARGET_32BIT" > + "TARGET_32BIT && reload_completed" > "* > { > arm_output_multireg_pop (operands, /*return_pc=*/false, > diff --git a/gcc/testsuite/gcc.c-torture/compile/pr71436.c > b/gcc/testsuite/gcc.c-torture/compile/pr71436.c > new file mode 100644 > index 0000000..ab08d5d > --- /dev/null > +++ b/gcc/testsuite/gcc.c-torture/compile/pr71436.c > @@ -0,0 +1,35 @@ > +/* PR target/71436. */ > + > +#pragma pack(1) > +struct S0 > +{ > + volatile int f0; > + short f2; > +}; > + > +void foo (struct S0 *); > +int a, d; > +static struct S0 b[5]; > +static struct S0 c; > +void fn1 (); > +void > +main () > +{ > + { > + struct S0 e; > + for (; d; fn1 ()) > + { > + { > + a = 3; > + for (; a >= 0; a -= 1) > + { > + { > + e = c; > + } > + b[a] = e; > + } > + } > + } > + } > + foo (b); > +} >