On Thu, Dec 22, 2016 at 12:00:37PM +0100, Georg-Johann Lay wrote: > On 21.12.2016 18:46, Segher Boessenkool wrote: > >On Wed, Dec 21, 2016 at 01:58:18PM +0100, Georg-Johann Lay wrote: > >>$ avr-gcc > >>/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c -S > >>-O1 -mmcu=avr4 -S -v > >> > >>/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c: In > >>function 'yasm_lc3b__parse_insn': > >>/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c:19:1: > >>error: insn does not satisfy its constraints: > >> } > >> ^ > >>(jump_insn 58 98 59 9 (set (pc) > >> (if_then_else (eq (and:HI (reg:HI 31 r31) > >> (const_int 1 [0x1])) > >> (const_int 0 [0])) > >> (label_ref 70) > >> (pc))) > >>"/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c":11 > >>415 {*sbrx_and_branchhi} > >> (int_list:REG_BR_PROB 375 (nil)) > >> -> 70) > > > > > >>Combine comes up with the following insn: > >>(jump_insn 58 57 59 7 (set (pc) > >> (if_then_else (eq (and:HI (subreg:HI (mem:QI (reg/v/f:HI 75 [ > >>operands ]) [1 *operands_17(D)+0 S1 A8]) 0) > >> (const_int 1 [0x1])) > >> (const_int 0 [0])) > >> (label_ref 70) > >> (pc))) > >>"/home/georg/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c":11 > >>415 {*sbrx_and_branchhi} > >> (int_list:REG_BR_PROB 375 (nil)) > >> -> 70) > >> > >>which cannot be correct because avr.md::*sbrx_and_branchhi reads: > >> > >>(define_insn "*sbrx_and_branch<mode>" > >> [(set (pc) > >> (if_then_else > >> (match_operator 0 "eqne_operator" > >> [(and:QISI > >> (match_operand:QISI 1 "register_operand" "r") > >> (match_operand:QISI 2 "single_one_operand" "n")) > >> (const_int 0)]) > >> (label_ref (match_operand 3 "" "")) > >> (pc)))] > >> "" { ... } ...) > >> > >>Hence we have a memory operand (subreg of mem)) where only a register is > >>allowed. Reg alloc then reloads the mem:QI into R31, but R31 is the > >>last hard reg, i.e. R31 cannot hold HImode. > > > >If you don't have instruction scheduling subregs of mem are allowed (and > >are counted as registers). Combine asks recog, and it think this is fine. > > > >Why does reload use r31 here? Why does it think that is valid for HImode? > > I can't tell you, all I'm seeing bunch of new FAILs after r243578. > For a simpler test case like the following > > unsigned u; > int volatile v; > > void btest1 (void) > { > if (u & 1) > v = 0; > v = 0; > } > > the combine pattern is also used but the register pressure is smaller. > After quite some spills, reload then comes up with > > (insn 18 7 8 2 (set (reg:QI 24 r24) > (mem/c:QI (symbol_ref:HI ("u") <var_decl 0x7f73ef303900 u>) > [1 u+0 S1 A8])) "testbit.c":6 56 {movqi_insn} > (nil)) > (jump_insn 8 18 9 2 (set (pc) > (if_then_else (eq (and:HI (reg:HI 24 r24) > (const_int 1 [0x1])) > (const_int 0 [0])) > (label_ref 11) > (pc))) "testbit.c":6 415 {*sbrx_and_branchhi} > (int_list:REG_BR_PROB 4600 (nil)) > -> 11) > > i.e. the paradoxical subreg could be resolved somehow and R25 is > uninitialized. It this the purpose of the combine change to come > up with uninitialized regs because it is known that just the > initialized parts are used by the code?
If a (zero_extract (foo) (const_int) (const_int) does not match in combine, it calls change_zero_ext to replace it with (and (lshiftrt (foo) (const_int)) (const_int) (The zero_extract expression is generated by combine for insn 55+56->57.) Prior to the patch, this was done only if the zero_extract has the same mode as "foo". With the patch, it also deals with mode expanding zero_extracts, e.g. (zero_extract:DI (foo:SI) (const_int) (const_int)) -> (and:DI (subreg:DI (lshiftrt:SI (const_int) 0) (const_int)) This is what combine has done: Trying 55, 56 -> 57: ... Failed to match this instruction: (set (reg:HI 78) (zero_extract:HI (mem:QI (reg/v/f:HI 75 [ operands ]) [1 *operands_17(D)+0 S1 A8]) (const_int 1 [0x1]) (const_int 0 [0]))) Successfully matched this instruction: (set (reg:HI 78) (and:HI (subreg:HI (mem:QI (reg/v/f:HI 75 [ operands ]) [1 *operands_17(D)+0 S1 A8]) 0) (const_int 1 [0x1]))) Successfully matched this instruction: (set (cc0) (compare (reg:HI 78) (const_int 0 [0]))) allowing combination of insns 55, 56 and 57 Later it combinded insn 57 -> 58 to the expression that has triggered the bad register allocation. This looks all correct to me. > Even if subregs are allowed, paradoxical subregs look really odd here. > > The ICE'ing test case, reload comes up with > > (insn 97 57 98 9 (set (reg:HI 30 r30) > (reg/v/f:HI 20 r20 [orig:75 operands ] [75])) "pr26833.c":11 > 68 {*movhi} > (nil)) > (insn 98 97 58 9 (set (reg:QI 31 r31) > (mem:QI (reg:HI 30 r30) [1 *operands_17(D)+0 S1 A8])) > "pr26833.c":11 56 {movqi_insn} > (nil)) > (jump_insn 58 98 59 9 (set (pc) > (if_then_else (eq (and:HI (reg:HI 31 r31) > (const_int 1 [0x1])) > (const_int 0 [0])) > (label_ref 70) > (pc))) "pr26833.c":11 415 {*sbrx_and_branchhi} > (int_list:REG_BR_PROB 375 (nil)) > -> 70) > > insn 98 is valid but overrides parts of HI:30 set by insn 97. > > As it appears, the QI memory is loaded to R31 which is valid, and > then reload drops in that register as replacement for the paradoxical > subreg and comes up with HI:31 in insn 58. > > Actually a zero-extend would be needed, does it? > > As far as I know, reload generates only a very limited subset of insns, > namely additions to compute frame addresses and mov insns. But > reload is not supposed to generate extends or other arithmetic. change_zero_ext rewrites any zero_extend/zero_extract. The resulting expression doesn't need any extension because it's coded into the AND constant: (and:HI (...) (const_int 1 [0x1])) > If the purpose of the change to combine.c was to optimize code, > then it would be preferred to avoid HI altogether and just use QImode. > avr.md has a similar insn for QI, so that would really give a > reasonable match. The purpose of the change_zero_ext function is: If combine cannot match an expression that contains a zero_{extend,extract}, then call the function to rephrase such subexpression in a more generic form using logical operations and try to find a match for them. The patch adds a rule to rewrite mode expanding zero_extracts, which helps with some s390 patterns. Before the patch, combine doesn't find a match for 55+56->57. With the patch it combines 55+56->57->58, eventually leading to the fault. Please correct me if I'm wrong, but I see no bugs in the combined expressions. I think the patch only triggers something that is already broken, but well hidden. Somehow the register allocator seems to miss some important information about (subreg:HI (mem:QI (reg/v/f:HI 75 ... and looks for a register that is suitable for QImode while it actually needs an HImode register. Maybe there's some place in common code or the avr backend) where HARD_REGNO_MODE_OK should be used but isn't? Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany