Re: [PATCH v2] combine: Improve change_zero_ext, call simplify_set afterwards.

Georg-Johann Lay Thu, 22 Dec 2016 07:19:18 -0800

On 22.12.2016 15:27, Dominik Vogt wrote:

On Thu, Dec 22, 2016 at 12:00:37PM +0100, Georg-Johann Lay wrote:

On 21.12.2016 18:46, Segher Boessenkool wrote:

On Wed, Dec 21, 2016 at 01:58:18PM +0100, Georg-Johann Lay wrote:

$ avr-gcc
/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c -S
-O1 -mmcu=avr4 -S -v


/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c: In
function 'yasm_lc3b__parse_insn':
/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c:19:1:
error: insn does not satisfy its constraints:
}
^
(jump_insn 58 98 59 9 (set (pc)
       (if_then_else (eq (and:HI (reg:HI 31 r31)
                   (const_int 1 [0x1]))
               (const_int 0 [0]))
           (label_ref 70)
           (pc)))
"/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c":11
415 {*sbrx_and_branchhi}
    (int_list:REG_BR_PROB 375 (nil))
-> 70)

Combine comes up with the following insn:
(jump_insn 58 57 59 7 (set (pc)
       (if_then_else (eq (and:HI (subreg:HI (mem:QI (reg/v/f:HI 75 [
operands ]) [1 *operands_17(D)+0 S1 A8]) 0)
                   (const_int 1 [0x1]))
               (const_int 0 [0]))
           (label_ref 70)
           (pc)))
"/home/georg/gnu/gcc.gnu.org/trunk/gcc/testsuite/gcc.c-torture/compile/pr26833.c":11
415 {*sbrx_and_branchhi}
    (int_list:REG_BR_PROB 375 (nil))
-> 70)

which cannot be correct because avr.md::*sbrx_and_branchhi reads:

(define_insn "*sbrx_and_branch<mode>"
 [(set (pc)
       (if_then_else
        (match_operator 0 "eqne_operator"
                        [(and:QISI
                          (match_operand:QISI 1 "register_operand" "r")
                          (match_operand:QISI 2 "single_one_operand" "n"))
                         (const_int 0)])
        (label_ref (match_operand 3 "" ""))
        (pc)))]
 "" { ... } ...)

Hence we have a memory operand (subreg of mem)) where only a register is
allowed.  Reg alloc then reloads the mem:QI into R31, but R31 is the
last hard reg, i.e. R31 cannot hold HImode.


If you don't have instruction scheduling subregs of mem are allowed (and
are counted as registers).  Combine asks recog, and it think this is fine.

Why does reload use r31 here?  Why does it think that is valid for HImode?


I can't tell you, all I'm seeing bunch of new FAILs after r243578.
For a simpler test case like the following

unsigned u;
int volatile v;

void btest1 (void)
{
    if (u & 1)
        v = 0;
    v = 0;
}

the combine pattern is also used but the register pressure is smaller.
After quite some spills, reload then comes up with

(insn 18 7 8 2 (set (reg:QI 24 r24)
        (mem/c:QI (symbol_ref:HI ("u")  <var_decl 0x7f73ef303900 u>)
[1 u+0 S1 A8])) "testbit.c":6 56 {movqi_insn}
     (nil))
(jump_insn 8 18 9 2 (set (pc)
        (if_then_else (eq (and:HI (reg:HI 24 r24)
                    (const_int 1 [0x1]))
                (const_int 0 [0]))
            (label_ref 11)
            (pc))) "testbit.c":6 415 {*sbrx_and_branchhi}
     (int_list:REG_BR_PROB 4600 (nil))
 -> 11)

i.e. the paradoxical subreg could be resolved somehow and R25 is
uninitialized.  It this the purpose of the combine change to come
up with uninitialized regs because it is known that just the
initialized parts are used by the code?


If a

  (zero_extract (foo) (const_int) (const_int)

does not match in combine, it calls change_zero_ext to replace it
with

  (and (lshiftrt (foo) (const_int)) (const_int)

(The zero_extract expression is generated by combine for insn
55+56->57.)  Prior to the patch, this was done only if the
zero_extract has the same mode as "foo".  With the patch, it also
deals with mode expanding zero_extracts, e.g.

  (zero_extract:DI (foo:SI) (const_int) (const_int))

->

  (and:DI (subreg:DI (lshiftrt:SI (const_int) 0) (const_int))

This is what combine has done:

Trying 55, 56 -> 57:
...
Failed to match this instruction:
(set (reg:HI 78)
    (zero_extract:HI (mem:QI (reg/v/f:HI 75 [ operands ]) [1
*operands_17(D)+0 S1 A8])
        (const_int 1 [0x1])
        (const_int 0 [0])))
Successfully matched this instruction:
(set (reg:HI 78)
    (and:HI (subreg:HI (mem:QI (reg/v/f:HI 75 [ operands ]) [1
*operands_17(D)+0 S1 A8]) 0)
        (const_int 1 [0x1])))
Successfully matched this instruction:
(set (cc0)
    (compare (reg:HI 78)
        (const_int 0 [0])))
allowing combination of insns 55, 56 and 57

Later it combinded insn 57 -> 58 to the expression that has
triggered the bad register allocation.  This looks all correct to
me.

Even if subregs are allowed, paradoxical subregs look really odd here.

The ICE'ing test case, reload comes up with

(insn 97 57 98 9 (set (reg:HI 30 r30)
        (reg/v/f:HI 20 r20 [orig:75 operands ] [75])) "pr26833.c":11
68 {*movhi}
     (nil))
(insn 98 97 58 9 (set (reg:QI 31 r31)
        (mem:QI (reg:HI 30 r30) [1 *operands_17(D)+0 S1 A8]))
"pr26833.c":11 56 {movqi_insn}
     (nil))
(jump_insn 58 98 59 9 (set (pc)
        (if_then_else (eq (and:HI (reg:HI 31 r31)
                    (const_int 1 [0x1]))
                (const_int 0 [0]))
            (label_ref 70)
            (pc))) "pr26833.c":11 415 {*sbrx_and_branchhi}
     (int_list:REG_BR_PROB 375 (nil))
 -> 70)

insn 98 is valid but overrides parts of HI:30 set by insn 97.

As it appears, the QI memory is loaded to R31 which is valid, and
then reload drops in that register as replacement for the paradoxical
subreg and comes up with HI:31 in insn 58.

Actually a zero-extend would be needed, does it?

As far as I know, reload generates only a very limited subset of insns,
namely additions to compute frame addresses and mov insns.  But
reload is not supposed to generate extends or other arithmetic.


change_zero_ext rewrites any zero_extend/zero_extract.  The
resulting expression doesn't need any extension because it's coded
into the AND constant:

  (and:HI (...) (const_int 1 [0x1]))

If the purpose of the change to combine.c was to optimize code,
then it would be preferred to avoid HI altogether and just use QImode.
avr.md has a similar insn for QI, so that would really give a
reasonable match.


The purpose of the change_zero_ext function is:  If combine cannot
match an expression that contains a zero_{extend,extract}, then
call the function to rephrase such subexpression in a more generic
form using logical operations and try to find a match for them.

The patch adds a rule to rewrite mode expanding zero_extracts,
which helps with some s390 patterns.  Before the patch, combine
doesn't find a match for 55+56->57.  With the patch it combines
55+56->57->58, eventually leading to the fault.

Please correct me if I'm wrong, but I see no bugs in the combined
expressions.


Algebraically it looks correct, but I am unsure if reload is supposed
to handle such situations.  And I must admit I never saw "paradoxical"
extractions before; I would expected an extension in that case, not an
extraction.

I think the patch only triggers something that is already broken,
but well hidden.  Somehow the register allocator seems to miss
some important information about

  (subreg:HI (mem:QI (reg/v/f:HI 75 ...


This is effectively a zero-extend or a sign-extend, smells like
a secondary reload.

and looks for a register that is suitable for QImode while it
actually needs an HImode register.  Maybe there's some place in
common code or the avr backend) where HARD_REGNO_MODE_OK should be
used but isn't?


We need a reload expert, IMO reload is not supposed to break out
and handle such extensions because it needs a QI reg and match
its constraints and also the QI->HI extension and match its
constraints (which have a "r,0" combination for zero-extend.

Johann


Ciao

Dominik ^_^  ^_^

Re: [PATCH v2] combine: Improve change_zero_ext, call simplify_set afterwards.

Reply via email to