http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57281

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |wrong-code
             Target|                            |i?86-*-*
             Status|ASSIGNED                    |NEW
   Target Milestone|---                         |4.9.0
            Summary|x86_64-linux loop fails to  |[4.9 Regression]
                   |terminate at -O3 -m32       |x86_64-linux loop fails to
                   |                            |terminate at -O3 -m32,
                   |                            |bogus extendsidi2_1
                   |                            |splitter

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
But the miscompile happens during RTL opts:

  <bb 5>:
  f.0_13 ={v} f;
  *pretmp_4 = 0;
  b.5_1 = b;
  _23 = (long long int) b.5_1;
  *pretmp_18 = _23;
  *pretmp_4 = 0;
  b.2_27 = b;
  b.3_28 = b.2_27 + -1;
  b = b.3_28;
  if (b.3_28 != -20)
    goto <bb 6>;
  else
    goto <bb 10>;

  <bb 6>:
  goto <bb 5>;

is ok, while

.L4:
        movl    f, %eax
        movl    f+4, %edx
        movl    $0, (%ecx)
        movl    b, %eax
        movl    %eax, (%ebx)
        sarl    $31, %eax
        movl    %eax, 4(%ebx)
        subl    $1, %eax
        cmpl    $-20, %eax
        movl    %eax, b
        jne     .L4

messes up the IV value - the store of the upper half of the long long
extended b into *pretmp_18 invalidates b (%eax) but that value is then
still used for the IV.

Ok pre-IRA/LRA:

(insn 19 18 20 4 (parallel [
            (set (mem:DI (reg/f:SI 68 [ D.1736 ]) [4 *_18+0 S8 A64])
                (sign_extend:DI (reg:SI 75 [ b ])))
            (clobber (reg:CC 17 flags))
            (clobber (scratch:SI))
        ]) t.c:9 137 {extendsidi2_1}
     (expr_list:REG_DEAD (reg:SI 75 [ b ])
        (expr_list:REG_UNUSED (reg:CC 17 flags)
            (nil))))
...
(insn 21 20 22 4 (set (reg:SI 76 [ b ])
        (mem/c:SI (symbol_ref:SI ("b") [flags 0x2]  <var_decl 0x7fccdf73b098
b>) [2 b+0 S4 A32])) t.c:17 85 {*movsi_internal}
     (nil))
(insn 22 21 23 4 (parallel [
            (set (reg:SI 72 [ b.3 ])
                (plus:SI (reg:SI 76 [ b ])
                    (const_int -1 [0xffffffffffffffff])))
            (clobber (reg:CC 17 flags))

and after:

(insn 19 78 20 4 (parallel [
            (set (mem:DI (reg/f:SI 3 bx [orig:68 D.1736 ] [68]) [4 *_18+0 S8
A64])
                (sign_extend:DI (reg:SI 0 ax [orig:75 b ] [75])))
            (clobber (reg:CC 17 flags))
            (clobber (reg:SI 1 dx [80]))
        ]) t.c:9 137 {extendsidi2_1}
     (expr_list:REG_UNUSED (reg:SI 1 dx [80])
        (expr_list:REG_DEAD (reg:SI 0 ax [orig:75 b ] [75])
            (nil))))
(insn 20 19 21 4 (set (mem:SI (reg/f:SI 2 cx [orig:61 D.1734 ] [61]) [2 *_4+0
S4 A32])
        (const_int 0 [0])) t.c:21 85 {*movsi_internal}
     (nil))
(note 21 20 80 4 NOTE_INSN_DELETED)
(insn 80 21 22 4 (set (reg:SI 0 ax [82])
        (mem/c:SI (symbol_ref:SI ("b") [flags 0x2]  <var_decl 0x7fccdf73b098
b>) [2 b+0 S4 A32])) t.c:17 85 {*movsi_internal}
     (nil))

and postreload optimizes it to

(insn 78 18 19 4 (set (reg:SI 0 ax [orig:75 b ] [75])
        (mem/c:SI (symbol_ref:SI ("b") [flags 0x2]  <var_decl 0x7fccdf73b098
b>) [2 b+0 S4 A32])) t.c:9 85 {*movsi_internal}
     (expr_list:REG_EQUIV (mem/c:SI (symbol_ref:SI ("b") [flags 0x2]  <var_decl
0x7fccdf73b098 b>) [2 b+0 S4 A32])
        (nil)))
(insn 19 78 21 4 (parallel [
            (set (mem:DI (reg/f:SI 3 bx [orig:68 D.1736 ] [68]) [4 *_18+0 S8
A64])
                (sign_extend:DI (reg:SI 0 ax [orig:75 b ] [75])))
            (clobber (reg:CC 17 flags))
            (clobber (reg:SI 1 dx [80]))
        ]) t.c:9 137 {extendsidi2_1}
     (expr_list:REG_UNUSED (reg:SI 1 dx [80])
        (expr_list:REG_DEAD (reg:SI 0 ax [orig:75 b ] [75])
            (nil))))
(note 21 19 80 4 NOTE_INSN_DELETED)
(insn 80 21 22 4 (set (reg:SI 0 ax [82])
        (reg:SI 0 ax [orig:75 b ] [75])) t.c:17 85 {*movsi_internal}
     (nil))
(insn 22 80 23 4 (parallel [
            (set (reg:SI 0 ax [orig:72 b.3 ] [72])
                (plus:SI (reg:SI 0 ax [82])
                    (const_int -1 [0xffffffffffffffff])))
            (clobber (reg:CC 17 flags))

but broken by split, which clobbers ax:

(insn 78 18 89 4 (set (reg:SI 0 ax [orig:75 b ] [75])
        (mem/c:SI (symbol_ref:SI ("b") [flags 0x2]  <var_decl 0x7fccdf73b098
b>) [2 b+0 S4 A32])) t.c:9 85 {*movsi_internal}
     (expr_list:REG_EQUIV (mem/c:SI (symbol_ref:SI ("b") [flags 0x2]  <var_decl
0x7fccdf73b098 b>) [2 b+0 S4 A32])
        (nil)))
(insn 89 78 90 4 (set (mem:SI (reg/f:SI 3 bx [orig:68 D.1736 ] [68]) [4 *_18+0
S4 A64])
        (reg:SI 0 ax [orig:75 b ] [75])) t.c:9 85 {*movsi_internal}
     (nil))
(insn 90 89 91 4 (parallel [
            (set (reg:SI 0 ax [orig:75 b ] [75])
                (ashiftrt:SI (reg:SI 0 ax [orig:75 b ] [75])
                    (const_int 31 [0x1f])))
            (clobber (reg:CC 17 flags))
        ]) t.c:9 529 {ashrsi3_cvt}
     (nil))
(insn 91 90 21 4 (set (mem:SI (plus:SI (reg/f:SI 3 bx [orig:68 D.1736 ] [68])
                (const_int 4 [0x4])) [4 *_18+4 S4 A32])
        (reg:SI 0 ax [orig:75 b ] [75])) t.c:9 85 {*movsi_internal}
     (nil))
(note 21 91 22 4 NOTE_INSN_DELETED)
(insn 22 21 23 4 (parallel [
            (set (reg:SI 0 ax [orig:72 b.3 ] [72])
                (plus:SI (reg:SI 0 ax [82])
                    (const_int -1 [0xffffffffffffffff])))
            (clobber (reg:CC 17 flags))
        ]) t.c:17 260 {*addsi_1}
     (expr_list:REG_DEAD (reg:SI 0 ax [82])
        (expr_list:REG_EQUAL (plus:SI (mem/c:SI (symbol_ref:SI ("b") [flags
0x2]  <var_decl 0x7fccdf73b098 b>) [2 b+0 S4 A32])
                (const_int -1 [0xffffffffffffffff]))
            (nil))))
(insn 23 22 24 4 (set (mem/c:SI (symbol_ref:SI ("b") [flags 0x2]  <var_decl
0x7fccdf73b098 b>) [2 b+0 S4 A32])
        (reg:SI 0 ax [orig:72 b.3 ] [72])) t.c:17 85 {*movsi_internal}
     (nil))

Reply via email to