https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89634
Bug ID: 89634 Summary: gmp-ecm miscompilation on s390x with -march=zEC12 -m64 -O2 Product: gcc Version: 8.3.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: jakub at gcc dot gnu.org CC: darkkirb at darkkirb dot de, dimhen at gmail dot com, jakub at gcc dot gnu.org, marxin at gcc dot gnu.org, nheghathivhistha at gmail dot com, rguenth at gcc dot gnu.org Depends on: 89497, 89551 Target Milestone: --- +++ This bug was initially created as a clone of Bug #89497 +++ /* PR middle-end/89497 */ static unsigned long * foo (unsigned long *x) { return x + (1 + *x); } __attribute__((noipa)) unsigned long bar (unsigned long *x) { unsigned long c, d = 1, e, *f, g, h = 0, i; for (e = *x - 1; e > 0; e--) { f = foo (x + 1); for (i = 1; i < e; i++) f = foo (f); c = *f; if (c == 2) d *= 2; else { i = (c - 1) / 2 - 1; g = (2 * i + 1) * (d + 1) + (2 * d + 1); if (g > h) h = g; d *= c; } } return h; } int main () { unsigned long a[18] = { 4, 2, -200, 200, 2, -400, 400, 3, -600, 0, 600, 5, -100, -66, 0, 66, 100, __LONG_MAX__ / 8 + 1 }; if (bar (a) != 17) __builtin_abort (); return 0; } used to be miscompiled on s390x with -march=zEC12 -m64 -O2 in r269301 and earlier, the bug went latent with r269302 change. Still, to me this looks like a postreload_jump pass bug. Before that, we have (for simplicity using -march=zEC12 -m64 -O2 -fno-reorder-blocks{,-and-partition}): ;; basic block 4, loop depth 0, count 118111600 (estimated locally), maybe hot ;; prev block 3, next block 5, flags: (REACHABLE, RTL, MODIFIED) ;; pred: 12 [84.2% (guessed)] count:69378754 (estimated locally) (DFS_BACK) ;; 3 [always] count:12992275 (estimated locally) (FALLTHRU) ;; 15 [always] count:35740571 (estimated locally) (DFS_BACK) ... (code_label 23 6 24 4 3 (nil) [2 uses]) (note 24 23 25 4 [bb 4] NOTE_INSN_BASIC_BLOCK) (note 25 24 26 4 NOTE_INSN_DELETED) (jump_insn 26 25 87 4 (parallel [ (set (pc) (if_then_else (eq (reg/v:DI 5 %r5 [orig:74 e ] [74]) (const_int 1 [0x1])) (label_ref 43) (pc))) (clobber (reg:CC 33 %cc)) ]) "rh1686696.c":16:7 1263 {*cmp_and_br_signed_di} (int_list:REG_BR_PROB 118111604 (nil)) -> 43) ;; succ: 5 [89.0% (guessed)] count:105119324 (estimated locally) (FALLTHRU) ;; 9 [11.0% (guessed)] count:12992276 (estimated locally) ... ;; basic block 5, loop depth 0, count 105119324 (estimated locally), maybe hot ;; prev block 4, next block 6, flags: (REACHABLE, RTL, MODIFIED) ;; pred: 4 [89.0% (guessed)] count:105119324 (estimated locally) (FALLTHRU) ... (jump_insn 68 67 134 12 (parallel [ (set (pc) (if_then_else (ne (reg/v:DI 5 %r5 [orig:74 e ] [74]) (const_int 1 [0x1])) (label_ref:DI 23) (pc))) (set (reg/v:DI 5 %r5 [orig:74 e ] [74]) (plus:DI (reg/v:DI 5 %r5 [orig:74 e ] [74]) (const_int -1 [0xffffffffffffffff]))) (clobber (scratch:DI)) (clobber (reg:CC 33 %cc)) ]) "rh1686696.c":13:3 1922 {doloop_di} (int_list:REG_BR_PROB 904381916 (nil)) -> 23) ;; succ: 4 [84.2% (guessed)] count:69378754 (estimated locally) (DFS_BACK) ;; 13 [15.8% (guessed)] count:12992276 (estimated locally) (FALLTHRU) but postreload_jump makes: ;; basic block 4, loop depth 0, count 48732846 (estimated locally), maybe hot ;; prev block 3, next block 5, flags: (RTL) ;; pred: 15 [always] count:35740571 (estimated locally) (DFS_BACK) ;; 3 [always] count:12992275 (estimated locally) (FALLTHRU) ... (code_label 23 6 24 4 3 (nil) [1 uses]) (note 24 23 25 4 [bb 4] NOTE_INSN_BASIC_BLOCK) (note 25 24 26 4 NOTE_INSN_DELETED) (jump_insn 26 25 146 4 (parallel [ (set (pc) (if_then_else (eq (reg/v:DI 5 %r5 [orig:74 e ] [74]) (const_int 1 [0x1])) (label_ref 43) (pc))) (clobber (reg:CC 33 %cc)) ]) "rh1686696.c":16:7 1263 {*cmp_and_br_signed_di} (int_list:REG_BR_PROB 355222868 (nil)) -> 43) ;; succ: 5 [66.9% (guessed)] count:32610701 (estimated locally) (FALLTHRU) ;; 9 [33.1% (guessed)] count:16122145 (estimated locally) ;; basic block 5, loop depth 0, count 105119324 (estimated locally), maybe hot ;; Invalid sum of incoming counts 101989455 (estimated locally), should be 105119324 (estimated locally) ;; prev block 4, next block 6, flags: (RTL) ;; pred: 4 [66.9% (guessed)] count:32610701 (estimated locally) (FALLTHRU) ;; 12 [84.2% (guessed)] count:69378754 (estimated locally) (DFS_BACK) ... (jump_insn 68 67 134 12 (parallel [ (set (pc) (if_then_else (ne (reg/v:DI 5 %r5 [orig:74 e ] [74]) (const_int 1 [0x1])) (label_ref:DI 146) (pc))) (set (reg/v:DI 5 %r5 [orig:74 e ] [74]) (plus:DI (reg/v:DI 5 %r5 [orig:74 e ] [74]) (const_int -1 [0xffffffffffffffff]))) (clobber (scratch:DI)) (clobber (reg:CC 33 %cc)) ]) "rh1686696.c":13:3 1922 {doloop_di} (int_list:REG_BR_PROB 904381916 (nil)) -> 146) ;; succ: 5 [84.2% (guessed)] count:69378754 (estimated locally) (DFS_BACK) ;; 13 [15.8% (guessed)] count:12992276 (estimated locally) (FALLTHRU) So, effectively the doloop jump_insn 68 has been changed from jumping before the %r5 == 1 comparison to after it. If the jump_insn would not decrement %r5, that would be perfectly valid optimization, so I guess something doesn't verify that the register isn't clobbered in the same instruction. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89497 [Bug 89497] [8 Regression] ICE caused by Segmentation Fault when compiling cups 2.2.10 with LTO flags enabled https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89551 [Bug 89551] [9 regression] Test case gcc.dg/uninit-pred-8_b.c fails after r269302