15 regression] flac miscompiled with -O3 -march=znver2 -fipa-pta -fno-vect-cost-model since r12-3893-g6390c5047adb75

rguenth at gcc dot gnu.org via Gcc-bugs Mon, 24 Jun 2024 00:12:33 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115533


--- Comment #18 from Richard Biener <rguenth at gcc dot gnu.org> ---
With -fipa-pta we add

+t.c:28:24: optimized: loop with 5 iterations completely unrolled (header
execution count 43151276)
+t.c:30:16: optimized: loop turned into non-loop; it never loops
...
-t.c:28:24: optimized: loop with 4 iterations completely unrolled (header
execution count 107374186)
+t.c:36:11: optimized: basic block part vectorized using 32 byte vectors
+t.c:36:11: optimized: basic block part vectorized using 8 byte vectors

the testcase still breaks when adding -fdisable-tree-cunroll
-fno-tree-loop-vectorize, then the only change is

+t.c:36:11: optimized: basic block part vectorized using 16 byte vectors
+t.c:36:11: optimized: basic block part vectorized using 8 byte vectors

when failing we have

t.c:36:11: missed:   can't determine dependence between *_65 and *ad_68
t.c:36:11: note:  removing SLP instance operations starting from: *_65 = _66;

w/o IPA PTA we have

  # PT = nonlocal escaped null
  _65 = a.6_13 + _64;
  # PT = nonlocal escaped
  ad_68 = ad_205 + 4;

with IPA PTA

  # PT = null { D.4063 D.4066 } (nonlocal, escaped, escaped heap)
  _65 = a.6_13 + _64;
  # PT = { D.4062 D.4065 } (nonlocal, escaped, escaped heap)
  ad_68 = ad_205 + 4;

[Bug ipa/115533] [12/13/14/15 regression] flac miscompiled with -O3 -march=znver2 -fipa-pta -fno-vect-cost-model since r12-3893-g6390c5047adb75

Reply via email to