https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106073
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://gcc.gnu.org/bugzill | |a/show_bug.cgi?id=90348 --- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> --- Options reduced to -O2 -funroll-loops -fno-tree-vectorize -fdisable-tree-cunrolli -fdbg-cnt=gimple_unroll:3-6:8-8 in particular reducing late unrolling more will no longer reproduce the issue. Disabling all threading after cunroll still reproduces the issue, thus adding -fdisable-tree-thread2 -fdisable-tree-threadfull2 -fdisable-tree-vrp2 -fdisable-tree-dom3 disabling IVOPTs hides the issue. Making all functions but main static also still reproduces the issue (so there's just one function left for late opts). With that simplification -O2 -funroll-loops -fno-tree-vectorize -fdisable-tree-cunrolli -fdbg-cnt=gimple_unroll:3-6:8-8 -fdisable-tree-thread2 -fdisable-tree-threadfull2 -fdisable-tree-dom3 -fno-tree-vrp -fdump-tree-all -fdbg-cnt=ivopts_loop:13-13:15-15 reproduces it, less IVOPTs does not. So one difference triggering the issue is (good vs bad in .optimized): @@ -328,20 +326,21 @@ <bb 19> [local count: 60532]: l = 0; + ivtmp.151_212 = (unsigned long) &bf; + _507 = ivtmp.151_212 + 56; <bb 20> [local count: 412224]: - # ai_lsm.108_510 = PHI <d.13_27(19), _516(21)> - # ivtmp_511 = PHI <8(19), ivtmp_512(21)> - ivtmp_512 = ivtmp_511 - 1; - if (ivtmp_512 != 0) + # ivtmp.151_254 = PHI <ivtmp.151_212(19), ivtmp.151_380(21)> + if (ivtmp.151_254 != _507) goto <bb 21>; [89.00%] else goto <bb 22>; [11.00%] <bb 21> [local count: 366880]: - bf[ai_lsm.108_510][0] = 5; - bf[ai_lsm.108_510][1] = 5; - _516 = ai_lsm.108_510 + 1; + _506 = (void *) ivtmp.151_254; + MEM[(int *)_506] = 5; + MEM[(int *)_506 + 4B] = 5; + ivtmp.151_380 = ivtmp.151_254 + 8; goto <bb 20>; [100.00%] <bb 22> [local count: 45347]: @@ -353,8 +352,6 @@ <bb 23> [local count: 40253]: bf ={v} {CLOBBER(eol)}; - ivtmp.132_498 = (unsigned long) &bf; - _480 = ivtmp.132_498 + 56; goto <bb 25>; [100.00%] <bb 24> [local count: 325681]: @@ -364,8 +361,8 @@ ivtmp.132_245 = ivtmp.132_383 + 8; <bb 25> [local count: 365933]: - # ivtmp.132_383 = PHI <ivtmp.132_498(23), ivtmp.132_245(24)> - if (ivtmp.132_383 != _480) + # ivtmp.132_383 = PHI <ivtmp.151_212(23), ivtmp.132_245(24)> + if (ivtmp.151_254 != ivtmp.132_383) goto <bb 24>; [89.00%] else goto <bb 26>; [11.00%] that already causes quite some assembler changes :/ It's still not clear what goes wrong here. Interestingly again, -fstack-reuse=none avoids the issue. So maybe the above hints at 'bf' being the issue here: Partition 3: size 56 align 16 av au bf Partition 5: size 56 align 16 av Partition 1: size 44 align 16 at Partition 2: size 8 align 8 au