https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116588
Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |aldyh at gcc dot gnu.org, | |amacleod at redhat dot com, | |jakub at gcc dot gnu.org --- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> --- Can't be bisected due to the --param=vrp-block-limit=0 dependency, and that has been only introduced in r15-1622, which already fails. Seems like vrp2 issue. Before vectorization, we have: <bb 3> [local count: 1073741824]: # _33 = PHI <2(12), _34(14)> _35 = _33 - _26; _36 = _35 - _27; _37 = VIEW_CONVERT_EXPR<unsigned long[3]>(e)[_35]; _38 = _37 << _25; _39 = VIEW_CONVERT_EXPR<unsigned long[3]>(e)[_36]; _40 = _39 >> _28; _41 = _38 | _40; bitint.2[_33] = _41; _34 = _33 + 18446744073709551615; _42 = (ssizetype) _32; _43 = (ssizetype) _34; if (_42 <= _43) goto <bb 14>; [0.05%] else goto <bb 6>; [99.95%] <bb 14> [local count: 536864]: goto <bb 3>; [100.00%] loop which in this testcase should iterate exactly once because it is shift left by 128 bits, so _35 and _36 are 0 and _33 is 2. Then vectorizer vectorizes this loop (for some strange reason with a single scalar iteration first and then 2 iterations at a time). Before vrp2 we have: <bb 2> [local count: 1073741824]: c.0_1 = c; _2 = (unsigned int) c.0_1; _3 = 128 - _2; # RANGE [irange] unsigned int [0, +INF] MASK 0x3f VALUE 0x0 _25 = _3 & 63; # RANGE [irange] unsigned int [0, +INF] MASK 0x3ffffff VALUE 0x0 _29 = _3 >> 6; # RANGE [irange] sizetype [0, +INF] MASK 0x3ffffff VALUE 0x0 _26 = (sizetype) _29; _31 = _25 != 0; # RANGE [irange] sizetype [0, 1] MASK 0x1 VALUE 0x0 _27 = (sizetype) _31; # RANGE [irange] sizetype [0, +INF] MASK 0x7ffffff VALUE 0x0 _32 = _27 + _26; if (_32 <= 2) goto <bb 3>; [80.00%] else goto <bb 7>; [20.00%] <bb 3> [local count: 858993464]: _30 = -_25; # RANGE [irange] unsigned int [0, +INF] MASK 0x3f VALUE 0x0 _28 = _30 & 63; _63 = -_32; _105 = 2 - _32; if (_105 <= 1) goto <bb 5>; [10.00%] else goto <bb 4>; [90.00%] where _2 is 0 (but compiler doesn't know that), so _26 is 2 and _27 is 0 and _32 is 2. bb 4 contains the single scalar iteration + vectorized loop, bb 5 scalar loop. And as _105 should be 2 - 2 = 0, _105 <= 1 should be true at runtime. But somehow vrp2 determines that it is always false: Global Exported: _105 = [irange] unsigned long [0, 2] Folding statement: _105 = 2 - _32; Not folded Folding statement: if (_105 <= 1) Visiting conditional with predicate: if (_105 <= 1) With known ranges _105: [irange] unsigned long [0, 2] Predicate evaluates to: DON'T KNOW Simplified relational if (_105 <= 1) into if (_105 != 2) Folded into: if (0 != 0) The [0, 2] range for _105 is reasonable, at runtime it should be 0, simplifying _105 <= 1 into _105 != 2 is reasonable too, but how it determined from that that it is 0 != 0 is unclear.