https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116588
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |aldyh at gcc dot gnu.org,
| |amacleod at redhat dot com,
| |jakub at gcc dot gnu.org
--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Can't be bisected due to the --param=vrp-block-limit=0 dependency, and that has
been only introduced in r15-1622, which already fails.
Seems like vrp2 issue.
Before vectorization, we have:
<bb 3> [local count: 1073741824]:
# _33 = PHI <2(12), _34(14)>
_35 = _33 - _26;
_36 = _35 - _27;
_37 = VIEW_CONVERT_EXPR<unsigned long[3]>(e)[_35];
_38 = _37 << _25;
_39 = VIEW_CONVERT_EXPR<unsigned long[3]>(e)[_36];
_40 = _39 >> _28;
_41 = _38 | _40;
bitint.2[_33] = _41;
_34 = _33 + 18446744073709551615;
_42 = (ssizetype) _32;
_43 = (ssizetype) _34;
if (_42 <= _43)
goto <bb 14>; [0.05%]
else
goto <bb 6>; [99.95%]
<bb 14> [local count: 536864]:
goto <bb 3>; [100.00%]
loop which in this testcase should iterate exactly once because it is shift
left by 128 bits, so _35 and _36 are 0 and _33 is 2.
Then vectorizer vectorizes this loop (for some strange reason with a single
scalar
iteration first and then 2 iterations at a time).
Before vrp2 we have:
<bb 2> [local count: 1073741824]:
c.0_1 = c;
_2 = (unsigned int) c.0_1;
_3 = 128 - _2;
# RANGE [irange] unsigned int [0, +INF] MASK 0x3f VALUE 0x0
_25 = _3 & 63;
# RANGE [irange] unsigned int [0, +INF] MASK 0x3ffffff VALUE 0x0
_29 = _3 >> 6;
# RANGE [irange] sizetype [0, +INF] MASK 0x3ffffff VALUE 0x0
_26 = (sizetype) _29;
_31 = _25 != 0;
# RANGE [irange] sizetype [0, 1] MASK 0x1 VALUE 0x0
_27 = (sizetype) _31;
# RANGE [irange] sizetype [0, +INF] MASK 0x7ffffff VALUE 0x0
_32 = _27 + _26;
if (_32 <= 2)
goto <bb 3>; [80.00%]
else
goto <bb 7>; [20.00%]
<bb 3> [local count: 858993464]:
_30 = -_25;
# RANGE [irange] unsigned int [0, +INF] MASK 0x3f VALUE 0x0
_28 = _30 & 63;
_63 = -_32;
_105 = 2 - _32;
if (_105 <= 1)
goto <bb 5>; [10.00%]
else
goto <bb 4>; [90.00%]
where _2 is 0 (but compiler doesn't know that), so _26 is 2 and _27 is 0 and
_32 is 2.
bb 4 contains the single scalar iteration + vectorized loop, bb 5 scalar loop.
And as _105 should be 2 - 2 = 0, _105 <= 1 should be true at runtime.
But somehow vrp2 determines that it is always false:
Global Exported: _105 = [irange] unsigned long [0, 2]
Folding statement: _105 = 2 - _32;
Not folded
Folding statement: if (_105 <= 1)
Visiting conditional with predicate: if (_105 <= 1)
With known ranges
_105: [irange] unsigned long [0, 2]
Predicate evaluates to: DON'T KNOW
Simplified relational if (_105 <= 1)
into if (_105 != 2)
Folded into: if (0 != 0)
The [0, 2] range for _105 is reasonable, at runtime it should be 0, simplifying
_105 <= 1 into _105 != 2 is reasonable too, but how it determined from that
that it is 0 != 0 is unclear.