https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111770
--- Comment #2 from Alex Coplan <acoplan at gcc dot gnu.org> ---
I think to progress this and related cases we need to have .MASK_LOAD defined
to zero in the case that the predicate is false (either unconditionally for all
targets if possible or otherwise conditionally for targets where that is safe).
Here is a related case:
int bar(int n, char *a, char *b, char *c) {
int sum = 0;
for (int i = 0; i < n; ++i)
if (c[i] == 0)
sum += a[i] * b[i];
return sum;
}
in this case we get the missed optimization even before vectorization during
ifcvt (in some ways it is a simpler case to consider as only scalars are
involved). Here with -O3 -march=armv9-a from ifcvt we get:
<bb 3> [local count: 955630224]:
# sum_23 = PHI <_ifc__41(8), 0(18)>
# i_25 = PHI <i_20(8), 0(18)>
_1 = (sizetype) i_25;
_2 = c_16(D) + _1;
_3 = *_2;
_29 = _3 == 0;
_43 = _42 + _1;
_4 = (char *) _43;
_5 = .MASK_LOAD (_4, 8B, _29);
_6 = (int) _5;
_45 = _44 + _1;
_7 = (char *) _45;
_8 = .MASK_LOAD (_7, 8B, _29);
_9 = (int) _8;
_46 = (unsigned int) _6;
_47 = (unsigned int) _9;
_48 = _46 * _47;
_10 = (int) _48;
_ifc__41 = .COND_ADD (_29, sum_23, _10, sum_23);
for this case it should be possible to use an unpredicated add instead of a
.COND_ADD. We essentially need to show that this transformation is valid:
_29 ? sum_23 + _10 : sum_23 --> sum_23 + _10
and this essentially boils down to showing that:
_29 = false => _10 = 0
now I'm not sure if there's a way of match-and-simplifying some GIMPLE
expression under the assumption that a given SSA name takes a particular value;
but if there were, and we defined .MASK_LOAD to zero given a false predicate,
then we could evaluate _10 under the assumption that _29 = false, which if we
added some simple match.pd rule for .MASK_LOAD with a false predicate would
allow it to evaluate to zero, and thus we could establish _10 = 0 proving the
transformation is correct. If such an approach is possible then I guess ifcvt
could use it to avoid conditionalizing statements unnecessarily.
Richi: any thoughts on the above or on how we should handle this sort of thing?