https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116855
Bug ID: 116855
Summary: Unsafe early-break vectorization
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: fxue at os dot amperecomputing.com
Target Milestone: ---
For the case:
char string[1020];
char * find(size_t n, char c)
{
for (size_t i = 0; i < n; i++) {
if (string[i] == c)
return &string[i];
}
return 0;
}
On aarch64 (not SVE compilation), the loop could be vectorized with -O3 as:
...
bnd.5_22 = n_4(D) >> 4;
vect_cst__50 = {c_6(D), c_6(D), ..., c_6(D), c_6(D)};
...
# vectp_string.10_47 = PHI <vectp_string.10_48(8), &string(13)>
# ivtmp_63 = PHI <ivtmp_64(8), 0(13)>
...
vect__1.12_49 = MEM <vector(16) char> [(char *)vectp_string.10_47];
mask_patt_9.13_51 = vect__1.12_49 == vect_cst__50;
if (mask_patt_9.13_51 != { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 })
goto <bb 20>; [5.50%]
else
goto <bb 5>; [94.50%]
...
vectp_string.10_48 = vectp_string.10_47 + 16;
ivtmp_64 = ivtmp_63 + 1;
if (ivtmp_64 < bnd.5_22)
goto <bb 8>; [94.50%]
else
goto <bb 15>; [5.50%]
Suppose that n is 1026, larger than length of "string", and only its last
element equals "char c", then the search would end up with a vector load that
contains unsafe memory accesses out bound of "string", and this may trigger
segfault.
One possible fix is to generate vector niter using the smaller value between
known constant bound and variable scalar niter. Another solution is that we
could follow assertion as "-fallow-store-data-races", which assume segfault
would not happen, so it is fine with introduction of new data races, then we
could enable the vectorization with -Ofast, not -O3. And by this means, it
could be extended to cover data array (represented by pointer) with no
statically-determined bound, for example:
char * find(char *string, size_t n, char c)
{
for (size_t i = 0; i < n; i++) {
if (string[i] == c)
return &string[i];
}
return 0;
}