Hi, I'm investigating how to vectorize the following simple case:
int A[1024 * 2]; int foo1 (unsigned offset) { int sum = 0; for (unsigned i = 0; i < 1024; i++) sum += A[i + offset]; return sum; } The loop body and loop vectorizer dumps are: # i_13 = PHI <i_9(5), 0(2)> # ivtmp_14 = PHI <ivtmp_12(5), 1024(2)> _1 = offset_7(D) + i_13; _2 = A[_1]; sum_8 = _2 + sum_11; i_9 = i_13 + 1; ... Creating dr for A[_1] ... (chrec = (sizetype) {offset_7(D), +, 1}_1) (res = (sizetype) {offset_7(D), +, 1}_1) case1.c:7:13: missed: failed: evolution of offset is not affine. As SCEV thinks {offset,+,1} may overflow, it can not propagate the sizetype and fails. The call-stack is: dr_analyze_innermost() -> simple_iv() -> ... -> convert_affine_scev() -> scev_probably_wraps_p() -> scev_var_range_cant_overflow() loop_exits_before_overflow() BTW. If we add a stmt like "if (offset <= 1024)" before the loop, GCC can vectorized it successfully, as scev_var_range_cant_overflow() knows the range of _1. For the original case, I think GCC is able to infer the range from the array length and do the vectorization. There is already functions to infer this info from undefined behavior like array-ref and record nonwrappiong IVs. We can see the dumps of passes like evrp, cunroll, vrp1, etc: Induction variable (unsigned int) offset_8(D) + 1 * iteration does not wrap in statement _2 = A[_1]; in loop 1. Statement _2 = A[_1]; is executed at most 2047 (bounded by 2047) + 1 times in loop 1. The call-stack is: estimate_numbers_of_iterations() -> infer_loop_bounds_from_undefined() -> infer_loop_bounds_from_array() -> ... -> record_nonwrapping_iv() So, I have two questions: 1. is it legal to vectorize such case by inferring the no wrap info from array ref (I'm wondering if there is any corner case that can not do)? 2. If the 1st question is true, then how could we implement this in GCC? For the 2nd question, I think there may be several possible solutions: - Could we re-use the existing nonwrapping IV information found by niter? - Or, could we implement a function called infer_nowrapping_from_array() (like infer_loop_bounds_from_array)? For this way, there are also different possible places to call it: - in the very beginning (i.e., dr_analyze_innermost()), where we knows it is analyzing an array-ref A[_1]. - in scev_probably_wraps_p(), where it doesn't know it's analyzing an array subscript, so we have to find if the user of _1 is array-ref (i.e., A[_1]). As SCEV is the fundamental pass for loop analysis, I think we'd be very careful about the correctness. Do you have any comments? Thanks! -Hao