On Thu, Nov 9, 2023 at 10:51 AM Hao Liu OS via Gcc <gcc@gcc.gnu.org> wrote: > > Hi, > > I'm investigating how to vectorize the following simple case: > > int A[1024 * 2]; > int foo1 (unsigned offset) { > int sum = 0; > for (unsigned i = 0; i < 1024; i++) > sum += A[i + offset]; > return sum; > } > > The loop body and loop vectorizer dumps are: > > # i_13 = PHI <i_9(5), 0(2)> > # ivtmp_14 = PHI <ivtmp_12(5), 1024(2)> > _1 = offset_7(D) + i_13; > _2 = A[_1]; > sum_8 = _2 + sum_11; > i_9 = i_13 + 1; > ... > Creating dr for A[_1] > ... > (chrec = (sizetype) {offset_7(D), +, 1}_1) > (res = (sizetype) {offset_7(D), +, 1}_1) > case1.c:7:13: missed: failed: evolution of offset is not affine. > > As SCEV thinks {offset,+,1} may overflow, it can not propagate the sizetype > and > fails. The call-stack is: > > dr_analyze_innermost() -> > simple_iv() -> ... -> > convert_affine_scev() -> > scev_probably_wraps_p() -> > scev_var_range_cant_overflow() > loop_exits_before_overflow() > > > BTW. If we add a stmt like "if (offset <= 1024)" before the loop, GCC can > vectorized it successfully, as scev_var_range_cant_overflow() knows the range > of _1. > > For the original case, I think GCC is able to infer the range from the array > length and do the vectorization. There is already functions to infer this info > from undefined behavior like array-ref and record nonwrappiong IVs. We can see > the dumps of passes like evrp, cunroll, vrp1, etc: > > Induction variable (unsigned int) offset_8(D) + 1 * iteration does not > wrap in statement _2 = A[_1]; > in loop 1. > Statement _2 = A[_1]; > is executed at most 2047 (bounded by 2047) + 1 times in loop 1. > > The call-stack is: > > estimate_numbers_of_iterations() -> > infer_loop_bounds_from_undefined() -> > infer_loop_bounds_from_array() -> ... -> > record_nonwrapping_iv() > > > So, I have two questions: > > 1. is it legal to vectorize such case by inferring the no wrap info from array > ref (I'm wondering if there is any corner case that can not do)? > 2. If the 1st question is true, then how could we implement this in GCC? > > For the 2nd question, I think there may be several possible solutions: > - Could we re-use the existing nonwrapping IV information found by niter? > - Or, could we implement a function called infer_nowrapping_from_array() (like > infer_loop_bounds_from_array)? For this way, there are also different > possible > places to call it: > - in the very beginning (i.e., dr_analyze_innermost()), where we knows it is > analyzing an array-ref A[_1]. > - in scev_probably_wraps_p(), where it doesn't know it's analyzing an array > subscript, so we have to find if the user of _1 is array-ref (i.e., > A[_1]). > > As SCEV is the fundamental pass for loop analysis, I think we'd be very > careful > about the correctness. Do you have any comments?
Also SCEV and niter analysis are very closely inter-winded ... It was previously suggested that SCEV analysis should get something like niter analysis "assumptions" so that we can record a condition that needs to hold true for the SCEV result to be valid. That condition can then for example checked at runtime by the vectorizer, just like we check the niter analysis 'assumptions'. But it's going to be quite a bit of work to wire in something like 'assumptions' there. In your particular case we could indeed use undefined behavior like niter analysis does and it would be nice to somehow record its analysis and use it from SCEV. Apart from introducing 'assumptions' I don't have any good idea on how to actually do this. It will require experimenting. Richard. > > Thanks! > -Hao