On Fri, 31 Jan 2025 at 14:47, Marc Glisse <marc.gli...@inria.fr> wrote: > > On Fri, 31 Jan 2025, Abhishek Kaushik wrote: > > > The current while loop in std::reduce and related functions is hard to > > vectorize because the loop control variable is hard to detect in icx. > > > > `while ((__last - __first) >= 4)` > > > > Changing the loop header to a for loop following the OpenMP canonical > > form allows easy vectorization, resulting in improved performance. > > > > `for (; __first <= __last - 4; __first += 4)` > > Is that always legal? If the sequence has size 1, is __last - 4 well > defined?
No. I thought of that and for some reason assumed that since we're already doing (last - first) it's OK ... but that's nonsense. We need to check the size before the stride=4 loop.