On Fri, 31 Jan 2025 at 14:47, Marc Glisse <marc.gli...@inria.fr> wrote:
>
> On Fri, 31 Jan 2025, Abhishek Kaushik wrote:
>
> > The current while loop in std::reduce and related functions is hard to
> > vectorize because the loop control variable is hard to detect in icx.
> >
> > `while ((__last - __first) >= 4)`
> >
> > Changing the loop header to a for loop following the OpenMP canonical
> > form allows easy vectorization, resulting in improved performance.
> >
> > `for (; __first <= __last - 4; __first += 4)`
>
> Is that always legal? If the sequence has size 1, is __last - 4 well
> defined?

No. I thought of that and for some reason assumed that since we're
already doing (last - first) it's OK ... but that's nonsense.

We need to check the size before the stride=4 loop.

Reply via email to