Amit Khandekar <amitdkhan...@gmail.com> writes: > On Mon, 7 Sep 2020 at 11:23, Tom Lane <t...@sss.pgh.pa.us> wrote: >> BTW, poking at this further, it seems that the patch only really >> works for gcc. clang accepts the -ftree-vectorize switch, but >> looking at the generated asm shows that it does nothing useful. >> Which is odd, because clang does do loop vectorization.
> Hmm, yeah that's unfortunate. My guess is that the compiler would do > vectorization only if 'i' is a constant, which is not true for our > case. No, they claim to handle variable trip counts, per https://llvm.org/docs/Vectorizers.html#loops-with-unknown-trip-count I experimented with a few different ideas such as adding restrict decoration to the pointers, and eventually found that what works is to write the loop termination condition as "i2 < limit" rather than "i2 <= limit". It took me a long time to think of trying that, because it seemed ridiculously stupid. But it works. regards, tom lane