On Tue, 8 Sep 2020 at 02:19, Tom Lane <t...@sss.pgh.pa.us> wrote: > > I wrote: > > I experimented with a few different ideas such as adding restrict > > decoration to the pointers, and eventually found that what works > > is to write the loop termination condition as "i2 < limit" > > rather than "i2 <= limit". It took me a long time to think of > > trying that, because it seemed ridiculously stupid. But it works.
Ah ok. I checked the "Auto-Vectorization in LLVM" link that you shared. All the examples use "< n" or "> n". None of them use "<= n". Looks like a hidden restriction. > > I've done more testing and confirmed that both gcc and clang can > vectorize the improved loop on aarch64 as well as x86_64. (clang's > results can be confusing because -ftree-vectorize doesn't seem to > have any effect: its vectorizer is on by default. But if you use > -fno-vectorize it'll go back to the old, slower code.) > > The only buildfarm effect I've noticed is that locust and > prairiedog, which are using nearly the same ancient gcc version, > complain > > c1: warning: -ftree-vectorize enables strict aliasing. -fno-strict-aliasing > is ignored when Auto Vectorization is used. > > which is expected (they say the same for checksum.c), but then > there are a bunch of > > warning: dereferencing type-punned pointer will break strict-aliasing rules > > which seems worrisome. (This sort of thing is the reason I'm > hesitant to apply higher optimization levels across the board.) > Both animals pass the regression tests anyway, but if any other > compilers treat -ftree-vectorize as an excuse to apply stricter > optimization assumptions, we could be in for trouble. > > I looked closer and saw that all of those warnings are about > init_var(), and this change makes them go away: > > -#define init_var(v) MemSetAligned(v, 0, sizeof(NumericVar)) > +#define init_var(v) memset(v, 0, sizeof(NumericVar)) > > I'm a little inclined to commit that as future-proofing. It's > essentially reversing out a micro-optimization I made in d72f6c750. > I doubt I had hard evidence that it made any noticeable difference; > and even if it did back then, modern compilers probably prefer the > memset approach. Thanks. I must admit it did not occur to me that I could have very well installed clang on my linux machine and tried compiling this file, or tested with some older gcc versions. I think I was using gcc 8. Do you know what was the gcc compiler version that gave these warnings ? -- Thanks, -Amit Khandekar Huawei Technologies