On Wed, Sep 14, 2022 at 11:31:34AM -0600, Sandra Loosemore wrote: > GCC presently enables the loop vectorizer at lower optimization levels for > OpenMP loops with the "simd" specifier than it does for loops without it. > The "simd" specifier isn't defined to be purely an optimization hint to the > compiler; it also has semantic effects like changing the privatization of > the loop variable. It seems reasonable to decouple the additional > vectorization from those semantic effects and apply it also to work-sharing > loops without the "simd" specifier at the same optimization levels. > > I've tested this patch on x86_64-linux-gnu-amdgcn, plain x86_64-linux-gnu, > and aarch64-linux-gnu. OK for mainline?
I don't understand this. Isn't -ftree-loop-optimize on by default at all optimization levels? Why this would be a good idea for say -O0, or -Og, -Os, or -Oz? People want the code be debuggable with -O0 or -Og, it doesn't help if it is vectorized (not sure if the vectorizer gate is even reached in that case though), and for -Os or -Oz want small code, which vectorized code typically is not. And the vectorizer is on by default for -O2 and higher, so is this just about -O1? The reason for setting force_vectorize for simd directive is that the user asks explicitly for it. For other constructs we can just guess on user intents. For simd directive the user also guarantees that there aren't inter-iteration dependencies that would prevent vectorization, but that is expressed in loop->safelen and loop->simdlen, for other loops we don't have such guarantees, so the compiler just needs to analyze if they are vectorizable. But doesn't gcc already do that for -O2/-O3 by default? As for loop->safelen, I think we might set it in some cases for other OpenMP constructs, like distribute without dist_schedule or worksharing-loops with certain set of clauses (I think without schedule clause). Jakub