http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49849
Summary: loop optimization prevents vectorization Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: vincenzo.innoce...@cern.ch In the following example I suspect that some sort of loop merging at O3 prevent the optimization of the second inner loop in "bar" compare c++ -Wall -O2 -ftree-vectorize -ftree-vectorizer-verbose=7 -c vectHist.cpp -ffast-math c++ -Wall -O3 -ftree-vectorize -ftree-vectorizer-verbose=7 -c vectHist.cpp -ffast-math what I do not understand is that if (following man page) I compare O2 and O3 with gcc -c -Q -O3 --help=optimizers > /tmp/O3-opts gcc -c -Q -O2 --help=optimizers > /tmp/O2-opts diff /tmp/O2-opts /tmp/O3-opts | grep enabled > -fgcse-after-reload [enabled] > -finline-functions [enabled] > -fipa-cp-clone [enabled] > -fpredictive-commoning [enabled] > -ftree-loop-distribute-patterns [enabled] > -ftree-vectorize [enabled] > -funswitch-loops [enabled] I still get c++ -std=gnu++0x -DNDEBUG -Wall -O2 -ftree-vectorize -msse4 -fvisibility-inlines-hidden -ftree-vectorizer-verbose=2 --param vect-max-version-for-alias-checks=30 -funsafe-loop-optimizations -ftree-loop-distribution -ftree-loop-if-convert-stores -fipa-pta -Wunsafe-loop-optimizations -fgcse-sm -fgcse-las -c vectHist.cpp -ffast-math -funswitch-loops -ftree-loop-distribute-patterns -fpredictive-commoning -finline-functions -fipa-cp-clone -fgcse-after-reload vectHist.cpp:17: note: not vectorized: data ref analysis failed x_5 = co[D.4986_4]; vectHist.cpp:16: note: vectorized 0 loops in function. vectHist.cpp:35: note: not vectorized: data ref analysis failed D.4977_30 = hist[D.4976_29]; vectHist.cpp:33: note: LOOP VECTORIZED. vectHist.cpp:31: note: not vectorized: data ref analysis failed D.4957_13 = co[D.4956_12]; vectHist.cpp:25: note: vectorized 1 loops in function. while changing just O2 in 03 (that at this point should be not really effective as I added all options by hand) does not vectorize… c++ -std=gnu++0x -DNDEBUG -Wall -O3 -mavx -ftree-vectorize -msse4 -fvisibility-inlines-hidden -ftree-vectorizer-verbose=2 --param vect-max-version-for-alias-checks=30 -funsafe-loop-optimizations -ftree-loop-distribution -ftree-loop-if-convert-stores -fipa-pta -Wunsafe-loop-optimizations -fgcse-sm -fgcse-las -c vectHist.cpp -ffast-math -funswitch-loops -ftree-loop-distribute-patterns -fpredictive-commoning -finline-functions -fipa-cp-clone -fgcse-after-reload vectHist.cpp:17: note: not vectorized: data ref analysis failed x_5 = co[D.5125_4]; vectHist.cpp:17: note: not vectorized: data ref analysis failed x_5 = co[D.5125_4]; vectHist.cpp:16: note: vectorized 0 loops in function. vectHist.cpp:30: note: not vectorized: data ref analysis failed D.5096_55 = co[D.5095_54]; vectHist.cpp:30: note: not vectorized: data ref analysis failed D.5096_55 = co[D.5095_54]; vectHist.cpp:25: note: vectorized 0 loops in function. note how it does not report anything about loops at lines 31,33 and 35 --------------------------- // a classroom example #include<cmath> const int N=1024; float __attribute__ ((aligned(16))) a[N]; float __attribute__ ((aligned(16))) b[N]; float __attribute__ ((aligned(16))) c[N]; float __attribute__ ((aligned(16))) d[N]; int __attribute__ ((aligned(16))) k[N]; float __attribute__ ((aligned(16))) co[12]; float __attribute__ ((aligned(16))) hist[100]; // do not expect GCC to vectorize (yet) void foo() { for (int i=0; i!=N; ++i) { float x = co[k[i]]; float y = a[i]/std::sqrt(x*b[i]); ++hist[int(y)]; } } // let's give it an hand: split the loop so that the "heavy duty one" vectorize void bar() { const int S=8; int loops = N/S; float x[S]; float y[S]; for (int j=0; j!=loops; ++j) { for (int i=0; i!=S; ++i) x[i] = co[k[j+i]]; for (int i=0; i!=S; ++i) // this should vectorize y[i] = a[j+i]/std::sqrt(x[i]*b[j+i]); for (int i=0; i!=S; ++i) ++hist[int(y[i])]; } }