void Sub(short * __restrict src1row, short * __restrict src2row, int num_in_row) { for(int i=num_in_row; i--;) { *src1row -= *src2row; ++src1row; ++src2row; } }
In the test case above, GCC inserts several explicit conversions soon after the gimple transformation stage and gets, D.2097 = *src1row; D.2098 = (short unsigned int) D.2097; D.2099 = *src2row; D.2100 = (short unsigned int) D.2099; D.2101 = D.2098 - D.2100; D.2102 = (short int) D.2101; These conversions breaks the vectorization and GCC reports, /* i686-unknown-linux-gnu-gcc -O3 -ftree-vectorize -ftree-vectorizer-verbose=5 -march=nocona -fno-strict-aliasing -c test.cc */ ...... test.cc:2: note: not vectorized: relevant stmt not supported: D.2430_11 = (short unsigned int) D.2429_10 test.cc:1: note: vectorized 0 loops in function. -- Summary: Unnecessary conversion from short to unsigend short breaks vectorization Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: gangren at google dot com GCC build triplet: i686-unknown-linux-gnu-gcc GCC host triplet: i686-unknown-linux-gnu-gcc GCC target triplet: i686-unknown-linux-gnu-gcc http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32309