https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115355
--- Comment #9 from Kewen Lin <linkw at gcc dot gnu.org> --- (In reply to Peter Bergner from comment #7) > The test fails when setToIdentityBAD's index var is unsigned int. It passes > when using unsigned long long, unsigned long, unsigned short and unsigned > char. When using unsigned long long/unsigned long, we do no vectorize the unsigned {long ,}long fails to vectorize due to cost modeling: missed: cost model: the vector iteration cost = 2 divided by the scalar iteration cost = 1 is greater or equal to the vectorization factor = 2. missed: not vectorized: vectorization not profitable. it can be forced with -fno-vect-cost-model. > loop. We vectorize the loop when using unsigned int/short/char. The > vectorized code is a little strange, in that the smaller the integer type we > use for the index var, the more code we generate. > > The vectorized code for unsigned char is truly huge! ...although it does > seem to work correctly. I'm attaching the "unsigned char i" code gen for > setToIdentityBAD for people to examine. Even though it gives "correct" > results, it can't really be the code we want to generate, correct??? It's due to aggressive unrolling, as it has one early check on the loop bound between 16 and 255, then cunroll completely unrolls it for each 16 multiples (totally 15 loops). A compact version of code can be generated with -fdisable-tree-cunroll.