https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65930
Bug ID: 65930 Summary: Reduction with sign-change not handled Product: gcc Version: 6.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Blocks: 53947 Target Milestone: --- Neither int foo (unsigned int *x) { int sum = 0; for (int i = 0; i < 4; ++i) sum += x[i*4+0]+ x[i*4 + 1] + x[i*4 + 2] + x[i*4 + 3]; return sum; } nor int bar (unsigned int *x) { int sum = 0; for (int i = 0; i < 16; ++i) sum += x[i]; return sum; } are currently vectorized because t.c:4:3: note: reduction: not commutative/associative: sum_27 = (int) _26; though the sign of 'sum' vs the sign of 'x' doesn't really matter here. It works for unsigned baz (int *x) { unsigned int sum = 0; for (int i = 0; i < 16; ++i) sum += x[i]; return sum; } because C promotes x[i] instead of sum in this case. For vectorization we might want to change the add(s) to int - of course not strictly valid because of undefined overflow issues. This kind of loop appears in a hot area of x264. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations