https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94792
Bug ID: 94792 Summary: Missed SLP optimization in pr65930-2.c variation Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: iii at linux dot ibm.com Target Milestone: --- gcc commit cf3a909cf455. Consider the following variation of pr65930-2.c: $ cat pr65930-2b.c #include "tree-vect.h" int __attribute__((noipa)) bar (unsigned int *x, int n) { unsigned int sum = 4; x = __builtin_assume_aligned (x, __BIGGEST_ALIGNMENT__); for (int i = 0; i < n; ++i) sum += x[i*4+0]+ x[i*4 + 1] + x[i*4 + 2] + x[i*4 + 3]; return sum; } int main () { static int a[16] __attribute__((aligned(__BIGGEST_ALIGNMENT__))) = { 1, 3, 5, 8, 9, 10, 17, 18, 23, 29, 30, 55, 42, 2, 3, 1 }; check_vect (); if (bar (a, 4) != 260) abort (); return 0; } This differs from pr65930-2.c only in that sum type is unsigned int, which should be on cast less. And yet: $ gcc pr65930-2b.c -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers -fdiagnostics-color=never -fdiagnostics-urls=never -msse2 -ftree-vectorize -fno-tree-loop-distribute-patterns -fno-vect-cost-model -fno-common -O2 -fdump-tree-vect-details -lm -o ./pr65930-2.exe ; grep SLP pr65930-2b.c.161t.vect | wc -l 0 whereas for the original version: $ gcc pr65930-2.c -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers -fdiagnostics-color=never -fdiagnostics-urls=never -msse2 -ftree-vectorize -fno-tree-loop-distribute-patterns -fno-vect-cost-model -fno-common -O2 -fdump-tree-vect-details -lm -o ./pr65930-2.exe ; grep SLP pr65930-2.c.161t.vect | wc -l 33 The resulting assembly is also noticeably larger and uses regular adds for at least part of the data.