https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117112
Bug ID: 117112 Summary: missed vectorization opportunity: "not vectorized: no grouped stores in basic block" Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: 652023330028 at smail dot nju.edu.cn Target Milestone: --- Hello, we noticed that there seems to be a missing vectorization for the code below (at line 10). reduced code: https://godbolt.org/z/K9WhWrjno int data[20]; void f(int * __restrict arr1, int * __restrict arr2) { for(int i = 0; i < 20; i++){ arr1[i] = 1; } for(int i = 0; i < 20; i++){ arr2[i] = 2 * arr1[i]; int k = arr2[i] % arr1[i]; data[i] = data[i - k] + 1; // line 10, can be vectorized } } GCC -O3 -fno-vect-cost-model: f(int*, int*): pcmpeqd xmm0, xmm0 mov r8, rsi xor ecx, ecx psrld xmm0, 31 movups XMMWORD PTR [rdi], xmm0 movups XMMWORD PTR [rdi+16], xmm0 movups XMMWORD PTR [rdi+32], xmm0 movups XMMWORD PTR [rdi+48], xmm0 movups XMMWORD PTR [rdi+64], xmm0 .L2: mov esi, DWORD PTR [rdi+rcx*4] lea eax, [rsi+rsi] cdq mov DWORD PTR [r8+rcx*4], eax idiv esi mov eax, ecx sub eax, edx cdqe mov eax, DWORD PTR data[0+rax*4] add eax, 1 mov DWORD PTR data[0+rcx*4], eax add rcx, 1 cmp rcx, 20 jne .L2 ret missed: <source>:12:1: missed: not consecutive access _7 = *_6; <source>:12:1: missed: not consecutive access *_8 = _9; <source>:12:1: missed: not consecutive access _11 = data[_10]; <source>:12:1: missed: not consecutive access data[i_31] = _12; <source>:12:1: missed: not vectorized: no grouped stores in basic block. Thank you very much for your time and effort! We look forward to hearing from you.