https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100258
Bug ID: 100258
Summary: constant store pulled out of the loop causes an extra
memory load
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: enhancement
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: pinskia at gcc dot gnu.org
Target Milestone: ---
Target: x86_64-linux-gnu
Take:
void f(float *x, int t)
{
for(int i = 0; i < t; i++)
x[i*3] = 1.0;
}
Right now this produces for it at -O2:
testl %esi, %esi
jle .L5
leal -1(%rsi), %eax
leaq (%rax,%rax,2), %rax
vmovss .LC0(%rip), %xmm0
leaq 12(%rdi,%rax,4), %rax
.p2align 4,,10
.p2align 3
.L3:
vmovss %xmm0, (%rdi)
addq $12, %rdi
cmpq %rax, %rdi
jne .L3
.L5:
ret
----- CUT ----
If we don't have a loop, e.g. just a store to *x, we get:
movl $0x3f800000, (%rdi)
Which is 1000000x more effiecent and we just need a loop around that without
doing the load of .LC0.