Testcase (compile at -O2 -maltivec -ftree-vectorize):
int a[16*100];
int e;
float f(void)
{
int i;
int e1;
e1 = e;
for(i = 0;i<16*100;i++)
e1 += a[i];
e = e1;
}
----------- cut ------
Currently you get:
stvewx v1,0,r2
lis r2,ha16(_e)
lwz r0,-20(r1) <---- LHS hazard
stw r0,lo16(_e)(r2)
Even though the elements of v1 will all be the same, so GCC could do:
lis r2,ha16(_e)
add r2, lo16(_e)(r2)
stvewx v1,0,r2
--
Summary: Reduction into a global variable causes a Load Hit Store
Hazard (for the Cell)
Product: gcc
Version: 4.3.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: enhancement
Priority: P3
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: pinskia at gcc dot gnu dot org
GCC target triplet: powerpc*-*-*
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32826