https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115841
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2024-07-16
Status|UNCONFIRMED |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot
gnu.org
Ever confirmed|0 |1
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
Reduced testcase, fails with -Ofast -mavx512vl -mtune=znver4 --param
vect-partial-vector-usage=1 -fcommon (-fcommon is the key from fortran so we
can't re-align xl).
For an arch with "proper" costs (aligned loads cheaper) one would swap
the static/not static but even with -fno-vect-cost-model which should make
three loads aligned here it doesn't reproduce.
unsigned char xl[192];
static unsigned char A170[192*3];
void jerate (unsigned char *, unsigned char *);
float foo (unsigned n)
{
jerate (xl, A170);
unsigned i = 32;
int kr = 1;
float sfn11s = 0.f;
float sfn12s = 0.f;
do
{
int krm1 = kr - 1;
long j = krm1;
float a = (*(float(*)[n])A170)[j];
float b = (*(float(*)[n])xl)[j];
float c = a * b;
float d = c * 6.93149983882904052734375e-1f;
float e = (*(float(*)[n])A170)[j+48];
float f = (*(float(*)[n])A170)[j+96];
float g = d * e;
sfn11s = sfn11s + g;
float h = f * d;
sfn12s = sfn12s + h;
kr++;
}
while (--i != 0);
float tem = sfn11s + sfn12s;
return tem;
}