https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99929
Bug ID: 99929 Summary: SVE: Wrong code at -O2 -ftree-vectorize Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: acoplan at gcc dot gnu.org Target Milestone: --- AArch64 GCC miscompiles the following testcase: #include <arm_sve.h> static void e(unsigned long long *g, int p2) { *g ^= p2; } static unsigned long long b; static int f[1][1][1][1]; static long l[23][2]; static short m[23]; int main() { for (unsigned i = 0; i < 23; ++i) for (unsigned j = 0; j < 2; ++j) l[i][j] = m[i] = 4; if (svaddv(svptrue_pat_b32(SV_VL1), svdup_u32(1)) != 1) __builtin_abort(); for (unsigned i = 0; i < 3; ++i) e(&b, m[i]); } with -march=armv8.2-a+sve -O2 -ftree-vectorize. At -O2 (without -ftree-vectorize), we do the reduction with: uaddv d0, p0, z0.s where the predicate is generated by: ptrue p0.b, vl1 which gives the expected result. With -ftree-vectorize, we do the reduction with: uaddv d0, p1, z0.s where the predicate is generated by: ptrue p1.h, all which does not give the expected result.