https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94308
Bug ID: 94308 Summary: [10 Regression] ICE in final_scan_insn_1 with vzeroupper Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: jakub at gcc dot gnu.org Target Milestone: --- /* { dg-do compile } */ /* { dg-options "-O2 -mfpmath=sse -mavx2 -mfma" } */ #include <x86intrin.h> void foo (float *x, const float *y, const float *z, unsigned int w) { unsigned int a; const unsigned int b = w / 8; const float *c = y; const float *d = z; __m256 e = _mm256_setzero_ps (); __m256 f, g; for (a = 0; a < b; a++) { f = _mm256_loadu_ps (c); g = _mm256_loadu_ps (d); c += 8; d += 8; e = _mm256_fmadd_ps (f, g, e); } __attribute__ ((aligned (32))) float h[8]; _mm256_storeu_ps (h, e); _mm256_zeroupper (); float i = h[0] + h[1] + h[2] + h[3] + h[4] + h[5] + h[6] + h[7]; for (a = b * 8; a < w; a++) i += (*c++) * (*d++); *x = i; } ICEs on i686-linux or x86_64-linux with -m32. The problem is that the vzeroupper pass in this case fills in sets for all xmm0..xmm7 regs, but doesn't force rerecognition of the insn, so it is still considered avx_vzeroupper_1, but the splitter doesn't trigger for it.