https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61810
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |hjl.tools at gmail dot com --- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Andrew Pinski from comment #6) > https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577192.html On current trunk x86_64 that gets FAIL: gcc.target/i386/extract-insert-combining.c scan-assembler-times (?:vmovd|movd)[ \\\\t]+[^{\\n]*%xmm[0-9] 3 FAIL: gcc.target/i386/extract-insert-combining.c scan-assembler-times (?:vpinsrd|pinsrd)[ \\\\t]+[^{\\n]*%xmm[0-9] 1 FAIL: gcc.target/i386/pr104441-1b.c execution test FAIL: gcc.target/i386/pr98335.c scan-assembler movzbl FAIL: gcc.target/i386/pr98335.c scan-assembler-not movb FAIL: gnat.dg/sso8.adb execution test FAIL: libgomp.c/loop-19.c execution test FAILs can be reproduced in an unpatched tree with specifying -fdisable-rtl-init-regs Assembly difference for gcc.target/i386/pr104441-1b.c is (besides RA): - vpxor %xmm1, %xmm1, %xmm1 + vpinsrd $1, (%rax,%r10), %xmm5, %xmm1 + vpinsrd $1, (%rdx,%r9), %xmm4, %xmm3 vmovd (%rax), %xmm0 - vpxor %xmm2, %xmm2, %xmm2 addl $4, %ecx - vpinsrd $1, (%rax,%r10), %xmm1, %xmm1 - vpinsrd $1, (%rdx,%r9), %xmm2, %xmm2 adding initialization in compute4x_m_sad_avx2_intrin of reg 109 at in block 4 for insn 33. adding initialization in compute4x_m_sad_avx2_intrin of reg 99 at in block 4 for insn 48. where we have for example -(insn 97 31 98 4 (clobber (reg/v:V2DI 109 [ src23 ])) "/home/rguenther/obj-gcc4-g/gcc/include/smmintrin.h":408:20 -1 - (nil)) -(insn 98 97 33 4 (set (reg/v:V2DI 109 [ src23 ]) - (const_vector:V2DI [ - (const_int 0 [0]) repeated x2 - ])) "/home/rguenther/obj-gcc4-g/gcc/include/smmintrin.h":408:20 -1 - (nil)) -(insn 33 98 36 4 (set (reg:V4SI 138 [ src23 ]) +(insn 33 31 36 4 (set (reg:V4SI 138 [ src23 ]) (vec_merge:V4SI (vec_duplicate:V4SI (reg:SI 137 [ MEM[(int32_t *)src_62 + _41 * 1] ])) (subreg:V4SI (reg/v:V2DI 109 [ src23 ]) 0) (const_int 2 [0x2]))) "/home/rguenther/obj-gcc4-g/gcc/include/smmintrin.h":408:20 6925 {sse4_1_pinsrd} where this produces { undef, MEM, undef, undef } without init-regs But it looks like the testcase is broken: __attribute__((always_inline, target("avx2"))) static __m256i load8bit_4x4_avx2(const uint8_t *const src, const uint32_t stride) { __m128i src01, src23; src01 = _mm_cvtsi32_si128(*(int32_t*)(src + 0 * stride)); src23 = _mm_insert_epi32(src23, *(int32_t *)(src + 3 * stride), 1); return _mm256_setr_m128i(src01, src23); } it seems to expect that src23 is zero before inserting the data?