https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80481
Uroš Bizjak <ubizjak at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|--- |10.0 --- Comment #10 from Uroš Bizjak <ubizjak at gmail dot com> --- Actually, the issue is not fixed. As mentioned in the description, the issue is best visible without -funrol-loops. (GCC) 10.0.0 20190821 (experimental) compiles (-Ofast -fopenmp -march=knl) to: .L31: leaq (%rdx), %rsi negq %rsi vpermps (%r9,%rsi), %zmm8, %zmm0 vmovaps %zmm0, %zmm1 vmaxps (%r11,%rdx), %zmm3, %zmm0 vfnmadd132ps (%r14,%rdx), %zmm7, %zmm1 vmaxps %zmm1, %zmm0, %zmm0 vmovups %zmm0, 0(%r13,%rdx) leaq 64(%rdx), %rdx cmpq %r8, %rdx jne .L31 As seen in the detailed dump, #(insn:TI 856 852 1743 71 (set (reg:V16SF 20 xmm0 [orig:885 vect__72.36 ] [885]) # (unspec:V16SF [ # (mem:V16SF (plus:DI (reg/f:DI 37 r9 [orig:198 vectp.34 ] [198]) # (reg:DI 4 si [883])) [3 MEM[base: vectp.34_256, index: _1006, offset: 0B]+0 S64 A32]) # (reg:V16SI 44 xmm8 [919]) # ] UNSPEC_VPERMVAR)) "/hdd/uros/git/gcc/gcc/testsuite/g++.dg/pr80481.C":59:28 4754 {avx512f_permvarv16sf} # (expr_list:REG_DEAD (reg:DI 4 si [883]) # (nil))) vpermps (%r9,%rsi), %zmm8, %zmm0 # 856 [c=68 l=7] avx512f_permvarv16sf #(insn:TI 1743 856 860 71 (set (reg:V16SF 21 xmm1 [orig:885 vect__72.36 ] [885]) # (reg:V16SF 20 xmm0 [orig:885 vect__72.36 ] [885])) "/hdd/uros/git/gcc/gcc/testsuite/g++.dg/pr80481.C":60:20 1255 {movv16sf_internal} # (expr_list:REG_DEAD (reg:V16SF 20 xmm0 [orig:885 vect__72.36 ] [885]) # (nil))) vmovaps %zmm0, %zmm1 # 1743 [c=4 l=6] movv16sf_internal/2 #(insn 860 1743 857 71 (set (reg:V16SF 20 xmm0 [orig:888 vect__13.45 ] [888]) # (smax:V16SF (reg:V16SF 23 xmm3 [890]) # (mem:V16SF (plus:DI (reg/f:DI 39 r11 [orig:187 vectp.43 ] [187]) # (reg:DI 1 dx [orig:478 ivtmp.111 ] [478])) [3 MEM[base: vectp.43_238, index: ivtmp.111_997, offset: 0B]+0 S64 A32]))) "/hdd/uros/git/gcc/gcc/testsuite/g++.dg/pr80481.C":61:15 1631 {*smaxv16sf3} # (nil)) vmaxps (%r11,%rdx), %zmm3, %zmm0 # 860 [c=68 l=6] *smaxv16sf3/1 %zmm0 is killed in (insn 860) and thus dead in (insn 1743). (insn 856) could simply use %zmm0 as its destination.