https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80481

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |10.0

--- Comment #10 from Uroš Bizjak <ubizjak at gmail dot com> ---
Actually, the issue is not fixed. As mentioned in the description, the issue is
best visible without -funrol-loops.

(GCC) 10.0.0 20190821 (experimental) compiles (-Ofast -fopenmp -march=knl) to:

.L31:
        leaq    (%rdx), %rsi
        negq    %rsi
        vpermps (%r9,%rsi), %zmm8, %zmm0
        vmovaps %zmm0, %zmm1
        vmaxps  (%r11,%rdx), %zmm3, %zmm0
        vfnmadd132ps    (%r14,%rdx), %zmm7, %zmm1
        vmaxps  %zmm1, %zmm0, %zmm0
        vmovups %zmm0, 0(%r13,%rdx)
        leaq    64(%rdx), %rdx
        cmpq    %r8, %rdx
        jne     .L31

As seen in the detailed dump,

#(insn:TI 856 852 1743 71 (set (reg:V16SF 20 xmm0 [orig:885 vect__72.36 ]
[885])
#        (unspec:V16SF [
#                (mem:V16SF (plus:DI (reg/f:DI 37 r9 [orig:198 vectp.34 ]
[198])
#                        (reg:DI 4 si [883])) [3 MEM[base: vectp.34_256, index:
_1006, offset: 0B]+0 S64 A32])
#                (reg:V16SI 44 xmm8 [919])
#            ] UNSPEC_VPERMVAR))
"/hdd/uros/git/gcc/gcc/testsuite/g++.dg/pr80481.C":59:28 4754
{avx512f_permvarv16sf}
#     (expr_list:REG_DEAD (reg:DI 4 si [883])
#        (nil)))
        vpermps (%r9,%rsi), %zmm8, %zmm0        # 856   [c=68 l=7] 
avx512f_permvarv16sf
#(insn:TI 1743 856 860 71 (set (reg:V16SF 21 xmm1 [orig:885 vect__72.36 ]
[885])
#        (reg:V16SF 20 xmm0 [orig:885 vect__72.36 ] [885]))
"/hdd/uros/git/gcc/gcc/testsuite/g++.dg/pr80481.C":60:20 1255
{movv16sf_internal}
#     (expr_list:REG_DEAD (reg:V16SF 20 xmm0 [orig:885 vect__72.36 ] [885])
#        (nil)))
        vmovaps %zmm0, %zmm1    # 1743  [c=4 l=6]  movv16sf_internal/2
#(insn 860 1743 857 71 (set (reg:V16SF 20 xmm0 [orig:888 vect__13.45 ] [888])
#        (smax:V16SF (reg:V16SF 23 xmm3 [890])
#            (mem:V16SF (plus:DI (reg/f:DI 39 r11 [orig:187 vectp.43 ] [187])
#                    (reg:DI 1 dx [orig:478 ivtmp.111 ] [478])) [3 MEM[base:
vectp.43_238, index: ivtmp.111_997, offset: 0B]+0 S64 A32])))
"/hdd/uros/git/gcc/gcc/testsuite/g++.dg/pr80481.C":61:15 1631 {*smaxv16sf3}
#     (nil))
        vmaxps  (%r11,%rdx), %zmm3, %zmm0       # 860   [c=68 l=6] 
*smaxv16sf3/1

%zmm0 is killed in (insn 860) and thus dead in (insn 1743). (insn 856) could
simply use %zmm0 as its destination.

Reply via email to