https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101311

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
           Severity|normal                      |enhancement
   Last reconfirmed|                            |2021-07-03
     Ever confirmed|0                           |1

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
So for aarch64 it is xor:
(insn 7 4 8 2 (set (reg:SF 101)
        (mult:SF (reg/v:SF 98 [ a ])
            (reg/v:SF 99 [ b ]))) "t66.c":2:19 966 {mulsf3}
     (expr_list:REG_DEAD (reg/v:SF 99 [ b ])
        (expr_list:REG_DEAD (reg/v:SF 98 [ a ])
            (nil))))
(insn 8 7 9 2 (set (reg:SI 102)
        (xor:SI (subreg:SI (reg:SF 101) 0)
            (const_int -2147483648 [0xffffffff80000000]))) "t66.c":3:35 490
{xorsi3}
     (expr_list:REG_DEAD (reg:SF 101)
        (nil)))

But on x86_64 it is plus:
(insn 8 7 9 2 (parallel [
            (set (reg:SI 92)
                (plus:SI (subreg:SI (reg:SF 91) 0)
                    (const_int -2147483648 [0xffffffff80000000])))
            (clobber (reg:CC 17 flags))
        ]) "t87.c":3:35 207 {*addsi_1}
     (expr_list:REG_DEAD (reg:SF 91)
        (expr_list:REG_UNUSED (reg:CC 17 flags)
            (nil))))

It is xor until fwprop1 on x86_64.

Where it changes:
(insn 8 7 9 2 (parallel [
            (set (reg:SI 92)
                (xor:SI (subreg:SI (reg:SF 91) 0)
                    (const_int -2147483648 [0xffffffff80000000])))
            (clobber (reg:CC 17 flags))
        ]) "t87.c":3:35 529 {*xorsi_1}
     (nil))

into:
(insn 8 7 9 2 (parallel [
            (set (reg:SI 92)
                (plus:SI (subreg:SI (reg:SF 91) 0)
                    (const_int -2147483648 [0xffffffff80000000])))
            (clobber (reg:CC 17 flags))
        ]) "t87.c":3:35 207 {*addsi_1}
     (expr_list:REG_DEAD (reg:SF 91)
        (expr_list:REG_UNUSED (reg:CC 17 flags)
            (nil))))

But even if I change 0x80000000 to 0x80000001 (to force it to stay XOR), I
still don't get the SSE instruction.

Note aarch64 gets it right though:
        .cfi_startproc
        fmul    s0, s0, s1
        movi    v1.2s, 0x80, lsl 24
        eor     v0.8b, v0.8b, v1.8b
        fcvtzs  w0, s0
        ret

Reply via email to