On Sun, Feb 17, 2019 at 6:10 PM H.J. Lu <hjl.to...@gmail.com> wrote:
>
> On Sun, Feb 17, 2019 at 7:57 AM Uros Bizjak <ubiz...@gmail.com> wrote:
> >
> > On Sun, Feb 17, 2019 at 4:53 PM Uros Bizjak <ubiz...@gmail.com> wrote:
> >
> > > > > > On x86-64, since __m64 is returned and passed in XMM registers, we 
> > > > > > can
> > > > > > emulate MMX intrinsics with SSE instructions. To support it, we 
> > > > > > added
> > > > > >
> > > > > >  #define TARGET_MMX_WITH_SSE (TARGET_64BIT && TARGET_SSE2)
> > > > > >
> > > > > > ;; Define instruction set of MMX instructions
> > > > > > (define_attr "mmx_isa" "base,native,x64,x64_noavx,x64_avx"
> > > > > >   (const_string "base"))
> > > > > >
> > > > > >          (eq_attr "mmx_isa" "native")
> > > > > >            (symbol_ref "!TARGET_MMX_WITH_SSE")
> > > > > >          (eq_attr "mmx_isa" "x64")
> > > > > >            (symbol_ref "TARGET_MMX_WITH_SSE")
> > > > > >          (eq_attr "mmx_isa" "x64_avx")
> > > > > >            (symbol_ref "TARGET_MMX_WITH_SSE && TARGET_AVX")
> > > > > >          (eq_attr "mmx_isa" "x64_noavx")
> > > > > >            (symbol_ref "TARGET_MMX_WITH_SSE && !TARGET_AVX")
> > > > > >
> > > > > > We added SSE emulation to MMX patterns and disabled MMX 
> > > > > > alternatives with
> > > > > > TARGET_MMX_WITH_SSE.
> > > > > >
> > > > > > Most of MMX instructions have equivalent SSE versions and results 
> > > > > > of some
> > > > > > SSE versions need to be reshuffled to the right order for MMX.  
> > > > > > Thee are
> > > > > > couple tricky cases:
> > > > > >
> > > > > > 1. MMX maskmovq and SSE2 maskmovdqu aren't equivalent.  We emulate 
> > > > > > MMX
> > > > > > maskmovq with SSE2 maskmovdqu by zeroing out the upper 64 bits of 
> > > > > > the
> > > > > > mask operand and handle unmapped bits 64:127 at memory address by
> > > > > > adjusting source and mask operands together with memory address.
> > > > > >
> > > > > > 2. MMX movntq is emulated with SSE2 DImode movnti, which is 
> > > > > > available
> > > > > > in 64-bit mode.
> > > > > >
> > > > > > 3. MMX pshufb takes a 3-bit index while SSE pshufb takes a 4-bit 
> > > > > > index.
> > > > > > SSE emulation must clear the bit 4 in the shuffle control mask.
> > > > > >
> > > > > > 4. To emulate MMX cvtpi2p with SSE2 cvtdq2ps, we must properly 
> > > > > > preserve
> > > > > > the upper 64 bits of destination XMM register.
> > > > > >
> > > > > > Tests are also added to check each SSE emulation of MMX intrinsics.
> > > > > >
> > > > > > There are no regressions on i686 and x86-64.  For x86-64, GCC is 
> > > > > > also
> > > > > > tested with
> > > > > >
> > > > > > --with-arch=native --with-cpu=native
> > > > > >
> > > > > > on AVX2 and AVX512F machines.
> > > > >
> > > > > An idea that would take patch a step further also on 32 bit targets:
> > > > >
> > > > > *Assuming* that operations on XMM registers are as fast (or perhaps
> > > > > faster) than operations on MMX registers, we can change mmx_isa
> > > > > attribute in e.g.
> > > > >
> > > > > +  "@
> > > > > +   p<logic>\t{%2, %0|%0, %2}
> > > > > +   p<logic>\t{%2, %0|%0, %2}
> > > > > +   vp<logic>\t{%2, %1, %0|%0, %1, %2}"
> > > > > +  [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
> > > > >
> > > > > to:
> > > > >
> > > > > [(set_attr "isa" "*,noavx,avx")
> > > > >  (set_attr "mmx_isa" "native,*,*")]
> > > > >
> > > > > So, for x86_64 everything stays the same, but for x86_32 we now allow
> > > > > intrinsics to use xmm registers in addition to mmx registers. We can't
> > > > > disable MMX for x64_32 anyway due to ISA constraints (and some tricky
> > > > > cases, e.g. monvti that works only for 64bit targets and e.g. maskmovq
> > > > > & similar, which are more efficient with MMX regs), but RA has much
> > > > > more freedom to allocate the most effective register set even for
> > > > > 32bit targets.
> > > > >
> > > > > WDYT?
> > > > >
> > > >
> > > > Since MMX registers are used to pass and return __m64 values,
> > > > we can't really get rid of MMX instructions in 32-bit mode.  If people
> > > > have to stay with 32-bit mode, they need MMX.  I don't think we should
> > > > extend TARGET_MMX_WITH_SSE to 32-bit mode.
> > >
> > > No, TARGET_MMX_WITH_SSE is still enabled only for 64bit targets. We
> > > should not *disable* SSE alternatives on 32bit targets.
>
> I don't think my patch set disables any SSE alternatives in 32-bit
> mode.   However,
> it DOES NOT enable any SSE alternatives in 32-bit mode.  To really enable SSE
> alternatives in
>
> (define_insn "*mmx_<code><mode>3"
>   [(set (match_operand:MMXMODEI 0 "register_operand" "=y,x,Yv")
>         (any_logic:MMXMODEI
>           (match_operand:MMXMODEI 1 "register_mmxmem_operand" "%0,0,Yv")
>           (match_operand:MMXMODEI 2 "register_mmxmem_operand" "ym,x,Yv")))]
>   "(TARGET_MMX || TARGET_MMX_WITH_SSE)
>    && ix86_binary_operator_ok (<CODE>, <MODE>mode, operands)"
>   "@
>    p<logic>\t{%2, %0|%0, %2}
>    p<logic>\t{%2, %0|%0, %2}
>    vp<logic>\t{%2, %1, %0|%0, %1, %2}"
>   [(set_attr "mmx_isa" "native,x64_noavx,x64_avx")
>    (set_attr "type" "mmxadd,sselog,sselog")
>    (set_attr "mode" "DI,TI,TI")])
>
> register_mmxmem_operand must return true for SSE alternatives:

It returns true for register and memory operands for 32bit targets, because

#define TARGET_MMX_WITH_SSE (TARGET_64BIT && TARGET_SSE2)

> ;; Match register operands, but include memory operands for
> ;; !TARGET_MMX_WITH_SSE.
> (define_predicate "register_mmxmem_operand"
>   (ior (match_operand 0 "register_operand")
>        (and (not (match_test "TARGET_MMX_WITH_SSE"))
>             (match_operand 0 "memory_operand"))))
>
> How do you enable SSE alternatives in 32-bit mode without enabling
> TARGET_MMX_WITH_SSE for 32-bit mode?

Check the new attribute definitions below:

> > The correct isa attribute definition would be:
> >
> > [(set_attr "isa" "*,sse2_noavx,avx")
> >  (set_attr "mmx_isa" "native,*,*")]

Uros.

Reply via email to