Re: [PATCH 00/40] V6: Emulate MMX intrinsics with SSE

H.J. Lu Fri, 15 Feb 2019 16:53:50 -0800

On Fri, Feb 15, 2019 at 9:50 AM Uros Bizjak <[email protected]> wrote:
>
> On Fri, Feb 15, 2019 at 2:58 PM H.J. Lu <[email protected]> wrote:
> >
> > On x86-64, since __m64 is returned and passed in XMM registers, we can
> > emulate MMX intrinsics with SSE instructions. To support it, we added
> >
> >  #define TARGET_MMX_WITH_SSE (TARGET_64BIT && TARGET_SSE2)
> >
> > ;; Define instruction set of MMX instructions
> > (define_attr "mmx_isa" "base,native,x64,x64_noavx,x64_avx"
> >   (const_string "base"))
> >
> >          (eq_attr "mmx_isa" "native")
> >            (symbol_ref "!TARGET_MMX_WITH_SSE")
> >          (eq_attr "mmx_isa" "x64")
> >            (symbol_ref "TARGET_MMX_WITH_SSE")
> >          (eq_attr "mmx_isa" "x64_avx")
> >            (symbol_ref "TARGET_MMX_WITH_SSE && TARGET_AVX")
> >          (eq_attr "mmx_isa" "x64_noavx")
> >            (symbol_ref "TARGET_MMX_WITH_SSE && !TARGET_AVX")
> >
> > We added SSE emulation to MMX patterns and disabled MMX alternatives with
> > TARGET_MMX_WITH_SSE.
> >
> > Most of MMX instructions have equivalent SSE versions and results of some
> > SSE versions need to be reshuffled to the right order for MMX.  Thee are
> > couple tricky cases:
> >
> > 1. MMX maskmovq and SSE2 maskmovdqu aren't equivalent.  We emulate MMX
> > maskmovq with SSE2 maskmovdqu by zeroing out the upper 64 bits of the
> > mask operand and handle unmapped bits 64:127 at memory address by
> > adjusting source and mask operands together with memory address.
> >
> > 2. MMX movntq is emulated with SSE2 DImode movnti, which is available
> > in 64-bit mode.
> >
> > 3. MMX pshufb takes a 3-bit index while SSE pshufb takes a 4-bit index.
> > SSE emulation must clear the bit 4 in the shuffle control mask.
> >
> > 4. To emulate MMX cvtpi2p with SSE2 cvtdq2ps, we must properly preserve
> > the upper 64 bits of destination XMM register.
> >
> > Tests are also added to check each SSE emulation of MMX intrinsics.
> >
> > There are no regressions on i686 and x86-64.  For x86-64, GCC is also
> > tested with
> >
> > --with-arch=native --with-cpu=native
> >
> > on AVX2 and AVX512F machines.
>
> I went through the code again, and looks OK in general, modulo
> mmx_nonimmediate_operand issue and a couple of minor issues.
>
> Please substitute nonimmediate_operand predicate with
> mmx_nonimmediate_operand in expanders and insn patterns. Please note
> that the proposed convention is to name the operand
> register_mmxmem_operand (c.f. register_ssemem_operand), so I suggest
> we name the predicate in this way.
>
> There is an issue with a change to emms pattern.
>
> And let's remove _mm_empty () calls from testcases; they complicate
> things too much for no apparent benefit.
>
> With those issues fixed, the patchset is OK for gcc-10 when it opens.


The new patch set starts at

https://gcc.gnu.org/ml/gcc-patches/2019-02/msg01275.html

including

https://gcc.gnu.org/ml/gcc-patches/2019-02/msg01271.html

for

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89372

-- 
H.J.

Re: [PATCH 00/40] V6: Emulate MMX intrinsics with SSE

Reply via email to