Re: [PATCH, i386] Fix PR 88998, bad codegen with mmx instructions

Uros Bizjak Wed, 23 Jan 2019 12:28:16 -0800

On Wed, Jan 23, 2019 at 8:52 PM H.J. Lu <hjl.to...@gmail.com> wrote:
>
> On Wed, Jan 23, 2019 at 11:22 AM Uros Bizjak <ubiz...@gmail.com> wrote:
> >
> > Attached patch adds SSE alternatives to sse2_cvtpi2pd, sse2_cvtpd2pi
> > and sse2_cvttpd2pi to avoid MMX registers when e.g. _mm_cvtepi32_pd
> > intrinsics is used. Without the patch, the testcase compiles to (-O2
> > -mavx):
> >
> > _Z7prepareii:
> >         vmovd   %edi, %xmm1
> >         vpinsrd $1, %esi, %xmm1, %xmm0
> >         movdq2q %xmm0, %mm0
> >         cvtpi2pd        %mm0, %xmm0
> >         vhaddpd %xmm0, %xmm0, %xmm0
> >         ret
> >
> > while patched gcc generates:
> >
> >         vmovd   %edi, %xmm1
> >         vpinsrd $1, %esi, %xmm1, %xmm0
> >         vcvtdq2pd       %xmm0, %xmm0
> >         vhaddpd %xmm0, %xmm0, %xmm0
> >         ret
> >
> > The later avoids transition of FPU to MMX mode.
> >
>
> Is that possible to support 64-bit vectors, like V2SI, with SSE
> instead of MMX for x86-64 under a command-line switch?


SSE registers are preferred for 64bit vectors (see number of
exclamation marks in *mov<mode>_internal in mmx.md), so the value will
be passed in SSE regs unless there is pure MMX instruction, where due
to missing SSE alternatives, RA will need to allocate MMX register.

Uros.

Re: [PATCH, i386] Fix PR 88998, bad codegen with mmx instructions

Reply via email to