On Wed, Jan 23, 2019 at 11:22 AM Uros Bizjak <ubiz...@gmail.com> wrote:
>
> Attached patch adds SSE alternatives to sse2_cvtpi2pd, sse2_cvtpd2pi
> and sse2_cvttpd2pi to avoid MMX registers when e.g. _mm_cvtepi32_pd
> intrinsics is used. Without the patch, the testcase compiles to (-O2
> -mavx):
>
> _Z7prepareii:
>         vmovd   %edi, %xmm1
>         vpinsrd $1, %esi, %xmm1, %xmm0
>         movdq2q %xmm0, %mm0
>         cvtpi2pd        %mm0, %xmm0
>         vhaddpd %xmm0, %xmm0, %xmm0
>         ret
>
> while patched gcc generates:
>
>         vmovd   %edi, %xmm1
>         vpinsrd $1, %esi, %xmm1, %xmm0
>         vcvtdq2pd       %xmm0, %xmm0
>         vhaddpd %xmm0, %xmm0, %xmm0
>         ret
>
> The later avoids transition of FPU to MMX mode.
>

Is that possible to support 64-bit vectors, like V2SI, with SSE
instead of MMX for x86-64 under a command-line switch?

-- 
H.J.

Reply via email to