On Wed, Jan 23, 2019 at 8:52 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > On Wed, Jan 23, 2019 at 11:22 AM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > Attached patch adds SSE alternatives to sse2_cvtpi2pd, sse2_cvtpd2pi > > and sse2_cvttpd2pi to avoid MMX registers when e.g. _mm_cvtepi32_pd > > intrinsics is used. Without the patch, the testcase compiles to (-O2 > > -mavx): > > > > _Z7prepareii: > > vmovd %edi, %xmm1 > > vpinsrd $1, %esi, %xmm1, %xmm0 > > movdq2q %xmm0, %mm0 > > cvtpi2pd %mm0, %xmm0 > > vhaddpd %xmm0, %xmm0, %xmm0 > > ret > > > > while patched gcc generates: > > > > vmovd %edi, %xmm1 > > vpinsrd $1, %esi, %xmm1, %xmm0 > > vcvtdq2pd %xmm0, %xmm0 > > vhaddpd %xmm0, %xmm0, %xmm0 > > ret > > > > The later avoids transition of FPU to MMX mode. > > > > Is that possible to support 64-bit vectors, like V2SI, with SSE > instead of MMX for x86-64 under a command-line switch?
SSE registers are preferred for 64bit vectors (see number of exclamation marks in *mov<mode>_internal in mmx.md), so the value will be passed in SSE regs unless there is pure MMX instruction, where due to missing SSE alternatives, RA will need to allocate MMX register. Uros.