On Mon, Feb 11, 2019 at 8:08 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > On Sun, Feb 10, 2019 at 2:48 AM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > On 2/10/19, H.J. Lu <hjl.to...@gmail.com> wrote: > > > Emulate MMX sse_cvtps2pi/sse_cvttps2pi with SSE. > > > > > > PR target/89021 > > > * config/i386/mmx.md (sse_cvtps2pi): Add SSE emulation. > > > (sse_cvttps2pi): Likewise. > > > > It looks to me that this description is wrong. We don't have V4SF > > modes here, but V2SF, so we have to fake 64bit load in case of MMX. > > The cvtps2dq will access memory in true 128bit width, so this is > > wrong. > > > > We have to fix the description to not fake wide mode. > > > > What do you propose to implement > > __m64 _mm_cvtps_pi32 (__m128 __A);
Hm... In your original patch, we *do* have V4SF memory access, but the original insn accesses it in __m64 mode. This should be OK, but then accessing this memory in __m128 mode should also be OK. So, on a more detailed look, the original patch looks OK to me. Luckily, a false alarm... > > We also have > > (define_insn "sse2_cvtps2pd<mask_name>" > [(set (match_operand:V2DF 0 "register_operand" "=v") > (float_extend:V2DF > (vec_select:V2SF > (match_operand:V4SF 1 "vector_operand" "vm") > (parallel [(const_int 0) (const_int 1)]))))] > "TARGET_SSE2 && <mask_avx512vl_condition>" > "%vcvtps2pd\t{%1, %0<mask_operand2>|%0<mask_operand2>, %q1}" > > These aren't new problems introduced by my MMX work. This one is not problematic, since the instruction accesses memory in __m64 mode, which is narrower that V4SFmode. Uros.