On Fri, Nov 30, 2012 at 1:34 PM, Marc Glisse <marc.gli...@inria.fr> wrote:
> Hello, > > I experimented with the simplify-rtx transformation you suggested, see: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54855 > > It works when the argument is a register, but not for memory (which is where > the constant is in the testcase). And the description of the operation in > sse.md does seem problematic. It says the second argument is: > > (match_operand:VF_128 2 "nonimmediate_operand" "xm,xm")) > > but Intel's documentation says "The source operand can be an XMM register or > a 64-bit memory location", not quite the same. > > Do you think the .md description should really stay this way, or could we > change it to something that better reflects "64-bit memory location"? For reference, we are talking about: (define_insn "<sse>_vm<plusminus_insn><mode>3" [(set (match_operand:VF_128 0 "register_operand" "=x,x") (vec_merge:VF_128 (plusminus:VF_128 (match_operand:VF_128 1 "register_operand" "0,x") (match_operand:VF_128 2 "nonimmediate_operand" "xm,xm")) (match_dup 1) (const_int 1)))] "TARGET_SSE" "@ <plusminus_mnemonic><ssescalarmodesuffix>\t{%2, %0|%0, %2} v<plusminus_mnemonic><ssescalarmodesuffix>\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sseadd") (set_attr "prefix" "orig,vex") (set_attr "mode" "<ssescalarmode>")]) No, looking at your description, the operand 2 should be scalar operand (we use _s{s,d} scalar instruction here), and for doubles this should refer to 64bit memory location. I don't remember all the details about vec_merge scalar instructions, but it looks to me that canonical representation should be more like your proposal: +(define_insn "*sse2_vm<plusminus_insn>v2df3" + [(set (match_operand:V2DF 0 "register_operand" "=x,x") + (vec_concat:V2DF + (plusminus:DF + (vec_select:DF + (match_operand:V2DF 1 "register_operand" "0,x") + (parallel [(const_int 0)])) + (match_operand:DF 2 "nonimmediate_operand" "xm,xm")) + (vec_select:DF (match_dup 1) (parallel [(const_int 1)]))))] + "TARGET_SSE2" Uros.