http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54855



--- Comment #2 from Marc Glisse <glisse at gcc dot gnu.org> 2012-10-12 13:41:35 
UTC ---

(In reply to comment #1)

> Does not work for + though, as -0.0 + 0.0 is 0.0.

[...]

> On the tree level we see in-memory v because of the component modification:

> 

>   _7 = BIT_FIELD_REF <v, 64, 0>;

>   _8 = _7 - 1.0e+0;

>   BIT_FIELD_REF <v, 64, 0> = _8;

>   v.0_10 = v;

>   v.1_11 = v.0_10 * { 2.0e+0, 2.0e+0 };

>   v = v.1_11;

> 

> so either lowering this differently in the first place or detecting

> this kind of pattern would fix it.



Do you mean that at the tree level v[0] -= 1.0 could be changed to v -= {1.,

0.} ? That's not exactly what Ulrich was suggesting. It could be nice too, but

then we would need a different optimization in the back-end that detects the

special case of a vector subtraction where the second part of one argument is

0, in order to produce the optimal code.



In the x86 md, the sd instruction is represented as:

  [(set (match_operand:VF_128 0 "register_operand" "=x,x")

        (vec_merge:VF_128

          (plusminus:VF_128

            (match_operand:VF_128 1 "register_operand" "0,x")

            (match_operand:VF_128 2 "nonimmediate_operand" "xm,xm"))

          (match_dup 1)

          (const_int 1)))]



which is going to be hard to recognize from:

(insn 26 53 28 4 (set (reg:DF 81 [ D.2546 ])

        (vec_select:DF (reg/v:V2DF 73 [ v ])

            (parallel [

                    (const_int 0 [0])

                ]))) d.c:12 1408 {sse2_storelpd}

     (nil))

(insn 28 26 29 4 (set (reg:DF 82 [ D.2546 ])

        (minus:DF (reg:DF 81 [ D.2546 ])

            (reg:DF 84))) d.c:12 760 {*fop_df_1_sse}

     (expr_list:REG_DEAD (reg:DF 81 [ D.2546 ])

        (nil)))

(insn 29 28 30 4 (set (reg/v:V2DF 73 [ v ])

        (vec_concat:V2DF (reg:DF 82 [ D.2546 ])

            (vec_select:DF (reg/v:V2DF 73 [ v ])

                (parallel [

                        (const_int 1 [0x1])

                    ])))) d.c:12 1411 {sse2_loadlpd}

     (expr_list:REG_DEAD (reg:DF 82 [ D.2546 ])

        (nil)))



However, since that's only 3 insn, providing an additional define_insn for the

same instruction but with a pattern of vec_select and vec_concat might be

enough for combine.

Reply via email to