http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54855
--- Comment #2 from Marc Glisse <glisse at gcc dot gnu.org> 2012-10-12 13:41:35 UTC --- (In reply to comment #1) > Does not work for + though, as -0.0 + 0.0 is 0.0. [...] > On the tree level we see in-memory v because of the component modification: > > _7 = BIT_FIELD_REF <v, 64, 0>; > _8 = _7 - 1.0e+0; > BIT_FIELD_REF <v, 64, 0> = _8; > v.0_10 = v; > v.1_11 = v.0_10 * { 2.0e+0, 2.0e+0 }; > v = v.1_11; > > so either lowering this differently in the first place or detecting > this kind of pattern would fix it. Do you mean that at the tree level v[0] -= 1.0 could be changed to v -= {1., 0.} ? That's not exactly what Ulrich was suggesting. It could be nice too, but then we would need a different optimization in the back-end that detects the special case of a vector subtraction where the second part of one argument is 0, in order to produce the optimal code. In the x86 md, the sd instruction is represented as: [(set (match_operand:VF_128 0 "register_operand" "=x,x") (vec_merge:VF_128 (plusminus:VF_128 (match_operand:VF_128 1 "register_operand" "0,x") (match_operand:VF_128 2 "nonimmediate_operand" "xm,xm")) (match_dup 1) (const_int 1)))] which is going to be hard to recognize from: (insn 26 53 28 4 (set (reg:DF 81 [ D.2546 ]) (vec_select:DF (reg/v:V2DF 73 [ v ]) (parallel [ (const_int 0 [0]) ]))) d.c:12 1408 {sse2_storelpd} (nil)) (insn 28 26 29 4 (set (reg:DF 82 [ D.2546 ]) (minus:DF (reg:DF 81 [ D.2546 ]) (reg:DF 84))) d.c:12 760 {*fop_df_1_sse} (expr_list:REG_DEAD (reg:DF 81 [ D.2546 ]) (nil))) (insn 29 28 30 4 (set (reg/v:V2DF 73 [ v ]) (vec_concat:V2DF (reg:DF 82 [ D.2546 ]) (vec_select:DF (reg/v:V2DF 73 [ v ]) (parallel [ (const_int 1 [0x1]) ])))) d.c:12 1411 {sse2_loadlpd} (expr_list:REG_DEAD (reg:DF 82 [ D.2546 ]) (nil))) However, since that's only 3 insn, providing an additional define_insn for the same instruction but with a pattern of vec_select and vec_concat might be enough for combine.