On Mon, Sep 7, 2020 at 7:44 AM luoxhu <luo...@linux.ibm.com> wrote: > > Hi, > > On 2020/9/4 18:23, Segher Boessenkool wrote: > >> diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c > >> index 03b00738a5e..00c65311f76 100644 > >> --- a/gcc/config/rs6000/rs6000-c.c > >> +++ b/gcc/config/rs6000/rs6000-c.c > >> /* Build *(((arg1_inner_type*)&(vector type){arg1})+arg2) = arg0. > >> */ > >> @@ -1654,15 +1656,8 @@ altivec_resolve_overloaded_builtin (location_t loc, > >> tree fndecl, > >> SET_EXPR_LOCATION (stmt, loc); > >> stmt = build1 (COMPOUND_LITERAL_EXPR, arg1_type, stmt); > >> } > >> - > >> - innerptrtype = build_pointer_type (arg1_inner_type); > >> - > >> - stmt = build_unary_op (loc, ADDR_EXPR, stmt, 0); > >> - stmt = convert (innerptrtype, stmt); > >> - stmt = build_binary_op (loc, PLUS_EXPR, stmt, arg2, 1); > >> - stmt = build_indirect_ref (loc, stmt, RO_NULL); > >> - stmt = build2 (MODIFY_EXPR, TREE_TYPE (stmt), stmt, > >> - convert (TREE_TYPE (stmt), arg0)); > >> + stmt = build_array_ref (loc, stmt, arg2); > >> + stmt = fold_build2 (MODIFY_EXPR, TREE_TYPE (arg0), stmt, arg0); > >> stmt = build2 (COMPOUND_EXPR, arg1_type, stmt, decl); > >> return stmt; > >> } > > You should make a copy of the vector, not modify the original one in > > place? (If I read that correctly, heh.) Looks good otherwise. > > > > Segher, there is already existed code to make a copy of vector as we > discussed offline. Thanks for the reminder. > > cat pr79251.c.006t.gimple > __attribute__((noinline)) > test (__vector signed int v, int i, size_t n) > { > __vector signed int D.3192; > __vector signed int D.3194; > __vector signed int v1; > v1 = v; > D.3192 = v1; > _1 = n & 3; > VIEW_CONVERT_EXPR<int[4]>(D.3192)[_1] = i; > v1 = D.3192; > D.3194 = v1; > return D.3194; > } > > Attached the v2 patch which does: > 1) Build VIEW_CONVERT_EXPR for vec_insert (i, v, n) like v[n%4] = i to > unify the gimple code, then expander could use vec_set_optab to expand. > 2) Recognize the pattern in expander and use optabs to expand > VIEW_CONVERT_EXPR to vec_insert with variable index to fast instructions: > lvsl+xxperm+xxsel.
Looking at the RTL expander side I see several issues. @@ -5237,6 +5288,16 @@ expand_assignment (tree to, tree from, bool nontemporal) to_rtx = expand_expr (tem, NULL_RTX, VOIDmode, EXPAND_WRITE); + if (TREE_CODE (to) == ARRAY_REF) + { + tree op0 = TREE_OPERAND (to, 0); + if (TREE_CODE (op0) == VIEW_CONVERT_EXPR + && expand_view_convert_to_vec_set (to, from, to_rtx)) + { + pop_temp_slots (); + return; + } + } you're placing this at an awkward spot IMHO, after to_rtx expansion but disregading parts of it and compensating just with 'to' matching. Is the pieces (offset, bitpos) really too awkward to work with for matching? Because as written you'll miscompile struct X { _vector signed int v; _vector singed int u; } x; test(int i, int a) { x.u[i] = a; } as I think you'll end up assigning to x.v. Are we just interested in the case were we store to a pseudo or also when the destination is memory? I guess only when it's a pseudo - correct? In that case handling this all in optimize_bitfield_assignment_op is probably the best thing to try. Note we possibly refrain from assigning a pseudo to such vector because we see a variable array-ref to it. Richard. > Thanks, > Xionghu