On Wed, Sep 16, 2020 at 8:15 AM luoxhu <luo...@linux.ibm.com> wrote: > > > > On 2020/9/15 14:51, Richard Biener wrote: > > > >> I only see VAR_DECL and PARM_DECL, is there any function to check the tree > >> variable is global? I added DECL_REGISTER, but the RTL still expands to > >> stack: > > > > is_global_var () or alternatively !auto_var_in_fn_p (), I think doing > > IFN_SET only > > makes sense if there's the chance we can promote the variable to a > > register. But it > > would be an incorrect transform (it stores the whole vector) if the > > vector storage > > could "escape" to another thread - which means you probably have to check > > !TREE_ADDRESSABLE as well. > > > > The tree of param "u" will be marked ADDRESSABLE when generating > "VIEW_CONVERT_EXPR<int[4]>(D.3190)[_1] = i;", if check !TREE_ADDRESSABLE, no > IFN_SET > will be produced in gimple-isel.
The TREE_ADDRESSABLE bit should be cleared later on GIMPLE during execute_update_address_taken. If it is not we should fix that. > #1 0x000000001066c700 in convert_vector_to_array_for_subscript (loc=5307072, > vecp=0x7fffffffc5d0, > index=<trunc_mod_expr 0x7ffff59c73a0>) at > ../../gcc/gcc/c-family/c-common.c:8169 > #2 0x0000000010553b54 in build_array_ref (loc=5307072, array=<parm_decl > 0x7ffff5ad0100 u>, index=< > trunc_mod_expr 0x7ffff59c73a0>) at ../../gcc/gcc/c/c-typeck.c:2668 > #3 0x00000000105c8824 in c_parser_postfix_expression_after_primary > (parser=0x7ffff7f703f0, expr_lo > c=5307040, expr=...) at ../../gcc/gcc/c/c-parser.c:10494 > #4 0x00000000105c7570 in c_parser_postfix_expression (parser=0x7ffff7f703f0) > at ../../gcc/gcc/c/c- > parser.c:10216 > > >> > >> My current implementation does: > >> > >> 1) v = vec_insert (i, u, n); > >> > >> =>gimple: > >> { > >> register __vector signed int D.3190; > >> D.3190 = u; // *new decl and copy u first.* > >> _1 = n & 3; > >> VIEW_CONVERT_EXPR<int[4]>(D.3190)[_1] = i; // *update op0 of > >> VIEW_CONVERT_EXPR* > >> _2 = D.3190; > >> ... > >> } > >> > >> =>isel: > >> { > >> register __vector signed int D.3190; > >> D.3190 = u_4(D); > >> _1 = n_6(D) & 3; > >> .VEC_SET (&D.3190, i_7(D), _1); > > > > why are you passing the address of D.3190 to .VEC_SET? That will not > > make D.3190 > > be expanded to a pseudo. You really need to have GIMPLE registers > > here (SSA name) > > and thus a return value, leaving the argument unmodified > > > > D.3190_3 = .VEC_SET (D.3190_4, i_7(D), _1); > > > > note this is why I asked about the actual CPU instruction - as I read > > Seghers mail > > the instruction modifies a vector register, not memory. > > > > Updated the code and got expected gimple-isel output and ASM for both 2 cases: > > pr79251.c.236t.isel: > > __attribute__((noinline)) > test (__vector signed int u, int i, size_t n) > { > long unsigned int _1; > __vector signed int _6; > vector(4) int _7; > vector(4) int vec_set_dst_8; > > <bb 2> [local count: 1073741824]: > _1 = n_2(D) & 3; > _7 = u; > vec_set_dst_8 = .VEC_SET (_7, i_4(D), _1); > u = vec_set_dst_8; > _6 = u; > return _6; > > } > > But tree variable "u" need to be set to "TREE_ADDRESSABLE (view_op0) = 0;" > (Maybe check IFN VEC_SET and set to 0 in discover_nonconstant_array_refs > is better later.) after generating the IFN VEC_SET, otherwise, "u" will still > be expanded to stack since expander: > "Replacing Expressions: _7 replace with --> _7 = u;". > > Setting "u" to non-addressable also seems really unreasonable as for below > case, as u[n+1] need be ADDRESSABLE: Why should u be addressable for u[n+1]? > __attribute__ ((noinline)) vector TYPE > test (vector TYPE u, TYPE i, size_t n) > { > u[n % 4] = i; > u[n+1] = i+1; > return u; > } > > => gimple-isel with both VEC_SET and VIEW_CONVERT_EXPR: > > test (__vector signed int u, int i, size_t n) > { > long unsigned int _1; > long unsigned int _2; > int _3; > __vector signed int _9; > vector(4) int _10; > vector(4) int vec_set_dst_11; > > <bb 2> [local count: 1073741824]: > _1 = n_4(D) & 3; > _10 = u; > vec_set_dst_11 = .VEC_SET (_10, i_6(D), _1); > u = vec_set_dst_11; > _2 = n_4(D) + 1; > _3 = i_6(D) + 1; > VIEW_CONVERT_EXPR<int[4]>(u)[_2] = _3; why is .VEC_SET not used here? > _9 = u; > return _9; > > } > > > Below code are generating the IFN call, create_tmp_reg or > create_tmp_reg_ssa_name > seems not create a variable that will be allocated on register? Yes, they create decls, use make_ssa_name (TREE_TYPE (view_op0)) to create a SSA name of the desired type. > > diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc > index b330cf4c20e..a699022cd09 100644 > --- a/gcc/gimple-isel.cc > +++ b/gcc/gimple-isel.cc > ... > + if (!is_global_var (view_op0) > + && TREE_CODE (TREE_TYPE (view_op0)) == VECTOR_TYPE > + && tree_fits_uhwi_p (TYPE_SIZE (TREE_TYPE (view_op0))) > + && tree_to_uhwi (TYPE_SIZE (TREE_TYPE (view_op0))) == 128 > + && determine_value_range (pos, &minv, &maxv) == VR_RANGE > + && wi::geu_p (minv, 0) > + && wi::leu_p (maxv, (128 / GET_MODE_BITSIZE (innermode)))) > + { > + location_t loc = gimple_location (stmt); > + tree var_src = create_tmp_reg (TREE_TYPE (view_op0)); > + tree var_dst > + = make_temp_ssa_name (TREE_TYPE (view_op0), NULL, "vec_set_dst"); > + TREE_ADDRESSABLE (view_op0) = 0; > + > + ass_stmt = gimple_build_assign (var_src, view_op0); you should set the gimple_vuse to the vuse of 'stmt' > + gimple_set_location (ass_stmt, loc); > + gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT); > + > + new_stmt > + = gimple_build_call_internal (IFN_VEC_SET, 3, var_src, val, pos); > + > + gimple_call_set_lhs (new_stmt, var_dst); > + > + gimple_set_location (new_stmt, loc); > + gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); > + > + ass_stmt = gimple_build_assign (view_op0, var_dst); > + > + gimple_set_location (ass_stmt, loc); > + gsi_insert_before (gsi, ass_stmt, GSI_SAME_STMT); > + > + gimple_move_vops (ass_stmt, stmt); > + } > + } > + > + return ass_stmt; > > > Thanks, > Xionghu