On Tue, May 23, 2023 at 5:18 PM Richard Biener <rguent...@suse.de> wrote: > > The following also accounts for a GPR->XMM move cost for splat > operations and properly guards eliding the cost when moving from > memory only for SSE4.1 or HImode or larger operands. This > doesn't fix the PR fully yet. > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK? > > Thanks, > Richard. > > PR target/109944 > * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): > For vector construction or splats apply GPR->XMM move > costing. QImode memory can be handled directly only > with SSE4.1 pinsrb.
OK. Thanks, Uros. > --- > gcc/config/i386/i386.cc | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > index 38125ce284a..011a1fb0d6d 100644 > --- a/gcc/config/i386/i386.cc > +++ b/gcc/config/i386/i386.cc > @@ -23654,7 +23654,7 @@ ix86_vector_costs::add_stmt_cost (int count, > vect_cost_for_stmt kind, > stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign); > stmt_cost *= (TYPE_VECTOR_SUBPARTS (vectype) + 1); > } > - else if (kind == vec_construct > + else if ((kind == vec_construct || kind == scalar_to_vec) > && node > && SLP_TREE_DEF_TYPE (node) == vect_external_def > && INTEGRAL_TYPE_P (TREE_TYPE (vectype))) > @@ -23687,7 +23687,9 @@ ix86_vector_costs::add_stmt_cost (int count, > vect_cost_for_stmt kind, > Likewise with a BIT_FIELD_REF extracting from a vector > register we can hope to avoid using a GPR. */ > if (!is_gimple_assign (def) > - || (!gimple_assign_load_p (def) > + || ((!gimple_assign_load_p (def) > + || (!TARGET_SSE4_1 > + && GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (op))) == 1)) > && (gimple_assign_rhs_code (def) != BIT_FIELD_REF > || !VECTOR_TYPE_P (TREE_TYPE > (TREE_OPERAND (gimple_assign_rhs1 (def), > 0)))))) > -- > 2.35.3