On Tue, May 23, 2023 at 5:18 PM Richard Biener <rguent...@suse.de> wrote:
>
> The following also accounts for a GPR->XMM move cost for splat
> operations and properly guards eliding the cost when moving from
> memory only for SSE4.1 or HImode or larger operands.  This
> doesn't fix the PR fully yet.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?
>
> Thanks,
> Richard.
>
>         PR target/109944
>         * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
>         For vector construction or splats apply GPR->XMM move
>         costing.  QImode memory can be handled directly only
>         with SSE4.1 pinsrb.

OK.

Thanks,
Uros.

> ---
>  gcc/config/i386/i386.cc | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index 38125ce284a..011a1fb0d6d 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -23654,7 +23654,7 @@ ix86_vector_costs::add_stmt_cost (int count, 
> vect_cost_for_stmt kind,
>        stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign);
>        stmt_cost *= (TYPE_VECTOR_SUBPARTS (vectype) + 1);
>      }
> -  else if (kind == vec_construct
> +  else if ((kind == vec_construct || kind == scalar_to_vec)
>            && node
>            && SLP_TREE_DEF_TYPE (node) == vect_external_def
>            && INTEGRAL_TYPE_P (TREE_TYPE (vectype)))
> @@ -23687,7 +23687,9 @@ ix86_vector_costs::add_stmt_cost (int count, 
> vect_cost_for_stmt kind,
>              Likewise with a BIT_FIELD_REF extracting from a vector
>              register we can hope to avoid using a GPR.  */
>           if (!is_gimple_assign (def)
> -             || (!gimple_assign_load_p (def)
> +             || ((!gimple_assign_load_p (def)
> +                  || (!TARGET_SSE4_1
> +                      && GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (op))) == 1))
>                   && (gimple_assign_rhs_code (def) != BIT_FIELD_REF
>                       || !VECTOR_TYPE_P (TREE_TYPE
>                                 (TREE_OPERAND (gimple_assign_rhs1 (def), 
> 0))))))
> --
> 2.35.3

Reply via email to