I approve of this patch. I'm responsible for GCC for TX2 at Marvell. Andrew 
Pinski should certainly chime in if he wants.

Joel

On 7/6/20, 10:48 AM, "Gcc-patches on behalf of Richard Sandiford" 
<gcc-patches-boun...@gcc.gnu.org on behalf of richard.sandif...@arm.com> wrote:

    External Email

    ----------------------------------------------------------------------
    Anton Youdkevitch <anton.youdkevi...@bell-sw.com> writes:
    > This patch changes some vector costs for TX2 so that
    > more vectorizations beneficial for TX2 chip can happen.
    >
    > The new cost model makes the x264 benchmark of CPU2017
    > 7% faster with no negative performance impact on other
    > benchmarks.
    >
    > Bootstrapped on linux-aarch64
    >
    >   2020-07-06 Anton Youdkevitch <anton.youdkevi...@bell-sw.com>
    > gcc/
    >     * config/aarch64/aarch64.c (thunderx2t99_regmove_cost):
    >     Change instruction cost
    >     (thunderx2t99_vector_cost): Likewise

    OK if Andrew agrees.

    Thanks,
    Richard

    >
    > From 3440e019c05fe5b565041cad549c6eefa2004a2b Mon Sep 17 00:00:00 2001
    > From: Anton Youdkevitch <anton.youdkevi...@bell-sw.com>
    > Date: Tue, 26 May 2020 04:23:04 -0700
    > Subject: [PATCH] Change costs for TX2 to expose more vectorization 
opportunities
    >
    > Make the costs such that they do not exaclty reflect
    > the actual instructions costs from the manual but make
    > the codegen emit the code we want it to.
    > ---
    >  gcc/config/aarch64/aarch64.c | 18 +++++++++---------
    >  1 file changed, 9 insertions(+), 9 deletions(-)
    >
    > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
    > index e92c7e6..18c2251 100644
    > --- a/gcc/config/aarch64/aarch64.c
    > +++ b/gcc/config/aarch64/aarch64.c
    > @@ -535,9 +535,9 @@ static const struct cpu_regmove_cost 
thunderx2t99_regmove_cost =
    >  {
    >    1, /* GP2GP  */
    >    /* Avoid the use of int<->fp moves for spilling.  */
    > -  8, /* GP2FP  */
    > -  8, /* FP2GP  */
    > -  4  /* FP2FP  */
    > +  5, /* GP2FP  */
    > +  6, /* FP2GP  */
    > +  3, /* FP2FP  */
    >  };
    >  
    >  static const struct cpu_regmove_cost thunderx3t110_regmove_cost =
    > @@ -704,15 +704,15 @@ static const struct cpu_vector_cost 
thunderx2t99_vector_cost =
    >    6, /* scalar_fp_stmt_cost  */
    >    4, /* scalar_load_cost  */
    >    1, /* scalar_store_cost  */
    > -  5, /* vec_int_stmt_cost  */
    > -  6, /* vec_fp_stmt_cost  */
    > +  4, /* vec_int_stmt_cost  */
    > +  5, /* vec_fp_stmt_cost  */
    >    10, /* vec_permute_cost  */
    >    6, /* vec_to_scalar_cost  */
    >    5, /* scalar_to_vec_cost  */
    > -  8, /* vec_align_load_cost  */
    > -  8, /* vec_unalign_load_cost  */
    > -  4, /* vec_unalign_store_cost  */
    > -  4, /* vec_store_cost  */
    > +  4, /* vec_align_load_cost  */
    > +  4, /* vec_unalign_load_cost  */
    > +  1, /* vec_unalign_store_cost  */
    > +  1, /* vec_store_cost  */
    >    2, /* cond_taken_branch_cost  */
    >    1  /* cond_not_taken_branch_cost  */
    >  };

Reply via email to