Re: [PATCH] RISC-V: Fix wrong tune parameters on int_div

Andrew Waterman Fri, 27 Oct 2023 10:22:27 -0700

On Fri, Oct 27, 2023 at 6:55 AM Jeff Law <jeffreya...@gmail.com> wrote:
>
>
>
> On 10/27/23 01:49, Robin Dapp wrote:
> >> @@ -346,7 +346,7 @@ static const struct riscv_tune_param rocket_tune_info 
> >> = {
> >>     {COSTS_N_INSNS (4), COSTS_N_INSNS (5)},  /* fp_mul */
> >>     {COSTS_N_INSNS (20), COSTS_N_INSNS (20)},        /* fp_div */
> >>     {COSTS_N_INSNS (4), COSTS_N_INSNS (4)},  /* int_mul */
> >> -  {COSTS_N_INSNS (6), COSTS_N_INSNS (6)},   /* int_div */
> >> +  {COSTS_N_INSNS (33), COSTS_N_INSNS (65)}, /* int_div */
> >>     1,                                               /* issue_rate */
> >>     3,                                               /* branch_cost */
> >>     5,                                               /* memory_cost */
> >> @@ -361,7 +361,7 @@ static const struct riscv_tune_param 
> >> sifive_7_tune_info = {
> >>     {COSTS_N_INSNS (4), COSTS_N_INSNS (5)},  /* fp_mul */
> >>     {COSTS_N_INSNS (20), COSTS_N_INSNS (20)},        /* fp_div */
> >>     {COSTS_N_INSNS (4), COSTS_N_INSNS (4)},  /* int_mul */
> >> -  {COSTS_N_INSNS (6), COSTS_N_INSNS (6)},   /* int_div */
> >> +  {COSTS_N_INSNS (33), COSTS_N_INSNS (65)}, /* int_div */
> >>     2,                                               /* issue_rate */
> >>     4,                                               /* branch_cost */
> >>     3,                                               /* memory_cost */
> >> @@ -376,7 +376,7 @@ static const struct riscv_tune_param 
> >> thead_c906_tune_info = {
> >>     {COSTS_N_INSNS (4), COSTS_N_INSNS (5)}, /* fp_mul */
> >>     {COSTS_N_INSNS (20), COSTS_N_INSNS (20)}, /* fp_div */
> >>     {COSTS_N_INSNS (4), COSTS_N_INSNS (4)}, /* int_mul */
> >> -  {COSTS_N_INSNS (6), COSTS_N_INSNS (6)}, /* int_div */
> >> +  {COSTS_N_INSNS (18), COSTS_N_INSNS (34)}, /* int_div */
> >>     1,            /* issue_rate */
> >>     3,            /* branch_cost */
> >>     5,            /* memory_cost */
> >
> > Instruction costs don't really correspond to latencies even though
> > sometimes they are used as if they were.  I'm a bit wary of using
> > e.g. 65 which would disparage each use of an integer division inside
> > a sequence.
> >
> > Could you check which costs we need in order to still emit your wanted
> > sequence?  Maybe we can use values a bit lower than yours and still
> > get the proper code.  Where is the decision being made actually?
> The main use of costing of a div/mod instruction is to guide the
> reciprocal division code when dividing by a constant.    In that context
> we're comparing costs against a sequence of multiplies, shifts, add/sub
> insns which are almost always costed by their latency.  So using latency
> for division is a reasonable place to start.
>
> The other thing that might be worth investigating for those processors
> would be to set "use_divmod_expansion" in the cost structure.  I've
> heard talk of fusing div/mod into divmod, though I'm not aware of any
> part implementing that fusion


I'm also unaware of existing implementations that fuse these
operations; div + mul + sub is probably best for most uarches...

> (from a prior life, that would seem to
> require a 2nd output port on the integer unit which could be highly
> undesirable).

...but it can be done more cheaply than this, so I wouldn't foreclose
on the possibility.  Nevertheless, future work, as you say.

> Anyway, this could be a followup item for Yangyu if it
> looks profitable.
>
> jeff

Re: [PATCH] RISC-V: Fix wrong tune parameters on int_div

Reply via email to