On Thu, Apr 24, 2025 at 12:50 AM Jan Hubicka <hubi...@ucw.cz> wrote: > > > In some benchmark, I notice stv failed due to cost unprofitable, but the > > igain > > is inside the loop, but sse<->integer conversion is outside the loop, > > current cost > > model doesn't consider the frequency of those gain/cost. > > The patch weights those cost with frequency just like LRA does. > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > > Ok for GCC16? > > > > gcc/ChangeLog: > > > > * config/i386/i386-features.cc (scalar_chain::mark_dual_mode_def): > > (general_scalar_chain::compute_convert_gain): > > --- > > gcc/config/i386/i386-features.cc | 9 +++++++-- > > 1 file changed, 7 insertions(+), 2 deletions(-) > > > > diff --git a/gcc/config/i386/i386-features.cc > > b/gcc/config/i386/i386-features.cc > > index c35ac24fd8a..ae0844a70c2 100644 > > --- a/gcc/config/i386/i386-features.cc > > +++ b/gcc/config/i386/i386-features.cc > > @@ -337,18 +337,20 @@ scalar_chain::mark_dual_mode_def (df_ref def) > > /* Record the def/insn pair so we can later efficiently iterate over > > the defs to convert on insns not in the chain. */ > > bool reg_new = bitmap_set_bit (defs_conv, DF_REF_REGNO (def)); > > + unsigned frequency > > + = REG_FREQ_FROM_BB (BLOCK_FOR_INSN (DF_REF_INSN (def))); > > I am generally trying to get rid of remaing uses of REG_FREQ since the > 10000 based fixed point arithmetics iot always working that well. > > You can do the sums in profile_count type (doing something reasonable > when count is uninitialized) and then convert it to sreal for the final > heuristics. Thanks for the suggestion, let me try. > > Typically such code also wants skip scaling by count when optimizing for > size (since in this case we want to count statically). Not sure how > important it is for vector code but I suppose it can happen. > > Honza > > if (!bitmap_bit_p (insns, DF_REF_INSN_UID (def))) > > { > > if (!bitmap_set_bit (insns_conv, DF_REF_INSN_UID (def)) > > && !reg_new) > > return; > > - n_integer_to_sse++; > > + n_integer_to_sse += frequency; > > } > > else > > { > > if (!reg_new) > > return; > > - n_sse_to_integer++; > > + n_sse_to_integer += frequency; > > } > > > > if (dump_file) > > @@ -556,6 +558,8 @@ general_scalar_chain::compute_convert_gain () > > rtx src = SET_SRC (def_set); > > rtx dst = SET_DEST (def_set); > > int igain = 0; > > + unsigned frequency > > + = REG_FREQ_FROM_BB (BLOCK_FOR_INSN (insn)); > > > > if (REG_P (src) && REG_P (dst)) > > igain += 2 * m - ix86_cost->xmm_move; > > @@ -755,6 +759,7 @@ general_scalar_chain::compute_convert_gain () > > } > > } > > > > + igain *= frequency; > > if (igain != 0 && dump_file) > > { > > fprintf (dump_file, " Instruction gain %d for ", igain); > > -- > > 2.34.1 > >
-- BR, Hongtao