On Sun, 19 Nov 2023, Jeff Law wrote:

> As I suspect you know a big part of the problem here is that BRANCH_COST and
> rtx_cost don't have any common scale and thus trying to compare BRANCH_COST to
> RTX_COST doesn't have well defined meaning.

 We do have preexisting places using COSTS_N_INSNS (BRANCH_COST ()) 
though, as documented in ifcvt.cc:

     ??? Actually, instead of the branch instruction costs we might want
     to use COSTS_N_INSNS (BRANCH_COST ()) as in other places.  */

so it seems the right direction, and given that we expose this measure to 
the user (and at the very least GCC developers implementing new tuning 
microarchitectures) I think it's the only sane way to do branch costing: 
define the measure in terms of how many ordinary ALU instructions a branch 
is statistically equivalent to.

 We may have to consider whether we want/need higher resolution here, 
effectively branch costs including a fractional part.

> That hasn't kept us from trying to do precisely that and the result has always
> been less than satisfactory.  You're introducing more, but I don't think
> there's a reasonable way out of this mess at this point.

 Ack.

> > FWIW I don't understand why the test cases absolutely HAD to have such
> > overlong names guaranteed to exceed our 80 column limit in any context.
> > It's such a pain to handle.
> I dislike the long names as well.  I nearly changed them myself as part of the
> eswin submission, but that seemed a bit gratituous to me so I left them as-is.
> 
> If you wanted to rename them, be my guest, consider it pre-approved ;-)

 Ack.

> WRT the extraneous zero-extension.  Isn't that arguably a bug in the scc
> expander for risc-v?  Fixing that isn't a prerequisite here, but it probably
> worth a bit of someone's time.

 I've looked at it already and it's the middle end that ends up with the 
zero-extension, specifically `convert_move' invoked from `emit_cstore' 
down the call to `noce_try_store_flag_mask', to widen the output from 
`cstoredi4', so I don't think we can do anything in the backend to prevent 
it from happening.  And neither I think we can do anything useful about 
`cstoredi4' having a SImode output, as it's a pattern matched by name 
rather than RTX, so we can't provide variants having a SImode and a DImode 
output each both at a time, as that would cause a name clash.

 What we could do is adding new named patterns with the output mode spelt 
out explicitly, such as `cstoresidi4', `cstoredidi4', etc., but that would 
be quite a big rewrite.  Maybe worth doing, maybe not, I don't know.  For 
the time being the hack I have made does the trick.

  Maciej

Reply via email to