On Sun, 19 Nov 2023, Jeff Law wrote: > As I suspect you know a big part of the problem here is that BRANCH_COST and > rtx_cost don't have any common scale and thus trying to compare BRANCH_COST to > RTX_COST doesn't have well defined meaning.
We do have preexisting places using COSTS_N_INSNS (BRANCH_COST ()) though, as documented in ifcvt.cc: ??? Actually, instead of the branch instruction costs we might want to use COSTS_N_INSNS (BRANCH_COST ()) as in other places. */ so it seems the right direction, and given that we expose this measure to the user (and at the very least GCC developers implementing new tuning microarchitectures) I think it's the only sane way to do branch costing: define the measure in terms of how many ordinary ALU instructions a branch is statistically equivalent to. We may have to consider whether we want/need higher resolution here, effectively branch costs including a fractional part. > That hasn't kept us from trying to do precisely that and the result has always > been less than satisfactory. You're introducing more, but I don't think > there's a reasonable way out of this mess at this point. Ack. > > FWIW I don't understand why the test cases absolutely HAD to have such > > overlong names guaranteed to exceed our 80 column limit in any context. > > It's such a pain to handle. > I dislike the long names as well. I nearly changed them myself as part of the > eswin submission, but that seemed a bit gratituous to me so I left them as-is. > > If you wanted to rename them, be my guest, consider it pre-approved ;-) Ack. > WRT the extraneous zero-extension. Isn't that arguably a bug in the scc > expander for risc-v? Fixing that isn't a prerequisite here, but it probably > worth a bit of someone's time. I've looked at it already and it's the middle end that ends up with the zero-extension, specifically `convert_move' invoked from `emit_cstore' down the call to `noce_try_store_flag_mask', to widen the output from `cstoredi4', so I don't think we can do anything in the backend to prevent it from happening. And neither I think we can do anything useful about `cstoredi4' having a SImode output, as it's a pattern matched by name rather than RTX, so we can't provide variants having a SImode and a DImode output each both at a time, as that would cause a name clash. What we could do is adding new named patterns with the output mode spelt out explicitly, such as `cstoresidi4', `cstoredidi4', etc., but that would be quite a big rewrite. Maybe worth doing, maybe not, I don't know. For the time being the hack I have made does the trick. Maciej