On 12/29/22 07:50, Philipp Tomsich wrote:
We have two issues around min/max here:
1. That it doesn't apply to the SImode abs case (which is due to
expand_abs_nojump() blindly testing for the current mode in smax_optab).
Mode testing is inherent in the optab query interface.
2. That we have to reduce the number of extensions to the least amount.
Yes. I'd already indicated to Raphael that y'all were poking in this
space and that I consider the two approaches complementary.
Essentially we want to reduce unnecessary extensions by type adjustment
in gimple when we can prove that safe while at the same time we want the
RTL bits to generate good code if we're unable to adjust the types.
The above addresses expand_abs_nojump(), but makes the general solution
harder as the middle-end needs to know there is no native SImode min/max
available.
I think we can probably deal with this without too much trouble. We
just need a reasonable API to ask "is this optab supported directly by
the target, or is it expanded by the target". I can envision a couple
ways to do that, one of which would be to query the rtx cost. If the
resultant cost is > COSTS_N_INSNS (1), then its synthesized, otherwise
it's native.
We still plan (proof-of-concept works, but a final patch will likely not
be ready before very late in January) to submit a patch to improve the
expansion of MIN_EXPR/MAX_EXPR that utilizes the type-precision and
value-ranges to not even create the sign-extensions in the first place.
Which would be fantastic.
If we do the above, the middle-end will blindly emit this sequence with
the 2 sign-extensions — which may or may not be eliminated later by
combining with a w-form.
I think the question becomes whether or not we think the work you're
doing would likely subsume what Raphael has done.
Your work and Raphael's work both target the gimple->RTL expansion
phase. Yours tries to adjust types so that we have fewer extensions.
Raphael's work adjusts how we expand when we're asked for a 32bit min/max.
Those seem complementary to me -- if you're unable to adjust the types,
then go ahead and ask for that 32bit min/max and let Raphael's expander
generate reasonable code for it. If there's an API that allows you to
query for native vs synthesized support, then you should have the
backend info necessary to drive your type adjustment work.
Note that, if we decide to go ahead with using this as a temporary
solution until our change is ready, you'll also need to add a cost for
the SImode max.
Agreed. I haven't exposed Raphael to rtx_cost yet, but now seems like a
good a time as any.
Raphael, you're going to want to look in riscv.cc::riscv_rtx_costs.
We'll want a case for MIN/MAX. For TARGET_ZBB and word sized
operations, the cost is probably COSTS_N_INSNS (1), for TARGET_ZBB a
subword operation (ie SImode for TARGET_64BIT) it'd be COSTS_N_INSNS
(2). In both of those cases I think we return true.
For !TARGET_ZBB, COSTS_N_INSNS (N) where N is the number of instructions
to implement a min/max without TARGET_ZBB would be the right value. I
just don't know it offhand.
Jeff