On 6/6/23 00:47, Richard Biener wrote:
I wonder if there's some more generic target macro we can key the
behavior off - SLOW_BYTE_ACCESS isn't a good fit, WORD_REGISTER_OPERATIONS
is maybe closer but it's exact implications are unknown to me. Maybe
there's something else as well ...
LOAD_EXTEND_OP might help here, at least on some targets. Though not on
x86.
The point about OPTAB_WIDEN above was that I wonder why we
extend 'op0' and 'op1' before emitting the binop when we allow WIDEN
anyway.
Ahh. I misunderstood. However, I think dropping the pre-widening will
result in byte ops on x86 which may not be wise given the partial
register stall problem that exists on some variants.
Yes, we want the result in 'mode' (but why? As you say we
can extend at the end) and there's likely no way to tell expand_simple_binop
to "expand as needed and not narrow the result". So I wonder if we should
emulate that somehow (also taking into consideration the compare).
That's what I felt I was starting to build. Essentially looking at
costing (and probably other stuff eventually, like the ability to
compare/branch on narrower modes) to make a determination about whether
or not to do the operations in narrow or wider modes. With the costing
so mucked up on x86 though, I'm hesitant to pursue this path further at
this time.
Jeff