On 8/22/24 9:02 PM, HAO CHEN GUI wrote:
Hi Hongtao,
在 2024/8/23 9:47, Hongtao Liu 写道:
On Thu, Aug 22, 2024 at 4:06 PM HAO CHEN GUI <guih...@linux.ibm.com> wrote:
Hi Hongtao,
在 2024/8/21 11:21, Hongtao Liu 写道:
r15-3058-gbb42c551905024 support const0 operand for movv16qi, please
rebase your patch and see if there's still the regressions.
There's still regressions. The patch enables V16QI const0 store, but
it also enables V8QI const0 store. The vector mode is preferable than
scalar mode so that V8QI is used for 8-byte memory clear instead of
DI. It's sub-optimal.
Could we check if mode_size is greater than HOST_BITS_PER_WIDE_INT?
Not sure if all targets prefer it. Richard & Jeff, what's your opinion?
Sorry, I haven't been following. That doesn't seem like a good test at
the surface (why would HOST_BITS_PER_WIDE_INT matter here, that's a
property of the host, not the target).
Additionally, selection of the "optimal" mode may be impossible as
there's just not going to be enough context. For a given target there
may be cases where something like V16QI is good and for the same target
cases where doing a series of DI accesses would be better.
So we have to pick sensible modes and give the targets ways to turn the
knobs to hopefully get better code depending on the desired behavior of
each (sub)target.
So how's that for a non-answer? :-)
IMHO, could we disable it from predicate or convert it to DI mode store
if V8QI const0 store is sub-optimal on i386?
I'd look for ways to allow the x86 port to control behavior. Presumably
the problem is the move-by-pieces code is emitting stores directly
rather than going through an expander?
Jeff