Re: [PATCHv4, expand] Add const0 move checking for CLEAR_BY_PIECES optabs

Jeff Law Fri, 23 Aug 2024 06:37:29 -0700



On 8/22/24 9:02 PM, HAO CHEN GUI wrote:

Hi Hongtao,

在 2024/8/23 9:47, Hongtao Liu 写道:

On Thu, Aug 22, 2024 at 4:06 PM HAO CHEN GUI <guih...@linux.ibm.com> wrote:


Hi Hongtao,

在 2024/8/21 11:21, Hongtao Liu 写道:

r15-3058-gbb42c551905024 support const0 operand for movv16qi, please
rebase your patch and see if there's still the regressions.


There's still regressions. The patch enables V16QI const0 store, but
it also enables V8QI const0 store. The vector mode is preferable than
scalar mode so that V8QI is used for 8-byte memory clear instead of
DI. It's sub-optimal.

Could we check if mode_size is greater than HOST_BITS_PER_WIDE_INT?

Not sure if all targets prefer it. Richard & Jeff, what's your opinion?

Sorry, I haven't been following. That doesn't seem like a good test atthe surface (why would HOST_BITS_PER_WIDE_INT matter here, that's aproperty of the host, not the target).

Additionally, selection of the "optimal" mode may be impossible asthere's just not going to be enough context. For a given target theremay be cases where something like V16QI is good and for the same targetcases where doing a series of DI accesses would be better.

So we have to pick sensible modes and give the targets ways to turn theknobs to hopefully get better code depending on the desired behavior ofeach (sub)target.


So how's that for a non-answer?  :-)


IMHO, could we disable it from predicate or convert it to DI mode store
if V8QI const0 store is sub-optimal on i386?

I'd look for ways to allow the x86 port to control behavior. Presumablythe problem is the move-by-pieces code is emitting stores directlyrather than going through an expander?



Jeff

Re: [PATCHv4, expand] Add const0 move checking for CLEAR_BY_PIECES optabs

Reply via email to