On 5/31/2021 6:04 AM, H.J. Lu wrote:
On Sun, May 30, 2021 at 11:49 AM Jeff Law <jeffreya...@gmail.com> wrote:


On 5/11/2021 5:35 PM, H.J. Lu via Gcc-patches wrote:
Add TARGET_READ_MEMSET_VALUE and TARGET_GEN_MEMSET_VALUE to support
target instructions to duplicate QImode value to TImode/OImode/XImode
value for memmset.  Define SCRATCH_SSE_REG as a scratch register for
ix86_gen_memset_value.

gcc/

       PR middle-end/90773
       * builtins.c (builtin_memset_read_str): Call
       targetm.read_memset_value.
       (builtin_memset_gen_str): Call targetm.gen_memset_value.
       * target.def (read_memset_value): New hook.
       (gen_memset_value): Likewise.
       * targhooks.c: Inclue "builtins.h".
       (default_read_memset_value): New function.
       (default_gen_memset_value): Likewise.
       * targhooks.h (default_read_memset_value): New prototype.
       (default_gen_memset_value): Likewise.
       * config/i386/i386-expand.c (ix86_expand_vector_init_duplicate):
       Make it global.
       * config/i386/i386-protos.h (ix86_minimum_incoming_stack_boundary):
       New.
       (ix86_expand_vector_init_duplicate): Likewise.
       * config/i386/i386.c (ix86_minimum_incoming_stack_boundary): Add
       an argument to ignore stack_alignment_estimated.  It is passed
       as false by default.
       (ix86_gen_memset_value_from_prev): New function.
       (ix86_gen_memset_value): Likewise.
       (ix86_read_memset_value): Likewise.
       (TARGET_GEN_MEMSET_VALUE): New.
       (TARGET_READ_MEMSET_VALUE): Likewise.
       * config/i386/i386.h (SCRATCH_SSE_REG): New.
       * doc/tm.texi.in: Add TARGET_READ_MEMSET_VALUE and
       TARGET_GEN_MEMSET_VALUE hooks.
       * doc/tm.texi: Regenerated.

gcc/testsuite/

       PR middle-end/90773
       * gcc.target/i386/pr90773-15.c: New test.
       * gcc.target/i386/pr90773-16.c: Likewise.
       * gcc.target/i386/pr90773-17.c: Likewise.
       * gcc.target/i386/pr90773-18.c: Likewise.
       * gcc.target/i386/pr90773-19.c: Likewise.
Why does this need target hooks?  ISTM the right way to go here is to
just emit the constant load to the target register and let the target
figure out how best to construct the constant into the register.  If
that means load it via QImode and broadcast, that's fine, but I'm not
sure why that's not all implemented in the target files.

I will submit a patch to add optabs instead.
I may be missing something, but I'm not even sure why we need special optabs.

Aren't you just trying to efficiently get a constant element broadcast across an entire vector?

jeff

Reply via email to