On 10/18/24 7:12 AM, Craig Blackmore wrote:
`expand_vec_setmem` only generated vectorized memset if it fitted into a
single vector store.  Extend it to generate a loop for longer and
unknown lengths.

The test cases now use -O1 so that they are not sensitive to scheduling.

gcc/ChangeLog:

        * config/riscv/riscv-string.cc
        (use_vector_stringop_p): Add comment.
        (expand_vec_setmem): Use use_vector_stringop_p instead of
        check_vectorise_memory_operation.  Add loop generation.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/rvv/base/setmem-1.c: Use -O1.  Expect a loop
        instead of a libcall.  Add test for unknown length.
        * gcc.target/riscv/rvv/base/setmem-2.c: Likewise.
        * gcc.target/riscv/rvv/base/setmem-3.c: Likewise and expect smaller
        lmul.
So why handle memset differently than the other mem* routines where we limit ourselves to what we can handle without needing loops?

My suspicion is that once we're moving enough data that we can't do it with a single big lmul store that calling out to the library variant probably isn't a big deal for memset. Do you have data which suggests otherwise?

Jeff


Reply via email to