On 10/18/24 7:12 AM, Craig Blackmore wrote:
Unlike the other vector string ops, expand_block_move was using max LMUL
m8 regardless of TARGET_MAX_LMUL.
Yea. In retrospect I should have caught this when we integrated the
original patches.
The check for whether to generate inline vector code for movmem has been
moved from movmem<mode> to riscv_vector::expand_block_move to avoid
maintaining multiple versions of similar logic. They already differed
on the minimum length for which they would generate vector code. Now
that the expand_block_move value is used, movmem will be generated for
smaller lengths.
ACK.
Limiting memcpy to m1 caused some memcpy loops to be generated in
the calling convention tests which makes it awkward to add suitable scan
assembler tests checking the return value being set, so
-mrvv-max-lmul=m8 has been added to these tests. Other tests have been
adjusted to expect the new memcpy m1 generation where reasonably
straight-forward, otherwise -mrvv-max-lmul=m8 has been added.
Seems quite reasonable.
pr111720-[0-9].c regressed because a memcpy loop is generated instead
of straight-line. This reveals an existing issue where a redundant
straight-line memcpy gets eliminated but a memcpy loop does not
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117205).
Hard to argue with Richi's comment though; opaque intrinsics are going
to be tough for the gimple phases to do much with. My hope is that not
too many developers make heavy use of the intrinsics interface. I
wouldn't consider that a major problem.
I have added -mrvv-max-lmul=m8 to pr111720-[0-9].c so that we continue
to test the elimination of straight-line memcpy.
ACK.
gcc/ChangeLog:
* config/riscv/riscv-protos.h (get_lmul_mode): New prototype.
(expand_block_move): Add bool parameter for movmem_p.
* config/riscv/riscv-string.cc (riscv_expand_block_move_scalar):
Pass movmem_p as false to riscv_vector::expand_block_move.
(expand_block_move): Add movmem_p parameter. Return false if
loop needed and movmem_p is true. Respect TARGET_MAX_LMUL.
* config/riscv/riscv-v.cc (get_lmul_mode): New function.
* config/riscv/riscv.md (movmem<mode>): Move checking for
whether to generate inline vector code to
riscv_vector::expand_block_move by passing movmem_p as true.
Generally OK as well.
Robin is on PTO for the next few days. But I take his comment as a
request for a minor cleanup/enhancement that could be used in the rest
of the inline str*/mem* expanders, not as a "this has to be fixed to
move forward".
So I'll go ahead and push this to the trunk now. Please review Robin's
comment on this patch and take appropriate action.
Thanks,
jeff