On Sat, May 11, 2024 at 12:32 AM Jeff Law <jeffreya...@gmail.com> wrote:
>
>
>
> On 5/7/24 11:17 PM, Christoph Müllner wrote:
> > The RISC-V cpymemsi expansion is called, whenever the by-pieces
> > infrastructure will not take care of the builtin expansion.
> > The code emitted by the by-pieces infrastructure may emits code,
> > that includes unaligned accesses if riscv_slow_unaligned_access_p
> > is false.
> >
> > The RISC-V cpymemsi expansion is handled via riscv_expand_block_move().
> > The current implementation of this function does not check
> > riscv_slow_unaligned_access_p and never emits unaligned accesses.
> >
> > Since by-pieces emits unaligned accesses, it is reasonable to implement
> > the same behaviour in the cpymemsi expansion. And that's what this patch
> > is doing.
> >
> > The patch checks riscv_slow_unaligned_access_p at the entry and sets
> > the allowed alignment accordingly. This alignment is then propagated
> > down to the routines that emit the actual instructions.
> >
> > The changes introduced by this patch can be seen in the adjustments
> > of the cpymem tests.
> >
> > gcc/ChangeLog:
> >
> >       * config/riscv/riscv-string.cc (riscv_block_move_straight): Add
> >       parameter align.
> >       (riscv_adjust_block_mem): Replace parameter length by align.
> >       (riscv_block_move_loop): Add parameter align.
> >       (riscv_expand_block_move_scalar): Set alignment properly if the
> >       target has fast unaligned access.
> >
> > gcc/testsuite/ChangeLog:
> >
> >       * gcc.target/riscv/cpymem-32-ooo.c: Adjust for unaligned access.
> >       * gcc.target/riscv/cpymem-64-ooo.c: Likewise.
> Mostly ok.  One concern noted below.
>
>
> >
> > Signed-off-by: Christoph Müllner <christoph.muell...@vrull.eu>
> > ---
> >   gcc/config/riscv/riscv-string.cc              | 53 +++++++++++--------
> >   .../gcc.target/riscv/cpymem-32-ooo.c          | 20 +++++--
> >   .../gcc.target/riscv/cpymem-64-ooo.c          | 14 ++++-
> >   3 files changed, 59 insertions(+), 28 deletions(-)
> >
> > @@ -730,8 +732,16 @@ riscv_expand_block_move_scalar (rtx dest, rtx src, rtx 
> > length)
> >     unsigned HOST_WIDE_INT hwi_length = UINTVAL (length);
> >     unsigned HOST_WIDE_INT factor, align;
> >
> > -  align = MIN (MIN (MEM_ALIGN (src), MEM_ALIGN (dest)), BITS_PER_WORD);
> > -  factor = BITS_PER_WORD / align;
> > +  if (riscv_slow_unaligned_access_p)
> > +    {
> > +      align = MIN (MIN (MEM_ALIGN (src), MEM_ALIGN (dest)), BITS_PER_WORD);
> > +      factor = BITS_PER_WORD / align;
> > +    }
> > +  else
> > +    {
> > +      align = hwi_length * BITS_PER_UNIT;
> > +      factor = 1;
> > +    }
> Not sure why you're using hwi_length here.  That's a property of the
> host, not the target.  ISTM you wanted BITS_PER_WORD here to encourage
> word sized moves irrespective of alignment.

We set 'align' here to pretend proper alignment to force unaligned
accesses (if needed).
'hwi_length' is defined above as:
  unsigned HOST_WIDE_INT hwi_length = UINTVAL (length);
So, it is not a host property, but the number of bytes to compare.

Setting 'align' to BITS_PER_WORD does the same but is indeed the better choice.
I'll also add the comment "Pretend alignment" to make readers aware of
the fact that
we ignore the actual alignment.

> OK with that change after a fresh rounding of testing.

Pushed after adjusting as stated above and retesting:
* rv32 & rv64: GCC regression tests
* rv64: CPU 2017 intrate

Reply via email to