On 11/4/24 6:09 AM, Craig Blackmore wrote:
For fast unaligned access targets, by pieces uses up to UNITS_PER_WORD
size pieces resulting in more store instructions than needed.  For
example gcc.target/riscv/rvv/base/setmem-2.c:f1 built with
`-O3 -march=rv64gcv -mtune=thead-c906`:
```
f1:
         vsetivli        zero,8,e8,mf2,ta,ma
         vmv.v.x v1,a1
         vsetivli        zero,0,e32,mf2,ta,ma
         sb      a1,14(a0)
         vmv.x.s a4,v1
         vsetivli        zero,8,e16,m1,ta,ma
         vmv.x.s a5,v1
         vse8.v  v1,0(a0)
         sw      a4,8(a0)
         sh      a5,12(a0)
         ret
```

The slow unaligned access version built with `-O3 -march=rv64gcv` used
15 sb instructions:
```
f1:
         sb      a1,0(a0)
         sb      a1,1(a0)
         sb      a1,2(a0)
         sb      a1,3(a0)
         sb      a1,4(a0)
         sb      a1,5(a0)
         sb      a1,6(a0)
         sb      a1,7(a0)
         sb      a1,8(a0)
         sb      a1,9(a0)
         sb      a1,10(a0)
         sb      a1,11(a0)
         sb      a1,12(a0)
         sb      a1,13(a0)
         sb      a1,14(a0)
         ret
```

After this patch, the following is generated in both cases:
```
f1:
         vsetivli        zero,15,e8,m1,ta,ma
         vmv.v.x v1,a1
         vse8.v  v1,0(a0)
         ret
```

gcc/ChangeLog:

        * config/riscv/riscv.cc (riscv_use_by_pieces_infrastructure_p):
        New function.
        (TARGET_USE_BY_PIECES_INFRASTRUCTURE_P): Define.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/rvv/autovec/pr113469.c: Expect mf2 setmem.
        * gcc.target/riscv/rvv/base/setmem-2.c: Update f1 to expect
        straight-line vector memset.
        * gcc.target/riscv/rvv/base/setmem-3.c: Likewise.
Pushed to the trunk.

Now I just need to make a trivial adjustment to my glibc patches and start blasting them out :-)

jeff

Reply via email to