On Thu, 3 Oct 2024, Thomas Schwinge wrote:

> Hi!
> 
> On 2024-09-06T11:30:06+0200, Richard Biener <rguent...@suse.de> wrote:
> > On Thu, 5 Sep 2024, Richard Biener wrote:
> >> The following enables single-lane loop SLP discovery for non-grouped stores
> >> and adjusts vectorizable_store to properly handle those.
> 
> > I have now pushed this as r15-3509-gd34cda72098867
> 
> >> --- a/gcc/testsuite/gcc.dg/vect/slp-26.c
> >> +++ b/gcc/testsuite/gcc.dg/vect/slp-26.c
> >> @@ -50,4 +50,5 @@ int main (void)
> >>  /* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" { 
> >> target { ! { mips_msa || { amdgcn-*-* || { riscv_v || loongarch_sx } } } } 
> >> } } } */
> >>  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { 
> >> target { mips_msa || { amdgcn-*-* || { riscv_v || loongarch_sx } } } } } } 
> >> */
> >>  /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 
> >> "vect" { target { ! { mips_msa || { amdgcn-*-* || { riscv_v || 
> >> loongarch_sx } } } } } } } */
> >> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 
> >> "vect" { target { mips_msa || { amdgcn-*-* || { riscv_v || loongarch_sx } 
> >> } } } } } */
> >> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 
> >> "vect" { target { mips_msa || { amdgcn-*-* || loongarch_sx } } } } } */
> >> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 
> >> "vect" { target riscv_v } } } */
> 
> For '--target=amdgcn-amdhsa' (tested '-march=gfx908', '-march=gfx1100'),
> I see:
> 
>     PASS: gcc.dg/vect/slp-26.c (test for excess errors)
>     PASS: gcc.dg/vect/slp-26.c execution test
>     PASS: gcc.dg/vect/slp-26.c scan-tree-dump-times vect "vectorized 1 loops" 
> 1
>     [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-26.c scan-tree-dump-times vect 
> "vectorizing stmts using SLP" 1
> 
>     gcc.dg/vect/slp-26.c: pattern found 2 times
> 
> ..., so I suppose I'll apply the same change to 'amdgcn-*-*' as you did
> to 'riscv_v'?

I guess yes, I don't remember exactly the reason but IIRC it's about the
unsigned division which gcn might also be able to do - the 32817
value is explicitly excluded from pattern recognition.  We don't have
an effective target for unsigned [short] integer division.

Richard.

> 
> Grüße
>  Thomas
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

Reply via email to