On Thu, 3 Oct 2024, Thomas Schwinge wrote: > Hi! > > On 2024-09-06T11:30:06+0200, Richard Biener <rguent...@suse.de> wrote: > > On Thu, 5 Sep 2024, Richard Biener wrote: > >> The following enables single-lane loop SLP discovery for non-grouped stores > >> and adjusts vectorizable_store to properly handle those. > > > I have now pushed this as r15-3509-gd34cda72098867 > > >> --- a/gcc/testsuite/gcc.dg/vect/slp-26.c > >> +++ b/gcc/testsuite/gcc.dg/vect/slp-26.c > >> @@ -50,4 +50,5 @@ int main (void) > >> /* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" { > >> target { ! { mips_msa || { amdgcn-*-* || { riscv_v || loongarch_sx } } } } > >> } } } */ > >> /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { > >> target { mips_msa || { amdgcn-*-* || { riscv_v || loongarch_sx } } } } } } > >> */ > >> /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 > >> "vect" { target { ! { mips_msa || { amdgcn-*-* || { riscv_v || > >> loongarch_sx } } } } } } } */ > >> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 > >> "vect" { target { mips_msa || { amdgcn-*-* || { riscv_v || loongarch_sx } > >> } } } } } */ > >> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 > >> "vect" { target { mips_msa || { amdgcn-*-* || loongarch_sx } } } } } */ > >> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 > >> "vect" { target riscv_v } } } */ > > For '--target=amdgcn-amdhsa' (tested '-march=gfx908', '-march=gfx1100'), > I see: > > PASS: gcc.dg/vect/slp-26.c (test for excess errors) > PASS: gcc.dg/vect/slp-26.c execution test > PASS: gcc.dg/vect/slp-26.c scan-tree-dump-times vect "vectorized 1 loops" > 1 > [-PASS:-]{+FAIL:+} gcc.dg/vect/slp-26.c scan-tree-dump-times vect > "vectorizing stmts using SLP" 1 > > gcc.dg/vect/slp-26.c: pattern found 2 times > > ..., so I suppose I'll apply the same change to 'amdgcn-*-*' as you did > to 'riscv_v'?
I guess yes, I don't remember exactly the reason but IIRC it's about the unsigned division which gcn might also be able to do - the 32817 value is explicitly excluded from pattern recognition. We don't have an effective target for unsigned [short] integer division. Richard. > > Grüße > Thomas > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg)