On Tue, 28 Sep 2021, Hongtao Liu wrote: > On Tue, Sep 28, 2021 at 2:59 PM Richard Biener via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: > > > > On Mon, 27 Sep 2021, sunil.k.pandey wrote: > > > > > On Linux/x86_64, > > > > > > 6390c5047adb75960f86d56582e6322aaa4d9281 is the first bad commit > > > commit 6390c5047adb75960f86d56582e6322aaa4d9281 > > > Author: Richard Biener <rguent...@suse.de> > > > Date: Wed Nov 18 09:36:57 2020 +0100 > > > > > > Allow different vector types for stmt groups > > > > > > caused > > > > > > FAIL: gcc.dg/vect/bb-slp-17.c -flto -ffat-lto-objects > > > scan-tree-dump-times slp2 "optimized: basic block" 1 > > > FAIL: gcc.dg/vect/bb-slp-17.c scan-tree-dump-times slp2 "optimized: basic > > > block" 1 > > > > This shows that it is maybe a bad idea to support V2SImode vectorization > > with -m32 when we refuse to implement even plus. > > > > OTOH it's just the mode that's available, autovectorize_vector_modes > > doesn't include the corresponding mode but we still pick it up via > > the related vector mode for group-size == 2.
It looks like we could define the vectorize.related_mode hook to reject V2SImode when !TARGET_MMX_WITH_SSE - the default implementation just checks for vector_mode_supported_p. > > > FAIL: gcc.dg/vect/bb-slp-pr65935.c -flto -ffat-lto-objects > > > scan-tree-dump-times slp1 "optimized: basic block" 10 > > > FAIL: gcc.dg/vect/bb-slp-pr65935.c scan-tree-dump-times slp1 "optimized: > > > basic block" 10 > > > > We are now vectorizing the SSE tail when vectorizing with AVX. I'll > > adjust the testcase to prefer SSE. > > > > > FAIL: gcc.target/i386/vect-pr97352.c scan-assembler-times vmov.pd 4 > > > > With -mach=cascadelake we get > > > > vpermpd $68, c, %ymm0 > > vpermpd $238, c, %ymm0 > > > > instead of > > > > vmovapd c, %ymm1 > > vinsertf128 $1, %xmm1, %ymm1, %ymm0 > > vperm2f128 $49, %ymm1, %ymm1, %ymm0 > > > > what's a way to disallow additional -march= from taking effect? It's > I usually add -mno-{avx,avx512f} and -mtune=generic or sometimes > -mprefer-vector-width=* to the testcases. OK, I will try this route then. Thanks, Richard.