https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90332
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- Hmm, can't get the test to FAIL with a cross, somehow the dejagnu tests always end up UNSUPPORTED. The testcase for x86_64 has /* With AVX256 or more we do not pull off the trick eliding the epilogue. */ /* { dg-additional-options "-mprefer-avx128" { target { x86_64-*-* i?86-*-* } } } */ so we require the use of V16QImode -> V4SImode SAD with the V16QImode loads split into two V8QImode ones. There were insufficient dejagnu effective targets to model the restriction in + /* If the gap splits the vector in half and the target + can do half-vector operations avoid the epilogue peeling + by simply loading half of the vector only. Usually + the construction with an upper zero half will be elided. */ + dr_alignment_support alignment_support_scheme; + scalar_mode elmode = SCALAR_TYPE_MODE (TREE_TYPE (vectype)); + machine_mode vmode; + if (overrun_p + && !masked_p + && (((alignment_support_scheme + = vect_supportable_dr_alignment (first_dr_info, false))) + == dr_aligned + || alignment_support_scheme == dr_unaligned_supported) + && known_eq (nunits, (group_size - gap) * 2) + && mode_for_vector (elmode, (group_size - gap)).exists (&vmode) + && VECTOR_MODE_P (vmode) + && targetm.vector_mode_supported_p (vmode) + && (convert_optab_handler (vec_init_optab, + TYPE_MODE (vectype), vmode) + != CODE_FOR_nothing)) + overrun_p = false; I see we probably need hw_misalign, so does Index: gcc/testsuite/gcc.dg/vect/slp-reduc-sad-2.c =================================================================== --- gcc/testsuite/gcc.dg/vect/slp-reduc-sad-2.c (revision 270899) +++ gcc/testsuite/gcc.dg/vect/slp-reduc-sad-2.c (working copy) @@ -25,5 +25,5 @@ int x264_pixel_sad_8x8( uint8_t *pix1, u /* { dg-final { scan-tree-dump "vect_recog_sad_pattern: detected" "vect" } } */ /* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "vect" } } */ -/* { dg-final { scan-tree-dump-not "access with gaps requires scalar epilogue loop" "vect" } } */ +/* { dg-final { scan-tree-dump-not "access with gaps requires scalar epilogue loop" "vect" { xfail { ! vect_hw_misalign } } } } */ /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */ fix everything?