https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98772

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
But the issue seems to be

t.c:3:22: note:   ==> examining statement: _34 = *pix1_19;
t.c:3:22: missed:   permutation requires at least three vectors _34 = *pix1_19;
t.c:3:22: missed:   unsupported load permutation
t.c:6:24: missed:   not vectorized: relevant stmt not supported: _34 =
*pix1_19;
t.c:3:22: note:   removing SLP instance operations starting from: *_44 = _45;
t.c:3:22: missed:  unsupported SLP instances
t.c:3:22: note:  re-trying with SLP disabled

so SLP vectorization failing because of unsupported permutes with the larger
vector size and the non-SLP case failing with

t.c:3:22: missed:  loop does not have enough iterations to support
vectorization.
t.c:3:22: note:  ***** Analysis failed with vector mode V16QI

so I don't see the connection with the pattern.  Only for V8QI I see it
remotely mentioned, but there we have _different_ pattens matched...

I think the permute issue is "old" and goes away if you make it
strided-slp by incrementing pix1/2 by a non-constant, then we can
load the vector by char[4] pieces.  We just don't consider that
possibility when instead trying "strided" (with gap at the end).

The widen patterns are a red herring here I think.

Reply via email to