https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116784

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2024-09-20
   Target Milestone|---                         |15.0
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot 
gnu.org
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |ASSIGNED
           Keywords|                            |testsuite-fail

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
It looks like powerpc supports the 3-lane interleaving scheme we use for SLP.

It might fit vect_strided3 (but powerpc isn't amongst that).

In particular we use the following V8HImode permutes:

  _39 = VEC_PERM_EXPR <vect_a_13.25_22, vect_a_13.26_8, { 2, 2, 5, 5, 8, 8, 11,
11 }>;
  _40 = VEC_PERM_EXPR <vect_a_13.26_8, vect_a_13.27_6, { 6, 6, 9, 9, 12, 12,
15, 15 }>;
  _41 = VEC_PERM_EXPR <_39, _40, { 0, 2, 4, 6, 8, 10, 12, 14 }>;
  _35 = VEC_PERM_EXPR <vect_a_13.25_22, vect_a_13.26_8, { 1, 1, 4, 4, 7, 7, 10,
10 }>;
  _36 = VEC_PERM_EXPR <vect_a_13.26_8, vect_a_13.27_6, { 5, 5, 8, 8, 11, 11,
14, 14 }>;
  _37 = VEC_PERM_EXPR <_35, _36, { 0, 2, 4, 6, 8, 10, 12, 14 }>;
  _32 = VEC_PERM_EXPR <vect_a_13.25_22, vect_a_13.26_8, { 0, 0, 3, 3, 6, 6, 9,
9 }>;
  _33 = VEC_PERM_EXPR <vect_a_13.26_8, vect_a_13.27_6, { 4, 4, 7, 7, 10, 10,
13, 13 }>;
  _34 = VEC_PERM_EXPR <_32, _33, { 0, 2, 4, 6, 8, 10, 12, 14 }>;

  _45 = VEC_PERM_EXPR <vect__3.30_43, vect__4.31_44, { 0, 8, 0, 1, 9, 1, 2, 10
}>;
  _46 = VEC_PERM_EXPR <vect__3.30_43, vect__4.31_44, { 2, 3, 11, 3, 4, 12, 4, 5
}>;
  _47 = VEC_PERM_EXPR <vect__4.31_44, vect__3.30_43, { 5, 13, 14, 6, 14, 15, 7,
15 }>;
  _49 = VEC_PERM_EXPR <_45, vect__5.32_48, { 0, 1, 8, 3, 4, 9, 6, 7 }>;
  _50 = VEC_PERM_EXPR <vect__5.32_48, _46, { 2, 9, 10, 3, 12, 13, 4, 15 }>;
  _51 = VEC_PERM_EXPR <_47, vect__5.32_48, { 0, 13, 2, 3, 14, 5, 6, 15 }>;

Note the testcase has the "obvious" problem that the vectorized test uses
vect_perm_short while the SLP check uses vect_perm3_short.  In general
we should now be able to always SLP when we vectorize, so the checks should
match up (or be removed alltogether).

I'll see if matching up causes problems elsewhere.

Reply via email to