https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122793

--- Comment #10 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #7)
> Seems it is the pack_p case,
> Commenting out
>       /* Check whether the input has twice as many lanes per vector.  */
>       else if (children.length () == 1
>                && known_eq (SLP_TREE_LANES (child) * nunits,
>                             SLP_TREE_LANES (node) * op_nunits * 2))
>         pack_p = true;
> makes the #c5 testcase pass, while commenting out
>       /* Check whether the output has N times as many lanes per vector.  */
>       else if (constant_multiple_p (SLP_TREE_LANES (node) * op_nunits,
>                                     SLP_TREE_LANES (child) * nunits,
>                                     &this_unpack_factor)
>                && (i == 0 || unpack_factor == this_unpack_factor))
>         unpack_factor = this_unpack_factor;
> instead doesn't fix it.

Yes, the change in r15-4114-g8157f3f2d211bf has a bug in that repeated_p is
initialized to true, but after this change it's only set to false when !pack_p
&& !widen.

so repeated_p stays true even when the vector isn't repeating, but repeated_p
takes precedence over pack_p.

As a result it thinks that this pack operation replicates as a sequence from
the wider vector:

(rr) p debug (node)
perm.c:18:1: note: node 0x580a950 (max_nunits=1, refcnt=1) vector(16) unsigned
char
perm.c:18:1: note: op: VEC_PERM_EXPR
perm.c:18:1: note:      stmt 0 _8 = MEM[(unsigned char *)s_25 + 5B];
perm.c:18:1: note:      stmt 1 _9 = MEM[(unsigned char *)s_25 + 6B];
perm.c:18:1: note:      stmt 2 _13 = MEM[(unsigned char *)s_25 + 7B];
perm.c:18:1: note:      stmt 3 _13 = MEM[(unsigned char *)s_25 + 7B];
perm.c:18:1: note:      stmt 4 _13 = MEM[(unsigned char *)s_25 + 7B];
perm.c:18:1: note:      stmt 5 _13 = MEM[(unsigned char *)s_25 + 7B];
perm.c:18:1: note:      stmt 6 _13 = MEM[(unsigned char *)s_25 + 7B];
perm.c:18:1: note:      stmt 7 _13 = MEM[(unsigned char *)s_25 + 7B];
perm.c:18:1: note:      lane permutation { 0[7] 0[8] 0[9] 0[9] 0[9] 0[9] 0[9]
0[9] }
perm.c:18:1: note:      children 0x580a7a0
$1 = void
(rr) p debug (child)
perm.c:18:1: note: node 0x580a7a0 (max_nunits=16, refcnt=4) vector(16) unsigned
char
perm.c:18:1: note: op template: _6 = MEM[(unsigned char *)s_25 + -2B];
perm.c:18:1: note:      stmt 0 _6 = MEM[(unsigned char *)s_25 + -2B];
perm.c:18:1: note:      stmt 1 ---
perm.c:18:1: note:      stmt 2 ---
perm.c:18:1: note:      stmt 3 ---
perm.c:18:1: note:      stmt 4 ---
perm.c:18:1: note:      stmt 5 ---
perm.c:18:1: note:      stmt 6 _12 = MEM[(unsigned char *)s_25 + 4B];
perm.c:18:1: note:      stmt 7 _8 = MEM[(unsigned char *)s_25 + 5B];
perm.c:18:1: note:      stmt 8 _9 = MEM[(unsigned char *)s_25 + 6B];
perm.c:18:1: note:      stmt 9 _13 = MEM[(unsigned char *)s_25 + 7B];
perm.c:18:1: note:      stmt 10 _21 = MEM[(unsigned char *)s_25 + 8B];
perm.c:18:1: note:      stmt 11 _29 = MEM[(unsigned char *)s_25 + 9B];
perm.c:18:1: note:      stmt 12 ---
perm.c:18:1: note:      stmt 13 ---
perm.c:18:1: note:      stmt 14 ---
perm.c:18:1: note:      stmt 15 ---

The code and the comments indicate to me that it was intended to support
repeated unpacks, but not repeated packs.

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index a5cd596fd28..7104835eb5a 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -10242,7 +10242,10 @@ vectorizable_slp_permutation_1 (vec_info *vinfo,
gimple_stmt_iterator *gsi,
       if (children.length () == 1
          && known_eq (SLP_TREE_LANES (child) * nunits,
                       SLP_TREE_LANES (node) * op_nunits * 2))
-       pack_p = true;
+       {
+         pack_p = true;
+         repeating_p = false;
+       }
       /* Check whether the output has N times as many lanes per vector.  */
       else if (constant_multiple_p (SLP_TREE_LANES (node) * op_nunits,
                                    SLP_TREE_LANES (child) * nunits,

fixes it, since if we're packing, we're not repeating the original vector.
Testing the above change.

Reply via email to