https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117079
--- Comment #4 from Christoph Müllner <cmuellner at gcc dot gnu.org> --- The reason that we don't have "MEM <vector(4) unsigned int>" in the dump anymore is that we now have "MEM <vector(8) unsigned char>". Further, the size of the function in the test case shrinks from 225 instructions down to 109 (almost all vector instructions). I tried to measure a performance difference on my 5950X (-march=native) when calling the test function four times in a loop with 1024l * 1024 * 1024 * 1024 iterations. However, I did not see enough evidence to claim that the new code is better (memory bandwidth is probably the limit): * old: 4m34.405s, 4m47.825s, 4m38.187s * new: 4m34.722s, 4m34.936s, 4m34.922s I propose to fix the failing test case by fixing the test condition. A patch for that is on the list: https://gcc.gnu.org/pipermail/gcc-patches/2025-January/673551.html FWIW, here is a small code change that will bring back the old behavior for analysis: --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -2595,7 +2595,7 @@ out: auto_vec<unsigned> two_op_perm_indices[2]; vec<stmt_vec_info> two_op_scalar_stmts[2] = {vNULL, vNULL}; - if (two_operators && oprnds_info.length () == 2 && group_size > 2) + if (false && two_operators && oprnds_info.length () == 2 && group_size > 2) { unsigned idx = 0; hash_map<gimple *, unsigned> seen;