https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117079

--- Comment #4 from Christoph Müllner <cmuellner at gcc dot gnu.org> ---
The reason that we don't have "MEM <vector(4) unsigned int>" in the dump
anymore is that we now have "MEM <vector(8) unsigned char>".

Further, the size of the function in the test case shrinks from 225
instructions down to 109 (almost all vector instructions).

I tried to measure a performance difference on my 5950X (-march=native) when
calling the test function four times in a loop with 1024l * 1024 * 1024 * 1024
iterations.
However, I did not see enough evidence to claim that the new code is better
(memory bandwidth is probably the limit):

* old: 4m34.405s, 4m47.825s, 4m38.187s
* new: 4m34.722s, 4m34.936s, 4m34.922s

I propose to fix the failing test case by fixing the test condition.
A patch for that is on the list:
  https://gcc.gnu.org/pipermail/gcc-patches/2025-January/673551.html

FWIW, here is a small code change that will bring back the old behavior for
analysis:

--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -2595,7 +2595,7 @@ out:
   auto_vec<unsigned> two_op_perm_indices[2];
   vec<stmt_vec_info> two_op_scalar_stmts[2] = {vNULL, vNULL};

-  if (two_operators && oprnds_info.length () == 2 && group_size > 2)
+  if (false && two_operators && oprnds_info.length () == 2 && group_size > 2)
     {
       unsigned idx = 0;
       hash_map<gimple *, unsigned> seen;
  • [Bug tree-optimization/117079] [... cmuellner at gcc dot gnu.org via Gcc-bugs

Reply via email to