https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96053

            Bug ID: 96053
           Summary: Miss optimization:Finding SLP sequences from
                    reductions sometimes is better than finding from
                    reduction chains
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: zhoukaipeng3 at huawei dot com
  Target Milestone: ---

command:
gcc -S -O2 -ftree-vectorize test.c -funsafe-math-optimizations 
-fno-tree-reassoc -march=armv8.2-a+sve -msve-vector-bits=128

gcc version 11.0.0 20200629

In vectorization, finding SLP sequences from reduction chains has priority over
from reductions.  But sometimes, finding SLP sequences from reductions is a
better way to do vectorization than from reduction chains.

testcase:
double f(double *a, double *b)
{
  double res1 = 0;
  double res0 = 0;
  for (int i = 0 ; i < 1000; i+=4) {
    res0 += a[i] * b[i];
    res1 += a[i+1] * b[i*1];
    res0 += a[i+2] * b[i+2];
    res1 += a[i+3] * b[i+3];
  }
  return res0 + res1;
}

I have two imperfect solutions, one is to add a control option, and the other
is to use the cost model to evaluate which is better.  The first one is very
difficult for users to use, and the second one is difficult to implement.

Does anyone have a better suggestion?

Reply via email to