https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100696

            Bug ID: 100696
           Summary: mult_higpart is not vectorized
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ubizjak at gmail dot com
  Target Milestone: ---

Following testcases:

--cut here--
#define N 4

short r[N], a[N], b[N];
unsigned short ur[N], ua[N], ub[N];

void mul (void)
{
  int i;

  for (i = 0; i < N; i++)
    r[i] = a[i] * b[i];
}

/* { dg-final { scan-assembler "pmullw" } } */

void mulhi (void)
{
  int i;

  for (i = 0; i < N; i++)
    r[i] = ((int) a[i] * b[i]) >> 16;
}

/* { dg-final { scan-assembler "pmulhw" } } */

void umulhi (void)
{
  int i;

  for (i = 0; i < N; i++)
    ur[i] = ((unsigned int) ua[i] * ub[i]) >> 16;
}

/* { dg-final { scan-assembler "pmulhuw" } } */

void smulhrs (void)
{
  int i;

  for (i = 0; i < N; i++)
    r[i] = ((((int) a[i] * b[i]) >> 14) + 1) >> 1;
}

/* { dg-final { scan-assembler "pmulhrsw" } } */
--cut here--

should all vectorize for x86_64 with "-O3 -mssse3" to their vector
instructions.

Currently the compiler vectorizes only pmullw and much more complex pmulhrsw,
but not pmulhw and pmulhuw.

For N = 2 (SLP vectorization?), the compiler manages to vectorize mul and none
of the other testcases.

Reply via email to