https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100696
Bug ID: 100696 Summary: mult_higpart is not vectorized Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ubizjak at gmail dot com Target Milestone: --- Following testcases: --cut here-- #define N 4 short r[N], a[N], b[N]; unsigned short ur[N], ua[N], ub[N]; void mul (void) { int i; for (i = 0; i < N; i++) r[i] = a[i] * b[i]; } /* { dg-final { scan-assembler "pmullw" } } */ void mulhi (void) { int i; for (i = 0; i < N; i++) r[i] = ((int) a[i] * b[i]) >> 16; } /* { dg-final { scan-assembler "pmulhw" } } */ void umulhi (void) { int i; for (i = 0; i < N; i++) ur[i] = ((unsigned int) ua[i] * ub[i]) >> 16; } /* { dg-final { scan-assembler "pmulhuw" } } */ void smulhrs (void) { int i; for (i = 0; i < N; i++) r[i] = ((((int) a[i] * b[i]) >> 14) + 1) >> 1; } /* { dg-final { scan-assembler "pmulhrsw" } } */ --cut here-- should all vectorize for x86_64 with "-O3 -mssse3" to their vector instructions. Currently the compiler vectorizes only pmullw and much more complex pmulhrsw, but not pmulhw and pmulhuw. For N = 2 (SLP vectorization?), the compiler manages to vectorize mul and none of the other testcases.