https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117542

            Bug ID: 117542
           Summary: Missed loop vectorization for truncate from float to
                    __bf16.
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: liuhongt at gcc dot gnu.org
  Target Milestone: ---
            Target: x86_64-*-* i?86-*-*

For loop vectorization, GCC relies on optab vec_pack_trunk_m to check if
backend supports that or not.
But the optab is already used by truncate from float to _Float16 and can't be
overloaded. The document only mention the dest has 2*N elements of size S/2,
but doesn't specify the dest mode and there're 2 kinds of half-precision
floating-point.


------
‘vec_pack_trunc_m’
Narrow (demote) and merge the elements of two vectors. Operands 1 and 2
are vectors of the same mode having N integral or floating point elements of 
size S. Operand 0 is the resulting vector in which 2*N elements of size S/2 are
concatenated after narrowing them down using truncation.
----------

void
foo (__bf16* a, float* b)
{
    for (int i = 0; i != 10000; i++)
      a[i] = b[i];
}

 couldn't vectorize loop
 not vectorized: no vectype for stmt: _4 = *_3;

Reply via email to