On Thu, Jan 03, 2019 at 11:48:12AM +0100, Marc Glisse wrote:
> > The following patch adds support for the __builtin_convertvector builtin.
> > C casts on generic vectors are just reinterpretation of the bits (i.e. a
> > VCE), this builtin allows to cast int/unsigned elements to float or vice
> > versa or promote/demote them.  doc/ change is missing, will write it soon.
> > 
> > The builtin appeared in I think clang 3.4 and is apparently in real-world
> > use as e.g. Honza reported.  The first argument is an expression with vector
> > type, the second argument is a vector type (similarly e.g. to va_arg), to
> > which the first argument should be converted.  Both vector types need to
> > have the same number of elements.
> > 
> > I've implemented same element size (thus also whole vector size) conversions
> > efficiently - signed to unsigned and vice versa or same vector type just
> > using a VCE, for e.g. int <-> float or long long <-> double using
> > appropriate optab, possibly repeated multiple times for very large vectors.
> 
> IIUC, you only lower __builtin_convertvector to VCE or FLOAT_EXPR or
> whatever in tree-vect-generic. That seems quite late. At least for the
> "easy" same-size case, I think we should do it early (gimplification?),

No, it must not be done at gimplification time, think about OpenMP/OpenACC
offloading, the target before IPA optimizations might not be the target
after them, while they have to agree on ABI issues, the optabs definitely
can be and are different and these optabs originally added for the
vectorizer are something that doesn't have a fallback, whatever introduces
it into the IL is responsible for verification it is supported.

It could be done in some post-IPA pass, perhaps by just calling from
somewhere else the tree-vect-generic.c function added in the patch, maybe
with a special argument that would do it only for the single op cases and
not for the others.

That said, not sure if e.g. using an opaque builtin for the conversion that
supportable_convert_operation sometimes uses is better over this ifn.
What exact optimization opportunities you are looking for if it is lowered
earlier?  I have the VECTOR_CST folding in place...

> before we start optimizing, without checking if it is supported by the
> target (generic lowering can fix that up later). Of course that can be
> changed later, getting the basic functionality comes first.

        Jakub

Reply via email to