Joel Hutton <joel.hut...@arm.com> writes:
>>>> In practice this will only affect targets that choose to use mixed
>>>> vector sizes, and I think it's reasonable to optimise only for the
>>>> case in which such targets support widening conversions.  So what
>>>> do you think about the idea of emitting separate conversions and
>>>> a normal subtract?  We'd be relying on RTL to fuse them together,
>>>> but at least there would be no redundancy to eliminate.
>>>
>>> So in vectorizable_conversion for the widen-minus you'd check
>>> whether you can do a v4qi -> v4hi and then emit a conversion
>>> and a wide minus?
>>
>>Yeah.
>
> This seems reasonable, as I recall we decided against adding
> internal functions for the time being as all the existing vec patterns
> code would have to be refactored.

FWIW, that was for the hi/lo part.  The internal function in this case
would have been a normal standalone operation that makes sense independently
of the hi/lo pairs, and could be generated independently of the vectoriser
(e.g. from match.pd patterns).

Using an internal function is actually less work than using a tree code,
because you don't need to update all the various tree_code switch
statements.

> So emit a v4qi->v8qi gimple conversion
> then a regular widen_lo/hi using the existing backend patterns/optabs?

I was thinking of using a v8qi->v8hi convert on each operand followed
by a normal v8hi subtraction.  That's what we'd generate if the target
didn't define the widening patterns.

Thanks,
Richard

Reply via email to