Joel Hutton <joel.hut...@arm.com> writes: >>>> In practice this will only affect targets that choose to use mixed >>>> vector sizes, and I think it's reasonable to optimise only for the >>>> case in which such targets support widening conversions. So what >>>> do you think about the idea of emitting separate conversions and >>>> a normal subtract? We'd be relying on RTL to fuse them together, >>>> but at least there would be no redundancy to eliminate. >>> >>> So in vectorizable_conversion for the widen-minus you'd check >>> whether you can do a v4qi -> v4hi and then emit a conversion >>> and a wide minus? >> >>Yeah. > > This seems reasonable, as I recall we decided against adding > internal functions for the time being as all the existing vec patterns > code would have to be refactored.
FWIW, that was for the hi/lo part. The internal function in this case would have been a normal standalone operation that makes sense independently of the hi/lo pairs, and could be generated independently of the vectoriser (e.g. from match.pd patterns). Using an internal function is actually less work than using a tree code, because you don't need to update all the various tree_code switch statements. > So emit a v4qi->v8qi gimple conversion > then a regular widen_lo/hi using the existing backend patterns/optabs? I was thinking of using a v8qi->v8hi convert on each operand followed by a normal v8hi subtraction. That's what we'd generate if the target didn't define the widening patterns. Thanks, Richard