https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108583
--- Comment #16 from rguenther at suse dot de <rguenther at suse dot de> --- On Tue, 31 Jan 2023, tnfchris at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108583 > > --- Comment #15 from Tamar Christina <tnfchris at gcc dot gnu.org> --- > > OK, hopefully I understand now. Sorry for being slow. > > Not at all, Sorry if it came across a bit cranky, it wasn't meant that way! > > > If that's the condition we want to test for, it seems like something > > we need to check in the vectoriser rather than the hook. And it's > > not something we can easily do in the vector form, since we don't > > track ranges for vectors (AFAIK). > > Ack, that also tracks with what I tried before, we don't indeed track ranges > for vector ops. The general case can still be handled slightly better (I > think) > but it doesn't become as clear of a win as this one. > > > You probably did so elsewhere some time ago, but what exactly are those > > four instructions? (pointers to specifications appreciated) > > For NEON we use: > https://developer.arm.com/documentation/ddi0596/2021-12/SIMD-FP-Instructions/ADDHN--ADDHN2--Add-returning-High-Narrow- so thats a add + pack high > https://developer.arm.com/documentation/ddi0596/2021-12/SIMD-FP-Instructions/UADDW--UADDW2--Unsigned-Add-Wide- and that unpacks (zero-extends) the high/low part of one operand of an add I wonder if we'd open-code the pack / unpack and use regular add whether combine can synthesize uaddw and addhn? The pack and unpack would be vec_perms on GIMPLE (plus V_C_E). > In that order, and for SVE we use two > https://developer.arm.com/documentation/ddi0602/2022-12/SVE-Instructions/ADDHNB--Add-narrow-high-part--bottom-- probably similar. So the difficulty here will be to decide whether that's in the end better than what the pattern handling code does now, right? Because I think most targets will be able to do the above but lacking the special adds it will be slower because of the extra packing/unpacking? That said, can we possibly do just that costing (would be a first in the pattern code I guess) with a target hook? Or add optabs for the addh operations so we can query support?