Hi, > -----Original Message----- > From: Segher Boessenkool <seg...@kernel.crashing.org> > Sent: Tuesday, May 11, 2021 4:43 PM > To: Tamar Christina <tamar.christ...@arm.com> > Cc: gcc@gcc.gnu.org; Richard Sandiford <richard.sandif...@arm.com>; > Richard Biener <rguent...@suse.de> > Subject: Re: [RFC] Implementing detection of saturation and rounding > arithmetic > > Hi! > > On Tue, May 11, 2021 at 05:37:34AM +0000, Tamar Christina via Gcc wrote: > > 2. Saturating abs: > > char sat (char a) > > { > > int tmp = abs (a); > > return tmp > 127 ? 127 : ((tmp < -128) ? -128 : tmp); > > } > > That can be done quite a bit better, branchless at least. Same for all > examples here probably.
Do you mean in the example? Sure, I wanted to keep it simple 😊 > > > 2. Saturation: > > a) Use match.pd to rewrite the various saturation expressions into > min/max > > operations which opens up the expressions to further optimizations. > > You'll have to do the operation in a bigger mode for that. (This is also > true for > rounding, in many cases). > > This makes new internal functions more attractive / more feasible. True, but the internal function doesn't need to model the wider mode right? So if you're saturating an int, the internal-fn would work on int,int,int. > > > We could get the right instructions by using combine if we don't > > rewrite > > the instructions to an internal function, however then during > Vectorization > > we would overestimate the cost of performing the saturation. The > constants > > will the also be loaded into registers and so becomes a lot more > > difficult > > to cleanup solely in the backend. > > Combine is almost never the right answer if you want to combine more than > two or three RTL insns. It can be done, but combine will always write the > combined instruction in simplest terms, which tends to mean that if you > combine more insns there can be very many outcomes that you all need to > recognise as insns in your machine description. Yeah, ideally I would like to handle it before it gets to expand. > > > The one thing I am wondering about is whether we would need an > > internal function for all operations supported, or if it should be > > modelled as an internal FN which just "marks" the operation as > > rounding/saturating. After all, the only difference between a normal > > and saturating expression in RTL is the xx_truncate RTL surrounding > > the expression. Doing so would also mean that all targets whom have > saturating instructions would automatically benefit from this. > > I think you will have to experiment with both approaches to get a good > feeling for the tradeoff. Fair enough, Thanks for the feedback! Cheers, Tamar > > > Segher