On Wed, May 12, 2021 at 09:13:38AM +0000, Tamar Christina wrote: > > From: Segher Boessenkool <seg...@kernel.crashing.org> > > On Tue, May 11, 2021 at 05:37:34AM +0000, Tamar Christina via Gcc wrote: > > > 2. Saturating abs: > > > char sat (char a) > > > { > > > int tmp = abs (a); > > > return tmp > 127 ? 127 : ((tmp < -128) ? -128 : tmp); > > > } > > > > That can be done quite a bit better, branchless at least. Same for all > > examples here probably. > > Do you mean in the example? Sure, I wanted to keep it simple 😊
Point taken... I think you need to look at these issues sooner rather than later though. > > > 2. Saturation: > > > a) Use match.pd to rewrite the various saturation expressions into > > min/max > > > operations which opens up the expressions to further optimizations. > > > > You'll have to do the operation in a bigger mode for that. (This is also > > true for > > rounding, in many cases). > > > > This makes new internal functions more attractive / more feasible. > > True, but the internal function doesn't need to model the wider mode right? > So if you're saturating an int, the internal-fn would work on int,int,int. The ifn doesn't have to go to a wider mode, it's only if you want to express it with more basic operations that you have to. That is the reason I find ifns more attractive for this, yup. > > > We could get the right instructions by using combine if we don't > > > rewrite > > > the instructions to an internal function, however then during > > Vectorization > > > we would overestimate the cost of performing the saturation. The > > constants > > > will the also be loaded into registers and so becomes a lot more > > > difficult > > > to cleanup solely in the backend. > > > > Combine is almost never the right answer if you want to combine more than > > two or three RTL insns. It can be done, but combine will always write the > > combined instruction in simplest terms, which tends to mean that if you > > combine more insns there can be very many outcomes that you all need to > > recognise as insns in your machine description. > > Yeah, ideally I would like to handle it before it gets to expand. That is the right place yeah... but it will need some RTL work, in simplify-rtx at least. Segher