Kyrylo Tkachov <ktkac...@nvidia.com> writes: >> On 7 Aug 2024, at 15:09, Richard Sandiford <richard.sandif...@arm.com> wrote: >> Also, gimple sometimes carries assumptions about undefined behaviour, >> such as undefined overflow, whereas the intrinsics are defined to behave >> in the same way as the underlying instructions. For example, INT_MIN / -1 >> is well-defined when performed by svdiv_x. > > Thinking about this last night, I realized that this actually bites us here > for this transformation, in the case of vector zeroes. > The architecture pseudocode for the SVE SDIV and UDIV instruction gives a > well-defined result of 0 when diving a lane by 0. > AFAIK integer division by zero is undefined for the generic vector extensions > (and presumably GIMPLE) so we cannot do the fold arbitrarily. > Furthermore, svdiv (ptrue, 0, 0) would give a vector of zeroes, not a vector > of ones so we cannot have the backend fold this to a vector of 1s > unconditionally. We either leave it alone or have it generate some kind of > CMPNE pN, pM, zX, 0 ; mov zD, pN, #1 sequence, which may be pushing the > envelop of what we’re comfortable transforming.
Ah, yeah, good point. > Thanks, > Kyrill > > P.S. I still think it would be a good idea to have the backend perform > constant folding when the operands of the intrinsics are known constants. Yeah, that was my position too, which is what prompted the const_binop suggestion earlier in the thread. (But like you say, we have to be careful for certain inputs.) Thanks, Richard