Based on our discussion, I submitted two new patches for folding SVE intrinsics (svdiv and svmul): https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660744.html https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660745.html
> On 13 Aug 2024, at 10:58, Richard Sandiford <richard.sandif...@arm.com> wrote: > > External email: Use caution opening links or attachments > > > Kyrylo Tkachov <ktkac...@nvidia.com> writes: >>> On 7 Aug 2024, at 15:09, Richard Sandiford <richard.sandif...@arm.com> >>> wrote: >>> Also, gimple sometimes carries assumptions about undefined behaviour, >>> such as undefined overflow, whereas the intrinsics are defined to behave >>> in the same way as the underlying instructions. For example, INT_MIN / -1 >>> is well-defined when performed by svdiv_x. >> >> Thinking about this last night, I realized that this actually bites us here >> for this transformation, in the case of vector zeroes. >> The architecture pseudocode for the SVE SDIV and UDIV instruction gives a >> well-defined result of 0 when diving a lane by 0. >> AFAIK integer division by zero is undefined for the generic vector >> extensions (and presumably GIMPLE) so we cannot do the fold arbitrarily. >> Furthermore, svdiv (ptrue, 0, 0) would give a vector of zeroes, not a vector >> of ones so we cannot have the backend fold this to a vector of 1s >> unconditionally. We either leave it alone or have it generate some kind of >> CMPNE pN, pM, zX, 0 ; mov zD, pN, #1 sequence, which may be pushing the >> envelop of what we’re comfortable transforming. > > Ah, yeah, good point. > >> Thanks, >> Kyrill >> >> P.S. I still think it would be a good idea to have the backend perform >> constant folding when the operands of the intrinsics are known constants. > > Yeah, that was my position too, which is what prompted the const_binop > suggestion earlier in the thread. (But like you say, we have to be > careful for certain inputs.) > > Thanks, > Richard
smime.p7s
Description: S/MIME cryptographic signature