Re: [RFC][PATCH] SVE intrinsics: Fold svdiv (svptrue, x, x) to ones

Richard Sandiford Tue, 13 Aug 2024 01:58:42 -0700

Kyrylo Tkachov <ktkac...@nvidia.com> writes:
>> On 7 Aug 2024, at 15:09, Richard Sandiford <richard.sandif...@arm.com> wrote:
>> Also, gimple sometimes carries assumptions about undefined behaviour,
>> such as undefined overflow, whereas the intrinsics are defined to behave
>> in the same way as the underlying instructions.  For example, INT_MIN / -1
>> is well-defined when performed by svdiv_x.
>
> Thinking about this last night, I realized that this actually bites us here 
> for this transformation, in the case of vector zeroes.
> The architecture pseudocode for the SVE SDIV and UDIV instruction gives a 
> well-defined result of 0 when diving a lane by 0.
> AFAIK integer division by zero is undefined for the generic vector extensions 
> (and presumably GIMPLE) so we cannot do the fold arbitrarily.
> Furthermore, svdiv (ptrue, 0, 0) would give a vector of zeroes, not a vector 
> of ones so we cannot have the backend fold this to a vector of 1s 
> unconditionally. We either leave it alone or have it generate some kind of 
> CMPNE pN, pM, zX, 0 ; mov zD, pN, #1 sequence, which may be pushing the 
> envelop of what we’re comfortable transforming.


Ah, yeah, good point.

> Thanks,
> Kyrill
>
> P.S. I still think it would be a good idea to have the backend perform 
> constant folding when the operands of the intrinsics are known constants.

Yeah, that was my position too, which is what prompted the const_binop
suggestion earlier in the thread.  (But like you say, we have to be
careful for certain inputs.)

Thanks,
Richard

Re: [RFC][PATCH] SVE intrinsics: Fold svdiv (svptrue, x, x) to ones

Reply via email to