================ @@ -46,9 +46,7 @@ _CLC_DEF _CLC_OVERLOAD __CLC_GENTYPE __clc_hypot(__CLC_GENTYPE x, __CLC_GENTYPE retval = __clc_sqrt(__clc_mad(fx, fx, fy * fy)) * fx_exp; retval = (ux > PINFBITPATT_SP32 || uy == 0) ? __CLC_AS_GENTYPE(ux) : retval; - retval = (ux == PINFBITPATT_SP32 || uy == PINFBITPATT_SP32) - ? __CLC_AS_GENTYPE((__CLC_UINTN)PINFBITPATT_SP32) - : retval; + retval = __clc_isinf(x) || __clc_isinf(y) ? __CLC_GENTYPE_INF : retval; ---------------- frasercrmck wrote:
I was looking at `V_FREXP_EXP_I32_F32` and `V_FREXP_MANT_F32` in the [RDNA 3.5 docs](https://www.amd.com/content/dam/amd/en/documents/radeon-tech-docs/instruction-set-architectures/rdna35_instruction_set_architecture.pdf). I was previously looking at the lowering of `llvm.frexp` in CodeGen tests and saw more than just those two instructions, but now I realise I was only seeing the GFX6 output. The lowering is indeed just those two instructions for other architectures. My broader point is that I need to align the semantics of these instructions with what other targets would have to do for the equivalent operations. To match the AMDGPU behaviour like-for-like on other targets, such as separating out the two exp/mant operations, having subnormal support, etc., would be far more expensive than the bithacking it's currently doing in `hypot` to just shift out the mantissa. But if the AMDGPU version of "frexp_mant" does subnormal scaling and other targets don't, then that could cause bugs between platforms as now we can't guarantee the bits that come out of the frexp_mant and frexp_exp helpers. That's why I suggested it might just be better to have AMDGPU override `hypot` directly, rather than rely on a "generic" hypot which calls into mant/exp helpers which AMDGPU specialize. https://github.com/llvm/llvm-project/pull/129738 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits