================ @@ -0,0 +1,21 @@ +// TYPE sign(TYPE x) { +// if (isnan(x)) { +// return 0.0F; +// } +// if (x > 0.0F) { +// return 1.0F; +// } +// if (x < 0.0F) { +// return -1.0F; +// } +// return x; /* -0.0 or +0.0 */ +// } +_CLC_DEF _CLC_OVERLOAD __CLC_GENTYPE __clc_sign(__CLC_GENTYPE x) { + __CLC_BIT_INTN x_isnan = __clc_isnan(x); + __CLC_BIT_INTN x_isgreater_zero = x > __CLC_FP_LIT(0.0); + __CLC_BIT_INTN x_isless_zero = x < __CLC_FP_LIT(0.0); + __CLC_GENTYPE sel0 = __clc_select(x, __CLC_FP_LIT(1.0), x_isgreater_zero); + __CLC_GENTYPE sel1 = __clc_select(sel0, __CLC_FP_LIT(-1.0), x_isless_zero); + __CLC_GENTYPE sel2 = __clc_select(sel1, __CLC_FP_LIT(0.0), x_isnan); + return sel2; +} ---------------- frasercrmck wrote:
Nice idea, thanks. I think we have to account for `%x` being NaN in the `copysign` though. Alive2 picks up on the fact that we'd copy the sign bit from a negative NaN and return `-0.0`: https://alive2.llvm.org/ce/z/8E6wp8. The `@src` there is what's generated by this patch. Note that the OpenCL-CTS seems happy enough with either version, which is unexpected. If we had `return copysign((__builtin_isnan(x) || (x == 0.0f)) ? 0.0f : 1.0f, __builtin_isnan(x) ? 0.0 : x);` then Alive2 is happy enough. The double check for NaN starts to look a bit off, but the IR produced is: ``` llvm define float @tgt(float %a) { %v0 = fcmp ord float %a, 0.000000e+00 %v1 = fcmp ueq float %a, 0.000000e+00 %cond.i = select i1 %v1, float 0.000000e+00, float 1.000000e+00 %cond3.i = select i1 %v0, float %a, float 0.000000e+00 %v2 = tail call noundef float @llvm.copysign.f32(float %cond.i, float %cond3.i) ret float %v2 } ``` What do you think? Still better than the triple fcmp + triple select? https://github.com/llvm/llvm-project/pull/115699 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits