lower_instruction: handle denorms and overflow in ldexp correctly

Matt Turner Thu, 28 Sep 2017 15:55:06 -0700

On Sat, Sep 16, 2017 at 4:23 AM, Nicolai Hähnle <nhaeh...@gmail.com> wrote:
> From: Nicolai Hähnle <nicolai.haeh...@amd.com>
>
> GLSL ES requires both, and while GLSL explicitly doesn't require correct
> overflow handling, it does appear to require handling input inf/denorms
> correctly.
>
> Fixes dEQP-GLES31.functional.shaders.builtin_functions.precision.ldexp.*
>
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/compiler/glsl/lower_instructions.cpp | 171 
> +++++++++++++++++++------------
>  1 file changed, 107 insertions(+), 64 deletions(-)
>
> diff --git a/src/compiler/glsl/lower_instructions.cpp 
> b/src/compiler/glsl/lower_instructions.cpp
> index 0c1408911d2..362562a4af6 100644
> --- a/src/compiler/glsl/lower_instructions.cpp
> +++ b/src/compiler/glsl/lower_instructions.cpp
> @@ -358,140 +358,183 @@ 
> lower_instructions_visitor::mod_to_floor(ir_expression *ir)
>  }
>
>  void
>  lower_instructions_visitor::ldexp_to_arith(ir_expression *ir)
>  {
>     /* Translates
>      *    ir_binop_ldexp x exp
>      * into
>      *
>      *    extracted_biased_exp = rshift(bitcast_f2i(abs(x)), exp_shift);
> -    *    resulting_biased_exp = extracted_biased_exp + exp;
> +    *    resulting_biased_exp = min(extracted_biased_exp + exp, 255);
>      *
> -    *    if (resulting_biased_exp < 1 || x == 0.0f) {
> -    *       return copysign(0.0, x);
> +    *    if (extracted_biased_exp >= 255)
> +    *       return x; // +/-inf, NaN
> +    *
> +    *    sign_mantissa = bitcast_f2u(x) & sign_mantissa_mask;
> +    *
> +    *    if (min(resulting_biased_exp, extracted_biased_exp) < 1)
> +    *       resulting_biased_exp = 0;
> +    *    if (resulting_biased_exp >= 255 ||
> +    *        min(resulting_biased_exp, extracted_biased_exp) < 1) {
> +    *       sign_mantissa &= sign_mask;
>      *    }
>      *
> -    *    return bitcast_u2f((bitcast_f2u(x) & sign_mantissa_mask) |
> +    *    return bitcast_u2f(sign_mantissa |
>      *                       lshift(i2u(resulting_biased_exp), exp_shift));
>      *
>      * which we can't actually implement as such, since the GLSL IR doesn't
>      * have vectorized if-statements. We actually implement it without 
> branches
>      * using conditional-select:
>      *
>      *    extracted_biased_exp = rshift(bitcast_f2i(abs(x)), exp_shift);
> -    *    resulting_biased_exp = extracted_biased_exp + exp;
> +    *    resulting_biased_exp = min(extracted_biased_exp + exp, 255);
>      *
> -    *    is_not_zero_or_underflow = logic_and(nequal(x, 0.0f),
> -    *                                         gequal(resulting_biased_exp, 
> 1);
> -    *    x = csel(is_not_zero_or_underflow, x, copysign(0.0f, x));
> -    *    resulting_biased_exp = csel(is_not_zero_or_underflow,
> -    *                                resulting_biased_exp, 0);
> +    *    sign_mantissa = bitcast_f2u(x) & sign_mantissa_mask;
>      *
> -    *    return bitcast_u2f((bitcast_f2u(x) & sign_mantissa_mask) |
> -    *                       lshift(i2u(resulting_biased_exp), exp_shift));
> +    *    flush_to_zero = lequal(min(resulting_biased_exp, 
> extracted_biased_exp), 0);
> +    *    resulting_biased_exp = csel(flush_to_zero, 0, resulting_biased_exp)
> +    *    zero_mantissa = logic_or(flush_to_zero,
> +    *                             gequal(resulting_biased_exp, 255));
> +    *    sign_mantissa = csel(zero_mantissa, sign_mantissa & sign_mask, 
> sign_mantissa);
> +    *
> +    *    result = sign_mantissa |
> +    *             lshift(i2u(resulting_biased_exp), exp_shift));
> +    *
> +    *    return csel(extracted_biased_exp >= 255, x, bitcast_u2f(result));
> +    *
> +    * The definition of ldexp in the GLSL spec says:
> +    *
> +    *    "If this product is too large to be represented in the
> +    *     floating-point type, the result is undefined."
> +    *
> +    * However, the definition of ldexp in the GLSL ES spec does not contain
> +    * this sentence, so we do need to handle overflow correctly.


Without contradictory evidence, I would expect that was an oversight
and that it's also undefined in GLSL ES.

Is this required to fix the dEQP test you mention? If it's actually
spec'd to behave differently between desktop and ES then that's really
embarrassing.

As mentioned, you may want to just replace this whole lowering pass
with what Jason's done in nir_opt_algebraic.py for ldexp.

Whatever you decide to do, this patch is:

Acked-by: Matt Turner <matts...@gmail.com>
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 01/11] glsl/lower_instruction: handle denorms and overflow in ldexp correctly

Reply via email to