On 29.09.2017 00:54, Matt Turner wrote:
On Sat, Sep 16, 2017 at 4:23 AM, Nicolai Hähnle <nhaeh...@gmail.com> wrote:
From: Nicolai Hähnle <nicolai.haeh...@amd.com>

GLSL ES requires both, and while GLSL explicitly doesn't require correct
overflow handling, it does appear to require handling input inf/denorms
correctly.

Fixes dEQP-GLES31.functional.shaders.builtin_functions.precision.ldexp.*

Cc: mesa-sta...@lists.freedesktop.org
---
  src/compiler/glsl/lower_instructions.cpp | 171 +++++++++++++++++++------------
  1 file changed, 107 insertions(+), 64 deletions(-)

diff --git a/src/compiler/glsl/lower_instructions.cpp 
b/src/compiler/glsl/lower_instructions.cpp
index 0c1408911d2..362562a4af6 100644
--- a/src/compiler/glsl/lower_instructions.cpp
+++ b/src/compiler/glsl/lower_instructions.cpp
@@ -358,140 +358,183 @@ lower_instructions_visitor::mod_to_floor(ir_expression 
*ir)
  }

  void
  lower_instructions_visitor::ldexp_to_arith(ir_expression *ir)
  {
     /* Translates
      *    ir_binop_ldexp x exp
      * into
      *
      *    extracted_biased_exp = rshift(bitcast_f2i(abs(x)), exp_shift);
-    *    resulting_biased_exp = extracted_biased_exp + exp;
+    *    resulting_biased_exp = min(extracted_biased_exp + exp, 255);
      *
-    *    if (resulting_biased_exp < 1 || x == 0.0f) {
-    *       return copysign(0.0, x);
+    *    if (extracted_biased_exp >= 255)
+    *       return x; // +/-inf, NaN
+    *
+    *    sign_mantissa = bitcast_f2u(x) & sign_mantissa_mask;
+    *
+    *    if (min(resulting_biased_exp, extracted_biased_exp) < 1)
+    *       resulting_biased_exp = 0;
+    *    if (resulting_biased_exp >= 255 ||
+    *        min(resulting_biased_exp, extracted_biased_exp) < 1) {
+    *       sign_mantissa &= sign_mask;
      *    }
      *
-    *    return bitcast_u2f((bitcast_f2u(x) & sign_mantissa_mask) |
+    *    return bitcast_u2f(sign_mantissa |
      *                       lshift(i2u(resulting_biased_exp), exp_shift));
      *
      * which we can't actually implement as such, since the GLSL IR doesn't
      * have vectorized if-statements. We actually implement it without branches
      * using conditional-select:
      *
      *    extracted_biased_exp = rshift(bitcast_f2i(abs(x)), exp_shift);
-    *    resulting_biased_exp = extracted_biased_exp + exp;
+    *    resulting_biased_exp = min(extracted_biased_exp + exp, 255);
      *
-    *    is_not_zero_or_underflow = logic_and(nequal(x, 0.0f),
-    *                                         gequal(resulting_biased_exp, 1);
-    *    x = csel(is_not_zero_or_underflow, x, copysign(0.0f, x));
-    *    resulting_biased_exp = csel(is_not_zero_or_underflow,
-    *                                resulting_biased_exp, 0);
+    *    sign_mantissa = bitcast_f2u(x) & sign_mantissa_mask;
      *
-    *    return bitcast_u2f((bitcast_f2u(x) & sign_mantissa_mask) |
-    *                       lshift(i2u(resulting_biased_exp), exp_shift));
+    *    flush_to_zero = lequal(min(resulting_biased_exp, 
extracted_biased_exp), 0);
+    *    resulting_biased_exp = csel(flush_to_zero, 0, resulting_biased_exp)
+    *    zero_mantissa = logic_or(flush_to_zero,
+    *                             gequal(resulting_biased_exp, 255));
+    *    sign_mantissa = csel(zero_mantissa, sign_mantissa & sign_mask, 
sign_mantissa);
+    *
+    *    result = sign_mantissa |
+    *             lshift(i2u(resulting_biased_exp), exp_shift));
+    *
+    *    return csel(extracted_biased_exp >= 255, x, bitcast_u2f(result));
+    *
+    * The definition of ldexp in the GLSL spec says:
+    *
+    *    "If this product is too large to be represented in the
+    *     floating-point type, the result is undefined."
+    *
+    * However, the definition of ldexp in the GLSL ES spec does not contain
+    * this sentence, so we do need to handle overflow correctly.

Without contradictory evidence, I would expect that was an oversight
and that it's also undefined in GLSL ES.

Is this required to fix the dEQP test you mention? If it's actually
spec'd to behave differently between desktop and ES then that's really
embarrassing.

Yes, it is, and yes, it is...



As mentioned, you may want to just replace this whole lowering pass
with what Jason's done in nir_opt_algebraic.py for ldexp.

Whatever you decide to do, this patch is:

Acked-by: Matt Turner <matts...@gmail.com>

Thanks.

Cheers,
Nicolai


--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to