From: Ian Romanick <ian.d.roman...@intel.com>

This doesn't help on Intel GPUs now because we always take the
"always_precise" path first.  It may help on other GPUs, and it does
prevent a bunch of regressions in "intel/compiler: Don't always require
precise lowering of flrp".

I have CC'ed everyone responsible for drivers that sets lower_flrp32
or lower_flrp64.

Signed-off-by: Ian Romanick <ian.d.roman...@intel.com>
Cc: Marek Olšák <marek.ol...@amd.com>
Cc: Rob Clark <robdcl...@gmail.com>
Cc: Eric Anholt <e...@anholt.net>
Cc: Dave Airlie <airl...@redhat.com>
Cc: Timothy Arceri <tarc...@itsqueeze.com>
---
 src/compiler/nir/nir_lower_flrp.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/src/compiler/nir/nir_lower_flrp.c 
b/src/compiler/nir/nir_lower_flrp.c
index 1a3c55d07a2..24282c3cbcf 100644
--- a/src/compiler/nir/nir_lower_flrp.c
+++ b/src/compiler/nir/nir_lower_flrp.c
@@ -555,6 +555,23 @@ convert_flrp_instruction(nir_builder *bld,
       }
    }
 
+   /*
+    * - If t is constant:
+    *
+    *        x(1 - t) + yt
+    *
+    *   The cost is three instructions without FMA or two instructions with
+    *   FMA.  This is the same cost as the imprecise lowering, but it gives
+    *   the instruction scheduler a little more freedom.
+    *
+    *   There is no need to handle t = 0.5 specially.  nir_opt_algebraic
+    *   already has optimizations to convert 0.5x + 0.5y to 0.5(x + y).
+    */
+   if (alu->src[2].src.ssa->parent_instr->type == nir_instr_type_load_const) {
+      replace_with_strict(bld, dead_flrp, alu);
+      return;
+   }
+
    /*
     * - Otherwise
     *
-- 
2.14.4

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to