From: Ian Romanick <ian.d.roman...@intel.com> This doesn't help on Intel GPUs now because we always take the "always_precise" path first. It may help on other GPUs, and it does prevent a bunch of regressions in "intel/compiler: Don't always require precise lowering of flrp".
I have CC'ed everyone responsible for drivers that sets lower_flrp32 or lower_flrp64. Signed-off-by: Ian Romanick <ian.d.roman...@intel.com> Cc: Marek Olšák <marek.ol...@amd.com> Cc: Rob Clark <robdcl...@gmail.com> Cc: Eric Anholt <e...@anholt.net> Cc: Dave Airlie <airl...@redhat.com> Cc: Timothy Arceri <tarc...@itsqueeze.com> --- src/compiler/nir/nir_lower_flrp.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/src/compiler/nir/nir_lower_flrp.c b/src/compiler/nir/nir_lower_flrp.c index 1a3c55d07a2..24282c3cbcf 100644 --- a/src/compiler/nir/nir_lower_flrp.c +++ b/src/compiler/nir/nir_lower_flrp.c @@ -555,6 +555,23 @@ convert_flrp_instruction(nir_builder *bld, } } + /* + * - If t is constant: + * + * x(1 - t) + yt + * + * The cost is three instructions without FMA or two instructions with + * FMA. This is the same cost as the imprecise lowering, but it gives + * the instruction scheduler a little more freedom. + * + * There is no need to handle t = 0.5 specially. nir_opt_algebraic + * already has optimizations to convert 0.5x + 0.5y to 0.5(x + y). + */ + if (alu->src[2].src.ssa->parent_instr->type == nir_instr_type_load_const) { + replace_with_strict(bld, dead_flrp, alu); + return; + } + /* * - Otherwise * -- 2.14.4 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev