On Mon, Mar 23, 2015 at 8:13 PM, Jason Ekstrand <ja...@jlekstrand.net> wrote: > The idea here is that fusing multiply-add combinations too early can reduce > our ability to perform CSE and value-numbering. Instead, we split ffma > opcodes up-front, hope CSE cleans up, and then fuse after-the-fact. > Unless an algebraic pass does something silly where it inserts something > between the multiply and the add, splitting and re-fusing should never > cause a problem. We run the late algebraic optimizations after this so > that things like compare-with-zero don't hurt our ability to fuse things. > > shader-db results for fragment shaders on Haswell: > total instructions in shared programs: 4390538 -> 4379236 (-0.26%) > instructions in affected programs: 989359 -> 978057 (-1.14%) > helped: 5308 > HURT: 97 > GAINED: 78 > LOST: 5 > > This does, unfortunately, cause some substantial hurt to a shader in Kerbal > Space Program. However, the damage is caused by changing a single > instruction from a ffma to an add. This, in turn, *decreases* register > pressure in one part of the program causing it to fail to register allocate > and spill. Given the overwhelmingly positive results in other shaders and > the fact that the NIR for the Kerbal shaders is actually better, this > should be considered a positive. > --- > src/mesa/drivers/dri/i965/brw_context.c | 1 + > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 10 ++++++++-- > 2 files changed, 9 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_context.c > b/src/mesa/drivers/dri/i965/brw_context.c > index ed6fdff..7cab1c9 100644 > --- a/src/mesa/drivers/dri/i965/brw_context.c > +++ b/src/mesa/drivers/dri/i965/brw_context.c > @@ -558,6 +558,7 @@ brw_initialize_context_constants(struct brw_context *brw) > > static const nir_shader_compiler_options gen6_nir_options = { > .native_integers = true, > + .lower_ffma = true,
I see what's going on, but this probably deserves a comment. > }; > > /* We want the GLSL compiler to emit code that uses condition codes */ > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > index cdf9b01..b0b86e3 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > @@ -51,8 +51,6 @@ nir_optimize(nir_shader *nir) > nir_validate_shader(nir); > progress |= nir_opt_algebraic(nir); > nir_validate_shader(nir); > - progress |= nir_opt_peephole_ffma(nir); > - nir_validate_shader(nir); > progress |= nir_opt_constant_folding(nir); > nir_validate_shader(nir); > progress |= nir_opt_remove_phis(nir); > @@ -131,6 +129,12 @@ fs_visitor::emit_nir_code() > > nir_optimize(nir); > > + if (brw->gen >= 6) { > + /* Try and fuse multiply-adds */ > + nir_opt_peephole_ffma(nir); > + nir_validate_shader(nir); > + } > + > nir_opt_algebraic_late(nir); > nir_validate_shader(nir); > > @@ -141,6 +145,8 @@ fs_visitor::emit_nir_code() > nir_validate_shader(nir); > nir_copy_prop(nir); > nir_validate_shader(nir); > + nir_opt_dce(nir); > + nir_validate_shader(nir); Just to confirm: this did help something? > > if (unlikely(debug_enabled)) { > fprintf(stderr, "NIR (SSA form) for %s shader:\n", stage_name); > -- > 2.3.3 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev