On Fri, Oct 23, 2015 at 10:55 AM, Eduardo Lima Mitev <el...@igalia.com> wrote:
> When both fadd and fmul instructions have at least one operand that is a > constant and it is only used once, the total number of instructions can > be reduced from 3 (1 ffma + 2 load_const) to 2 (1 fmul + 1 fadd); because > the constants will be progagated as immediate operands of fmul and fadd. > > This patch detects these situations and prevents fusing fmul+fadd into > ffma. > > Shader-db results on i965 Haswell: > > total instructions in shared programs: 6235835 -> 6225895 (-0.16%) > instructions in affected programs: 1124094 -> 1114154 (-0.88%) > total loops in shared programs: 1979 -> 1979 (0.00%) > helped: 7612 > HURT: 843 > GAINED: 4 > LOST: 0 > --- > .../drivers/dri/i965/brw_nir_opt_peephole_ffma.c | 31 > ++++++++++++++++++++++ > 1 file changed, 31 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_nir_opt_peephole_ffma.c > b/src/mesa/drivers/dri/i965/brw_nir_opt_peephole_ffma.c > index a8448e7..c7fc15a 100644 > --- a/src/mesa/drivers/dri/i965/brw_nir_opt_peephole_ffma.c > +++ b/src/mesa/drivers/dri/i965/brw_nir_opt_peephole_ffma.c > @@ -133,6 +133,28 @@ get_mul_for_src(nir_alu_src *src, int num_components, > return alu; > } > > +/** > + * Given a list of (at least two) nir_alu_src's, tells if any of them is a > + * constant value and is used only once. > + */ > +static bool > +any_alu_src_is_a_constant(nir_alu_src srcs[]) > +{ > + for (unsigned i = 0; i < 2; i++) { > + if (srcs[i].src.ssa->parent_instr->type == > nir_instr_type_load_const) { > + nir_load_const_instr *load_const = > + nir_instr_as_load_const (srcs[i].src.ssa->parent_instr); > + > + if (list_is_single(&load_const->def.uses) && > + list_empty(&load_const->def.if_uses)) { > + return true; > + } > + } > + } > + > + return false; > +} > + > The comment above this functions reads "Given a list of (at least two) nir_alu_src's...", but the function checks exactly two. Was it your intention to support lists with size > 2? > static bool > brw_nir_opt_peephole_ffma_block(nir_block *block, void *void_state) > { > @@ -183,6 +205,15 @@ brw_nir_opt_peephole_ffma_block(nir_block *block, > void *void_state) > mul_src[0] = mul->src[0].src.ssa; > mul_src[1] = mul->src[1].src.ssa; > > + /* If any of the operands of the fmul and any of the fadd is a > constant, > + * we bypass because it will be more efficient as the constants > will be > + * propagated as operands, potentially saving two load_const > instructions. > + */ > + if (any_alu_src_is_a_constant(mul->src) && > + any_alu_src_is_a_constant(add->src)) { > + continue; > + } > + > if (abs) { > for (unsigned i = 0; i < 2; i++) { > nir_alu_instr *abs = nir_alu_instr_create(state->mem_ctx, > -- > 2.5.3 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev