On Mon, Sep 28, 2015 at 3:26 PM, Matt Turner <matts...@gmail.com> wrote: > total instructions in shared programs: 6496326 -> 6492315 (-0.06%) > instructions in affected programs: 159282 -> 155271 (-2.52%) > helped: 411 > --- > src/mesa/drivers/dri/i965/Makefile.sources | 1 + > src/mesa/drivers/dri/i965/brw_fs.cpp | 1 + > src/mesa/drivers/dri/i965/brw_predicate_block.cpp | 104 > ++++++++++++++++++++++ > src/mesa/drivers/dri/i965/brw_shader.h | 6 +- > src/mesa/drivers/dri/i965/brw_vec4.cpp | 1 + > 5 files changed, 112 insertions(+), 1 deletion(-) > create mode 100644 src/mesa/drivers/dri/i965/brw_predicate_block.cpp > > diff --git a/src/mesa/drivers/dri/i965/Makefile.sources > b/src/mesa/drivers/dri/i965/Makefile.sources > index cc3ecaf..9b1a039 100644 > --- a/src/mesa/drivers/dri/i965/Makefile.sources > +++ b/src/mesa/drivers/dri/i965/Makefile.sources > @@ -90,6 +90,7 @@ i965_FILES = \ > brw_packed_float.c \ > brw_performance_monitor.c \ > brw_pipe_control.c \ > + brw_predicate_block.cpp \ > brw_primitive_restart.c \ > brw_program.c \ > brw_program.h \ > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp > b/src/mesa/drivers/dri/i965/brw_fs.cpp > index 5ca5c26..7c7cb0d 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp > @@ -4844,6 +4844,7 @@ fs_visitor::optimize() > OPT(opt_cmod_propagation); > OPT(dead_code_eliminate); > OPT(opt_peephole_sel); > + OPT(opt_predicate_block, this); > OPT(dead_control_flow_eliminate, this); > OPT(opt_register_renaming); > OPT(opt_redundant_discard_jumps); > diff --git a/src/mesa/drivers/dri/i965/brw_predicate_block.cpp > b/src/mesa/drivers/dri/i965/brw_predicate_block.cpp > new file mode 100644 > index 0000000..4973172 > --- /dev/null > +++ b/src/mesa/drivers/dri/i965/brw_predicate_block.cpp > @@ -0,0 +1,104 @@ > +/* > + * Copyright © 2015 Intel Corporation > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice (including the next > + * paragraph) shall be included in all copies or substantial portions of the > + * Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > DEALINGS > + * IN THE SOFTWARE. > + */ > + > +#include "brw_cfg.h" > + > +/** @file brw_predicate_block.cpp > + * > + * This file contains the opt_predicate_block() optimization pass that moves > a > + * small block of instructions from inside an IF/ENDIF block to before the IF > + * instruction by predicating them. For example, > + * > + * Before: > + * > + * CMP.f0 > + * (+f0) IF > + * MUL ... > + * ADD ... > + * ENDIF > + * > + * After: > + * > + * CMP.f0 > + * (+f0) MUL ... > + * (+f0) ADD ... > + * (+f0) IF > + * ENDIF > + * > + * dead_control_flow_eliminate() is then able to remove the IF/ENDIF pair and > + * combine basic blocks. > + */ > + > +bool > +opt_predicate_block(backend_shader *s) > +{ > + bool progress = false; > + > + foreach_block_safe(block, s->cfg) { > + if (block->num == 0 || block->num == s->cfg->num_blocks - 1) > + continue; > + > + if (block->end_ip - block->start_ip > 3) > + continue; > + > + bblock_t *if_block = block->prev(); > + backend_instruction *if_inst = if_block->end(); > + if (if_inst->opcode != BRW_OPCODE_IF || > + if_inst->conditional_mod != BRW_CONDITIONAL_NONE) > + continue; > + > + backend_instruction *endif_inst = block->next()->start(); > + if (endif_inst->opcode != BRW_OPCODE_ENDIF) > + continue; > + > + bool skip = false; > + > + foreach_inst_in_block(backend_instruction, inst, block) { > + if (inst->opcode <= BRW_OPCODE_NOP && !inst->is_control_flow()) {
I was looking at shaders and noticed that this doesn't handle math instructions, so I added that, which gives an additional total instructions in shared programs: 6491241 -> 6490857 (-0.01%) instructions in affected programs: 16200 -> 15816 (-2.37%) helped: 65 But also LOST: 2 which is, of course, unfortunate because one of them exhibits a pretty sizable decrease: FS SIMD8: 816 -> 786 (-3.68%) Ilia also noted on IRC that the NVIDIA proprietary driver predicates blocks of instructions but leaves the branches in place that jump if all channels are off. That's interesting, but I think a lot of the benefit we see from this on i965 is because it allows us to combine basic blocks so other passes work better. Moral of the story is, I think it's time to work on the instruction scheduler. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev