On Thu, Mar 20, 2014 at 12:28 PM, Eric Anholt <e...@anholt.net> wrote: > Matt Turner <matts...@gmail.com> writes: > >> With an awful O(n^2) algorithm that searches previous instructions for >> dead writes. >> >> total instructions in shared programs: 805582 -> 788074 (-2.17%) >> instructions in affected programs: 144561 -> 127053 (-12.11%) >> --- >> src/mesa/drivers/dri/i965/brw_vec4.cpp | 46 >> ++++++++++++++++++++++++++++++++++ >> 1 file changed, 46 insertions(+) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp >> b/src/mesa/drivers/dri/i965/brw_vec4.cpp >> index 4ad398a..e9219a9 100644 >> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp >> @@ -369,6 +369,7 @@ bool >> vec4_visitor::dead_code_eliminate() >> { >> bool progress = false; >> + bool seen_control_flow = false; >> int pc = -1; >> >> calculate_live_intervals(); >> @@ -378,6 +379,8 @@ vec4_visitor::dead_code_eliminate() >> >> pc++; >> >> + seen_control_flow = inst->is_control_flow() || seen_control_flow; >> + > > So, once there's control flow in the program, this piece of optimization > doesn't happen ever after it? Seems like in the walk backwards you > could just stop the walk when you find a control flow instruction.
That's a good idea. I'll try to implement that today. >> if (inst->dst.file != GRF || inst->has_side_effects()) >> continue; >> >> @@ -393,6 +396,49 @@ vec4_visitor::dead_code_eliminate() >> } >> >> progress = try_eliminate_instruction(inst, write_mask) || progress; >> + >> + if (seen_control_flow || inst->predicate || inst->prev == NULL) >> + continue; >> + >> + int dead_channels = inst->dst.writemask; >> + >> + for (int i = 0; i < 3; i++) { >> + if (inst->src[i].file != GRF || >> + inst->src[i].reg != inst->dst.reg) >> + continue; >> + >> + for (int j = 0; j < 4; j++) { >> + int swiz = BRW_GET_SWZ(inst->src[i].swizzle, j); >> + dead_channels &= ~(1 << swiz); >> + } >> + } >> + >> + for (exec_node *node = inst->prev, *prev = node->prev; >> + prev != NULL && dead_channels != 0; >> + node = prev, prev = prev->prev) { > > You could potentially terminate the loop when you're out of the live > range of the dst, which would reduce the pain of n^2. Also a good idea. I thought about this a little and punted because I'd have to modify the ip value when previous instructions were removed, but that should be a pretty simple change. I'll try to do that too. Thanks for the reviews and the ideas. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev