On Wed, Dec 25, 2013 at 2:11 AM, Pohjolainen, Topi <topi.pohjolai...@intel.com> wrote: > On Thu, Dec 19, 2013 at 01:40:23PM -0800, Matt Turner wrote: >> Previously we simply considered two registers whose live ranges >> overlapped to interfere. Cases such as >> >> set A ------ >> ... | >> mov B, A -- | >> ... | B | A >> use B -- | >> ... | >> use A ------ >> >> would be considered to interfere, even though B is an unmodified copy of >> A whose live range fit wholly inside that of A. >> >> If no writes to A or B occur between the mov B, A and the use of B then >> we can safely coalesce them. >> >> Instead of removing MOV instructions, we make them NOPs and remove them >> at once after the main pass is finished in order to avoid recomputing >> live intervals (which are needed to perform the previous step). >> >> total instructions in shared programs: 1543768 -> 1513077 (-1.99%) >> instructions in affected programs: 951563 -> 920872 (-3.23%) >> GAINED: 46 >> LOST: 22 >> --- >> src/mesa/drivers/dri/i965/brw_fs.cpp | 69 >> ++++++++++++++++++++++++++++++++---- >> 1 file changed, 62 insertions(+), 7 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp >> b/src/mesa/drivers/dri/i965/brw_fs.cpp >> index e4ac0a5..ad56b87 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp >> @@ -2273,7 +2273,7 @@ fs_visitor::register_coalesce() >> int last_use[MAX_SAMPLER_MESSAGE_SIZE]; >> int next_ip = 0; >> >> - foreach_list_safe(node, &this->instructions) { >> + foreach_list(node, &this->instructions) { >> fs_inst *inst = (fs_inst *)node; >> >> int ip = next_ip; >> @@ -2299,8 +2299,39 @@ fs_visitor::register_coalesce() >> int var_to = live_intervals->var_from_reg(&inst->dst); >> >> if (live_intervals->vars_interfere(var_from, var_to) && >> - !inst->dst.equals(inst->src[0])) >> - continue; >> + !inst->dst.equals(inst->src[0])) { >> + >> + if (live_intervals->end[var_to] > live_intervals->end[var_from]) >> + continue; > > Just checking if I understood things correctly and trying to fill in the caps > that I cannot figure out myself: Is the check for the start implicit from > somewhere else? This has to apply as well, right? > > if (live_intervals->start[var_to] < > live_intervals->start[var_from]) > continue;
We know that the live ranges of (A) var_from and (B) var_to interfere because of the ->vars_interfere() call above. If the end of B's live range is after the end of A's range, then we know two things: - the start of B's live range must be in A's live range (since we already know the two ranges interfere, this is the only remaining possibility) - the interference isn't of the form we're looking for (where B is entirely inside A) That's pretty tricky and it took me a while to recreate what I was thinking when I wrote the code, so let me know if this makes sense to you. :) I think I'll put an explanatory comment there for good measure. >> + >> + bool overwritten = false; >> + int scan_ip = -1; >> + >> + foreach_list(n, &this->instructions) { >> + fs_inst *scan_inst = (fs_inst *)n; >> + scan_ip++; >> + >> + if (scan_inst->is_control_flow()) { >> + overwritten = true; >> + break; >> + } > > Instructions are scanned from the very beginning of the program, and hence we > are also scanning for instructions that correspond to control flow changes > (i.e., jumps) before either of the start of live intervals of 'var_to' and > 'var_from' (in fact if I'm not mistakan above, this should read "before live > interval of 'var_from'"). > > And if any such instruction is found the consideration for any coalescing is > dropped. Isn't it enough to scan for control flow changes only after the start > of the live interval of 'var_from'? (It has been a while since I've been > thinking any of this sort of things and I'm easily missing things. But had to > ask.) I'm not sure because of DO instructions that mark the beginning of a loop -- if our live ranges are in a loop we probably need to think some more about how we can coalesce registers.. I think we can improve this later though, because I've seen a lot of shaders with loops that contain MOVs that could be removed if register coalescing were a bit smarter. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev