On 29 October 2013 00:28, Eric Anholt <e...@anholt.net> wrote: > Paul Berry <stereotype...@gmail.com> writes: > > > On 21 October 2013 11:20, Eric Anholt <e...@anholt.net> wrote: > > > >> Orbital Explorer was generating a 4000 instruction geometry shader, > which > >> was taking 275 trips through dead code elimination and register > >> coalescing, each of which updated live variables to get its work done, > and > >> invalidated those live variables afterwards. > >> > >> By using bitfields instead of bools (reducing the working set size by a > >> factor of 8) in live variables analysis, it drops from 88% of the > profile > >> to 57%, and reduces overall runtime from I-got-bored-and-killed-it (Paul > >> says 3+ minutes) to 10.5 seconds. > >> > >> Compare to f179f419d1d0a03fad36c2b0a58e8b853bae6118 on the FS side. > >> --- > >> .../drivers/dri/i965/brw_vec4_live_variables.cpp | 41 > >> ++++++++++++---------- > >> .../drivers/dri/i965/brw_vec4_live_variables.h | 10 +++--- > >> 2 files changed, 28 insertions(+), 23 deletions(-) > >> > >> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp > >> b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp > >> index db3787b..f6675c8 100644 > >> --- a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp > >> +++ b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp > >> @@ -83,8 +83,8 @@ vec4_live_variables::setup_def_use() > >> > >> for (int j = 0; j < 4; j++) { > >> int c = BRW_GET_SWZ(inst->src[i].swizzle, j); > >> - if (!bd[b].def[reg * 4 + c]) > >> - bd[b].use[reg * 4 + c] = true; > >> + if (!BITSET_TEST(bd[b].def, reg * 4 + c)) > >> + BITSET_SET(bd[b].use, reg * 4 + c); > >> } > >> } > >> } > >> @@ -99,8 +99,8 @@ vec4_live_variables::setup_def_use() > >> for (int c = 0; c < 4; c++) { > >> if (inst->dst.writemask & (1 << c)) { > >> int reg = inst->dst.reg; > >> - if (!bd[b].use[reg * 4 + c]) > >> - bd[b].def[reg * 4 + c] = true; > >> + if (!BITSET_TEST(bd[b].use, reg * 4 + c)) > >> + BITSET_SET(bd[b].def, reg * 4 + c); > >> } > >> } > >> } > >> @@ -126,12 +126,12 @@ vec4_live_variables::compute_live_variables() > >> > >> for (int b = 0; b < cfg->num_blocks; b++) { > >> /* Update livein */ > >> - for (int i = 0; i < num_vars; i++) { > >> - if (bd[b].use[i] || (bd[b].liveout[i] && !bd[b].def[i])) { > >> - if (!bd[b].livein[i]) { > >> - bd[b].livein[i] = true; > >> - cont = true; > >> - } > >> + for (int i = 0; i < bitset_words; i++) { > >> + BITSET_WORD new_livein = (bd[b].use[i] | > >> + (bd[b].liveout[i] & > ~bd[b].def[i])); > >> + if (new_livein & ~bd[b].livein[i]) { > >> + bd[b].livein[i] |= new_livein; > >> + cont = true; > >> > > > > Personally, I think this would be slightly clearer as: > > > > BITSET_WORD new_livein = bd[b].livein[i] | bd[b].use[i] | > (bd[b].liveout[i] > > & ~bd[b].def[i]); > > if (new_livein != bd[b].livein[i]) { > > bd[b].livein[i] = new_livein; > > cont = true; > > } > > > > > >> } > >> } > >> > >> @@ -140,9 +140,11 @@ vec4_live_variables::compute_live_variables() > >> bblock_link *link = (bblock_link *)block_node; > >> bblock_t *block = link->block; > >> > >> - for (int i = 0; i < num_vars; i++) { > >> - if (bd[block->block_num].livein[i] && !bd[b].liveout[i]) > { > >> - bd[b].liveout[i] = true; > >> + for (int i = 0; i < bitset_words; i++) { > >> + BITSET_WORD new_liveout = > (bd[block->block_num].livein[i] & > >> + ~bd[b].liveout[i]); > >> + if (new_liveout) { > >> + bd[b].liveout[i] |= new_liveout; > >> cont = true; > >> } > >> > > > > Similarly, here I think it would be clearer to do: > > > > BITSET_WORD new_liveout = bd[block->block_num].livein[i]; > > if (new_liveout != bd[b].liveout[i]) { > > bd[b].liveout[i] |= new_liveout; > > cont = true; > > } > > I don't think this second block is correct -- you're missing a > "bd[b].liveout | " in setting up new_liveout. I'm leaving it as is to > keep it matching the FS side for now, at least. >
Yeah, you're right. I changed my approach halfway through writing it and failed to update it properly. FTR, what I was going for was: BITSET_WORD new_liveout = bd[b].liveout[i] | bd[block->block_num].livein[i]; if (new_liveout != bd[b].liveout[i]) { bd[b].liveout[i] = new_liveout; cont = true; } I'm fine leaving it as is, though. > > > Either way, the patch is: > > > > Reviewed-by: Paul Berry <stereotype...@gmail.com> > >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev