Kenneth Graunke <kenn...@whitecape.org> writes: > On 09/05/2012 11:16 PM, Eric Anholt wrote: >> This fixes the blue zombies bug in l4d2. >> >> NOTE: This is a candidate for the 9.0 branch. >> --- >> .../drivers/dri/i965/brw_fs_live_variables.cpp | 24 >> ++++++++++++++++++++ >> 1 file changed, 24 insertions(+) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp >> b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp >> index 16480ad..392691d 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp >> @@ -221,6 +221,30 @@ fs_visitor::calculate_live_intervals() >> } >> >> this->live_intervals_valid = true; >> + >> + /* Note in the non-control-flow code above, that we only take def[] as >> the >> + * first store, and use[] as the last use. We use this in dead code >> + * elimination, to determine when a store never gets used. However, we >> + * also use these arrays to answer the virtual_grf_interferes() question >> + * (live interval analysis), which is used for register coalescing and >> + * register allocation. >> + * >> + * So, there's a conflict over what the array should mean: if use[] >> + * considers a def after the last use, then the dead code elimination >> pass >> + * never does anything (and it's an important pass!). But if we don't >> + * include dead code, then virtual_grf_interferes() lies and we'll do >> + * horrible things like coalesce the register that is dead-code-written >> + * into another register that was live across the dead write (causing the >> + * use of the second register to take the dead write's source value >> instead >> + * of the coalesced MOV's source value). >> + * >> + * To resolve the conflict, immediately after calculating live intervals, >> + * detect dead code, nuke it, and if we changed anything, calculate again >> + * before returning to the caller. Now we happen to produce def[] and >> + * use[] arrays that will work for virtual_grf_interferes(). >> + */ >> + if (dead_code_eliminate()) >> + calculate_live_intervals(); >> } >> >> bool > > I am worried that this will come back to haunt us later. This is the > first time calculate_live_intervals() has actually *changed* the IR, > rather than analyze it. > > Specifically, if an optimization pass were to do: > > fs_cfg cfg(this); > calculate_live_intervals(); > > and the start/end of a basic block happened to be dead code, then when > we hit > > for (fs_inst *inst = block->start; > inst != block->end->next; > inst = (fs_inst *) inst->next) { > > we'll most assuredly start traversing data and hit a NULL pointer (if we > don't crash when dereferencing block->start) since you'll never hit > block->end. > > Granted, no code actually uses both CFGs and live intervals today, but > it sounds plausible. > > Here's one idea for safeguarding against this: > > 1. Add a new "int using_cfg" variable to fs_visitor. > 2. In the fs_cfg() constructor, increment v->using_cfg. > 3. In the fs_cfg() destructor, decrement v->using_cfg. > 4. In calculate_live_intervals, assert(!using_cfg); > > Then it'd least die with a reasonable message. > > What do you think? Worth doing? Something else? Am I being too paranoid? > > Regardless of the paranoia I think you should push this (unless you > think of a better solution). After spending nearly a whole week hitting > my head against the wall, I'm ready to see this fixed.
Other times we've written code like this (e.g. no_batch_wrap), I've found it irritating to maintain. For no_batch_wrap, it to catch very likely bugs, while I feel that in this case the bugs are not likely.
pgp3XPpFXohYz.pgp
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev