Iago Toral <ito...@igalia.com> writes: > On Tue, 2015-07-28 at 18:17 +0300, Francisco Jerez wrote: >> Iago Toral Quiroga <ito...@igalia.com> writes: >> >> > Link to v1: >> > http://lists.freedesktop.org/archives/mesa-dev/2015-July/089766.html >> > >> > Changes after review (Curro) >> > - Drop the patch that asserted that the reg size should always be 1 >> > - Expand this so that we do not unspill a register if we have just >> > unspilled it as well >> > - Use brw_mask_for_swizzle >> > - Update spilling costs accordingly >> > >> > New changes: >> > >> > - Expand the optimizations that are based on caching the >> > spilled/unspilled >> > so we keep using the cached register for as long as consecutive >> > instructions >> > keep reading the register (the previous version would only do this for >> > one >> > instruction). This is because we only see benefits for register >> > allocation >> > when there are gaps in the life span of a register where it is not used >> > (because these are the only instances in which we can use that reg for >> > a >> > different purpose), so as long as consecutive instructions keep >> > reading a >> > register we have just spilled or unspilled, we don't have to unspill it >> > again. >> > >> I think this may be a good idea (assuming you've managed to measure an >> improvement in practice), but I don't think that the explanation is >> strictly speaking correct. It *may* be beneficial to, say, unspill a >> variable for instruction i and then do it again for instruction i+1, >> because the set of variables live at instruction i may not be exactly >> the same as in instruction i+1, and by caching the value between both >> instructions you cause the temporary to interfere with the union of both >> sets simultaneously, what may increase the total number of registers >> required to register-allocate the program. > > This is true, although you also need to allocate a register for the new > vgrf used to unspill, so I think the chances of this being beneficial in > practice are very low. I'll make sure to update the comment to be more > precise though. > The difference is that if you split it into two temporaries each one may interfere with a subset of the nodes they'd have interfered with if they were a single variable, while they won't interfere with each other (because the live range of the first ends after the end of instruction i which is were the live range of the second starts), what means that the register allocator is still free to allocate the same physical register to both if need be -- or not, so it can only possibly decrease register usage never increase it.
>> That said I think that this may still be a good idea because the >> register-pressure benefit from separating the live ranges of temporaries >> used in consecutive instructions is likely to be tiny typically, the >> program is likely to have other spilling candidates which may simplify >> the interference graph drastically for the same amount of fill/spill >> bandwidth invested, so I think you're right that in most cases it's >> going to be silly to re-spill/fill the same variable in consecutive >> instructions. > > Right. The way I would expect this to work in practice is that we start > by spilling registers with the best benefit / cost ratio. That should be > registers that have a long life-span and usage gaps where the main > benefit for allocation comes from being able to allocate the register > for a different purpose during these gaps, so there should lose very > little for register allocation by doing this (if anything at all). > Yeah, agreed. >> In the future it may also be worth checking whether the heuristic can be >> refined to use some sort of register pressure-sensitive distance between >> uses of the same spilled variable as metric to decide whether the >> variable is worth re-spilling or if it makes sense for it to be cached >> between a pair of potentially non-consecutive uses. >> >> Anyway I'll have a closer look at the rest of your series soon-ish. > > Thanks Curro! > >> > Other >> > Iago Toral Quiroga (6): >> > i965/vec4: Only emit one scratch read per instruction for spilled >> > registers >> > i965/vec4: Remove checks for reladdr when checking for spillable >> > registers >> > i965/vec4: Don't emit scratch reads for a spilled register we have >> > just written >> > i965/vec4: Don't emit scratch reads for a register we have just >> > unspilled >> > i965/vec4: Adjust spilling cost for consecutive instructions >> > i965: Add a debug option for spilling everything in vec4 code >> > >> > src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 2 +- >> > src/mesa/drivers/dri/i965/brw_vec4.cpp | 2 +- >> > .../drivers/dri/i965/brw_vec4_reg_allocate.cpp | 145 >> > +++++++++++++++++++-- >> > src/mesa/drivers/dri/i965/intel_debug.c | 3 +- >> > src/mesa/drivers/dri/i965/intel_debug.h | 5 +- >> > 5 files changed, 139 insertions(+), 18 deletions(-) >> > >> > -- >> > 1.9.1
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev