On Thu, 2015-07-30 at 17:14 +0300, Francisco Jerez wrote: > Francisco Jerez <curroje...@riseup.net> writes: > > > Iago Toral Quiroga <ito...@igalia.com> writes: > > > >> When we have code such as this: > >> > >> mov vgrf1.0.x:F, vgrf2.xxxx:F > >> mov vgrf3.0.x:F, vgrf1.xxxx:F > >> ... > >> mov vgrf3.0.x:F, vgrf1.xxxx:F > >> > >> And vgrf1 is chosen for spilling, we can emit this: > >> > >> mov vgrf1.0.x:F, vgrf2.xxxx:F > >> gen4_scratch_write hw_reg0:F, vgrf1.xxxx:D, 22D > >> mov vgrf3.0.x:F, vgrf1.xxxx:F > >> ... > >> gen4_scratch_read vgrf4.0.x:F, 22D > >> mov vgrf3.0.x:F, vgrf4.xxxx:F > >> > >> Instead of this: > >> > >> mov vgrf1.0.x:F, vgrf2.xxxx:F > >> gen4_scratch_write hw_reg0:F, vgrf1.xxxx:D, 22D > >> gen4_scratch_read vgrf4.0.x:F, 22D > >> mov vgrf3.0.x:F, vgrf4.xxxx:F > >> ... > >> gen4_scratch_read vgrf5.0.x:F, 22D > >> mov vgrf3.0.x:F, vgrf5.xxxx:F > >> > >> And save one scratch read while still preserving the benefits of > >> spilling the register. > >> > >> In general, we avoid emitting scratch reads for as long as the next > >> instruction > >> keeps reading the spilled register. This should not harm the benefit of > >> spilling the register because gains for register allocation only come when > >> we > >> have chunks of program code where the register is alive but not really used > >> (because these are the points where we could effectively use that register > >> for > >> another purpose if we spilled it), so as long as consecutive instructions > >> use > >> that register we can avoid the scratch reads without losing anything. > >> --- > >> .../drivers/dri/i965/brw_vec4_reg_allocate.cpp | 37 > >> +++++++++++++++++++++- > >> 1 file changed, 36 insertions(+), 1 deletion(-) > >> > >> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp > >> b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp > >> index cff5406..fd56dae 100644 > >> --- a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp > >> +++ b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp > >> @@ -340,11 +340,43 @@ vec4_visitor::spill_reg(int spill_reg_nr) > >> unsigned int spill_offset = last_scratch++; > >> > >> /* Generate spill/unspill instructions for the objects being spilled. > >> */ > >> + vec4_instruction *spill_write_inst = NULL; > >> foreach_block_and_inst(block, vec4_instruction, inst, cfg) { > >> + /* We don't spill registers used for scratch */ > >> + if (inst->opcode == SHADER_OPCODE_GEN4_SCRATCH_READ || > >> + inst->opcode == SHADER_OPCODE_GEN4_SCRATCH_WRITE) > >> + continue; > >> + > >> int scratch_reg = -1; > >> + bool spill_reg_was_read = false; > >> for (unsigned int i = 0; i < 3; i++) { > >> if (inst->src[i].file == GRF && inst->src[i].reg == > >> spill_reg_nr) { > >> - if (scratch_reg == -1) { > >> + if (!spill_reg_was_read) { > >> + spill_reg_was_read = (!inst->predicate || > >> + inst->opcode == BRW_OPCODE_SEL); > >> + } > >> + > >> + /* If we are reading the spilled register right after writing > >> + * to it we can skip the scratch read and use directly the > >> + * register we used as source for the scratch write. For this > >> + * to work we must check that: > >> + * > >> + * 1) The write is inconditional, that is, it is not > >> predicated or > >> + * it is a SEL. > >> + * 2) All the channels that we read have been written in that > >> + * last write instruction. > >> + * > >> + * We keep doing this for as long as the next instruction > >> + * keeps reading the spilled register and break as soon as we > >> + * find an instruction that doesn't. > >> + */ > >> + if (spill_write_inst && > >> + (!spill_write_inst->predicate || > >> + spill_write_inst->opcode == BRW_OPCODE_SEL) && > >> + ((brw_mask_for_swizzle(inst->src[i].swizzle) & > >> + ~spill_write_inst->dst.writemask) == 0)) { > >> + scratch_reg = spill_write_inst->dst.reg; > >> + } else if (scratch_reg == -1) { > > > > One suggestion: You could factor out the rather complex caching logic > > into a separate function (e.g. 'bool can_reuse_scratch_for_source(const > > vec4_instruction *, unsigned i, unsigned scratch_reg)'). The function > > would simply compare scratch_reg with the sources of the current > > instruction (up to src) and the sources and destination of the previous > > Huh, of course "up to i" is what I intended to write.
Up to i? Shouldn't we just check source i only? The way I understand this is that we would have a loop for each source in the current instruction and call this function for each one to see if it is cached. > > non-scratch_read/write instruction. If there's a match it would check > > that the regioning is compatible with the i-th source and return true in > > that case. This would have several benefits: > > > > - It would keep ::spill_reg() simple and the caching heuristic factored > > out. The only thing spill_reg() would still need to take care of is > > keep track of the last spilling temporary (e.g. as you're doing it > > now with the scratch_reg variable) and pass it to > > can_reuse_scratch_for_source(), but it would no longer be necessary > > to reset it to -1 when reuse is not possible. > > > > - It would allow you to use a single implementation of the caching > > policy to implement both spill_reg() and evaluate_spill_costs(), what > > would make sure that things don't go out of sync in the likely case > > that we want to change the caching heuristic in the future. > > > In case it's not obvious from my previous explanation > evaluate_spill_costs() could invoke the function I proposed like > "can_reuse_scratch_for_source(inst, i, inst->src[i].reg)" to find out if > the register would be cached by spill_reg(). Sounds like a good idea, I'll give it a try. Thanks for having a look at the patches by the way. Iago > >> scratch_reg = alloc.allocate(1); > >> src_reg temp = inst->src[i]; > >> temp.reg = scratch_reg; > >> @@ -357,6 +389,9 @@ vec4_visitor::spill_reg(int spill_reg_nr) > >> > >> if (inst->dst.file == GRF && inst->dst.reg == spill_reg_nr) { > >> emit_scratch_write(block, inst, spill_offset); > >> + spill_write_inst = inst; > >> + } else if (spill_write_inst && !spill_reg_was_read) { > >> + spill_write_inst = NULL; > >> } > >> } > >> > >> -- > >> 1.9.1 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev