On Thu, 2015-07-30 at 17:08 +0300, Francisco Jerez wrote: > Iago Toral Quiroga <ito...@igalia.com> writes: > > > When we have code such as this: > > > > mov vgrf1.0.x:F, vgrf2.xxxx:F > > mov vgrf3.0.x:F, vgrf1.xxxx:F > > ... > > mov vgrf3.0.x:F, vgrf1.xxxx:F > > > > And vgrf1 is chosen for spilling, we can emit this: > > > > mov vgrf1.0.x:F, vgrf2.xxxx:F > > gen4_scratch_write hw_reg0:F, vgrf1.xxxx:D, 22D > > mov vgrf3.0.x:F, vgrf1.xxxx:F > > ... > > gen4_scratch_read vgrf4.0.x:F, 22D > > mov vgrf3.0.x:F, vgrf4.xxxx:F > > > > Instead of this: > > > > mov vgrf1.0.x:F, vgrf2.xxxx:F > > gen4_scratch_write hw_reg0:F, vgrf1.xxxx:D, 22D > > gen4_scratch_read vgrf4.0.x:F, 22D > > mov vgrf3.0.x:F, vgrf4.xxxx:F > > ... > > gen4_scratch_read vgrf5.0.x:F, 22D > > mov vgrf3.0.x:F, vgrf5.xxxx:F > > > > And save one scratch read while still preserving the benefits of > > spilling the register. > > > > In general, we avoid emitting scratch reads for as long as the next > > instruction > > keeps reading the spilled register. This should not harm the benefit of > > spilling the register because gains for register allocation only come when > > we > > have chunks of program code where the register is alive but not really used > > (because these are the points where we could effectively use that register > > for > > another purpose if we spilled it), so as long as consecutive instructions > > use > > that register we can avoid the scratch reads without losing anything. > > --- > > .../drivers/dri/i965/brw_vec4_reg_allocate.cpp | 37 > > +++++++++++++++++++++- > > 1 file changed, 36 insertions(+), 1 deletion(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp > > b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp > > index cff5406..fd56dae 100644 > > --- a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp > > +++ b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp > > @@ -340,11 +340,43 @@ vec4_visitor::spill_reg(int spill_reg_nr) > > unsigned int spill_offset = last_scratch++; > > > > /* Generate spill/unspill instructions for the objects being spilled. */ > > + vec4_instruction *spill_write_inst = NULL; > > foreach_block_and_inst(block, vec4_instruction, inst, cfg) { > > + /* We don't spill registers used for scratch */ > > + if (inst->opcode == SHADER_OPCODE_GEN4_SCRATCH_READ || > > + inst->opcode == SHADER_OPCODE_GEN4_SCRATCH_WRITE) > > + continue; > > + > > int scratch_reg = -1; > > + bool spill_reg_was_read = false; > > for (unsigned int i = 0; i < 3; i++) { > > if (inst->src[i].file == GRF && inst->src[i].reg == spill_reg_nr) > > { > > - if (scratch_reg == -1) { > > + if (!spill_reg_was_read) { > > + spill_reg_was_read = (!inst->predicate || > > + inst->opcode == BRW_OPCODE_SEL); > > + } > > + > > + /* If we are reading the spilled register right after writing > > + * to it we can skip the scratch read and use directly the > > + * register we used as source for the scratch write. For this > > + * to work we must check that: > > + * > > + * 1) The write is inconditional, that is, it is not > > predicated or > > + * it is a SEL. > > + * 2) All the channels that we read have been written in that > > + * last write instruction. > > + * > > + * We keep doing this for as long as the next instruction > > + * keeps reading the spilled register and break as soon as we > > + * find an instruction that doesn't. > > + */ > > + if (spill_write_inst && > > + (!spill_write_inst->predicate || > > + spill_write_inst->opcode == BRW_OPCODE_SEL) && > > + ((brw_mask_for_swizzle(inst->src[i].swizzle) & > > + ~spill_write_inst->dst.writemask) == 0)) { > > + scratch_reg = spill_write_inst->dst.reg; > > + } else if (scratch_reg == -1) { > > One suggestion: You could factor out the rather complex caching logic > into a separate function (e.g. 'bool can_reuse_scratch_for_source(const > vec4_instruction *, unsigned i, unsigned scratch_reg)'). The function > would simply compare scratch_reg with the sources of the current > instruction (up to src) and the sources and destination of the previous > non-scratch_read/write instruction. If there's a match it would check > that the regioning is compatible with the i-th source and return true in > that case. This would have several benefits:
I think this might need to be a bit more complex. The previous inst's src[i] might read only a subset of the channels that where loaded into scratch_reg so comparing only against that can lead us to think that we can't reuse scratch_reg when in fact we can. I think the process should be more like we loop back looking at the prev instruction for as long as the previous instruction reads the scratch_ref until we find the one that writes to scratch_reg (which must be a scratch read). At that point we check that the writemask in that instruction is compatible with our swizzle on src[i], in which case we can reuse scratch_reg. Does this make sense to you? In this case, if we have a successful match, we probably want to keep a reference to that instruction so that for the next src of the current instruction (or for the next instruction in our program) we don't have to repeat that process. Iago > - It would keep ::spill_reg() simple and the caching heuristic factored > out. The only thing spill_reg() would still need to take care of is > keep track of the last spilling temporary (e.g. as you're doing it > now with the scratch_reg variable) and pass it to > can_reuse_scratch_for_source(), but it would no longer be necessary > to reset it to -1 when reuse is not possible. > > - It would allow you to use a single implementation of the caching > policy to implement both spill_reg() and evaluate_spill_costs(), what > would make sure that things don't go out of sync in the likely case > that we want to change the caching heuristic in the future. > > > scratch_reg = alloc.allocate(1); > > src_reg temp = inst->src[i]; > > temp.reg = scratch_reg; > > @@ -357,6 +389,9 @@ vec4_visitor::spill_reg(int spill_reg_nr) > > > > if (inst->dst.file == GRF && inst->dst.reg == spill_reg_nr) { > > emit_scratch_write(block, inst, spill_offset); > > + spill_write_inst = inst; > > + } else if (spill_write_inst && !spill_reg_was_read) { > > + spill_write_inst = NULL; > > } > > } > > > > -- > > 1.9.1 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev