Iago Toral <ito...@igalia.com> writes: > On Thu, 2015-07-30 at 17:14 +0300, Francisco Jerez wrote: >> Francisco Jerez <curroje...@riseup.net> writes: >> >> > Iago Toral Quiroga <ito...@igalia.com> writes: >> > >> >> When we have code such as this: >> >> >> >> mov vgrf1.0.x:F, vgrf2.xxxx:F >> >> mov vgrf3.0.x:F, vgrf1.xxxx:F >> >> ... >> >> mov vgrf3.0.x:F, vgrf1.xxxx:F >> >> >> >> And vgrf1 is chosen for spilling, we can emit this: >> >> >> >> mov vgrf1.0.x:F, vgrf2.xxxx:F >> >> gen4_scratch_write hw_reg0:F, vgrf1.xxxx:D, 22D >> >> mov vgrf3.0.x:F, vgrf1.xxxx:F >> >> ... >> >> gen4_scratch_read vgrf4.0.x:F, 22D >> >> mov vgrf3.0.x:F, vgrf4.xxxx:F >> >> >> >> Instead of this: >> >> >> >> mov vgrf1.0.x:F, vgrf2.xxxx:F >> >> gen4_scratch_write hw_reg0:F, vgrf1.xxxx:D, 22D >> >> gen4_scratch_read vgrf4.0.x:F, 22D >> >> mov vgrf3.0.x:F, vgrf4.xxxx:F >> >> ... >> >> gen4_scratch_read vgrf5.0.x:F, 22D >> >> mov vgrf3.0.x:F, vgrf5.xxxx:F >> >> >> >> And save one scratch read while still preserving the benefits of >> >> spilling the register. >> >> >> >> In general, we avoid emitting scratch reads for as long as the next >> >> instruction >> >> keeps reading the spilled register. This should not harm the benefit of >> >> spilling the register because gains for register allocation only come >> >> when we >> >> have chunks of program code where the register is alive but not really >> >> used >> >> (because these are the points where we could effectively use that >> >> register for >> >> another purpose if we spilled it), so as long as consecutive instructions >> >> use >> >> that register we can avoid the scratch reads without losing anything. >> >> --- >> >> .../drivers/dri/i965/brw_vec4_reg_allocate.cpp | 37 >> >> +++++++++++++++++++++- >> >> 1 file changed, 36 insertions(+), 1 deletion(-) >> >> >> >> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp >> >> b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp >> >> index cff5406..fd56dae 100644 >> >> --- a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp >> >> +++ b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp >> >> @@ -340,11 +340,43 @@ vec4_visitor::spill_reg(int spill_reg_nr) >> >> unsigned int spill_offset = last_scratch++; >> >> >> >> /* Generate spill/unspill instructions for the objects being spilled. >> >> */ >> >> + vec4_instruction *spill_write_inst = NULL; >> >> foreach_block_and_inst(block, vec4_instruction, inst, cfg) { >> >> + /* We don't spill registers used for scratch */ >> >> + if (inst->opcode == SHADER_OPCODE_GEN4_SCRATCH_READ || >> >> + inst->opcode == SHADER_OPCODE_GEN4_SCRATCH_WRITE) >> >> + continue; >> >> + >> >> int scratch_reg = -1; >> >> + bool spill_reg_was_read = false; >> >> for (unsigned int i = 0; i < 3; i++) { >> >> if (inst->src[i].file == GRF && inst->src[i].reg == >> >> spill_reg_nr) { >> >> - if (scratch_reg == -1) { >> >> + if (!spill_reg_was_read) { >> >> + spill_reg_was_read = (!inst->predicate || >> >> + inst->opcode == BRW_OPCODE_SEL); >> >> + } >> >> + >> >> + /* If we are reading the spilled register right after writing >> >> + * to it we can skip the scratch read and use directly the >> >> + * register we used as source for the scratch write. For this >> >> + * to work we must check that: >> >> + * >> >> + * 1) The write is inconditional, that is, it is not >> >> predicated or >> >> + * it is a SEL. >> >> + * 2) All the channels that we read have been written in that >> >> + * last write instruction. >> >> + * >> >> + * We keep doing this for as long as the next instruction >> >> + * keeps reading the spilled register and break as soon as we >> >> + * find an instruction that doesn't. >> >> + */ >> >> + if (spill_write_inst && >> >> + (!spill_write_inst->predicate || >> >> + spill_write_inst->opcode == BRW_OPCODE_SEL) && >> >> + ((brw_mask_for_swizzle(inst->src[i].swizzle) & >> >> + ~spill_write_inst->dst.writemask) == 0)) { >> >> + scratch_reg = spill_write_inst->dst.reg; >> >> + } else if (scratch_reg == -1) { >> > >> > One suggestion: You could factor out the rather complex caching logic >> > into a separate function (e.g. 'bool can_reuse_scratch_for_source(const >> > vec4_instruction *, unsigned i, unsigned scratch_reg)'). The function >> > would simply compare scratch_reg with the sources of the current >> > instruction (up to src) and the sources and destination of the previous >> >> Huh, of course "up to i" is what I intended to write. > > Up to i? Shouldn't we just check source i only? The way I understand > this is that we would have a loop for each source in the current > instruction and call this function for each one to see if it is cached. > The purpose is to check whether the spill candidate has been used by the same instruction already, what would give roughly the same effect as your PATCH 1 using the same approach.
>> > non-scratch_read/write instruction. If there's a match it would check >> > that the regioning is compatible with the i-th source and return true in >> > that case. This would have several benefits: >> > >> > - It would keep ::spill_reg() simple and the caching heuristic factored >> > out. The only thing spill_reg() would still need to take care of is >> > keep track of the last spilling temporary (e.g. as you're doing it >> > now with the scratch_reg variable) and pass it to >> > can_reuse_scratch_for_source(), but it would no longer be necessary >> > to reset it to -1 when reuse is not possible. >> > >> > - It would allow you to use a single implementation of the caching >> > policy to implement both spill_reg() and evaluate_spill_costs(), what >> > would make sure that things don't go out of sync in the likely case >> > that we want to change the caching heuristic in the future. >> > >> In case it's not obvious from my previous explanation >> evaluate_spill_costs() could invoke the function I proposed like >> "can_reuse_scratch_for_source(inst, i, inst->src[i].reg)" to find out if >> the register would be cached by spill_reg(). > > Sounds like a good idea, I'll give it a try. Thanks for having a look at > the patches by the way. > > Iago > >> >> scratch_reg = alloc.allocate(1); >> >> src_reg temp = inst->src[i]; >> >> temp.reg = scratch_reg; >> >> @@ -357,6 +389,9 @@ vec4_visitor::spill_reg(int spill_reg_nr) >> >> >> >> if (inst->dst.file == GRF && inst->dst.reg == spill_reg_nr) { >> >> emit_scratch_write(block, inst, spill_offset); >> >> + spill_write_inst = inst; >> >> + } else if (spill_write_inst && !spill_reg_was_read) { >> >> + spill_write_inst = NULL; >> >> } >> >> } >> >> >> >> -- >> >> 1.9.1
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev