Francisco Jerez <curroje...@riseup.net> writes: > Iago Toral Quiroga <ito...@igalia.com> writes: > >> Previous patches made it so that we do not need to unspill the same vgrf >> with every instruction as long as these instructions come right after >> the register was spilled or unspilled. This means that actually spilling >> the register is now cheaper in these scenarios, so adjust the spilling >> cost calculation accordingly. >> --- >> .../drivers/dri/i965/brw_vec4_reg_allocate.cpp | 61 >> +++++++++++++++++++--- >> 1 file changed, 55 insertions(+), 6 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp >> b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp >> index 9a69fac..652a68c 100644 >> --- a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp >> @@ -269,6 +269,21 @@ vec4_visitor::evaluate_spill_costs(float *spill_costs, >> bool *no_spill) >> { >> float loop_scale = 1.0; >> >> + /* If a spilled register is written (or unspilled) and then immediately >> read, >> + * we will avoid to emit scratch reads for that register for as long as >> + * consecutive instructions keep reading that register. This makes >> spilling >> + * the register cheaper, so we need to account for this here. Since any >> + * instruction can only have at most 3 src operands we can only have up >> to >> + * 3 cached registers (i.e. registers that we do not need to unspill if >> they >> + * are read in a consecutive instruction) at any given time. >> + * >> + * We keep two lists, cached[0] has the registers that are cached before >> + * the current instruction, and cached[1] has the registers that are >> + * cached after the current instruction. We copy cached[1] into cached[0] >> + * after processing each instruction. >> + */ >> + int cached[2][3] = { -1, -1, -1, -1, -1, -1 }; >> + > > Wouldn't it be easier to drop the array and just compare the sources of > each instruction with the previous one? > >> for (unsigned i = 0; i < this->alloc.count; i++) { >> spill_costs[i] = 0.0; >> no_spill[i] = alloc.sizes[i] != 1; >> @@ -279,18 +294,52 @@ vec4_visitor::evaluate_spill_costs(float *spill_costs, >> bool *no_spill) >> * loops run 10 times. >> */ >> foreach_block_and_inst(block, vec4_instruction, inst, cfg) { >> + memset(cached[1], -1, sizeof(cached[1])); >> + >> for (unsigned int i = 0; i < 3; i++) { >> - if (inst->src[i].file == GRF) { >> - spill_costs[inst->src[i].reg] += loop_scale; >> - assert(!inst->src[i].reladdr); >> - } >> + if (inst->src[i].file == GRF) { >> + /* Don't increase spilling cost if src[i] isn't going >> + * to be unspilled >> + */ >> + int slot; >> + for (slot = 0; slot < 3; slot++) { >> + if (inst->src[i].reg == cached[0][slot]) > > I guess this should take the reg_offset and writemask of the cached > register into account. I suggest you use backend_reg::in_range() to > check if there is overlap. >
It also doesn't look like this handles the case in which the same register is unspilled several times for the same instruction correctly, as you've implemented it in PATCH 1. >> + break; >> + } >> + if (slot == 3) { >> + /* Not cached, we will unspill src[i] */ >> + spill_costs[inst->src[i].reg] += loop_scale; >> + assert(!inst->src[i].reladdr); >> + } else { >> + /* It is cached. Since it is read in this instruction it >> + * stays cached for the next. >> + */ >> + cached[1][slot] = inst->src[i].reg; >> + } >> + } >> } >> >> if (inst->dst.file == GRF) { >> - spill_costs[inst->dst.reg] += loop_scale; >> - assert(!inst->dst.reladdr); >> + int slot; >> + /* Mark this reg as cached because if the next instruction reads it >> + * we won't unspill it. >> + */ >> + for (slot = 0; slot < 3; slot++) { >> + if (cached[1][slot] == -1) { >> + cached[1][slot] = inst->dst.reg; >> + break; >> + } >> + } >> + assert(slot < 3); >> + spill_costs[inst->dst.reg] += loop_scale; >> + assert(!inst->dst.reladdr); >> } >> >> + /* Cached registers after this instruction are the cached registers >> + * for the next instruction >> + */ >> + memcpy(cached[0], cached[1], sizeof(cached[0])); >> + >> switch (inst->opcode) { >> >> case BRW_OPCODE_DO: >> -- >> 1.9.1
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev