On Feb 16, 2015 8:35 AM, "Francisco Jerez" <curroje...@riseup.net> wrote: > > The round-robin allocation strategy is expected to decrease the amount > of false dependencies created by the register allocator and give the > post-RA scheduling pass more freedom to move instructions around. On > the other hand it has the disadvantage of increasing fragmentation and > decreasing the number of equally-colored nearby nodes, what increases > the likelihood of failure in presence of optimistically colorable > nodes. > > This patch disables the round-robin strategy for optimistically > colorable nodes. These typically arise in situations of high register > pressure or for registers with large live intervals, in both cases the > task of the instruction scheduler shouldn't be constrained excessively > by the dense packing of those nodes, and a spill (or on Intel hardware > a fall-back to SIMD8 mode) is invariably worse than a slightly less > optimal scheduling.
Actually, that's not true. Matt was doing some experiments recently with a noise shader from synmark and the difference between our 2nd and 3rd choice schedulers is huge. In that test he disabled the third choice scheduler and the result was a shader that spilled 6 or 8 times but ran something like 30% faster. We really need to do some more experimentation with scheduling and figure out better heuristics than "SIMD16 is always faster" and "spilling is bad". > Shader-db results on the i965 driver: > > total instructions in shared programs: 5488539 -> 5488489 (-0.00%) > instructions in affected programs: 1121 -> 1071 (-4.46%) > helped: 1 > HURT: 0 > GAINED: 49 > LOST: 5 > --- > src/util/register_allocate.c | 22 +++++++++++++++++++++- > 1 file changed, 21 insertions(+), 1 deletion(-) > > diff --git a/src/util/register_allocate.c b/src/util/register_allocate.c > index af7a20c..d63d8eb 100644 > --- a/src/util/register_allocate.c > +++ b/src/util/register_allocate.c > @@ -168,6 +168,12 @@ struct ra_graph { > > unsigned int *stack; > unsigned int stack_count; > + > + /** > + * Tracks the start of the set of optimistically-colored registers in the > + * stack. > + */ > + unsigned int stack_optimistic_start; > }; > > /** > @@ -454,6 +460,7 @@ static void > ra_simplify(struct ra_graph *g) > { > bool progress = true; > + unsigned int stack_optimistic_start = ~0; > int i; > > while (progress) { > @@ -483,12 +490,16 @@ ra_simplify(struct ra_graph *g) > > if (!progress && best_optimistic_node != ~0U) { > decrement_q(g, best_optimistic_node); > + stack_optimistic_start = > + MIN2(stack_optimistic_start, g->stack_count); > g->stack[g->stack_count] = best_optimistic_node; > g->stack_count++; > g->nodes[best_optimistic_node].in_stack = true; > progress = true; > } > } > + > + g->stack_optimistic_start = stack_optimistic_start; > } > > /** > @@ -542,7 +553,16 @@ ra_select(struct ra_graph *g) > g->nodes[n].reg = r; > g->stack_count--; > > - if (g->regs->round_robin) > + /* Rotate the starting point except for optimistically colorable nodes. > + * The likelihood that we will succeed at allocating optimistically > + * colorable nodes is highly dependent on the way that the previous > + * nodes popped off the stack are laid out. The round-robin strategy > + * increases the fragmentation of the register file and decreases the > + * number of nearby nodes assigned to the same color, what increases the > + * likelihood of spilling with respect to the dense packing strategy. > + */ > + if (g->regs->round_robin && > + g->stack_count <= g->stack_optimistic_start) > start_search_reg = r + 1; > } > > -- > 2.1.3 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev