Kenneth Graunke <kenn...@whitecape.org> writes: > On 10/02/2012 07:52 PM, Eric Anholt wrote: >> Based on split_virtual_grfs(), we choose the same set every time, so set it >> in >> stone. This will help us avoid regenerating the somewhat expensive >> class/register set setup every compile. >> --- >> src/mesa/drivers/dri/i965/brw_fs.h | 1 + >> src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 101 >> +++++++++------------ >> 2 files changed, 42 insertions(+), 60 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h >> b/src/mesa/drivers/dri/i965/brw_fs.h >> index e69de31..34747d3 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs.h >> +++ b/src/mesa/drivers/dri/i965/brw_fs.h >> @@ -375,6 +375,7 @@ public: >> unsigned output_components[BRW_MAX_DRAW_BUFFERS]; >> fs_reg dual_src_output; >> int first_non_payload_grf; >> + /** Either BRW_MAX_GRF or GEN7_MRF_HACK_START */ >> int max_grf; >> int urb_setup[FRAG_ATTRIB_MAX]; >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp >> b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp >> index 37c8917..d1d9949 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp >> @@ -72,13 +72,29 @@ fs_visitor::assign_regs_trivial() >> } >> >> static void >> -brw_alloc_reg_set_for_classes(struct brw_context *brw, >> - int *class_sizes, >> - int class_count, >> - int reg_width, >> - int base_reg_count) >> +brw_alloc_reg_set(struct brw_context *brw, int reg_width, int >> base_reg_count) >> { >> struct intel_context *intel = &brw->intel; >> + /* The registers used to make up almost all values handled in the >> compiler >> + * are a scalar value occupying a single register (or 2 registers in the >> + * case of 16-wide, which is handled by dividing base_reg_count by 2 and >> + * multiplying allocated register numbers by 2). Things that were >> + * aggregates of scalar values at the GLSL level were split to scalar >> + * values by split_virtual_grfs(). >> + * >> + * However, texture SEND messages return a series of contiguous >> registers. >> + * We currently always ask for 4 registers, but we may convert that to >> use >> + * less some day. >> + * >> + * Additionally, on gen5 we need aligned pairs of registers for the PLN >> + * instruction. >> + * >> + * So we have a need for classes for 1, 2, and 4 registers currently, and >> + * we add in '3' to make indexing the array easier (since we'll probably >> + * want it for texturing later). >> + */ >> + const int class_sizes[4] = {1, 2, 3, 4}; >> + const int class_count = 4; >> >> /* Compute the total number of registers across all classes. */ >> int ra_reg_count = 0; >> @@ -139,7 +155,6 @@ brw_alloc_reg_set_for_classes(struct brw_context *brw, >> pairs_base_reg + i); >> } >> } >> - class_count++; >> } >> >> ra_set_finalize(brw->wm.regs, NULL); > > Would it be worthwhile to compute the q values here ourselves, rather > than relying on the generic computation in ra? Tom found that reduced a > bunch of overhead in r600. > > I haven't looked into it, so it might be totally useless with the way > our classes our set up...just a thought.
Someone could time it, but I suspect now that it's a one-shot deal nobody will notice.
pgpRBPLggLxU0.pgp
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev