On 03/27/2015 11:33 AM, Kenneth Graunke wrote: > On Thursday, March 26, 2015 08:52:33 PM Ian Romanick wrote: >> From: Ian Romanick <ian.d.roman...@intel.com> >> >> The CSE algorithm will continuously allocate new ae_entry objects. As >> each new basic block is exited, all of the previously allocated objects >> are dumped. Instead, put them in a free list and re-use them in the >> next basic block. Reduce, reuse, recycle! >> >> Signed-off-by: Ian Romanick <ian.d.roman...@intel.com> >> Cc: Jordan Justen <jordan.l.jus...@intel.com> >> >> --- >> src/glsl/opt_cse.cpp | 63 > +++++++++++++++++++++++++++++++++++++++++++++------- >> 1 file changed, 55 insertions(+), 8 deletions(-) >> >> diff --git a/src/glsl/opt_cse.cpp b/src/glsl/opt_cse.cpp >> index 623268e..425eebc 100644 >> --- a/src/glsl/opt_cse.cpp >> +++ b/src/glsl/opt_cse.cpp >> @@ -63,6 +63,17 @@ public: >> var = NULL; >> } >> >> + void init(ir_instruction *base_ir, ir_rvalue **val) >> + { >> + this->val = val; >> + this->base_ir = base_ir; >> + this->var = NULL; >> + >> + assert(val); >> + assert(*val); >> + assert(base_ir); >> + } >> + >> /** >> * The pointer to the expression that we might be able to reuse >> * >> @@ -116,6 +127,18 @@ private: >> ir_rvalue *try_cse(ir_rvalue *rvalue); >> void add_to_ae(ir_rvalue **rvalue); >> >> + /** >> + * Move all nodes from the ae list to the free list >> + */ >> + void empty_ae_list(); >> + >> + /** >> + * Get and initialize a new ae_entry >> + * >> + * This will either come from the free list or be freshly allocated. >> + */ >> + ae_entry *get_ae_entry(ir_rvalue **rvalue); >> + >> /** List of ae_entry: The available expressions to reuse */ >> exec_list *ae; >> >> @@ -126,6 +149,11 @@ private: >> * right. >> */ >> exec_list *validate_instructions; >> + >> + /** >> + * List of available-for-use ae_entry objects. >> + */ >> + exec_list free_ae_entries; >> }; >> >> /** >> @@ -322,6 +350,25 @@ cse_visitor::try_cse(ir_rvalue *rvalue) >> return NULL; >> } >> >> +void >> +cse_visitor::empty_ae_list() >> +{ >> + free_ae_entries.append_list(ae); > > If you're just trying to save memory...it sure looks like you can just > ralloc_free(mem_ctx); mem_ctx = ralloc_context(NULL) here. Which would > be much simpler. > > But I suppose this does reduce the number of malloc calls considerably, > so it's probably more efficient.
It was a bit of both. The original code was clearly "doing it wrong," and using an object pool seemed like the obvious solution. Doing an extra ralloc_free and ralloc_context here will increase the time spent in ralloc, and that doesn't seem desirable. Now, if I really thought CSE was a significant performance bottleneck, I'd change get_ae_entry to use pop_tail to get the most recently added (and likely cache-hot) ae_entry. I'd probably also keep the ae_entry list across calls and free them all at the end of the optimization loop. It doesn't seem worth the effort. There are a bunch of places throughout Mesa that could benefit from an object pool. What may be worth the effort is making a general mechanism that could be used in all the places. That couldn't be more than a half day of effort. >> +} >> + >> +ae_entry * >> +cse_visitor::get_ae_entry(ir_rvalue **rvalue) >> +{ >> + ae_entry *entry = (ae_entry *) free_ae_entries.pop_head(); >> + if (entry) { >> + entry->init(base_ir, rvalue); >> + } else { >> + entry = new(mem_ctx) ae_entry(base_ir, rvalue); >> + } >> + >> + return entry; >> +} >> +
signature.asc
Description: OpenPGP digital signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev