Re: [Mesa-dev] [PATCH 1/2] glsl: Implement a SSBO load optimization pass

Francisco Jerez Wed, 21 Oct 2015 03:04:59 -0700

Iago Toral <ito...@igalia.com> writes:

> Hi Curro,
>
> On Tue, 2015-10-20 at 14:18 +0300, Francisco Jerez wrote:
>> Iago Toral <ito...@igalia.com> writes:
>> 
>> > On Tue, 2015-10-20 at 13:22 +0300, Francisco Jerez wrote:
>> >> Iago Toral Quiroga <ito...@igalia.com> writes:
>> >> 
>> >> > This allows us to re-use the results of previous ssbo loads in 
>> >> > situations
>> >> > that are safe (i.e. when there are no stores, atomic operations or
>> >> > memory barriers in between).
>> >> >
>> >> > This is particularly useful for things like matrix multiplications, 
>> >> > where
>> >> > for a mat4 buffer variable we cut the number of loads from 16 (4 reads 
>> >> > of
>> >> > each column) down to 4 (1 read of each column).
>> >> >
>> >> > The pass can only cache ssbo loads that involve constant blocks and
>> >> > offsets, but could be extended to compare sub-expressions for these
>> >> > as well, similar to a CSE pass.
>> >> >
>> >> > The way the cache works is simple: ssbo loads with constant block/offset
>> >> > are included in a cache as they are seen. Stores invalidate cache 
>> >> > entries.
>> >> > Stores with non-constant offset invalidate all cached loads for the 
>> >> > block
>> >> > and stores with non-constant block invalidate all cache entries. There 
>> >> > is
>> >> > room to improve this by using the actual variable name we are accessing 
>> >> > to
>> >> > limit the entries that should be invalidated. We also need to invalidate
>> >> > cache entries when we exit the block in which they have been defined
>> >> > (i.e. inside if/else blocks or loops).
>> >> >
>> >> > The cache optimization is built as a separate pass, instead of merging 
>> >> > it
>> >> > inside the lower_ubo_reference pass for a number of reasons:
>> >> >
>> >> > 1) The way we process assignments in visitors is that the LHS is
>> >> > processed before the RHS. This creates a problem for an optimization
>> >> > such as this when we do things like a = a + 1, since we would see the
>> >> > store before the read when the actual execution order is reversed.
>> >> > This could be fixed by re-implementing the logic in the visit_enter
>> >> > method for ir_assignment in lower_ubo_reference and then returning
>> >> > visit_continue_with_parent.
>> >> >
>> >> > 2) Some writes/reads need to be split into multiple smaller
>> >> > writes/reads, and we need to handle caching for each one. This happens
>> >> > deep inside the code that handles the lowering and some
>> >> > of the information we need to do this is not available. This could also
>> >> > be fixed by passing more data into the corresponding functions or by
>> >> > making this data available as class members, but the current 
>> >> > implementation
>> >> > is already complex enough and  this would only contribute to the 
>> >> > complexity.
>> >> >
>> >> > 3) We can have ssbo loads in the LHS too (i.e. a[a[0]] = ..). In these 
>> >> > cases
>> >> > the current code in lower_uo_reference would see the store before the 
>> >> > read.
>> >> > Probably fixable, but again would add more complexity to the lowering.
>> >> >
>> >> > On the other hand, a separate pass that runs after the lowering sees
>> >> > all the individal loads and stores in the correct order (so we don't 
>> >> > need
>> >> > to do any tricks) and it allows us to sepearate the lowering logic 
>> >> > (which
>> >> > is already complex) from the caching logic. It also gives us a chance to
>> >> > run it after other optimization passes have run and turned constant
>> >> > expressions for block/offset into constants, enabling more opportunities
>> >> > for caching.
>> >> 
>> >> Seems like a restricted form of CSE that only handles SSBO loads, and
>> >> only the ones with constant arguments.  Why don't we CSE these? (and
>> >> other memory access operations like image loads)
>> >
>> > There is not a CSE pass in GLSL IR any more so we would have to do it in
>> > NIR and some drivers would lose the optimization. Doing it at GLSL IR
>> > level seemed like a win from this perspective.
>> >
>> > Then there is the fact that we cannot just CSE these. We need to make
>> > sure that we only CSE them when it is safe to do so (i.e. no
>> > stores/atomics to the same offsets in between, no memory barriers, etc).
>> > The current CSE pass in NIR does not support this as far as I can see. I
>> > suppose that we could look into changing the pass to accommodate
>> > restrictions such as this if we think is worth it.
>> >
>> Not really sure if the NIR CSE pass would be adequate, but at least at
>> the i965 back-end level this could be handled easily in the CSE pass
>> (for typed and untyped surface read opcodes in general) in roughly the
>> same way that source variable interference is handled -- Just kill
>> potentially overlapping entries from the AEB whenever an atomic or write
>> instruction for the same surface is seen.
>
> I've been having a quick look at this option but I don't think this is
> going to work for us. The first thing I noticed is that the CSE pass in
> i965 was not currently acting on ssbo loads at all: that is, it would
> add them to the AEB but immediately remove them because of this:
>
> /* Kill any AEB entries using registers that don't get reused any
>  * more -- a sure sign they'll fail operands_match().
>  */
>
> The problem here is that operands_match checks that all operands in the
> untyped reads match, but of course it won't use the values to check for
> this, it'll check that they are actually the same registers. This is not
> what we want to do, we want to identify identical ssbo loads by
> inspecting the block index and offset (i.e. the values inside the
> registers, which we can't even check because they are GRF registers at
> this point). Because of this, the chances of the CSE pass identifying
> matching loads is really slim, which explains why we never saw this
> producing problems even when we never disabled it for SSBO loads or
> images, at least until recently (Jordan found one example where CSE
> kicked in and created a problem).
>
That sounds fine as long as the surface index and offset are eventually
assigned to the same register if they are the same value -- If either of
them is a copy they should be copy or constant-propagated, otherwise
copy propagation has a bug (Reminds me of some earlier discussion with
Kristian.  Sure it's not useful?).  If the payload spans several
registers (e.g. header and several coordinate components) the
LOAD_PAYLOAD instruction that initializes it should be CSE'ed
recursively if its sources can also be determined to be the same,
otherwise CSE has a bug.


> Iago
>
>> > Iago
>> >
>> >> 
>> >> > ---
>> >> >  src/glsl/Makefile.sources  |   1 +
>> >> >  src/glsl/ir_optimization.h |   1 +
>> >> >  src/glsl/opt_ssbo_load.cpp | 338 
>> >> > +++++++++++++++++++++++++++++++++++++++++++++
>> >> >  3 files changed, 340 insertions(+)
>> >> >  create mode 100644 src/glsl/opt_ssbo_load.cpp
>> >> >
>> >> > diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
>> >> > index ca87036..73c7514 100644
>> >> > --- a/src/glsl/Makefile.sources
>> >> > +++ b/src/glsl/Makefile.sources
>> >> > @@ -201,6 +201,7 @@ LIBGLSL_FILES = \
>> >> >         opt_noop_swizzle.cpp \
>> >> >         opt_rebalance_tree.cpp \
>> >> >         opt_redundant_jumps.cpp \
>> >> > +       opt_ssbo_load.cpp \
>> >> >         opt_structure_splitting.cpp \
>> >> >         opt_swizzle_swizzle.cpp \
>> >> >         opt_tree_grafting.cpp \
>> >> > diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h
>> >> > index ce5c492..26677d7 100644
>> >> > --- a/src/glsl/ir_optimization.h
>> >> > +++ b/src/glsl/ir_optimization.h
>> >> > @@ -125,6 +125,7 @@ bool lower_clip_distance(gl_shader *shader);
>> >> >  void lower_output_reads(unsigned stage, exec_list *instructions);
>> >> >  bool lower_packing_builtins(exec_list *instructions, int op_mask);
>> >> >  void lower_ubo_reference(struct gl_shader *shader, exec_list 
>> >> > *instructions);
>> >> > +bool opt_ssbo_loads(struct gl_shader *shader, exec_list *instructions);
>> >> >  void lower_packed_varyings(void *mem_ctx,
>> >> >                             unsigned locations_used, ir_variable_mode 
>> >> > mode,
>> >> >                             unsigned gs_input_vertices, gl_shader 
>> >> > *shader);
>> >> > diff --git a/src/glsl/opt_ssbo_load.cpp b/src/glsl/opt_ssbo_load.cpp
>> >> > new file mode 100644
>> >> > index 0000000..5404907
>> >> > --- /dev/null
>> >> > +++ b/src/glsl/opt_ssbo_load.cpp
>> >> > @@ -0,0 +1,338 @@
>> >> > +/*
>> >> > + * Copyright © 2015 Intel Corporation
>> >> > + *
>> >> > + * Permission is hereby granted, free of charge, to any person 
>> >> > obtaining a
>> >> > + * copy of this software and associated documentation files (the 
>> >> > "Software"),
>> >> > + * to deal in the Software without restriction, including without 
>> >> > limitation
>> >> > + * the rights to use, copy, modify, merge, publish, distribute, 
>> >> > sublicense,
>> >> > + * and/or sell copies of the Software, and to permit persons to whom 
>> >> > the
>> >> > + * Software is furnished to do so, subject to the following conditions:
>> >> > + *
>> >> > + * The above copyright notice and this permission notice (including 
>> >> > the next
>> >> > + * paragraph) shall be included in all copies or substantial portions 
>> >> > of the
>> >> > + * Software.
>> >> > + *
>> >> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 
>> >> > EXPRESS OR
>> >> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
>> >> > MERCHANTABILITY,
>> >> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT 
>> >> > SHALL
>> >> > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES 
>> >> > OR OTHER
>> >> > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, 
>> >> > ARISING
>> >> > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
>> >> > + * DEALINGS IN THE SOFTWARE.
>> >> > + */
>> >> > +
>> >> > +/**
>> >> > + * \file opt_ssbo_load.cpp
>> >> > + *
>> >> > + * IR optimization pass to reduce the number of SSBO loads by re-using 
>> >> > previous
>> >> > + * loads when it is safe to do so (i.e. no stores have invalidated the 
>> >> > load,
>> >> > + * no memory barriers in between, etc)
>> >> > + */
>> >> > +
>> >> > +#include "ir.h"
>> >> > +#include "ir_builder.h"
>> >> > +#include "program/hash_table.h"
>> >> > +
>> >> > +using namespace ir_builder;
>> >> > +
>> >> > +namespace {
>> >> > +struct ssbo_load_cache_remove_closure {
>> >> > +   struct hash_table *ht;
>> >> > +   const char *key_prefix;
>> >> > +   unsigned depth;
>> >> > +};
>> >> > +
>> >> > +struct ssbo_load_cache_entry {
>> >> > +   ir_dereference *load;
>> >> > +   unsigned depth;
>> >> > +};
>> >> > +
>> >> > +class ssbo_load_cache_visitor : public ir_hierarchical_visitor {
>> >> > +public:
>> >> > +   ssbo_load_cache_visitor(gl_shader *shader)
>> >> > +   : current_depth(0), progress(false)
>> >> > +   {
>> >> > +      mem_ctx = ralloc_parent(shader->ir);
>> >> > +      ssbo_load_cache =
>> >> > +         hash_table_ctor(0, hash_table_string_hash, 
>> >> > hash_table_string_compare);
>> >> > +   }
>> >> > +
>> >> > +   ~ssbo_load_cache_visitor()
>> >> > +   {
>> >> > +      hash_table_dtor(ssbo_load_cache);
>> >> > +   }
>> >> > +
>> >> > +   virtual ir_visitor_status visit_enter(ir_call *);
>> >> > +
>> >> > +   char *build_ssbo_load_cache_key(unsigned block_index,
>> >> > +                                   unsigned offset);
>> >> > +   ir_dereference *ssbo_load_cache_find(ir_rvalue *block_index,
>> >> > +                                        ir_rvalue *offset);
>> >> > +   void ssbo_load_cache_add(ir_dereference *deref,
>> >> > +                            ir_rvalue *block_index,
>> >> > +                            ir_rvalue *offset,
>> >> > +                            unsigned depth);
>> >> > +   void ssbo_load_cache_remove(ir_rvalue *block_index,
>> >> > +                               ir_rvalue *offset);
>> >> > +   void ssbo_load_cache_remove_by_key_prefix(const char *prefix);
>> >> > +   void ssbo_load_cache_remove_all();
>> >> > +   void ssbo_load_cache_remove_depth(unsigned depth);
>> >> > +
>> >> > +   ir_visitor_status visit_enter(ir_if *ir);
>> >> > +   ir_visitor_status visit_enter(ir_loop *ir);
>> >> > +   ir_visitor_status visit_leave(ir_loop *ir);
>> >> > +
>> >> > +   bool get_progress() { return this->progress; }
>> >> > +
>> >> > +private:
>> >> > +   void *mem_ctx;
>> >> > +   unsigned current_depth;
>> >> > +   struct hash_table *ssbo_load_cache;
>> >> > +   bool progress;
>> >> > +};
>> >> > +
>> >> > +ir_visitor_status
>> >> > +ssbo_load_cache_visitor::visit_enter(ir_call *ir)
>> >> > +{
>> >> > +   if (!ir->callee->is_intrinsic)
>> >> > +      return visit_continue_with_parent;
>> >> > +
>> >> > +   if (!strcmp(ir->callee_name(), "__intrinsic_load_ssbo")) {
>> >> > +      exec_node *param = ir->actual_parameters.get_head();
>> >> > +      ir_rvalue *block = ((ir_instruction *)param)->as_rvalue();
>> >> > +
>> >> > +      param = param->get_next();
>> >> > +      ir_rvalue *offset = ((ir_instruction *)param)->as_rvalue();
>> >> > +
>> >> > +      ir_dereference *cached_load = ssbo_load_cache_find(block, 
>> >> > offset);
>> >> > +      if (cached_load) {
>> >> > +         ir_variable *var = ir->return_deref->variable_referenced();
>> >> > +         base_ir->insert_before(assign(var, 
>> >> > cached_load->clone(mem_ctx, NULL)));
>> >> > +         ir->remove();
>> >> > +         this->progress = true;
>> >> > +      } else {
>> >> > +         ssbo_load_cache_add(ir->return_deref, block, offset,
>> >> > +                             this->current_depth);
>> >> > +      }
>> >> > +   } else if (!strcmp(ir->callee_name(), "__intrinsic_store_ssbo")) {
>> >> > +      exec_node *param = ir->actual_parameters.get_head();
>> >> > +      ir_rvalue *block = ((ir_instruction *)param)->as_rvalue();
>> >> > +
>> >> > +      param = param->get_next();
>> >> > +      ir_rvalue *offset = ((ir_instruction *)param)->as_rvalue();
>> >> > +
>> >> > +      ssbo_load_cache_remove(block, offset);
>> >> > +   } else if (strstr(ir->callee_name(), "__intrinsic_ssbo_atomic") ==
>> >> > +       ir->callee_name()) {
>> >> > +      exec_node *param = ir->actual_parameters.get_head();
>> >> > +      ir_rvalue *block = ((ir_instruction *)param)->as_rvalue();
>> >> > +
>> >> > +      param = param->get_next();
>> >> > +      ir_rvalue *offset = ((ir_instruction *)param)->as_rvalue();
>> >> > +
>> >> > +      ssbo_load_cache_remove(block, offset);
>> >> > +   } else if (!strcmp(ir->callee_name(), 
>> >> > "__intrinsic_memory_barrier")) {
>> >> > +      ssbo_load_cache_remove_all();
>> >> > +   }
>> >> > +
>> >> > +   return visit_continue_with_parent;
>> >> > +}
>> >> > +
>> >> > +ir_visitor_status
>> >> > +ssbo_load_cache_visitor::visit_enter(ir_if *ir)
>> >> > +{
>> >> > +   ir->condition->accept(this);
>> >> > +
>> >> > +   this->current_depth++;
>> >> > +
>> >> > +   if (!ir->then_instructions.is_empty()) {
>> >> > +      visit_list_elements(this, &ir->then_instructions);
>> >> > +      ssbo_load_cache_remove_depth(this->current_depth);
>> >> > +   }
>> >> > +
>> >> > +   if (!ir->else_instructions.is_empty()) {
>> >> > +      visit_list_elements(this, &ir->else_instructions);
>> >> > +      ssbo_load_cache_remove_depth(this->current_depth);
>> >> > +   }
>> >> > +
>> >> > +   this->current_depth--;
>> >> > +
>> >> > +   return visit_continue_with_parent;
>> >> > +}
>> >> > +
>> >> > +ir_visitor_status
>> >> > +ssbo_load_cache_visitor::visit_enter(ir_loop *ir)
>> >> > +{
>> >> > +   this->current_depth++;
>> >> > +   return visit_continue;
>> >> > +}
>> >> > +
>> >> > +ir_visitor_status
>> >> > +ssbo_load_cache_visitor::visit_leave(ir_loop *ir)
>> >> > +{
>> >> > +   ssbo_load_cache_remove_depth(this->current_depth);
>> >> > +   this->current_depth--;
>> >> > +   return visit_continue;
>> >> > +}
>> >> > +
>> >> > +char *
>> >> > +ssbo_load_cache_visitor::build_ssbo_load_cache_key(unsigned 
>> >> > block_index,
>> >> > +                                                       unsigned offset)
>> >> > +{
>> >> > +   return ralloc_asprintf(mem_ctx, "%u-%u", block_index, offset);
>> >> > +}
>> >> > +
>> >> > +ir_dereference *
>> >> > +ssbo_load_cache_visitor::ssbo_load_cache_find(ir_rvalue *block_index,
>> >> > +                                              ir_rvalue *offset)
>> >> > +{
>> >> > +   ir_constant *const_block_index = block_index->as_constant();
>> >> > +   if (!const_block_index)
>> >> > +      return NULL;
>> >> > +
>> >> > +   ir_constant *const_offset = offset->as_constant();
>> >> > +   if (!const_offset)
>> >> > +      return NULL;
>> >> > +
>> >> > +   char *cache_key =
>> >> > +      build_ssbo_load_cache_key(const_block_index->value.u[0],
>> >> > +                                const_offset->value.u[0]);
>> >> > +
>> >> > +   struct ssbo_load_cache_entry *entry = (struct ssbo_load_cache_entry 
>> >> > *)
>> >> > +      hash_table_find(ssbo_load_cache, cache_key);
>> >> > +   return entry ? entry->load : NULL;
>> >> > +}
>> >> > +
>> >> > +void
>> >> > +ssbo_load_cache_visitor::ssbo_load_cache_add(ir_dereference *deref,
>> >> > +                                             ir_rvalue *block_index,
>> >> > +                                             ir_rvalue *offset,
>> >> > +                                             unsigned depth)
>> >> > +{
>> >> > +   /* We only support caching SSBO loads with constant block and 
>> >> > offset */
>> >> > +   ir_constant *const_block_index = block_index->as_constant();
>> >> > +   if (!const_block_index)
>> >> > +      return;
>> >> > +
>> >> > +   ir_constant *const_offset = offset->as_constant();
>> >> > +   if (!const_offset)
>> >> > +      return;
>> >> > +
>> >> > +   char *cache_key =
>> >> > +      build_ssbo_load_cache_key(const_block_index->value.u[0],
>> >> > +                                const_offset->value.u[0]);
>> >> > +
>> >> > +   struct ssbo_load_cache_entry *entry = (struct ssbo_load_cache_entry 
>> >> > *)
>> >> > +      ralloc_size(this->mem_ctx, sizeof(struct ssbo_load_cache_entry));
>> >> > +   entry->load = deref;
>> >> > +   entry->depth = depth;
>> >> > +
>> >> > +   hash_table_replace(ssbo_load_cache, entry, cache_key);
>> >> > +}
>> >> > +
>> >> > +static void
>> >> > +ssbo_load_cache_remove_callback(const void *key, void *data, void 
>> >> > *closure)
>> >> > +{
>> >> > +   struct ssbo_load_cache_remove_closure *c =
>> >> > +      (struct ssbo_load_cache_remove_closure *) closure;
>> >> > +
>> >> > +   /* If we have a key_prefix, then we only want to delete entries
>> >> > +    * with that key prefix. Otherwise, if we have depth > 0 we want
>> >> > +    * to delete keys with that depth (or larger). Otheriwse (no prefix
>> >> > +    * and depth == 0) we want to delete all keys.
>> >> > +    */
>> >> > +   assert((c->key_prefix && c->depth == 0) ||
>> >> > +          (!c->key_prefix && c->depth > 0) ||
>> >> > +          (!c->key_prefix && c->depth == 0));
>> >> > +
>> >> > +   if (c->key_prefix && strstr((char *)key, (char *)c->key_prefix) != 
>> >> > key)
>> >> > +      return;
>> >> > +
>> >> > +   if (c->depth > 0) {
>> >> > +      struct ssbo_load_cache_entry *entry =
>> >> > +         (struct ssbo_load_cache_entry *) data;
>> >> > +      if (entry->depth < c->depth)
>> >> > +         return;
>> >> > +   }
>> >> > +
>> >> > +   hash_table_remove(c->ht, key);
>> >> > +}
>> >> > +
>> >> > +void
>> >> > +ssbo_load_cache_visitor::ssbo_load_cache_remove_by_key_prefix(const 
>> >> > char *prefix)
>> >> > +{
>> >> > +   struct ssbo_load_cache_remove_closure c;
>> >> > +   c.ht = ssbo_load_cache;
>> >> > +   c.key_prefix = prefix;
>> >> > +   c.depth = 0;
>> >> > +
>> >> > +   hash_table_call_foreach(ssbo_load_cache,
>> >> > +                           ssbo_load_cache_remove_callback, &c);
>> >> > +}
>> >> > +
>> >> > +void
>> >> > +ssbo_load_cache_visitor::ssbo_load_cache_remove_all()
>> >> > +{
>> >> > +   struct ssbo_load_cache_remove_closure c;
>> >> > +   c.ht = ssbo_load_cache;
>> >> > +   c.key_prefix = NULL;
>> >> > +   c.depth = 0;
>> >> > +
>> >> > +   hash_table_call_foreach(ssbo_load_cache,
>> >> > +                           ssbo_load_cache_remove_callback, &c);
>> >> > +}
>> >> > +
>> >> > +void
>> >> > +ssbo_load_cache_visitor::ssbo_load_cache_remove_depth(unsigned depth)
>> >> > +{
>> >> > +   struct ssbo_load_cache_remove_closure c;
>> >> > +   c.ht = ssbo_load_cache;
>> >> > +   c.key_prefix = NULL;
>> >> > +   c.depth = depth;
>> >> > +
>> >> > +   hash_table_call_foreach(ssbo_load_cache,
>> >> > +                           ssbo_load_cache_remove_callback, &c);
>> >> > +}
>> >> > +
>> >> > +void
>> >> > +ssbo_load_cache_visitor::ssbo_load_cache_remove(ir_rvalue *block_index,
>> >> > +                                                ir_rvalue *offset)
>> >> > +{
>> >> > +   ir_constant *const_block_index = block_index->as_constant();
>> >> > +   if (!const_block_index) {
>> >> > +      /* If we don't know the block index, then invalidate the entire 
>> >> > cache.
>> >> > +       * We could try to do better, for example, considering the actual
>> >> > +       * field name we are accessing in the SSBO in the keys so we only
>> >> > +       * invalidate those. This requires some work though.
>> >> > +       */
>> >> > +      ssbo_load_cache_remove_all();
>> >> > +      return;
>> >> > +   }
>> >> > +
>> >> > +   ir_constant *const_offset = offset->as_constant();
>> >> > +   if (!const_offset) {
>> >> > +      /* We know the block but not the offset, so invalidate all 
>> >> > entries
>> >> > +       * for the given block
>> >> > +       */
>> >> > +      ssbo_load_cache_remove_by_key_prefix(
>> >> > +         ralloc_asprintf(mem_ctx, "%u-", 
>> >> > const_block_index->value.u[0]));
>> >> > +      return;
>> >> > +   }
>> >> > +
>> >> > +   /* We know block and offset, so invalidate that particular load 
>> >> > only */
>> >> > +   char *cache_key =
>> >> > +      build_ssbo_load_cache_key(const_block_index->value.u[0],
>> >> > +                                const_offset->value.u[0]);
>> >> > +
>> >> > +   hash_table_remove(ssbo_load_cache, cache_key);
>> >> > +}
>> >> > +
>> >> > +} /* Unnamed namespace */
>> >> > +
>> >> > +bool
>> >> > +opt_ssbo_loads(struct gl_shader *shader, exec_list *instructions)
>> >> > +{
>> >> > +   ssbo_load_cache_visitor v(shader);
>> >> > +   visit_list_elements(&v, instructions);
>> >> > +   return v.get_progress();
>> >> > +}
>> >> > -- 
>> >> > 1.9.1
>> >> >
>> >> > _______________________________________________
>> >> > mesa-dev mailing list
>> >> > mesa-dev@lists.freedesktop.org
>> >> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev

signature.asc
Description: PGP signature

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] glsl: Implement a SSBO load optimization pass

Reply via email to