Oh, thanks! A very nice torture case. However, it seems to be related to another problem. The test case has no IF or LOOP blocks, so it can't be possibly improved with speeding up the cloning of the table (the torture case can be modified with a few ifs and fors on top though...).

Anyway, on Gallium the time goes down from 1.91s before my WIP patch to 0.10s after my patch-series mostly due to improvements in glsl_to_tgsi_visitor::copy_propagation(). On Intel I don't see much of improvement however: 157s (!) -> 132s.

Seems to be another case of linear scanning of the ACP table. Mesa has several variations of the same copy propagation algorithm, and all of them (with exception do_copy_propagation_elements that uses a hash-table) linearly scan the entire table when they want to kill all the copies of a variable. Which makes the implementation O(N^2) and blows up execution time on huge functions like the ones from shadertoy. However most of the time it still spends in tree_grafting that also uses all types of linear scans and tree walks instead of lookup tables (ir_expression::get_num_operands() gets executed > 6 billion times!) -- I'll try to look into it.

Also, reading the bug report, I don't understand why did you treat do_copy_propagation_elements() and do_copy_propagation() differently -- in one case added hash-table for both sides of COPY(to, from), in another only for one side?


30.12.2016 09:08, Tapani Pälli пишет:


On 12/30/2016 05:53 AM, Vladislav Egorov wrote:
I've looked into it recently (I'm working on series of many various
trivial optimizations)

and it's faster to just memcpy() it. Just throwing out superfluous
hashing still keeps slow

hash-table insertion around -- with resizing, rehashing, memory
allocation/deallocation, internal

hash-function through integer division, collisions and so on. It
produces a nice speed improvement

actually. It's possible to explore approaches without any copying at
LOOP/IF entering at all,

but I am not sure it will improve performance.

When profiling copy_propagation(_elements) pass you can use Martina's testcase from this bug:

https://bugs.freedesktop.org/show_bug.cgi?id=94477

We still have a very long compile time for the WebGL case mentioned in the bug so would be cool to have some optimizations there.



30.12.2016 02:49, Thomas Helland пишет:
Really, we should have some kind of function for copying the whole table,
but this will work for now.
---
  src/compiler/glsl/opt_copy_propagation.cpp | 6 ++++--
  1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/opt_copy_propagation.cpp
b/src/compiler/glsl/opt_copy_propagation.cpp
index 247c498..e9f82e0 100644
--- a/src/compiler/glsl/opt_copy_propagation.cpp
+++ b/src/compiler/glsl/opt_copy_propagation.cpp
@@ -210,7 +210,8 @@
ir_copy_propagation_visitor::handle_if_block(exec_list *instructions)
     /* Populate the initial acp with a copy of the original */
     struct hash_entry *entry;
     hash_table_foreach(orig_acp, entry) {
-      _mesa_hash_table_insert(acp, entry->key, entry->data);
+      _mesa_hash_table_insert_pre_hashed(acp, entry->hash,
+                                         entry->key, entry->data);
     }
       visit_list_elements(this, instructions);
@@ -259,7 +260,8 @@ ir_copy_propagation_visitor::handle_loop(ir_loop
*ir, bool keep_acp)
     if (keep_acp) {
        struct hash_entry *entry;
        hash_table_foreach(orig_acp, entry) {
-         _mesa_hash_table_insert(acp, entry->key, entry->data);
+         _mesa_hash_table_insert_pre_hashed(acp, entry->hash,
+                                            entry->key, entry->data);
        }
     }


_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to