Oh, thanks! A very nice torture case. However, it seems to be related to
another problem. The test case has no IF or LOOP blocks, so it can't be
possibly improved with speeding up the cloning of the table (the torture
case can be modified with a few ifs and fors on top though...).
Anyway, on Gallium the time goes down from 1.91s before my WIP patch to
0.10s after my patch-series mostly due to improvements in
glsl_to_tgsi_visitor::copy_propagation(). On Intel I don't see much of
improvement however: 157s (!) -> 132s.
Seems to be another case of linear scanning of the ACP table. Mesa has
several variations of the same copy propagation algorithm, and all of
them (with exception do_copy_propagation_elements that uses a
hash-table) linearly scan the entire table when they want to kill all
the copies of a variable. Which makes the implementation O(N^2) and
blows up execution time on huge functions like the ones from shadertoy.
However most of the time it still spends in tree_grafting that also uses
all types of linear scans and tree walks instead of lookup tables
(ir_expression::get_num_operands() gets executed > 6 billion times!) --
I'll try to look into it.
Also, reading the bug report, I don't understand why did you treat
do_copy_propagation_elements() and do_copy_propagation() differently --
in one case added hash-table for both sides of COPY(to, from), in
another only for one side?
30.12.2016 09:08, Tapani Pälli пишет:
On 12/30/2016 05:53 AM, Vladislav Egorov wrote:
I've looked into it recently (I'm working on series of many various
trivial optimizations)
and it's faster to just memcpy() it. Just throwing out superfluous
hashing still keeps slow
hash-table insertion around -- with resizing, rehashing, memory
allocation/deallocation, internal
hash-function through integer division, collisions and so on. It
produces a nice speed improvement
actually. It's possible to explore approaches without any copying at
LOOP/IF entering at all,
but I am not sure it will improve performance.
When profiling copy_propagation(_elements) pass you can use Martina's
testcase from this bug:
https://bugs.freedesktop.org/show_bug.cgi?id=94477
We still have a very long compile time for the WebGL case mentioned in
the bug so would be cool to have some optimizations there.
30.12.2016 02:49, Thomas Helland пишет:
Really, we should have some kind of function for copying the whole
table,
but this will work for now.
---
src/compiler/glsl/opt_copy_propagation.cpp | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/src/compiler/glsl/opt_copy_propagation.cpp
b/src/compiler/glsl/opt_copy_propagation.cpp
index 247c498..e9f82e0 100644
--- a/src/compiler/glsl/opt_copy_propagation.cpp
+++ b/src/compiler/glsl/opt_copy_propagation.cpp
@@ -210,7 +210,8 @@
ir_copy_propagation_visitor::handle_if_block(exec_list *instructions)
/* Populate the initial acp with a copy of the original */
struct hash_entry *entry;
hash_table_foreach(orig_acp, entry) {
- _mesa_hash_table_insert(acp, entry->key, entry->data);
+ _mesa_hash_table_insert_pre_hashed(acp, entry->hash,
+ entry->key, entry->data);
}
visit_list_elements(this, instructions);
@@ -259,7 +260,8 @@ ir_copy_propagation_visitor::handle_loop(ir_loop
*ir, bool keep_acp)
if (keep_acp) {
struct hash_entry *entry;
hash_table_foreach(orig_acp, entry) {
- _mesa_hash_table_insert(acp, entry->key, entry->data);
+ _mesa_hash_table_insert_pre_hashed(acp, entry->hash,
+ entry->key, entry->data);
}
}
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev