Caio Marcelo de Oliveira Filho <caio.olive...@intel.com> writes: > Keep information in acp_entry whether the entry is full or not, and > use the ACP in more nodes when visiting the instructions: > > - add_copy: write whole variables to the ACP state (regardless the > type). > > - visit(ir_dereference_variable *): perform the propagation here if we have a > full candidate. Element-wise here doesn't apply because the mask > isn't available at this point. > > - visit_leave(ir_assignment *): process beyond scalar and vector, as > the full variables might have other types. > > Also import an improvement from opt_copy_propagation.cpp: if ir_call > is an intrinsic, we know the variables affected, so keep going. > --- > .../glsl/opt_copy_propagation_elements.cpp | 142 ++++++++++++++---- > 1 file changed, 113 insertions(+), 29 deletions(-) > > diff --git a/src/compiler/glsl/opt_copy_propagation_elements.cpp > b/src/compiler/glsl/opt_copy_propagation_elements.cpp > index d2c24697203..701cce2dd30 100644 > --- a/src/compiler/glsl/opt_copy_propagation_elements.cpp > +++ b/src/compiler/glsl/opt_copy_propagation_elements.cpp > @@ -27,18 +27,9 @@ > * Replaces usage of recently-copied components of variables with the > * previous copy of the variable. > * > - * This pass can be compared with opt_copy_propagation, which operands > - * on arbitrary whole-variable copies. However, in order to handle > - * the copy propagation of swizzled variables or writemasked writes, > - * we want to track things on a channel-wise basis. I found that > - * trying to mix the swizzled/writemasked support here with the > - * whole-variable stuff in opt_copy_propagation.cpp just made a mess, > - * so this is separate despite the ACP handling being somewhat > - * similar. > - * > * This should reduce the number of MOV instructions in the generated > - * programs unless copy propagation is also done on the LIR, and may > - * help anyway by triggering other optimizations that live in the HIR. > + * programs and help triggering other optimizations that live in GLSL > + * level. > */ > > #include "ir.h" > @@ -58,6 +49,7 @@ class acp_entry > public: > DECLARE_LINEAR_ZALLOC_CXX_OPERATORS(acp_entry) >
A comment would have helped me: /* If set, rhs_full indicates that this ACP entry represents a * whole-variable copy. The rhs_element[] array will still be filled, * to allow the swizzling from its components in case the variable * was a vector (and to simplify some of the erase() and write_vector() * logic). */ > + ir_variable *rhs_full; > ir_variable *rhs_element[4]; > unsigned rhs_channel[4]; > > - void write(ir_variable *lhs, ir_variable *rhs, unsigned write_mask, int > swizzle[4]) > + void write_elements(ir_variable *lhs, ir_variable *rhs, unsigned > write_mask, int swizzle[4]) > { > acp_entry *lhs_entry = pull_acp(lhs); > + lhs_entry->rhs_full = NULL; > + bool already_reference_rhs = false; > > for (int i = 0; i < 4; i++) { > + if (lhs_entry->rhs_element[i] == rhs) > + already_reference_rhs = true; > + > if ((write_mask & (1 << i)) == 0) > continue; > ir_variable *to_remove = lhs_entry->rhs_element[i]; > @@ -160,8 +160,38 @@ public: > } > } > > + if (!already_reference_rhs) { > + acp_entry *rhs_entry = pull_acp(rhs); > + _mesa_set_add(rhs_entry->dsts, lhs); > + } > + } Though I think the logic would be simpler if we skip the checking that we haven't added to the ACP and just _mesa_set_add() no matter what. Optional, though. Either way: Reviewed-by: Eric Anholt <e...@anholt.net>
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev