Sounds like you're doing something similar to nouveau's GlobalCSE pass. On the off chance it's useful, you can have a look at how we do it here:
https://cgit.freedesktop.org/mesa/mesa/tree/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp#n3118 On Fri, Jul 29, 2016 at 4:29 AM, Kenneth Graunke <kenn...@whitecape.org> wrote: > I found a shader in Tales of Maj'Eyal that contains: > > if ssa_21 { > block block_1: > /* preds: block_0 */ > ...instructions that prevent the select peephole... > vec1 32 ssa_23 = imov ssa_4 > vec1 32 ssa_24 = imov ssa_4.y > vec1 32 ssa_25 = imov ssa_4.z > /* succs: block_3 */ > } else { > block block_2: > /* preds: block_0 */ > vec1 32 ssa_26 = imov ssa_4 > vec1 32 ssa_27 = imov ssa_4.y > vec1 32 ssa_28 = imov ssa_4.z > /* succs: block_3 */ > } > block block_3: > /* preds: block_1 block_2 */ > vec1 32 ssa_29 = phi block_1: ssa_23, block_2: ssa_26 > vec1 32 ssa_30 = phi block_1: ssa_24, block_2: ssa_27 > vec1 32 ssa_31 = phi block_1: ssa_25, block_2: ssa_28 > > Here, copy propagation will bail because phis cannot perform swizzles, > and CSE won't do anything because there is no dominance relationship > between the imovs. By making nir_opt_remove_phis handle identical moves, > we can eliminate the phis and rewrite everything to use ssa_4 directly, > so all the moves become dead and get eliminated. > > I don't think we need to check "exact" - just the alu sources. > Presumably phi sources should match in their exactness. > > On Broadwell: > > total instructions in shared programs: 11639872 -> 11638535 (-0.01%) > instructions in affected programs: 134222 -> 132885 (-1.00%) > helped: 338 > HURT: 0 > > Signed-off-by: Kenneth Graunke <kenn...@whitecape.org> > --- > src/compiler/nir/nir_opt_remove_phis.c | 24 +++++++++++++++++++++++- > 1 file changed, 23 insertions(+), 1 deletion(-) > > diff --git a/src/compiler/nir/nir_opt_remove_phis.c > b/src/compiler/nir/nir_opt_remove_phis.c > index f1e7bb0..ebcc856 100644 > --- a/src/compiler/nir/nir_opt_remove_phis.c > +++ b/src/compiler/nir/nir_opt_remove_phis.c > @@ -27,6 +27,26 @@ > > #include "nir.h" > > +static nir_alu_instr * > +get_parent_mov(nir_ssa_def *ssa) > +{ > + if (ssa->parent_instr->type != nir_instr_type_alu) > + return false; > + > + nir_alu_instr *alu = nir_instr_as_alu(ssa->parent_instr); > + return (alu->op == nir_op_imov || alu->op == nir_op_fmov) ? alu : NULL; > +} > + > +static bool > +matching_mov(nir_alu_instr *mov1, nir_ssa_def *ssa) > +{ > + if (!mov1) > + return false; > + > + nir_alu_instr *mov2 = get_parent_mov(ssa); > + return mov2 && nir_alu_srcs_equal(mov1, mov2, 0, 0); > +} > + > /* > * This is a pass for removing phi nodes that look like: > * a = phi(b, b, b, ...) > @@ -54,6 +74,7 @@ remove_phis_block(nir_block *block) > nir_phi_instr *phi = nir_instr_as_phi(instr); > > nir_ssa_def *def = NULL; > + nir_alu_instr *mov = NULL; > bool srcs_same = true; > > nir_foreach_phi_src(src, phi) { > @@ -75,8 +96,9 @@ remove_phis_block(nir_block *block) > > if (def == NULL) { > def = src->src.ssa; > + mov = get_parent_mov(def); > } else { > - if (src->src.ssa != def) { > + if (src->src.ssa != def && !matching_mov(mov, src->src.ssa)) { > srcs_same = false; > break; > } > -- > 2.9.0 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev