Upon further consideration and actually seeing the patch, I think using num_components would be better. Calling it writemask isn't really true since it isn't the writemask for the mul. I thought about calling it read_mask but that isn't really true either because it isn't the components of the mul that get read either. It's only the write/read mask when combined with the swizzle. I guess we could call it swizzle_mask, but that just seems strange. Using the number of components nicely side-steps the whole problem. Since this pass fundamentally requires SSA, I don't think that's a problem. --Jason
On Wed, May 27, 2015 at 1:10 AM, Iago Toral Quiroga <ito...@igalia.com> wrote: > When we compute the output swizzle we want to consider the writemask of the > add operation, not the one from the multiplication. > --- > src/glsl/nir/nir_opt_peephole_ffma.c | 14 ++++++++------ > 1 file changed, 8 insertions(+), 6 deletions(-) > > diff --git a/src/glsl/nir/nir_opt_peephole_ffma.c > b/src/glsl/nir/nir_opt_peephole_ffma.c > index b430eac..c895c22 100644 > --- a/src/glsl/nir/nir_opt_peephole_ffma.c > +++ b/src/glsl/nir/nir_opt_peephole_ffma.c > @@ -73,7 +73,8 @@ are_all_uses_fadd(nir_ssa_def *def) > } > > static nir_alu_instr * > -get_mul_for_src(nir_alu_src *src, uint8_t swizzle[4], bool *negate, bool > *abs) > +get_mul_for_src(nir_alu_src *src, int writemask, > + uint8_t swizzle[4], bool *negate, bool *abs) > { > assert(src->src.is_ssa && !src->abs && !src->negate); > > @@ -85,16 +86,16 @@ get_mul_for_src(nir_alu_src *src, uint8_t swizzle[4], > bool *negate, bool *abs) > switch (alu->op) { > case nir_op_imov: > case nir_op_fmov: > - alu = get_mul_for_src(&alu->src[0], swizzle, negate, abs); > + alu = get_mul_for_src(&alu->src[0], writemask, swizzle, negate, abs); > break; > > case nir_op_fneg: > - alu = get_mul_for_src(&alu->src[0], swizzle, negate, abs); > + alu = get_mul_for_src(&alu->src[0], writemask, swizzle, negate, abs); > *negate = !*negate; > break; > > case nir_op_fabs: > - alu = get_mul_for_src(&alu->src[0], swizzle, negate, abs); > + alu = get_mul_for_src(&alu->src[0], writemask, swizzle, negate, abs); > *negate = false; > *abs = true; > break; > @@ -116,7 +117,7 @@ get_mul_for_src(nir_alu_src *src, uint8_t swizzle[4], > bool *negate, bool *abs) > return NULL; > > for (unsigned i = 0; i < 4; i++) { > - if (!(alu->dest.write_mask & (1 << i))) > + if (!(writemask & (1 << i))) > break; > > swizzle[i] = swizzle[src->swizzle[i]]; > @@ -160,7 +161,8 @@ nir_opt_peephole_ffma_block(nir_block *block, void > *void_state) > negate = false; > abs = false; > > - mul = get_mul_for_src(&add->src[add_mul_src], swizzle, &negate, > &abs); > + mul = get_mul_for_src(&add->src[add_mul_src], add->dest.write_mask, > + swizzle, &negate, &abs); > > if (mul != NULL) > break; > -- > 1.9.1 > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev