On Fri, Jan 1, 2016 at 4:46 AM, Kenneth Graunke <kenn...@whitecape.org> wrote: > While most align16 instructions only support a SubRegNum of 0 or 4 > (using swizzling to control the other channels), 3-src instructions > actually support arbitrary SubRegNums. When the RepCtrl bit is set, > we believe it ignores the swizzle and uses the equivalent of a <0,1,0> > region from the subnr. > > In the past, we adopted a vec4-centric approach of specifying subnr of > 0 or 4 and a swizzle, then having brw_eu_emit.c convert that to a proper > SubRegNum. This isn't a great fit for the scalar backend, where we > don't set swizzles at all, and happily set subnrs in the range [0, 7]. > > This patch changes brw_eu_emit.c to use subnr and swizzle directly, > relying on the higher levels to set them sensibly. > > This should fix problems where scalar sources get copy propagated into > 3-src instructions in the FS backend. I've only observed this with > TES push model inputs, but I suppose it could happen in other cases. > > Signed-off-by: Kenneth Graunke <kenn...@whitecape.org> > Cc: Matt Turner <matts...@gmail.com> > --- > src/mesa/drivers/dri/i965/brw_eu_emit.c | 11 +++++------ > src/mesa/drivers/dri/i965/brw_vec4.cpp | 13 +++++++++++++ > 2 files changed, 18 insertions(+), 6 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c > b/src/mesa/drivers/dri/i965/brw_eu_emit.c > index 5fb9662..35d8039 100644 > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c > @@ -847,12 +847,11 @@ brw_alu2(struct brw_codegen *p, unsigned opcode, > static int > get_3src_subreg_nr(struct brw_reg reg) > { > - if (reg.vstride == BRW_VERTICAL_STRIDE_0) { > - assert(brw_is_single_value_swizzle(reg.swizzle)); > - return reg.subnr / 4 + BRW_GET_SWZ(reg.swizzle, 0); > - } else { > - return reg.subnr / 4; > - } > + /* Normally, SubRegNum is in bytes (0..31). However, 3-src instructions > + * use 32-bit units (components 0..7). Since they only support F/D/UD > + * types, this doesn't lose any flexibility, but uses fewer bits. > + */ > + return reg.subnr / 4; > } > > static brw_inst * > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp > b/src/mesa/drivers/dri/i965/brw_vec4.cpp > index dd22398..c6a52c5 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp > @@ -1784,9 +1784,22 @@ vec4_visitor::convert_to_hw_regs() > case ATTR: > unreachable("not reached"); > } > + > src = reg; > } > > + if (inst->is_3src()) { > + /* 3-src instructions with scalar sources support arbitrary subnr, > + * but don't actually use swizzles. Convert swizzle into subnr. > + */ > + for (int i = 0; i < 3; i++) { > + if (inst->src[i].vstride == BRW_VERTICAL_STRIDE_0) { > + assert(brw_is_single_value_swizzle(inst->src[i].swizzle)); > + inst->src[i].subnr += 4 * BRW_GET_SWZ(inst->src[i].swizzle, > 0); > + } > + } > + } > +
I believe this is correct. The only way scalar sources are propagated into 3-src instructions is if file == UNIFORM && brw_is_single_value_swizzle(). The series is Reviewed-by: Matt Turner <matts...@gmail.com> _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev