On Thu, Dec 4, 2014 at 3:37 PM, Ben Widawsky <benjamin.widaw...@intel.com> wrote: > The GS has an interesting use for mul. It's essentially used as a fancy mov > (in > fact, I am not sure why a mov isn't used). The documentation in the function > has > a very good explanation from Paul on the mechanics.
What do you mean? The comment says * Therefore, we want to multiply DWORDs 0 and 4 of src0 (the x components * of the register for geometry shader invocations 0 and 1) by the * immediate value in src1, and store the result in DWORDs 3 and 4 of dst. I don't know why you'd think a MOV would do? I do see one of its uses is a MUL by 1u (the one in this patch), so that could be a move, but not in general. And a MOV and a MUL are equal in terms of execution speed, so there's no advantage to having a second opcode. > CHV has some quirks with regard to multiplication. While the documentation is > somewhat unclear, I've found that demoting the src1 operand in the GS mul > solves > all the problems. > I'd ask that any potential reviewer ignore the other instances > of mul for now (I have more patches), and simply make sure that what this > patch > does is correct. I'd cut this out of the commit message before you commit. > This fixes around 2000 piglit tests on BSW. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84777 (with many dupes) > Cc: "10.4" <mesa-sta...@lists.freedesktop.org> Are we actually expecting usable BSW support to be in 10.4.x? (The commit message uses both CHV and BSW. Can we pick one?) > Signed-off-by: Ben Widawsky <b...@bwidawsk.net> > > --- > src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 5 ++++- > src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 3 ++- > 2 files changed, 6 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp > index b353539..4f60797 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp > @@ -534,8 +534,11 @@ vec4_generator::generate_gs_set_write_offset(struct > brw_reg dst, > brw_push_insn_state(p); > brw_set_default_access_mode(p, BRW_ALIGN_1); > brw_set_default_mask_control(p, BRW_MASK_DISABLE); > + assert(src1.file == BRW_IMMEDIATE_VALUE && > + src1.type == BRW_REGISTER_TYPE_UD && > + src1.dw1.ud <= SHRT_MAX); > brw_MUL(p, suboffset(stride(dst, 2, 2, 1), 3), stride(src0, 8, 2, 4), > - src1); > + retype(src1, BRW_REGISTER_TYPE_UW)); I feel a little weird about this, but the alternative is setting the type in the visitor and that doesn't seem nice either. I guess this is probably the best thing. One thing, you're comparing with SHRT_MAX but then setting the type to unsigned-word. Shouldn't you compare with USHRT_MAX? With the commit message updated and trimmed and s/SHRT_MAX/USHRT_MAX/: Reviewed-by: Matt Turner <matts...@gmail.com> _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev