On Thu, Nov 19, 2015 at 6:40 AM, Matt Turner <matts...@gmail.com> wrote: > On Thu, Nov 19, 2015 at 2:05 AM, Iago Toral Quiroga <ito...@igalia.com> wrote: >> From: Connor Abbott <connor.w.abb...@intel.com> >> >> It appears that not only math instructions, but also MOV_BYTES or >> any instruction that uses Align1 mode cannot be in the middle >> of a dependency control sequence or the GPU will hang (at least on my >> BDW). This fixes GPU hangs in some fp64 tests. > > I'm pretty surprised by this assessment. Doubtful even. > >> Reviewed-by: Iago Toral Quiroga <ito...@igalia.com> >> --- >> src/mesa/drivers/dri/i965/brw_vec4.cpp | 17 ++++++++++++----- >> 1 file changed, 12 insertions(+), 5 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp >> b/src/mesa/drivers/dri/i965/brw_vec4.cpp >> index 3bcd5cb..bc0a33b 100644 >> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp >> @@ -838,6 +838,17 @@ vec4_visitor::is_dep_ctrl_unsafe(const vec4_instruction >> *inst) >> } >> >> /* >> + * Instructions that use Align1 mode cause the GPU to hang when inserted >> + * between a NoDDClr and NoDDChk in Align16 mode. Discovered empirically. >> + */ >> + >> + if (inst->opcode == VEC4_OPCODE_PACK_BYTES || >> + inst->opcode == VEC4_OPCODE_MOV_BYTES || > > PACK_BYTES sets depctrl itself in the generator, and at the time I > added it I made a test that did > > vec4 foo = vec4(packUnorm4x8(...), > packUnorm4x8(...), > packUnorm4x8(...), > packUnorm4x8(...)) > > and confirmed that it set depctrl properly on the whole sequence. > There could of course be bugs somewhere, but the "hardware doesn't > work if you mix align1 and align16 depctrl" seems unlikely. > > Do you know of a test that this affects?
This only affects FP64 tests, since there we use an align1 mov to do double-to-float and float-to-double. However, I tried commenting out emit_nir_code() and just doing essentially: emit(MOV(...))->force_writemask_all = true; emit(VEC4_OPCODE_PACK_BYTES, ...); emit(MOV(...))->force_writemask_all = true; and on my BDW it hanged. In case it's not clear: this isn't about setting depctrl on the instruction itself, it just can't be inside of a depctrl sequence (which we were already disallowing for math instructions anyways). _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev