Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes: > On Thu, 2017-02-09 at 12:18 -0800, Francisco Jerez wrote: >> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes: >> >> > It is tested empirically that IVB/BYT don't support indirect >> > addressing >> > with doubles but it is not documented in the PRM. >> > >> > This patch applies the same solution than for Cherryview/Broxton >> > and >> > takes into account that we cannot double the stride, since the >> > hardware will do it internally. >> > >> > v2: >> > - Fix assert to take into account Indirect DF MOVs in IVB and HSW. >> > >> > Signed-off-by: Samuel Iglesias Gonsálvez <sigles...@igalia.com> >> > --- >> > src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 27 >> > ++++++++++++++++++-------- >> > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 11 ++++++----- >> > 2 files changed, 25 insertions(+), 13 deletions(-) >> > >> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp >> > b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp >> > index 487f2e90224..dd6cbab773c 100644 >> > --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp >> > +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp >> > @@ -418,18 +418,29 @@ fs_generator::generate_mov_indirect(fs_inst >> > *inst, >> > brw_MOV(p, dst, reg); >> > } else { >> > /* Prior to Broadwell, there are only 8 address registers. >> > */ >> > - assert(inst->exec_size == 8 || devinfo->gen >= 8); >> > + assert(inst->exec_size == 8 || devinfo->gen >= 8 || >> > + (devinfo->gen == 7 && inst->exec_size < 8 && >> > + (type_sz(reg.type) == 8 || type_sz(dst.type) == >> > 8))); >> > >> > /* We use VxH indirect addressing, clobbering a0.0 through >> > a0.7. */ >> > struct brw_reg addr = vec8(brw_address_reg(0)); >> > >> > - /* The destination stride of an instruction (in bytes) must >> > be greater >> > - * than or equal to the size of the rest of the >> > instruction. Since the >> > - * address register is of type UW, we can't use a D-type >> > instruction. >> > - * In order to get around this, re retype to UW and use a >> > stride. >> > - */ >> > - indirect_byte_offset = >> > - retype(spread(indirect_byte_offset, 2), >> > BRW_REGISTER_TYPE_UW); >> > + if (devinfo->gen != 7 || devinfo->is_haswell || >> > type_sz(reg.type) != 8) { >> > + /* The destination stride of an instruction (in bytes) >> > must be greater >> > + * than or equal to the size of the rest of the >> > instruction. Since the >> > + * address register is of type UW, we can't use a D-type >> > instruction. >> > + * In order to get around this, re retype to UW and use a >> > stride. >> > + */ >> > + indirect_byte_offset = >> > + retype(spread(indirect_byte_offset, 2), >> > BRW_REGISTER_TYPE_UW); >> > + } else { >> > + /* In Ivybridge/Baytrail, when it operates with DF >> > operands, we >> > + * cannot double the stride, since the hardware will do >> > it >> > + * internally. Tested empirically. >> > + */ >> > + indirect_byte_offset = >> > + retype(indirect_byte_offset, BRW_REGISTER_TYPE_UW); >> > + } >> >> This doesn't make any sense, indirect_byte_offset is a 32-bit integer >> type regardless of the type of the data you're copying, the hardware >> won't do any stride doubling for you here. >> > > Yeah, I realised this was wrong when looking at the bug mentioned by > Matt. Hence my work to provide a proper solution, which we discussed > privately. > > I've verified on IVB hardware that indirect addressing works for 64- > bit >> moves, but getting it to do what you want is a bit trickier than on >> more >> recent hardware -- I'll reply with a patch that gets it working. >> > > OK > >> > >> > /* There are a number of reasons why we don't use the base >> > offset here. >> > * One reason is that the field is only 9 bits which means >> > we can only >> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> > index 8f745dff440..85f43b7b144 100644 >> > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> > @@ -3822,17 +3822,18 @@ fs_visitor::nir_emit_intrinsic(const >> > fs_builder &bld, nir_intrinsic_instr *instr >> > (instr->num_components - 1) * type_sz(dest.type); >> > >> > fs_reg indirect_chv_high_32bit; >> > - bool is_chv_bxt_64bit = >> > - (devinfo->is_cherryview || devinfo->is_broxton) && >> > - type_sz(dest.type) == 8; >> > - if (is_chv_bxt_64bit) { >> > + bool is_ivb_byt_chv_bxt_64bit = >> >> Let's use a more sensible name here like "supports_64bit_indirects" >> and >> mark as const... Or just remove the variable altogether and fold >> this >> into the condition of the second if statement below. >> > > OK > >> > + (devinfo->is_cherryview || devinfo->is_broxton || >> > + devinfo->is_ivybridge || devinfo->is_baytrail) && >> > + type_sz(dest.type) == 8; >> > + if (is_ivb_byt_chv_bxt_64bit) { >> > indirect_chv_high_32bit = vgrf(glsl_type::uint_type); >> > /* Calculate indirect address to read high 32 bits */ >> > bld.ADD(indirect_chv_high_32bit, indirect, >> > brw_imm_ud(4)); >> >> The ADD and temporary seem redundant, you could take into account the >> additional 32bit offset by applying an immediate offset to the 'src' >> region used below. >> >> > } >> > >> > for (unsigned j = 0; j < instr->num_components; j++) { >> > - if (!is_chv_bxt_64bit) { >> > + if (!is_ivb_byt_chv_bxt_64bit) { >> >> The code below seems pretty broken... You lower a 64-bit to 64-bit >> indirect move into two 64-bit to UD type-converting moves, which is >> unlikely to do what you want. >> You need to lower it into two raw 32-bit >> to 32bit moves (e.g. by using subscript() in the source as well as on >> the destination). Please fix this (and CC the patch to mesa-stable) >> regardless of my patch to get 64-bit MOV_INDIRECT working, CHV/BXT >> are >> unlikely to work as expected except by luck. >> > > This is likely working because in fs_generator::generate_mov_indirect() > the source of the MOV is retyped to destination's type (at this case > an UD). So, at the end, there is no generation of a 64-bit to UD > converting move. >
Yeah, that likely prevented things from failing catastrophically [this is what I call luck :P], but generate_mov_indirect() dropping the source type on the floor likely qualifies as a bug in its own right... > But you are right, this is wrong. I'll fix it. > > Thanks for the feedback! > > Sam > >> > bld.emit(SHADER_OPCODE_MOV_INDIRECT, >> > offset(dest, bld, j), offset(src, bld, j), >> > indirect, brw_imm_ud(read_size)); >> > -- >> > 2.11.0 >> > >> > _______________________________________________ >> > mesa-dev mailing list >> > mesa-dev@lists.freedesktop.org >> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev