The hardware only has two bits to specify the horizontal stride, so the maximum horizontal stride we can use is 4. The pass calculates strides based on the sizes of the types involved, and for conversions between 64-bit and 8-bit types that can lead to strides of 8.
The compiler should make sure that such conversions are handled in two steps to avoid that situation. If we fail to do this properly, the generated assembly will be invalid and validation will fail, but asserting here makes debugging easier. --- src/intel/compiler/brw_fs_lower_conversions.cpp | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/src/intel/compiler/brw_fs_lower_conversions.cpp b/src/intel/compiler/brw_fs_lower_conversions.cpp index 145fb55f995..00781e824e8 100644 --- a/src/intel/compiler/brw_fs_lower_conversions.cpp +++ b/src/intel/compiler/brw_fs_lower_conversions.cpp @@ -90,6 +90,13 @@ fs_visitor::lower_conversions() fs_reg temp = ibld.vgrf(get_exec_type(inst)); fs_reg strided_temp = subscript(temp, dst.type, 0); + /* Make sure we don't exceed hardware limits here. If we have code + * that hits this assertion it means that we need to split the + * instruction in two, using intermediary types (see for + * example nir_op_i2i8). + */ + assert(strided_temp.stride <= 4); + assert(inst->size_written == inst->dst.component_size(inst->exec_size)); inst->dst = strided_temp; inst->saturate = false; -- 2.17.1 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev