[Mesa-dev] Fix saturation when coalescing registers in NIR-vec4
With this fix, we can load the constants in NIR-vec4 as integers, as it is done in the FS backend. This is related to: http://lists.freedesktop.org/archives/mesa-dev/2015-July/089899.html ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] i965/vec4: Fix saturation errors when coalescing registers
If the register types do not match and the instruction that contains the final destination is saturated, register coalescing generated non-equivalent code. This did not happen when using IR because types usually matched, but it is visible in nir-vec4. For example, mov vgrf7:D vgrf2:D mov.sat m4:F vgrf7:F is coalesced to: mov.sat m4:D vgrf2:D The patch prevents coalescing in such scenario, unless the instruction we want to coalesce into is a MOV. In that case, the patch sets the register types to the type of the final destination. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 18 ++ 1 file changed, 18 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index f18915a..52982c3 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -1053,6 +1053,15 @@ vec4_visitor::opt_register_coalesce() } } +/* This doesn't handle saturation on the instruction we + * want to coalesce away if the register types do not match. + * But if scan_inst is a 'mov' we can fix the types later. + */ +if (inst->saturate && +inst->dst.type != scan_inst->dst.type && +scan_inst->opcode != BRW_OPCODE_MOV) + break; + /* If we can't handle the swizzle, bail. */ if (!scan_inst->can_reswizzle(inst->dst.writemask, inst->src[0].swizzle, @@ -1128,6 +1137,15 @@ vec4_visitor::opt_register_coalesce() scan_inst->dst.file = inst->dst.file; scan_inst->dst.reg = inst->dst.reg; scan_inst->dst.reg_offset = inst->dst.reg_offset; + if (inst->saturate && + inst->dst.type != scan_inst->dst.type) { + /* If we have reached this point, scan_inst is a 'mov' and + * we can modify its register types to match the ones in inst. + * Otherwise, we could have an incorrect saturation result. + */ + scan_inst->dst.type = inst->dst.type; + scan_inst->src[0].type = inst->src[0].type; + } scan_inst->saturate |= inst->saturate; } scan_inst = (vec4_instruction *)scan_inst->next; -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] i965/vec4_nir: Load constants as integers
Loads constants using integer as their register type, this is done for consistency with the FS backend. --- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index 923e2d3..e4ae899 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp @@ -456,7 +456,7 @@ void vec4_visitor::nir_emit_load_const(nir_load_const_instr *instr) { dst_reg reg = dst_reg(GRF, alloc.allocate(1)); - reg.type = BRW_REGISTER_TYPE_F; + reg.type = BRW_REGISTER_TYPE_D; /* @FIXME: consider emitting vector operations to save some MOVs in * cases where the components are representable in 8 bits. @@ -464,7 +464,7 @@ vec4_visitor::nir_emit_load_const(nir_load_const_instr *instr) */ for (unsigned i = 0; i < instr->def.num_components; ++i) { reg.writemask = 1 << i; - emit(MOV(reg, src_reg(instr->value.f[i]))); + emit(MOV(reg, src_reg(instr->value.i[i]))); } /* Set final writemask */ -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 2/2] i965/vec4_nir: Load constants as integers
Loads constants using integer as their register type, this is done for consistency with the FS backend. --- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index 632e409..23b2fab 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp @@ -456,7 +456,7 @@ void vec4_visitor::nir_emit_load_const(nir_load_const_instr *instr) { dst_reg reg = dst_reg(GRF, alloc.allocate(1)); - reg.type = BRW_REGISTER_TYPE_F; + reg.type = BRW_REGISTER_TYPE_D; unsigned remaining = brw_writemask_for_size(instr->def.num_components); @@ -477,7 +477,7 @@ vec4_visitor::nir_emit_load_const(nir_load_const_instr *instr) } reg.writemask = writemask; - emit(MOV(reg, src_reg(instr->value.f[i]))); + emit(MOV(reg, src_reg(instr->value.i[i]))); remaining &= ~writemask; } -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] i965: don't fail to shift height images for levels.
From: Dave Airlie This fixes one subtest of: GL44-CTS.shader_image_size.advanced-nonMS-fs-int I've no idea why this wouldn't be scaled up here, and I've no idea what else will break, but I might as well open for discussion. v2: Only shift height if the texture is not an 1D_ARRAY, it fixes assertion in GL44-CTS.texture_view.gettexparameter due to the original patch (Antia). Signed-off-by: Dave Airlie Signed-off-by: Antia Puentes --- I have not taken a deep look to the test so take this with a grain of salt. As I said in a previous email, this patch raises an assertion in GL44-CTS.texture_view.gettexparameter: "glcts: intel_mipmap_tree.c:368: intel_miptree_create_layout: Assertion `height0 = 1' failed." Looking at the code surrounding the assertion, we have: if (target == GL_TEXTURE_1D_ARRAY) assert(height0 == 1); which suggests that we should avoid shifting the height at least for TEXTURE_1D_ARRAYs. Sending a second version of the patch. src/mesa/drivers/dri/i965/intel_tex_image.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c b/src/mesa/drivers/dri/i965/intel_tex_image.c index 958f8bd..120e7e0 100644 --- a/src/mesa/drivers/dri/i965/intel_tex_image.c +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c @@ -49,7 +49,7 @@ intel_miptree_create_for_teximage(struct brw_context *brw, /* Figure out image dimensions at start level. */ for (i = intelImage->base.Base.Level; i > 0; i--) { width <<= 1; - if (height != 1) + if (intelObj->base.Target != GL_TEXTURE_1D_ARRAY) height <<= 1; if (intelObj->base.Target == GL_TEXTURE_3D) depth <<= 1; -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3] i965: Fix calculation of the image height at start level
- Fixes CTS tests: * GL44-CTS.shader_image_size.advanced-nonMS-cs-float * GL44-CTS.shader_image_size.advanced-nonMS-cs-int * GL44-CTS.shader_image_size.advanced-nonMS-cs-uint * GL44-CTS.shader_image_size.advanced-nonMS-gs-float * GL44-CTS.shader_image_size.advanced-nonMS-gs-int * GL44-CTS.shader_image_size.advanced-nonMS-gs-uint * GL44-CTS.shader_image_size.advanced-nonMS-tes-float * GL44-CTS.shader_image_size.advanced-nonMS-tes-int * GL44-CTS.shader_image_size.advanced-nonMS-tes-uint * GL44-CTS.shader_image_size.advanced-nonMS-vs-float * GL44-CTS.shader_image_size.advanced-nonMS-vs-int * GL44-CTS.shader_image_size.advanced-nonMS-vs-uint v1: (written by Dave Airlie) Always shift height images for levels. Fixed the CTS tests. v2: Only shift height if the texture is not an 1D_ARRAY, it fixes assertion in GL44-CTS.texture_view.gettexparameter due to the original patch (Antia). v3: Remove the loop. Do not shift height either for 1D textures. Use an explicit switch and add an assertion (levels == 0) for multisampled textures (Jason). Signed-off-by: Dave Airlie Signed-off-by: Antia Puentes --- src/mesa/drivers/dri/i965/intel_tex_image.c | 27 +-- 1 file changed, 21 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c b/src/mesa/drivers/dri/i965/intel_tex_image.c index 7affe08..cfcbf3c 100644 --- a/src/mesa/drivers/dri/i965/intel_tex_image.c +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c @@ -47,12 +47,27 @@ intel_miptree_create_for_teximage(struct brw_context *brw, DBG("%s\n", __func__); /* Figure out image dimensions at start level. */ - for (i = intelImage->base.Base.Level; i > 0; i--) { - width <<= 1; - if (height != 1) - height <<= 1; - if (intelObj->base.Target == GL_TEXTURE_3D) - depth <<= 1; + switch(intelObj->base.Target) { + case GL_TEXTURE_2D_MULTISAMPLE: + case GL_TEXTURE_2D_MULTISAMPLE_ARRAY: + assert(intelImage->base.Base.Level == 0); + break; + case GL_TEXTURE_3D: + depth <<= intelImage->base.Base.Level; + /* Fall through */ + case GL_TEXTURE_2D: + case GL_TEXTURE_2D_ARRAY: + case GL_TEXTURE_RECTANGLE: + case GL_TEXTURE_CUBE_MAP: + case GL_TEXTURE_CUBE_MAP_ARRAY: + height <<= intelImage->base.Base.Level; + /* Fall through */ + case GL_TEXTURE_1D: + case GL_TEXTURE_1D_ARRAY: + width <<= intelImage->base.Base.Level; + break; + default: + unreachable("Unexpected target"); } /* Guess a reasonable value for lastLevel. This is probably going -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v4] i965: Fix calculation of the image height at start level
- Fixes CTS tests: * GL44-CTS.shader_image_size.advanced-nonMS-cs-float * GL44-CTS.shader_image_size.advanced-nonMS-cs-int * GL44-CTS.shader_image_size.advanced-nonMS-cs-uint * GL44-CTS.shader_image_size.advanced-nonMS-gs-float * GL44-CTS.shader_image_size.advanced-nonMS-gs-int * GL44-CTS.shader_image_size.advanced-nonMS-gs-uint * GL44-CTS.shader_image_size.advanced-nonMS-tes-float * GL44-CTS.shader_image_size.advanced-nonMS-tes-int * GL44-CTS.shader_image_size.advanced-nonMS-tes-uint * GL44-CTS.shader_image_size.advanced-nonMS-vs-float * GL44-CTS.shader_image_size.advanced-nonMS-vs-int * GL44-CTS.shader_image_size.advanced-nonMS-vs-uint v1: (written by Dave Airlie) Always shift height images for levels. Fixed the CTS test. v2: Only shift height if the texture is not an 1D_ARRAY, it fixes assertion in GL44-CTS.texture_view.gettexparameter due to the original patch (Antia). v3: Remove the loop. Do not shift height either for 1D textures. Use an explicit switch and add an assertion (levels == 0) for multisampled textures (Jason). v4: Rectangle textures can not have levels either (Ilia Mirkin). Signed-off-by: Dave Airlie Signed-off-by: Antia Puentes --- src/mesa/drivers/dri/i965/intel_tex_image.c | 27 +-- 1 file changed, 21 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c b/src/mesa/drivers/dri/i965/intel_tex_image.c index 7affe08..65962eb 100644 --- a/src/mesa/drivers/dri/i965/intel_tex_image.c +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c @@ -47,12 +47,27 @@ intel_miptree_create_for_teximage(struct brw_context *brw, DBG("%s\n", __func__); /* Figure out image dimensions at start level. */ - for (i = intelImage->base.Base.Level; i > 0; i--) { - width <<= 1; - if (height != 1) - height <<= 1; - if (intelObj->base.Target == GL_TEXTURE_3D) - depth <<= 1; + switch(intelObj->base.Target) { + case GL_TEXTURE_2D_MULTISAMPLE: + case GL_TEXTURE_2D_MULTISAMPLE_ARRAY: + case GL_TEXTURE_RECTANGLE: + assert(intelImage->base.Base.Level == 0); + break; + case GL_TEXTURE_3D: + depth <<= intelImage->base.Base.Level; + /* Fall through */ + case GL_TEXTURE_2D: + case GL_TEXTURE_2D_ARRAY: + case GL_TEXTURE_CUBE_MAP: + case GL_TEXTURE_CUBE_MAP_ARRAY: + height <<= intelImage->base.Base.Level; + /* Fall through */ + case GL_TEXTURE_1D: + case GL_TEXTURE_1D_ARRAY: + width <<= intelImage->base.Base.Level; + break; + default: + unreachable("Unexpected target"); } /* Guess a reasonable value for lastLevel. This is probably going -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/gen8: Fix vertex attrib upload for dvec3/4 shader inputs
The emission of vertex attributes corresponding to dvec3 and dvec4 vertex shader input variables was not correct when the passed to the VertexAttribL* commands was <= 2. This was because we were using the vertex array size when emitting vertices to decide if we uploaded a 64-bit floating point attribute as 1 slot (128-bits) for sizes 1 and 2, or 2 slots (256-bits) for sizes 3 and 4. This caused problems when mapping the input variables to registers because, for deciding which registers contain the values uploaded for a certain variable, we use the size and type given to the variable in the shader, so we will be assigning 256-bits to dvec3/4 variables, even if we only uploaded 128-bits for them, which happened when the vertex array size was <= 2. The patch uses the shader information to only emit as 128-bits those 64-bit floating point variables that were declared as double or dvec2 in the vertex shader. Dvec3 and dvec4 variables will be always uploaded as 256-bits, independently of the given to the VertexAttribL* command. From the ARB_vertex_attrib_64bit specification: "For the 64-bit double precision types listed in Table X.1, no default attribute values are provided if the values of the vertex attribute variable are specified with fewer components than required for the attribute variable. For example, the fourth component of a variable of type dvec4 will be undefined if specified using VertexAttribL3dv or using a vertex array specified with VertexAttribLPointer and a size of three." We are filling these unspecified components with zeros, which coincidentally is also what the GL44-CTS.vertex_attrib_binding.basic-inputL-case1 expects. Fixes: GL44-CTS.vertex_attrib_binding.basic-inputL-case1 test Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97287 --- src/mesa/drivers/dri/i965/brw_compiler.h | 1 + src/mesa/drivers/dri/i965/brw_context.h | 2 +- src/mesa/drivers/dri/i965/brw_draw_upload.c | 4 +++- src/mesa/drivers/dri/i965/brw_vs.c | 1 + src/mesa/drivers/dri/i965/gen8_draw_upload.c | 35 5 files changed, 21 insertions(+), 22 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h b/src/mesa/drivers/dri/i965/brw_compiler.h index 819c7d6..c2400f9 100644 --- a/src/mesa/drivers/dri/i965/brw_compiler.h +++ b/src/mesa/drivers/dri/i965/brw_compiler.h @@ -641,6 +641,7 @@ struct brw_vs_prog_data { struct brw_vue_prog_data base; GLbitfield64 inputs_read; + GLbitfield64 double_inputs_read; unsigned nr_attributes; unsigned nr_attribute_slots; diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 308ba99..310372a 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -535,7 +535,7 @@ struct brw_vertex_element { const struct gl_vertex_array *glarray; int buffer; - + bool is_dual_slot; /** Offset of the first element within the buffer object */ unsigned int offset; }; diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c b/src/mesa/drivers/dri/i965/brw_draw_upload.c index da13e7a..19c8065 100644 --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c @@ -472,7 +472,9 @@ brw_prepare_vertices(struct brw_context *brw) while (vs_inputs) { GLuint index = ffsll(vs_inputs) - 1; struct brw_vertex_element *input = &brw->vb.inputs[index]; - + input->is_dual_slot = + (brw->gen >= 8 && _mesa_bitcount_64(vs_prog_data->double_inputs_read & + BITFIELD64_BIT(index))); vs_inputs &= ~BITFIELD64_BIT(index); brw->vb.enabled[brw->vb.nr_enabled++] = input; } diff --git a/src/mesa/drivers/dri/i965/brw_vs.c b/src/mesa/drivers/dri/i965/brw_vs.c index 842c516..02a88ca 100644 --- a/src/mesa/drivers/dri/i965/brw_vs.c +++ b/src/mesa/drivers/dri/i965/brw_vs.c @@ -151,6 +151,7 @@ brw_codegen_vs_prog(struct brw_context *brw, uint64_t outputs_written = brw_vs_outputs_written(brw, key, vp->program.info.outputs_written); prog_data.inputs_read = vp->program.info.inputs_read; + prog_data.double_inputs_read = vp->program.info.double_inputs_read; if (key->copy_edgeflag) { prog_data.inputs_read |= VERT_BIT_EDGEFLAG; diff --git a/src/mesa/drivers/dri/i965/gen8_draw_upload.c b/src/mesa/drivers/dri/i965/gen8_draw_upload.c index 23c7587..69ba8e9 100644 --- a/src/mesa/drivers/dri/i965/gen8_draw_upload.c +++ b/src/mesa/drivers/dri/i965/gen8_draw_upload.c @@ -230,8 +230,15 @@ gen8_emit_vertices(struct brw_context *brw) case 0: comp0 = BRW_VE1_COMPONENT_STORE_0; case 1: comp1 = BRW_VE1_COMPONENT_STORE_0; case 2: comp2 = BRW_VE1_COMPONENT_STORE_0; - case 3: comp3 = input->glarray->Integer ? BRW_VE1_COMPONENT_STORE_1_INT - : BRW_VE1_COMPONENT_STORE_1_FLT; + case 3: + if (i
[Mesa-dev] [PATCH] i965/vec4: Don't coalesce registers in gen6 math ops if reswizzling needed
Math operations in SandyBridge do not support source swizzling Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92033 --- src/mesa/drivers/dri/i965/brw_ir_vec4.h | 3 ++- src/mesa/drivers/dri/i965/brw_vec4.cpp | 11 +-- 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h b/src/mesa/drivers/dri/i965/brw_ir_vec4.h index 966a410..a48bb68 100644 --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h @@ -175,7 +175,8 @@ public: bool is_send_from_grf(); unsigned regs_read(unsigned arg) const; - bool can_reswizzle(int dst_writemask, int swizzle, int swizzle_mask); + bool can_reswizzle(const struct brw_device_info *devinfo, int dst_writemask, + int swizzle, int swizzle_mask); void reswizzle(int dst_writemask, int swizzle); bool can_do_source_mods(const struct brw_device_info *devinfo); diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index ed49cd3..d7192e4 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -941,10 +941,17 @@ vec4_visitor::opt_set_dependency_control() } bool -vec4_instruction::can_reswizzle(int dst_writemask, +vec4_instruction::can_reswizzle(const struct brw_device_info *devinfo, +int dst_writemask, int swizzle, int swizzle_mask) { + + /* gen6 math instructions can not manage source swizzles */ + if (devinfo->gen == 6 && is_math() && + swizzle != BRW_SWIZZLE_XYZW) + return false; + /* If this instruction sets anything not referenced by swizzle, then we'd * totally break it when we reswizzle. */ @@ -1077,7 +1084,7 @@ vec4_visitor::opt_register_coalesce() break; /* If we can't handle the swizzle, bail. */ -if (!scan_inst->can_reswizzle(inst->dst.writemask, +if (!scan_inst->can_reswizzle(devinfo, inst->dst.writemask, inst->src[0].swizzle, chans_needed)) { break; -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] i965/vec4: Don't coalesce regs in Gen6 MATH ops if reswizzle/writemask needed
Gen6 MATH instructions can not execute in align16 mode, so swizzles or writemasking are not allowed. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92033 --- I have tried to find an example where the writemask check is strictly needed in order to avoid an incorrect register coalescing, but I have failed. If I have something like this in my shader: in vec4 attr_pos; void main() { vec4 a; a.xy = sin(attr_pos); .. } It is translated into the following vec4 instructions: sin vgrf4.0:F, vgrf3.xyzw:F mov vgrf0.0.xy:F, vgrf4.xyzw:F <-- (added by emit_math to fix the writemasking issue) And converted by the opt_reduce_swizzle optimization into: sin vgrf4.0:F, vgrf3.xyzw:F mov vgrf0.0.xy:F, vgrf4.xyyy:F The swizzle in the MOV instruction is not the identity anymore, so the method "can_reswizzle" in opt_register_coalesce will return FALSE because of the swizzle-check that my first version of the patch added. Out of curiosity, I disabled the opt_reduce_swizzle optimization to see what happens. Still "can_reswizzle" returns FALSE preventing the incorrect register coalescing, because the math function sets more things than the MOV instruction; i.e. the check "(dst.writemask & ~swizzle_mask) is positive. Although I have not found an example where the writemask-check is crucial, that does not neccessarily mean that it does not exists. I am sending the second version of the patch in case you prefer to have that check in place. src/mesa/drivers/dri/i965/brw_ir_vec4.h | 3 ++- src/mesa/drivers/dri/i965/brw_vec4.cpp | 11 +-- 2 files changed, 11 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h b/src/mesa/drivers/dri/i965/brw_ir_vec4.h index 966a410..a48bb68 100644 --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h @@ -175,7 +175,8 @@ public: bool is_send_from_grf(); unsigned regs_read(unsigned arg) const; - bool can_reswizzle(int dst_writemask, int swizzle, int swizzle_mask); + bool can_reswizzle(const struct brw_device_info *devinfo, int dst_writemask, + int swizzle, int swizzle_mask); void reswizzle(int dst_writemask, int swizzle); bool can_do_source_mods(const struct brw_device_info *devinfo); diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index ed49cd3..ae695f1 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -941,10 +941,17 @@ vec4_visitor::opt_set_dependency_control() } bool -vec4_instruction::can_reswizzle(int dst_writemask, +vec4_instruction::can_reswizzle(const struct brw_device_info *devinfo, +int dst_writemask, int swizzle, int swizzle_mask) { + /* Gen6 MATH instructions can not execute in align16 mode, so swizzles +* or writemasking are not allowed. */ + if (devinfo->gen == 6 && is_math() && + (swizzle != BRW_SWIZZLE_XYZW || dst_writemask != WRITEMASK_XYZW)) + return false; + /* If this instruction sets anything not referenced by swizzle, then we'd * totally break it when we reswizzle. */ @@ -1077,7 +1084,7 @@ vec4_visitor::opt_register_coalesce() break; /* If we can't handle the swizzle, bail. */ -if (!scan_inst->can_reswizzle(inst->dst.writemask, +if (!scan_inst->can_reswizzle(devinfo, inst->dst.writemask, inst->src[0].swizzle, chans_needed)) { break; -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Emit IF/ELSE/ENDIF/WHILE JIP with type W on Gen7
IvyBridge and Haswell PRM say that the JIP should be emitted with type W but we were using UD. The previous implementation did not show adverse effects, however changing the type to D caused a GPU hang, see bug 84557; IMHO it is safer to follow the specification thoroughly. --- src/mesa/drivers/dri/i965/brw_eu_emit.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index 43e5783..1ca79a9 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -1332,7 +1332,7 @@ brw_IF(struct brw_compile *p, unsigned execute_size) } else if (brw->gen == 7) { brw_set_dest(p, insn, vec1(retype(brw_null_reg(), BRW_REGISTER_TYPE_D))); brw_set_src0(p, insn, vec1(retype(brw_null_reg(), BRW_REGISTER_TYPE_D))); - brw_set_src1(p, insn, brw_imm_ud(0)); + brw_set_src1(p, insn, brw_imm_w(0)); brw_inst_set_jip(brw, insn, 0); brw_inst_set_uip(brw, insn, 0); } else { @@ -1533,7 +1533,7 @@ brw_ELSE(struct brw_compile *p) } else if (brw->gen == 7) { brw_set_dest(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D)); brw_set_src0(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D)); - brw_set_src1(p, insn, brw_imm_ud(0)); + brw_set_src1(p, insn, brw_imm_w(0)); brw_inst_set_jip(brw, insn, 0); brw_inst_set_uip(brw, insn, 0); } else { @@ -1610,7 +1610,7 @@ brw_ENDIF(struct brw_compile *p) } else if (brw->gen == 7) { brw_set_dest(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D)); brw_set_src0(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D)); - brw_set_src1(p, insn, brw_imm_ud(0)); + brw_set_src1(p, insn, brw_imm_w(0)); } else { brw_set_src0(p, insn, brw_imm_d(0)); } @@ -1802,7 +1802,7 @@ brw_WHILE(struct brw_compile *p) } else if (brw->gen == 7) { brw_set_dest(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D)); brw_set_src0(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D)); - brw_set_src1(p, insn, brw_imm_ud(0)); + brw_set_src1(p, insn, brw_imm_w(0)); brw_inst_set_jip(brw, insn, br * (do_insn - insn)); } else { brw_set_dest(p, insn, brw_imm_w(0)); -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: respect the source number set by #line
From GLSL 1.30.10, section 3.3 (Preprocessor): "#line line source-string-number ... After processing this directive (including its new-line), the implementation will behave as if it is compiling at ... source string number source-string-number. Subsequent source strings will be numbered sequentially, until another #line directive overrides that numbering." In the previous implementation the source number was always zero. Subsequent source strings are still not numbered sequentially, because in the glShaderSource implementation we are concatenating the source code strings into one long string. Partially fixes https://bugs.freedesktop.org/show_bug.cgi?id=88815 --- src/glsl/glsl_lexer.ll | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/glsl/glsl_lexer.ll b/src/glsl/glsl_lexer.ll index 8dc3d10..f0e047e 100644 --- a/src/glsl/glsl_lexer.ll +++ b/src/glsl/glsl_lexer.ll @@ -36,14 +36,13 @@ static int classify_identifier(struct _mesa_glsl_parse_state *, const char *); #define YY_USER_ACTION \ do {\ - yylloc->source = 0; \ yylloc->first_column = yycolumn + 1; \ yylloc->first_line = yylloc->last_line = yylineno + 1; \ yycolumn += yyleng; \ yylloc->last_column = yycolumn + 1; \ } while(0); -#define YY_USER_INIT yylineno = 0; yycolumn = 0; +#define YY_USER_INIT yylineno = 0; yycolumn = 0; yylloc->source = 0; /* A macro for handling reserved words and keywords across language versions. * -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: Update the #line behaviour on GLSL 3.30+ and GLSL ES+
From GLSL 3.30 and GLSL ES 1.00 on, after processing the line directive (including its new-line), the implementation should behave as if it is compiling at the line number passed as argument. In previous versions, it behaved as if compiling at the passed line number + 1. Partially fixes https://bugs.freedesktop.org/show_bug.cgi?id=88815 --- src/glsl/glsl_lexer.ll | 17 + 1 file changed, 17 insertions(+) diff --git a/src/glsl/glsl_lexer.ll b/src/glsl/glsl_lexer.ll index f0e047e..2785ed1 100644 --- a/src/glsl/glsl_lexer.ll +++ b/src/glsl/glsl_lexer.ll @@ -187,6 +187,15 @@ HASH ^{SPC}#{SPC} * one-based. */ yylineno = strtol(ptr, &ptr, 0) - 1; + + /* From GLSL 3.30 and GLSL ES on, after processing the +* line directive (including its new-line), the implementation +* will behave as if it is compiling at the line number passed +* as argument. It was line number + 1 in older specifications. +*/ + if (yyextra->is_version(330, 100)) + yylineno--; + yylloc->source = strtol(ptr, NULL, 0); } {HASH}line{SPCP}{INT}{SPC}${ @@ -202,6 +211,14 @@ HASH ^{SPC}#{SPC} * one-based. */ yylineno = strtol(ptr, &ptr, 0) - 1; + + /* From GLSL 3.30 and GLSL ES on, after processing the +* line directive (including its new-line), the implementation +* will behave as if it is compiling at the line number passed +* as argument. It was line number + 1 in older specifications. +*/ + if (yyextra->is_version(330, 100)) + yylineno--; } ^{SPC}#{SPC}pragma{SPCP}debug{SPC}\({SPC}on{SPC}\) { BEGIN PP; -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] formatquery: use correct target check for IMAGE_FORMAT_COMPATIBILITY_TYPE
Thanks for fixing this. Reviewed-by: Antia Puentes On 27/10/17 11:18, Alejandro Piñeiro wrote: From the spec: "IMAGE_FORMAT_COMPATIBILITY_TYPE: The matching criteria use for the resource when used as an image textures is returned in . This is equivalent to calling GetTexParameter" So we would need to return None for any target not supported by GetTexParameter. By mistake, we were using the target check for GetTexLevelParameter. --- src/mesa/main/formatquery.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/formatquery.c b/src/mesa/main/formatquery.c index 77c7faa2251..39c628039b8 100644 --- a/src/mesa/main/formatquery.c +++ b/src/mesa/main/formatquery.c @@ -1430,7 +1430,13 @@ _mesa_GetInternalformativ(GLenum target, GLenum internalformat, GLenum pname, if (!_mesa_has_ARB_shader_image_load_store(ctx)) goto end; - if (!_mesa_legal_get_tex_level_parameter_target(ctx, target, true)) + /* As pointed by the spec quote below, this pname query should return + * the same value that GetTexParameter. So if the target is not valid + * for GetTexParameter we return the unsupported value. The check below + * is the same target check used by GetTextParameter. + */ + int targetIndex = _mesa_tex_target_to_index(ctx, target); + if (targetIndex < 0 || targetIndex == TEXTURE_BUFFER_INDEX) goto end; /* From spec: "Equivalent to calling GetTexParameter with set ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/9] nir: Add SYSTEM_VALUE_BASE_VERTEX_ID instrinsics
Reviewed-by: Neil Roberts --- src/compiler/nir/nir.c | 4 src/compiler/nir/nir_gather_info.c | 1 + src/compiler/nir/nir_intrinsics.h | 1 + 3 files changed, 6 insertions(+) diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c index 7380bf436a8..6f0477b0676 100644 --- a/src/compiler/nir/nir.c +++ b/src/compiler/nir/nir.c @@ -1913,6 +1913,8 @@ nir_intrinsic_from_system_value(gl_system_value val) return nir_intrinsic_load_base_instance; case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE: return nir_intrinsic_load_vertex_id_zero_base; + case SYSTEM_VALUE_BASE_VERTEX_ID: + return nir_intrinsic_load_base_vertex_id; case SYSTEM_VALUE_BASE_VERTEX: return nir_intrinsic_load_base_vertex; case SYSTEM_VALUE_INVOCATION_ID: @@ -1982,6 +1984,8 @@ nir_system_value_from_intrinsic(nir_intrinsic_op intrin) return SYSTEM_VALUE_BASE_INSTANCE; case nir_intrinsic_load_vertex_id_zero_base: return SYSTEM_VALUE_VERTEX_ID_ZERO_BASE; + case nir_intrinsic_load_base_vertex_id: + return SYSTEM_VALUE_BASE_VERTEX_ID; case nir_intrinsic_load_base_vertex: return SYSTEM_VALUE_BASE_VERTEX; case nir_intrinsic_load_invocation_id: diff --git a/src/compiler/nir/nir_gather_info.c b/src/compiler/nir/nir_gather_info.c index 13cdae39eca..cf2abb8b8ed 100644 --- a/src/compiler/nir/nir_gather_info.c +++ b/src/compiler/nir/nir_gather_info.c @@ -232,6 +232,7 @@ gather_intrinsic_info(nir_intrinsic_instr *instr, nir_shader *shader) case nir_intrinsic_load_vertex_id: case nir_intrinsic_load_vertex_id_zero_base: case nir_intrinsic_load_base_vertex: + case nir_intrinsic_load_base_vertex_id: case nir_intrinsic_load_base_instance: case nir_intrinsic_load_instance_id: case nir_intrinsic_load_sample_id: diff --git a/src/compiler/nir/nir_intrinsics.h b/src/compiler/nir/nir_intrinsics.h index 20bef339ac4..78123dd927a 100644 --- a/src/compiler/nir/nir_intrinsics.h +++ b/src/compiler/nir/nir_intrinsics.h @@ -326,6 +326,7 @@ SYSTEM_VALUE(frag_coord, 4, 0, xx, xx, xx) SYSTEM_VALUE(front_face, 1, 0, xx, xx, xx) SYSTEM_VALUE(vertex_id, 1, 0, xx, xx, xx) SYSTEM_VALUE(vertex_id_zero_base, 1, 0, xx, xx, xx) +SYSTEM_VALUE(base_vertex_id, 1, 0, xx, xx, xx) SYSTEM_VALUE(base_vertex, 1, 0, xx, xx, xx) SYSTEM_VALUE(instance_id, 1, 0, xx, xx, xx) SYSTEM_VALUE(base_instance, 1, 0, xx, xx, xx) -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/9] compiler: Add new system value SYSTEM_VALUE_BASE_VERTEX_ID
This VS system value will contain the value passed as for indexed draw calls or the value passed as for non-indexed draw calls. It will be used to calculate the gl_VertexID as SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus SYSTEM_VALUE_BASE_VERTEX_ID. Note that the current calculation which uses SYSTEM_VALUE_BASE_VERTEX is not right, as gl_BaseVertex should be zero for non-indexed calls. Reviewed-by: Neil Roberts --- src/compiler/shader_enums.c | 1 + src/compiler/shader_enums.h | 15 +++ 2 files changed, 16 insertions(+) diff --git a/src/compiler/shader_enums.c b/src/compiler/shader_enums.c index b2ca80b49c2..cd8c256f227 100644 --- a/src/compiler/shader_enums.c +++ b/src/compiler/shader_enums.c @@ -215,6 +215,7 @@ gl_system_value_name(gl_system_value sysval) ENUM(SYSTEM_VALUE_INSTANCE_ID), ENUM(SYSTEM_VALUE_INSTANCE_INDEX), ENUM(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE), + ENUM(SYSTEM_VALUE_BASE_VERTEX_ID), ENUM(SYSTEM_VALUE_BASE_VERTEX), ENUM(SYSTEM_VALUE_BASE_INSTANCE), ENUM(SYSTEM_VALUE_DRAW_ID), diff --git a/src/compiler/shader_enums.h b/src/compiler/shader_enums.h index 9d229d4199e..7437cccdd0a 100644 --- a/src/compiler/shader_enums.h +++ b/src/compiler/shader_enums.h @@ -474,6 +474,21 @@ typedef enum */ SYSTEM_VALUE_BASE_VERTEX, + + /** +* Depending on the type of the draw call (indexed or non-indexed), +* is the value of \c basevertex passed to \c glDrawElementsBaseVertex and +* similar, or is the value of \c first passed to \c glDrawArrays and +* similar. +* +* \note +* It can be used to calculate the \c SYSTEM_VALUE_VERTEX_ID as +* \c SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus \c SYSTEM_VALUE_BASE_VERTEX_ID. +* +* \sa SYSTEM_VALUE_VERTEX_ID_ZERO_BASE, SYSTEM_VALUE_VERTEX_ID +*/ + SYSTEM_VALUE_BASE_VERTEX_ID, + /** * Value of \c baseinstance passed to instanced draw entry points * -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/9] Fix shader_draw_parameters CTS tests
Hi, the series sets gl_BaseVertex to zero for gl*DrawArrays* in i965. Previously, its value was the value passed as in the non-indexed draw call. This was convinient because the gl_VertexID was calculated as gl_VertexIDBaseZero + gl_BaseVertex. However, as gl_BaseVertex must be zero for non-indexed draw calls, this calculation is broken. Apart from setting the basevertex to zero in i965, the following patches add a new VS system value which will contain or as needed to make the gl_VertexID calculation right. The relevant bits of the OpenGL 4.6 specification involved here are: * 11.1.3.9 Shader Inputs "gl_BaseVertex holds the integer value passed to the baseVertex parameter to the command that resulted in the current shader invocation. In the case where the command has no baseVertex parameter, the value of gl_BaseVertex is zero." "gl_VertexID holds the integer index i implicitly passed by DrawArrays or one of the other drawing commands defined in section 10.4." * 10.4. Drawing Commands Using Vertex Arrays: - Page 352: "The index of any element transferred to the GL by DrawArraysOneInstance is referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID. The vertex ID of the ith element transferred is first + i." - Page 355: "The index of any element transferred to the GL by DrawElementsOneInstance is referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID. The vertex ID of the ith element transferred is the sum of basevertex and the value stored in the currently bound element array buffer at offset indices + i." Regards, Antia. Antia Puentes (5): compiler: Add new system value SYSTEM_VALUE_BASE_VERTEX_ID nir: Add SYSTEM_VALUE_BASE_VERTEX_ID instrinsics i965: emit basevertexid as a vertex element component i965: gl_BaseVertex must be zero for non-indexed draw calls intel/compiler: implement the basevertex(id) load intrinsics Neil Roberts (4): intel/compiler: Add a uses_basevertexid flag nir: Offset vertex_id by base_vertex_id instead of base_vertex i965: Let nir lower gl_VertexID instead of the linker spirv: Lower BaseVertex to BASE_VERTEX_ID instead of BASE_VERTEX src/compiler/nir/nir.c| 4 +++ src/compiler/nir/nir_gather_info.c| 1 + src/compiler/nir/nir_intrinsics.h | 1 + src/compiler/nir/nir_lower_system_values.c| 2 +- src/compiler/shader_enums.c | 1 + src/compiler/shader_enums.h | 15 +++ src/compiler/spirv/vtn_variables.c| 2 +- src/intel/compiler/brw_compiler.h | 1 + src/intel/compiler/brw_nir.c | 12 ++--- src/intel/compiler/brw_vec4.cpp | 18 - src/intel/vulkan/genX_cmd_buffer.c| 8 +++--- src/intel/vulkan/genX_pipeline.c | 4 +-- src/mesa/drivers/dri/i965/brw_context.c | 3 ++- src/mesa/drivers/dri/i965/brw_context.h | 36 ++--- src/mesa/drivers/dri/i965/brw_draw.c | 33 --- src/mesa/drivers/dri/i965/brw_draw_upload.c | 13 - src/mesa/drivers/dri/i965/genX_state_upload.c | 39 +++ 17 files changed, 132 insertions(+), 61 deletions(-) -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/9] intel/compiler: implement the basevertex(id) load intrinsics
The gl_BaseVertex is in a new location now, and the new basevertexid occupies the old gl_BaseVertex place. Reviewed-by: Neil Roberts --- src/intel/compiler/brw_nir.c| 12 src/intel/compiler/brw_vec4.cpp | 14 -- 2 files changed, 16 insertions(+), 10 deletions(-) diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c index 8f3f77f89ae..a4fe2f5eb14 100644 --- a/src/intel/compiler/brw_nir.c +++ b/src/intel/compiler/brw_nir.c @@ -240,8 +240,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir, */ const bool has_sgvs = nir->info.system_values_read & - (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) | - BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) | + (BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) | BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) | BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID)); @@ -262,6 +261,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir, nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr); switch (intrin->intrinsic) { +case nir_intrinsic_load_base_vertex_id: case nir_intrinsic_load_base_vertex: case nir_intrinsic_load_base_instance: case nir_intrinsic_load_vertex_id_zero_base: @@ -279,7 +279,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir, nir_intrinsic_set_base(load, num_inputs); switch (intrin->intrinsic) { - case nir_intrinsic_load_base_vertex: + case nir_intrinsic_load_base_vertex_id: nir_intrinsic_set_component(load, 0); break; case nir_intrinsic_load_base_instance: @@ -292,11 +292,15 @@ brw_nir_lower_vs_inputs(nir_shader *nir, nir_intrinsic_set_component(load, 3); break; case nir_intrinsic_load_draw_id: + case nir_intrinsic_load_base_vertex: /* gl_DrawID is stored right after gl_VertexID and friends * if any of them exist. */ nir_intrinsic_set_base(load, num_inputs + has_sgvs); - nir_intrinsic_set_component(load, 0); + if (intrin->intrinsic == nir_intrinsic_load_draw_id) + nir_intrinsic_set_component(load, 0); + else + nir_intrinsic_set_component(load, 1); break; default: unreachable("Invalid system value intrinsic"); diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp index d996ab8c89f..5d86cc4ac7e 100644 --- a/src/intel/compiler/brw_vec4.cpp +++ b/src/intel/compiler/brw_vec4.cpp @@ -2784,13 +2784,18 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, * incoming vertex attribute. So, add an extra slot. */ if (shader->info.system_values_read & - (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) | -BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) | + (BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) | BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) | BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))) { nr_attribute_slots++; } + if (shader->info.system_values_read & + (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) | +BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID))) { + nr_attribute_slots++; + } + if (shader->info.system_values_read & BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX)) prog_data->uses_basevertex = true; @@ -2811,12 +2816,9 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID)) prog_data->uses_instanceid = true; - /* gl_DrawID has its very own vec4 */ if (shader->info.system_values_read & - BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID)) { + BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID)) prog_data->uses_drawid = true; - nr_attribute_slots++; - } /* The 3DSTATE_VS documentation lists the lower bound on "Vertex URB Entry * Read Length" as 1 in vec4 mode, and 0 in SIMD8 mode. Empirically, in -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/9] nir: Offset vertex_id by base_vertex_id instead of base_vertex
From: Neil Roberts base_vertex will be zero for non-indexed calls, but we need it to include the ‘first’ parameter. This is true for both GL and Vulkan. I think this patch will also affect freedreno and radeonsi. I believe if they are relying on this lowering then they are currently already broken because they will have the wrong values for gl_BaseVertex. However this patch will also make them break for gl_VertexID and they will need to be fixed to use base_vertex_id instead. --- src/compiler/nir/nir_lower_system_values.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/compiler/nir/nir_lower_system_values.c b/src/compiler/nir/nir_lower_system_values.c index 3594f4ae5ce..5464a1b3980 100644 --- a/src/compiler/nir/nir_lower_system_values.c +++ b/src/compiler/nir/nir_lower_system_values.c @@ -105,7 +105,7 @@ convert_block(nir_block *block, nir_builder *b) if (b->shader->options->vertex_id_zero_based) { sysval = nir_iadd(b, nir_load_vertex_id_zero_base(b), - nir_load_base_vertex(b)); + nir_load_base_vertex_id(b)); } else { sysval = nir_load_vertex_id(b); } -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/9] intel/compiler: Add a uses_basevertexid flag
From: Neil Roberts --- src/intel/compiler/brw_compiler.h | 1 + src/intel/compiler/brw_vec4.cpp | 4 2 files changed, 5 insertions(+) diff --git a/src/intel/compiler/brw_compiler.h b/src/intel/compiler/brw_compiler.h index df6ee018546..6b5b73a54f0 100644 --- a/src/intel/compiler/brw_compiler.h +++ b/src/intel/compiler/brw_compiler.h @@ -969,6 +969,7 @@ struct brw_vs_prog_data { bool uses_vertexid; bool uses_instanceid; bool uses_basevertex; + bool uses_basevertexid; bool uses_baseinstance; bool uses_drawid; }; diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp index bbe4585e0c7..d996ab8c89f 100644 --- a/src/intel/compiler/brw_vec4.cpp +++ b/src/intel/compiler/brw_vec4.cpp @@ -2795,6 +2795,10 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX)) prog_data->uses_basevertex = true; + if (shader->info.system_values_read & + BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX_ID)) + prog_data->uses_basevertexid = true; + if (shader->info.system_values_read & BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE)) prog_data->uses_baseinstance = true; -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/9] i965: emit basevertexid as a vertex element component
The new basevertexid will be emitted in the position previously occupied by gl_BaseVertex. This way we can keep pointing the indirect buffer for indirect draw calls. The gl_BaseVertex is now emited as part of the draw_id VERTEX_ELEMENT. Reviewed-by: Neil Roberts --- src/mesa/drivers/dri/i965/brw_context.h | 36 ++--- src/mesa/drivers/dri/i965/brw_draw.c | 26 +++--- src/mesa/drivers/dri/i965/brw_draw_upload.c | 13 - src/mesa/drivers/dri/i965/genX_state_upload.c | 39 +++ 4 files changed, 72 insertions(+), 42 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 8aa0c5ff64c..8ad3cdd7ebd 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -807,28 +807,44 @@ struct brw_context struct { struct { - /** The value of gl_BaseVertex for the current _mesa_prim. */ - int gl_basevertex; + /** + * Either the value of gl_BaseVertex for indexed draw calls or the + * value of the argument for non-indexed draw calls, for the + * current _mesa_prim. + */ + int basevertexid; /** The value of gl_BaseInstance for the current _mesa_prim. */ int gl_baseinstance; } params; /** - * Buffer and offset used for GL_ARB_shader_draw_parameters - * (for now, only gl_BaseVertex). + * Buffer and offset used for GL_ARB_shader_draw_parameters. For indirect + * draw calls it will point to the indirect buffer. */ struct brw_bo *draw_params_bo; uint32_t draw_params_offset; + struct { + /** The value of gl_DrawID for the current _mesa_prim. */ + int gl_drawid; + + /** + * The value of gl_BaseVertex for the current _mesa_prim. It must be + * zero for non-indexed calls. + */ + int gl_basevertex; + } derived_params; + /** - * The value of gl_DrawID for the current _mesa_prim. This always comes - * in from it's own vertex buffer since it's not part of the indirect - * draw parameters. + * Buffer and offset also used for GL_ARB_shader_draw_parameters. As they + * are not part of the indirect buffer they will go in their own vertex + * element. Note that although gl_BaseVertex is part of the indirect + * buffer for indexed draw calls, that is not longer the case for + * non-indexed. */ - int gl_drawid; - struct brw_bo *draw_id_bo; - uint32_t draw_id_offset; + struct brw_bo *derived_draw_params_bo; + uint32_t derived_draw_params_offset; /** * Pointer to the the buffer storing the indirect draw parameters. It diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 809e7221b2d..8c26273035b 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -766,25 +766,25 @@ brw_draw_single_prim(struct gl_context *ctx, * always flag if the shader uses one of the values. For direct draws, * we only flag if the values change. */ - const int new_basevertex = + const int new_basevertexid = prim->indexed ? prim->basevertex : prim->start; const int new_baseinstance = prim->base_instance; const struct brw_vs_prog_data *vs_prog_data = brw_vs_prog_data(brw->vs.base.prog_data); if (prim_id > 0) { const bool uses_draw_parameters = - vs_prog_data->uses_basevertex || + vs_prog_data->uses_basevertexid || vs_prog_data->uses_baseinstance; if ((uses_draw_parameters && prim->is_indirect) || - (vs_prog_data->uses_basevertex && - brw->draw.params.gl_basevertex != new_basevertex) || + (vs_prog_data->uses_basevertexid && + brw->draw.params.basevertexid != new_basevertexid) || (vs_prog_data->uses_baseinstance && brw->draw.params.gl_baseinstance != new_baseinstance)) brw->ctx.NewDriverState |= BRW_NEW_VERTICES; } - brw->draw.params.gl_basevertex = new_basevertex; + brw->draw.params.basevertexid = new_basevertexid; brw->draw.params.gl_baseinstance = new_baseinstance; brw_bo_unreference(brw->draw.draw_params_bo); @@ -809,12 +809,20 @@ brw_draw_single_prim(struct gl_context *ctx, * valid vs_prog_data, but we always flag BRW_NEW_VERTICES before * the loop. */ - brw->draw.gl_drawid = prim->draw_id; - brw_bo_unreference(brw->draw.draw_id_bo); - brw->draw.draw_id_bo = NULL; - if (prim_id > 0 && vs_prog_data->uses_drawid) + const int new_basevertex = prim->indexed ? prim->basevertex : prim->start; + + if (prim_id > 0 && + (vs_prog_data->uses_drawid || +(vs_prog_data->uses_basevertex && + brw->draw.derived_params.gl_basevertex != new_basevertex))) brw->ctx.NewDr
[Mesa-dev] [PATCH 5/9] i965: gl_BaseVertex must be zero for non-indexed draw calls
- From the OpenGL 4.6 (11.1.3.9 Shader Inputs) specification: "gl_BaseVertex holds the integer value passed to the baseVertex parameter to the command that resulted in the current shader invocation. In the case where the command has no baseVertex parameter, the value of gl_BaseVertex is zero." - Fixes CTS tests: * KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysParameters * KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysInstancedParameters * KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters * KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysIndirectParameters * KHR-GL45.shader_draw_parameters_tests.MultiDrawArraysIndirectCountParameters Reviewed-by: Neil Roberts --- src/mesa/drivers/dri/i965/brw_draw.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 8c26273035b..54d35c0522c 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -239,6 +239,13 @@ brw_emit_prim(struct brw_context *brw, prim->indirect_offset + 12); brw_load_register_mem(brw, GEN7_3DPRIM_START_INSTANCE, bo, prim->indirect_offset + 16); + + if (brw->draw.derived_draw_params_bo != NULL) { +brw_store_register_mem32(brw, brw->draw.derived_draw_params_bo, + GEN7_3DPRIM_BASE_VERTEX, + brw->draw.derived_draw_params_offset + 4); +brw_emit_mi_flush(brw); + } } else { brw_load_register_mem(brw, GEN7_3DPRIM_START_INSTANCE, bo, prim->indirect_offset + 12); @@ -809,7 +816,7 @@ brw_draw_single_prim(struct gl_context *ctx, * valid vs_prog_data, but we always flag BRW_NEW_VERTICES before * the loop. */ - const int new_basevertex = prim->indexed ? prim->basevertex : prim->start; + const int new_basevertex = prim->indexed ? prim->basevertex : 0; if (prim_id > 0 && (vs_prog_data->uses_drawid || -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 9/9] spirv: Lower BaseVertex to BASE_VERTEX_ID instead of BASE_VERTEX
From: Neil Roberts The base vertex in Vulkan is different from GL in that for non-indexed primitives the value is taken from the firstVertex parameter instead of being set to zero. This coincides with the new SYSTEM_VALUE_BASE_VERTEX_ID instead of BASE_VERTEX. --- src/compiler/spirv/vtn_variables.c | 2 +- src/intel/vulkan/genX_cmd_buffer.c | 8 src/intel/vulkan/genX_pipeline.c | 4 ++-- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/src/compiler/spirv/vtn_variables.c b/src/compiler/spirv/vtn_variables.c index 1cf9d597cf0..82d04f2558b 100644 --- a/src/compiler/spirv/vtn_variables.c +++ b/src/compiler/spirv/vtn_variables.c @@ -1155,7 +1155,7 @@ vtn_get_builtin_location(struct vtn_builder *b, set_mode_system_value(mode); break; case SpvBuiltInBaseVertex: - *location = SYSTEM_VALUE_BASE_VERTEX; + *location = SYSTEM_VALUE_BASE_VERTEX_ID; set_mode_system_value(mode); break; case SpvBuiltInBaseInstance: diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c index fbb5706606a..4e9b884d3ac 100644 --- a/src/intel/vulkan/genX_cmd_buffer.c +++ b/src/intel/vulkan/genX_cmd_buffer.c @@ -2175,7 +2175,7 @@ void genX(CmdDraw)( genX(cmd_buffer_flush_state)(cmd_buffer); - if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance) + if (vs_prog_data->uses_basevertexid || vs_prog_data->uses_baseinstance) emit_base_vertex_instance(cmd_buffer, firstVertex, firstInstance); if (vs_prog_data->uses_drawid) emit_draw_index(cmd_buffer, 0); @@ -2213,7 +2213,7 @@ void genX(CmdDrawIndexed)( genX(cmd_buffer_flush_state)(cmd_buffer); - if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance) + if (vs_prog_data->uses_basevertexid || vs_prog_data->uses_baseinstance) emit_base_vertex_instance(cmd_buffer, vertexOffset, firstInstance); if (vs_prog_data->uses_drawid) emit_draw_index(cmd_buffer, 0); @@ -2369,7 +2369,7 @@ void genX(CmdDrawIndirect)( struct anv_bo *bo = buffer->bo; uint32_t bo_offset = buffer->offset + offset; - if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance) + if (vs_prog_data->uses_basevertexid || vs_prog_data->uses_baseinstance) emit_base_vertex_instance_bo(cmd_buffer, bo, bo_offset + 8); if (vs_prog_data->uses_drawid) emit_draw_index(cmd_buffer, i); @@ -2408,7 +2408,7 @@ void genX(CmdDrawIndexedIndirect)( uint32_t bo_offset = buffer->offset + offset; /* TODO: We need to stomp base vertex to 0 somehow */ - if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance) + if (vs_prog_data->uses_basevertexid || vs_prog_data->uses_baseinstance) emit_base_vertex_instance_bo(cmd_buffer, bo, bo_offset + 12); if (vs_prog_data->uses_drawid) emit_draw_index(cmd_buffer, i); diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c index 7e3a785c584..01e48ce4559 100644 --- a/src/intel/vulkan/genX_pipeline.c +++ b/src/intel/vulkan/genX_pipeline.c @@ -97,7 +97,7 @@ emit_vertex_input(struct anv_pipeline *pipeline, const uint32_t elements_double = double_inputs_read >> VERT_ATTRIB_GENERIC0; const bool needs_svgs_elem = vs_prog_data->uses_vertexid || vs_prog_data->uses_instanceid || -vs_prog_data->uses_basevertex || +vs_prog_data->uses_basevertexid || vs_prog_data->uses_baseinstance; uint32_t elem_count = __builtin_popcount(elements) - @@ -177,7 +177,7 @@ emit_vertex_input(struct anv_pipeline *pipeline, * This means, that if we have BaseInstance, we need BaseVertex as * well. Just do all or nothing. */ - uint32_t base_ctrl = (vs_prog_data->uses_basevertex || + uint32_t base_ctrl = (vs_prog_data->uses_basevertexid || vs_prog_data->uses_baseinstance) ? VFCOMP_STORE_SRC : VFCOMP_STORE_0; -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 8/9] i965: Let nir lower gl_VertexID instead of the linker
From: Neil Roberts --- src/mesa/drivers/dri/i965/brw_context.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 19d5a2e3503..01c8e8cb3cf 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -586,7 +586,8 @@ brw_initialize_context_constants(struct brw_context *brw) ctx->Const.QuadsFollowProvokingVertexConvention = false; ctx->Const.NativeIntegers = true; - ctx->Const.VertexID_is_zero_based = true; + /* This is lowered by NIR instead */ + ctx->Const.VertexID_is_zero_based = false; /* Regarding the CMP instruction, the Ivybridge PRM says: * -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 0/7] Fix shader_draw_parameters CTS tests
This patch series is a v2 of the one we sent some weeks ago to fix gl_BaseVertex in i965. The original patch series can be found on here: https://patchwork.freedesktop.org/series/33623/ * Rationale: gl_BaseVertex must be zero for non-indexed draw comands. However, for this kind of draw comands, we were setting it to the value passed as argument of the command. The series fixes this bug for i965. Notice that we still need the value of the argument of the non-indexed draw call to calculate gl_VertexID, as for those gl_VertexID must be equal to: gl_VertexIDBaseZero + . The Vertex Elements' composition will be now as follows: * VE 1: * VE 2: - BaseVertex will be the passed to indexed draw calls, zero otherwise. - firstvertex will be the vertex passed to non-indexed draw calls, and will be the for indexed draw commands. Setting firstvertex to for indexed draw commands is convinient for two reasons: 1) In case of indirect draw calls, we can keep pointing to the indirect buffer when emitting the vertices. 2) We can calculate gl_VertexID as VertexID(basezero) + firstindex for both types of draw commands and these two values will be in the same Vertex Element. We will only emit VE2 if gl_DrawID or gl_BaseVertex are referenced in the shader. * Changes in v2: - Rename the new system value as SYSTEM_VALUE_FIRST_VERTEX and the related variables through the series. - Squash patches 1-2 of the original series, and 4 and 6. - Move patch 5 to the end of the patch series. - Swap the order of patches 7 and 8. * OpenGL 4.6 specification bits: - 11.1.3.9 Shader Inputs "gl_BaseVertex holds the integer value passed to the baseVertex parameter to the command that resulted in the current shader invocation. In the case where the command has no baseVertex parameter, the value of gl_BaseVertex is zero." "gl_VertexID holds the integer index i implicitly passed by DrawArrays or one of the other drawing commands defined in section 10.4." - 10.4. Drawing Commands Using Vertex Arrays: -> Page 352: "The index of any element transferred to the GL by DrawArraysOneInstance is referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID. The vertex ID of the ith element transferred is first + i." -> Page 355: "The index of any element transferred to the GL by DrawElementsOneInstance is referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID. The vertex ID of the ith element transferred is the sum of basevertex and the value stored in the currently bound element array buffer at offset indices + i." Antia Puentes (3): compiler: Add SYSTEM_VALUE_FIRST_VERTEX and instrinsics intel: emit first_vertex and reorder the VE' components i965: gl_BaseVertex must be zero for non-indexed draw calls Neil Roberts (4): intel/compiler: Add a uses_firstvertex flag i965: Let nir lower gl_VertexID instead of the linker nir: Offset vertex_id by first_vertex instead of base_vertex spirv: Lower BaseVertex to FIRST_VERTEX instead of BASE_VERTEX src/compiler/nir/nir.c| 4 +++ src/compiler/nir/nir_gather_info.c| 1 + src/compiler/nir/nir_intrinsics.h | 1 + src/compiler/nir/nir_lower_system_values.c| 2 +- src/compiler/shader_enums.c | 1 + src/compiler/shader_enums.h | 15 +++ src/compiler/spirv/vtn_variables.c| 2 +- src/intel/compiler/brw_compiler.h | 1 + src/intel/compiler/brw_nir.c | 11 +--- src/intel/compiler/brw_vec4.cpp | 17 src/intel/vulkan/genX_cmd_buffer.c| 8 +++--- src/intel/vulkan/genX_pipeline.c | 4 +-- src/mesa/drivers/dri/i965/brw_context.c | 3 ++- src/mesa/drivers/dri/i965/brw_context.h | 36 ++--- src/mesa/drivers/dri/i965/brw_draw.c | 33 --- src/mesa/drivers/dri/i965/brw_draw_upload.c | 13 - src/mesa/drivers/dri/i965/genX_state_upload.c | 39 +++ 17 files changed, 132 insertions(+), 59 deletions(-) -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 2/7] intel/compiler: Add a uses_firstvertex flag
From: Neil Roberts Reviewed-by: Kenneth Graunke --- src/intel/compiler/brw_compiler.h | 1 + src/intel/compiler/brw_vec4.cpp | 4 2 files changed, 5 insertions(+) diff --git a/src/intel/compiler/brw_compiler.h b/src/intel/compiler/brw_compiler.h index 28aed833245..6661427e982 100644 --- a/src/intel/compiler/brw_compiler.h +++ b/src/intel/compiler/brw_compiler.h @@ -967,6 +967,7 @@ struct brw_vs_prog_data { bool uses_vertexid; bool uses_instanceid; bool uses_basevertex; + bool uses_firstvertex; bool uses_baseinstance; bool uses_drawid; }; diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp index 73c40ad6009..14f930e0264 100644 --- a/src/intel/compiler/brw_vec4.cpp +++ b/src/intel/compiler/brw_vec4.cpp @@ -2800,6 +2800,10 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX)) prog_data->uses_basevertex = true; + if (shader->info.system_values_read & + BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX)) + prog_data->uses_firstvertex = true; + if (shader->info.system_values_read & BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE)) prog_data->uses_baseinstance = true; -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 3/7] intel: emit first_vertex and reorder the VE' components
The new order is: * VE 1: * VE 2: Previously it was: * VE 1: * VE 2: The gl_BaseVertex is in a new location now, and firstvertex occupies the old gl_BaseVertex place. This way we can keep pointing to the indirect buffer for indirect draw calls. Reviewed-by: Neil Roberts --- src/intel/compiler/brw_nir.c | 11 +--- src/intel/compiler/brw_vec4.cpp | 13 + src/mesa/drivers/dri/i965/brw_context.h | 36 ++--- src/mesa/drivers/dri/i965/brw_draw.c | 26 +++--- src/mesa/drivers/dri/i965/brw_draw_upload.c | 13 - src/mesa/drivers/dri/i965/genX_state_upload.c | 39 +++ 6 files changed, 88 insertions(+), 50 deletions(-) diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c index 8f3f77f89ae..f702f5b8534 100644 --- a/src/intel/compiler/brw_nir.c +++ b/src/intel/compiler/brw_nir.c @@ -240,7 +240,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir, */ const bool has_sgvs = nir->info.system_values_read & - (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) | + (BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) | BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) | BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) | BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID)); @@ -262,6 +262,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir, nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr); switch (intrin->intrinsic) { +case nir_intrinsic_load_first_vertex: case nir_intrinsic_load_base_vertex: case nir_intrinsic_load_base_instance: case nir_intrinsic_load_vertex_id_zero_base: @@ -279,7 +280,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir, nir_intrinsic_set_base(load, num_inputs); switch (intrin->intrinsic) { - case nir_intrinsic_load_base_vertex: + case nir_intrinsic_load_first_vertex: nir_intrinsic_set_component(load, 0); break; case nir_intrinsic_load_base_instance: @@ -292,11 +293,15 @@ brw_nir_lower_vs_inputs(nir_shader *nir, nir_intrinsic_set_component(load, 3); break; case nir_intrinsic_load_draw_id: + case nir_intrinsic_load_base_vertex: /* gl_DrawID is stored right after gl_VertexID and friends * if any of them exist. */ nir_intrinsic_set_base(load, num_inputs + has_sgvs); - nir_intrinsic_set_component(load, 0); + if (intrin->intrinsic == nir_intrinsic_load_draw_id) + nir_intrinsic_set_component(load, 0); + else + nir_intrinsic_set_component(load, 1); break; default: unreachable("Invalid system value intrinsic"); diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp index 14f930e0264..70a197a9fa0 100644 --- a/src/intel/compiler/brw_vec4.cpp +++ b/src/intel/compiler/brw_vec4.cpp @@ -2789,13 +2789,19 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, * incoming vertex attribute. So, add an extra slot. */ if (shader->info.system_values_read & - (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) | + (BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) | BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) | BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) | BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))) { nr_attribute_slots++; } + if (shader->info.system_values_read & + (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) | +BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID))) { + nr_attribute_slots++; + } + if (shader->info.system_values_read & BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX)) prog_data->uses_basevertex = true; @@ -2816,12 +2822,9 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID)) prog_data->uses_instanceid = true; - /* gl_DrawID has its very own vec4 */ if (shader->info.system_values_read & - BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID)) { + BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID)) prog_data->uses_drawid = true; - nr_attribute_slots++; - } /* The 3DSTATE_VS documentation lists the lower bound on "Vertex URB Entry * Read Length" as 1 in vec4 mode, and 0 in SIMD8 mode. Empirically, in diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 06704838061..c6d176bc243 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -843,28 +843,44 @@ struct brw_context struct { struct { - /** The value of gl_BaseVertex for the current _mesa_prim. */ - int gl_basevertex;
[Mesa-dev] [PATCH v2 7/7] i965: gl_BaseVertex must be zero for non-indexed draw calls
- From the OpenGL 4.6 (11.1.3.9 Shader Inputs) specification: "gl_BaseVertex holds the integer value passed to the baseVertex parameter to the command that resulted in the current shader invocation. In the case where the command has no baseVertex parameter, the value of gl_BaseVertex is zero." - Fixes CTS tests: * KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysParameters * KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysInstancedParameters * KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters * KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysIndirectParameters * KHR-GL45.shader_draw_parameters_tests.MultiDrawArraysIndirectCountParameters Reviewed-by: Neil Roberts --- src/mesa/drivers/dri/i965/brw_draw.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 18d26cfafca..3d831e7d866 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -239,6 +239,13 @@ brw_emit_prim(struct brw_context *brw, prim->indirect_offset + 12); brw_load_register_mem(brw, GEN7_3DPRIM_START_INSTANCE, bo, prim->indirect_offset + 16); + + if (brw->draw.derived_draw_params_bo != NULL) { +brw_store_register_mem32(brw, brw->draw.derived_draw_params_bo, + GEN7_3DPRIM_BASE_VERTEX, + brw->draw.derived_draw_params_offset + 4); +brw_emit_mi_flush(brw); + } } else { brw_load_register_mem(brw, GEN7_3DPRIM_START_INSTANCE, bo, prim->indirect_offset + 12); @@ -803,7 +810,7 @@ brw_draw_single_prim(struct gl_context *ctx, * valid vs_prog_data, but we always flag BRW_NEW_VERTICES before * the loop. */ - const int new_basevertex = prim->indexed ? prim->basevertex : prim->start; + const int new_basevertex = prim->indexed ? prim->basevertex : 0; if (prim_id > 0 && (vs_prog_data->uses_drawid || -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 6/7] spirv: Lower BaseVertex to FIRST_VERTEX instead of BASE_VERTEX
From: Neil Roberts The base vertex in Vulkan is different from GL in that for non-indexed primitives the value is taken from the firstVertex parameter instead of being set to zero. This coincides with the new SYSTEM_VALUE_FIRST_VERTEX instead of BASE_VERTEX. --- src/compiler/spirv/vtn_variables.c | 2 +- src/intel/vulkan/genX_cmd_buffer.c | 8 src/intel/vulkan/genX_pipeline.c | 4 ++-- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/src/compiler/spirv/vtn_variables.c b/src/compiler/spirv/vtn_variables.c index c57f5541319..67833a4c134 100644 --- a/src/compiler/spirv/vtn_variables.c +++ b/src/compiler/spirv/vtn_variables.c @@ -1155,7 +1155,7 @@ vtn_get_builtin_location(struct vtn_builder *b, set_mode_system_value(mode); break; case SpvBuiltInBaseVertex: - *location = SYSTEM_VALUE_BASE_VERTEX; + *location = SYSTEM_VALUE_FIRST_VERTEX; set_mode_system_value(mode); break; case SpvBuiltInBaseInstance: diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c index ab5590d7cee..ac4882494be 100644 --- a/src/intel/vulkan/genX_cmd_buffer.c +++ b/src/intel/vulkan/genX_cmd_buffer.c @@ -2085,7 +2085,7 @@ void genX(CmdDraw)( genX(cmd_buffer_flush_state)(cmd_buffer); - if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance) + if (vs_prog_data->uses_firstvertex || vs_prog_data->uses_baseinstance) emit_base_vertex_instance(cmd_buffer, firstVertex, firstInstance); if (vs_prog_data->uses_drawid) emit_draw_index(cmd_buffer, 0); @@ -2123,7 +2123,7 @@ void genX(CmdDrawIndexed)( genX(cmd_buffer_flush_state)(cmd_buffer); - if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance) + if (vs_prog_data->uses_firstvertex || vs_prog_data->uses_baseinstance) emit_base_vertex_instance(cmd_buffer, vertexOffset, firstInstance); if (vs_prog_data->uses_drawid) emit_draw_index(cmd_buffer, 0); @@ -2279,7 +2279,7 @@ void genX(CmdDrawIndirect)( struct anv_bo *bo = buffer->bo; uint32_t bo_offset = buffer->offset + offset; - if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance) + if (vs_prog_data->uses_firstvertex || vs_prog_data->uses_baseinstance) emit_base_vertex_instance_bo(cmd_buffer, bo, bo_offset + 8); if (vs_prog_data->uses_drawid) emit_draw_index(cmd_buffer, i); @@ -2318,7 +2318,7 @@ void genX(CmdDrawIndexedIndirect)( uint32_t bo_offset = buffer->offset + offset; /* TODO: We need to stomp base vertex to 0 somehow */ - if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance) + if (vs_prog_data->uses_firstvertex || vs_prog_data->uses_baseinstance) emit_base_vertex_instance_bo(cmd_buffer, bo, bo_offset + 12); if (vs_prog_data->uses_drawid) emit_draw_index(cmd_buffer, i); diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c index 7e3a785c584..4f88fbaf656 100644 --- a/src/intel/vulkan/genX_pipeline.c +++ b/src/intel/vulkan/genX_pipeline.c @@ -97,7 +97,7 @@ emit_vertex_input(struct anv_pipeline *pipeline, const uint32_t elements_double = double_inputs_read >> VERT_ATTRIB_GENERIC0; const bool needs_svgs_elem = vs_prog_data->uses_vertexid || vs_prog_data->uses_instanceid || -vs_prog_data->uses_basevertex || +vs_prog_data->uses_firstvertex || vs_prog_data->uses_baseinstance; uint32_t elem_count = __builtin_popcount(elements) - @@ -177,7 +177,7 @@ emit_vertex_input(struct anv_pipeline *pipeline, * This means, that if we have BaseInstance, we need BaseVertex as * well. Just do all or nothing. */ - uint32_t base_ctrl = (vs_prog_data->uses_basevertex || + uint32_t base_ctrl = (vs_prog_data->uses_firstvertex || vs_prog_data->uses_baseinstance) ? VFCOMP_STORE_SRC : VFCOMP_STORE_0; -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 4/7] i965: Let nir lower gl_VertexID instead of the linker
From: Neil Roberts --- src/mesa/drivers/dri/i965/brw_context.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index b62852d90c8..249f62847b9 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -585,7 +585,8 @@ brw_initialize_context_constants(struct brw_context *brw) ctx->Const.QuadsFollowProvokingVertexConvention = false; ctx->Const.NativeIntegers = true; - ctx->Const.VertexID_is_zero_based = true; + /* This is lowered by NIR instead */ + ctx->Const.VertexID_is_zero_based = false; /* Regarding the CMP instruction, the Ivybridge PRM says: * -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 5/7] nir: Offset vertex_id by first_vertex instead of base_vertex
From: Neil Roberts base_vertex will be zero for non-indexed calls, but we need it to include the ‘first’ parameter. This is true for both GL and Vulkan. I think this patch will also affect freedreno and radeonsi. I believe if they are relying on this lowering then they are currently already broken because they will have the wrong values for gl_BaseVertex. However this patch will also make them break for gl_VertexID and they will need to be fixed to use firt_vertex instead. --- src/compiler/nir/nir_lower_system_values.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/compiler/nir/nir_lower_system_values.c b/src/compiler/nir/nir_lower_system_values.c index 3594f4ae5ce..6f4fb8233ab 100644 --- a/src/compiler/nir/nir_lower_system_values.c +++ b/src/compiler/nir/nir_lower_system_values.c @@ -105,7 +105,7 @@ convert_block(nir_block *block, nir_builder *b) if (b->shader->options->vertex_id_zero_based) { sysval = nir_iadd(b, nir_load_vertex_id_zero_base(b), - nir_load_base_vertex(b)); + nir_load_first_vertex(b)); } else { sysval = nir_load_vertex_id(b); } -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 1/7] compiler: Add SYSTEM_VALUE_FIRST_VERTEX and instrinsics
This VS system value will contain the value passed as for indexed draw calls or the value passed as for non-indexed draw calls. It will be used to calculate the gl_VertexID as SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus SYSTEM_VALUE_FIRST_VERTEX. Note that the current calculation which uses SYSTEM_VALUE_BASE_VERTEX is not right, as gl_BaseVertex should be zero for non-indexed calls. v2: use SYSTEM_VALUE_FIRST_VERTEX as name for the value, instead of SYSTEM_VALUE_BASE_VERTEX_ID (Kenneth). Reviewed-by: Neil Roberts Reviewed-by: Kenneth Graunke --- src/compiler/nir/nir.c | 4 src/compiler/nir/nir_gather_info.c | 1 + src/compiler/nir/nir_intrinsics.h | 1 + src/compiler/shader_enums.c| 1 + src/compiler/shader_enums.h| 15 +++ 5 files changed, 22 insertions(+) diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c index 7380bf436a8..29730f5fa86 100644 --- a/src/compiler/nir/nir.c +++ b/src/compiler/nir/nir.c @@ -1913,6 +1913,8 @@ nir_intrinsic_from_system_value(gl_system_value val) return nir_intrinsic_load_base_instance; case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE: return nir_intrinsic_load_vertex_id_zero_base; + case SYSTEM_VALUE_FIRST_VERTEX: + return nir_intrinsic_load_first_vertex; case SYSTEM_VALUE_BASE_VERTEX: return nir_intrinsic_load_base_vertex; case SYSTEM_VALUE_INVOCATION_ID: @@ -1982,6 +1984,8 @@ nir_system_value_from_intrinsic(nir_intrinsic_op intrin) return SYSTEM_VALUE_BASE_INSTANCE; case nir_intrinsic_load_vertex_id_zero_base: return SYSTEM_VALUE_VERTEX_ID_ZERO_BASE; + case nir_intrinsic_load_first_vertex: + return SYSTEM_VALUE_FIRST_VERTEX; case nir_intrinsic_load_base_vertex: return SYSTEM_VALUE_BASE_VERTEX; case nir_intrinsic_load_invocation_id: diff --git a/src/compiler/nir/nir_gather_info.c b/src/compiler/nir/nir_gather_info.c index 946939657ec..555ae77b1d3 100644 --- a/src/compiler/nir/nir_gather_info.c +++ b/src/compiler/nir/nir_gather_info.c @@ -247,6 +247,7 @@ gather_intrinsic_info(nir_intrinsic_instr *instr, nir_shader *shader) case nir_intrinsic_load_vertex_id: case nir_intrinsic_load_vertex_id_zero_base: case nir_intrinsic_load_base_vertex: + case nir_intrinsic_load_first_vertex: case nir_intrinsic_load_base_instance: case nir_intrinsic_load_instance_id: case nir_intrinsic_load_sample_id: diff --git a/src/compiler/nir/nir_intrinsics.h b/src/compiler/nir/nir_intrinsics.h index 20bef339ac4..a7770bf6a85 100644 --- a/src/compiler/nir/nir_intrinsics.h +++ b/src/compiler/nir/nir_intrinsics.h @@ -326,6 +326,7 @@ SYSTEM_VALUE(frag_coord, 4, 0, xx, xx, xx) SYSTEM_VALUE(front_face, 1, 0, xx, xx, xx) SYSTEM_VALUE(vertex_id, 1, 0, xx, xx, xx) SYSTEM_VALUE(vertex_id_zero_base, 1, 0, xx, xx, xx) +SYSTEM_VALUE(first_vertex, 1, 0, xx, xx, xx) SYSTEM_VALUE(base_vertex, 1, 0, xx, xx, xx) SYSTEM_VALUE(instance_id, 1, 0, xx, xx, xx) SYSTEM_VALUE(base_instance, 1, 0, xx, xx, xx) diff --git a/src/compiler/shader_enums.c b/src/compiler/shader_enums.c index 2179c475abd..5e123f29f37 100644 --- a/src/compiler/shader_enums.c +++ b/src/compiler/shader_enums.c @@ -214,6 +214,7 @@ gl_system_value_name(gl_system_value sysval) ENUM(SYSTEM_VALUE_INSTANCE_ID), ENUM(SYSTEM_VALUE_INSTANCE_INDEX), ENUM(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE), + ENUM(SYSTEM_VALUE_FIRST_VERTEX), ENUM(SYSTEM_VALUE_BASE_VERTEX), ENUM(SYSTEM_VALUE_BASE_INSTANCE), ENUM(SYSTEM_VALUE_DRAW_ID), diff --git a/src/compiler/shader_enums.h b/src/compiler/shader_enums.h index ffe551ab20f..76bb2cc4203 100644 --- a/src/compiler/shader_enums.h +++ b/src/compiler/shader_enums.h @@ -472,6 +472,21 @@ typedef enum */ SYSTEM_VALUE_BASE_VERTEX, + + /** +* Depending on the type of the draw call (indexed or non-indexed), +* is the value of \c basevertex passed to \c glDrawElementsBaseVertex and +* similar, or is the value of \c first passed to \c glDrawArrays and +* similar. +* +* \note +* It can be used to calculate the \c SYSTEM_VALUE_VERTEX_ID as +* \c SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus \c SYSTEM_VALUE_FIRST_VERTEX. +* +* \sa SYSTEM_VALUE_VERTEX_ID_ZERO_BASE, SYSTEM_VALUE_VERTEX_ID +*/ + SYSTEM_VALUE_FIRST_VERTEX, + /** * Value of \c baseinstance passed to instanced draw entry points * -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/9] compiler: Add new system value SYSTEM_VALUE_BASE_VERTEX_ID
Hi, On 30/11/17 21:43, Neil Roberts wrote: Kenneth Graunke writes: We have a number of similar names now: SYSTEM_VALUE_BASE_VERTEX SYSTEM_VALUE_BASE_VERTEX_ID SYSTEM_VALUE_VERTEX_ID SYSTEM_VALUE_VERTEX_ID_ZERO_BASE BASE_VERTEX and BASE_VERTEX_ID are really similar names, and honestly either one seems like it could be the name for gl_BaseVertex. I'm afraid it would be easy to mix them up by mistake. IMHO, it would be nice to pick a different word, just to keep some distinction between the two fairly related concepts... Perhaps SYSTEM_VALUE_FIRST_VERTEX...? That's only half the meaning, but it at least uses a different word, and makes you think "do I want BASE_VERTEX or FIRST_VERTEX?" Yes, naming this thing is really difficult. I’m not sure if you noticed, but for Vulkan, the BaseVertex builtin should actually have the value of BASE_VERTEX_ID unlike GL. So if we rename BASE_VERTEX to something without “base vertex” in it then it will still be confusing for Vulkan. So effectively the descriptive names are like: SYSTEM_VALUE_BASE_VERTEX_ON_GL_BUT_NOT_VULKAN SYSTEM_VALUE_BASE_VERTEX_ON_VULKAN_OR_OFFSET_FOR_VERTEX_ID_ON_GL I’m not sure whether that’s enough of an argument against FIRST_VERTEX though, so personally I don’t really mind either way. Antia, what do you think? I am fine renaming it to FIRST_VERTEX. I have sent a second version of the series with the renaming and addressing other feedback given by Kenneth. Thanks. Regards. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965: silence coverity warning
Reviewed-by: Antia Puentes On 09/10/17 16:37, Lionel Landwerlin wrote: Also makes this statement a bit clearer. CID: 1418920 Signed-off-by: Lionel Landwerlin --- src/mesa/drivers/dri/i965/brw_draw.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index c7ed7284501..7ebed98fcef 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -955,7 +955,7 @@ brw_draw_indirect_prims(struct gl_context *ctx, prim[draw_count - 1].end = 1; for (i = 0; i < draw_count; ++i, indirect_offset += stride) { prim[i].mode = mode; - prim[i].indexed = !!ib; + prim[i].indexed = ib != NULL; prim[i].indirect_offset = indirect_offset; prim[i].is_indirect = 1; prim[i].draw_id = i; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] intel: blorp: fix potential NULL pointer dereferences
Reviewed-by: Antia Puentes On 09/10/17 16:37, Lionel Landwerlin wrote: CID: 1418616, 1418607 Signed-off-by: Lionel Landwerlin --- src/intel/blorp/blorp_blit.c | 16 +++- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/src/intel/blorp/blorp_blit.c b/src/intel/blorp/blorp_blit.c index 11c2116a758..84e46d7d31b 100644 --- a/src/intel/blorp/blorp_blit.c +++ b/src/intel/blorp/blorp_blit.c @@ -2343,23 +2343,29 @@ blorp_surf_convert_to_uncompressed(const struct isl_device *isl_dev, */ blorp_surf_convert_to_single_slice(isl_dev, info); - if (width || height) { + if (width) { #ifndef NDEBUG uint32_t right_edge_px = info->tile_x_sa + *x + *width; - uint32_t bottom_edge_px = info->tile_y_sa + *y + *height; assert(*width % fmtl->bw == 0 || right_edge_px == info->surf.logical_level0_px.width); +#endif + *width = DIV_ROUND_UP(*width, fmtl->bw); + } + if (height) { +#ifndef NDEBUG + uint32_t bottom_edge_px = info->tile_y_sa + *y + *height; assert(*height % fmtl->bh == 0 || bottom_edge_px == info->surf.logical_level0_px.height); #endif - *width = DIV_ROUND_UP(*width, fmtl->bw); *height = DIV_ROUND_UP(*height, fmtl->bh); } - if (x || y) { + if (x) { assert(*x % fmtl->bw == 0); - assert(*y % fmtl->bh == 0); *x /= fmtl->bw; + } + if (y) { + assert(*y % fmtl->bh == 0); *y /= fmtl->bh; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] meson: fix typo in isl
Reviewed-by: Antia Puentes On 12/10/17 13:24, Elie Tournier wrote: Signed-off-by: Elie Tournier --- src/intel/isl/meson.build | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/intel/isl/meson.build b/src/intel/isl/meson.build index 789175e256..54024b4d11 100644 --- a/src/intel/isl/meson.build +++ b/src/intel/isl/meson.build @@ -101,5 +101,5 @@ if with_tests build_by_default : false, ) - test('isl_surf_get_imaage_offset', isl_surf_get_image_offset_test) + test('isl_surf_get_image_offset', isl_surf_get_image_offset_test) endif ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/5] intel/compiler: Add uses_is_indexed_draw flag
--- src/intel/compiler/brw_compiler.h | 1 + src/intel/compiler/brw_vec4.cpp | 4 2 files changed, 5 insertions(+) diff --git a/src/intel/compiler/brw_compiler.h b/src/intel/compiler/brw_compiler.h index 24196248b8e..e3bf535a519 100644 --- a/src/intel/compiler/brw_compiler.h +++ b/src/intel/compiler/brw_compiler.h @@ -978,6 +978,7 @@ struct brw_vs_prog_data { bool uses_vertexid; bool uses_instanceid; bool uses_basevertex; + bool uses_is_indexed_draw; bool uses_firstvertex; bool uses_baseinstance; bool uses_drawid; diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp index 1e384f5bf4d..e583c549204 100644 --- a/src/intel/compiler/brw_vec4.cpp +++ b/src/intel/compiler/brw_vec4.cpp @@ -2837,6 +2837,10 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX)) prog_data->uses_basevertex = true; + if (shader->info.system_values_read & + BITFIELD64_BIT(SYSTEM_VALUE_IS_INDEXED_DRAW)) + prog_data->uses_is_indexed_draw = true; + if (shader->info.system_values_read & BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX)) prog_data->uses_firstvertex = true; -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/5] intel: activate the gl_BaseVertex lowering
Surplus code related to the basevertex is removed. The Vertex Elements contain now: * VE 1: * VE 2: Also fixes unreachable message. Fixes OpenGL CTS tests: * KHR-GL46.shader_draw_parameters_tests.ShaderDrawArraysInstancedParameters * KHR-GL46.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters * KHR-GL46.shader_draw_parameters_tests.MultiDrawArraysIndirectCountParameters * KHR-GL46.shader_draw_parameters_tests.ShaderDrawArraysParameters * KHR-GL46.shader_draw_parameters_tests.ShaderMultiDrawArraysIndirectParameters Fixes Piglit tests: * arb_shader_draw_parameters-drawid-indirect baseinstance * arb_shader_draw_parameters-basevertex Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102678 --- src/intel/compiler/brw_compiler.c | 3 ++- src/intel/compiler/brw_compiler.h | 1 - src/intel/compiler/brw_fs_nir.cpp | 8 src/intel/compiler/brw_nir.c | 5 + src/intel/compiler/brw_vec4.cpp | 7 +-- src/mesa/drivers/dri/i965/brw_draw.c | 8 ++-- src/mesa/drivers/dri/i965/brw_draw_upload.c | 5 + src/mesa/drivers/dri/i965/genX_state_upload.c | 1 - 8 files changed, 11 insertions(+), 27 deletions(-) diff --git a/src/intel/compiler/brw_compiler.c b/src/intel/compiler/brw_compiler.c index d5f483798a9..6480dbefbf6 100644 --- a/src/intel/compiler/brw_compiler.c +++ b/src/intel/compiler/brw_compiler.c @@ -45,7 +45,8 @@ .lower_device_index_to_zero = true,\ .native_integers = true, \ .use_interpolated_input_intrinsics = true, \ - .vertex_id_zero_based = true + .vertex_id_zero_based = true, \ + .lower_base_vertex = true #define COMMON_SCALAR_OPTIONS \ .lower_pack_half_2x16 = true, \ diff --git a/src/intel/compiler/brw_compiler.h b/src/intel/compiler/brw_compiler.h index e3bf535a519..8b4e6fe2e29 100644 --- a/src/intel/compiler/brw_compiler.h +++ b/src/intel/compiler/brw_compiler.h @@ -977,7 +977,6 @@ struct brw_vs_prog_data { bool uses_vertexid; bool uses_instanceid; - bool uses_basevertex; bool uses_is_indexed_draw; bool uses_firstvertex; bool uses_baseinstance; diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index 22beb0e00d1..02aaf144019 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -112,10 +112,10 @@ emit_system_values_block(nir_block *block, fs_visitor *v) nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr); switch (intrin->intrinsic) { case nir_intrinsic_load_vertex_id: - unreachable("should be lowered by lower_vertex_id()."); + case nir_intrinsic_load_base_vertex: + unreachable("should be lowered by nir_lower_system_values()."); case nir_intrinsic_load_vertex_id_zero_base: - case nir_intrinsic_load_base_vertex: case nir_intrinsic_load_is_indexed_draw: case nir_intrinsic_load_first_vertex: case nir_intrinsic_load_instance_id: @@ -2420,10 +2420,10 @@ fs_visitor::nir_emit_vs_intrinsic(const fs_builder &bld, switch (instr->intrinsic) { case nir_intrinsic_load_vertex_id: - unreachable("should be lowered by lower_vertex_id()"); + case nir_intrinsic_load_base_vertex: + unreachable("should be lowered by nir_lower_system_values()"); case nir_intrinsic_load_vertex_id_zero_base: - case nir_intrinsic_load_base_vertex: case nir_intrinsic_load_instance_id: case nir_intrinsic_load_base_instance: case nir_intrinsic_load_draw_id: { diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c index a624deb6d2a..9998c59586e 100644 --- a/src/intel/compiler/brw_nir.c +++ b/src/intel/compiler/brw_nir.c @@ -238,8 +238,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir, */ const bool has_sgvs = nir->info.system_values_read & - (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) | - BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) | + (BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) | BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) | BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) | BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID)); @@ -261,7 +260,6 @@ brw_nir_lower_vs_inputs(nir_shader *nir, nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr); switch (intrin->intrinsic) { -case nir_intrinsic_load_base_vertex: case nir_intrinsic_load_first_vertex: case nir_intrinsic_load_base_instance: case nir_intrinsic_load_vertex_id_zero_base: @@ -280,7 +278,6 @@ brw_nir_lower_vs_inputs(nir_shader *nir, nir_intrinsic_set_base(load, num_inputs); switch (intrin->intrinsic) { -
[Mesa-dev] [PATCH 1/5] compiler: Add SYSTEM_VALUE_IS_INDEXED_DRAW and instrinsics
This VS system value contains if the draw command used to start the rendering was an indexed draw command or a non-indexed one (~0/0 respectively). Useful to calculate the gl_BaseVertex as: (SYSTEM_VALUE_IS_INDEXED_DRAW & SYSTEM_VALUE_FIRST_VERTEX). --- src/compiler/nir/nir.c | 4 src/compiler/nir/nir_gather_info.c | 1 + src/compiler/nir/nir_intrinsics.py | 1 + src/compiler/shader_enums.c| 1 + src/compiler/shader_enums.h| 7 +++ 5 files changed, 14 insertions(+) diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c index ea28fbd1af5..dc1c560319e 100644 --- a/src/compiler/nir/nir.c +++ b/src/compiler/nir/nir.c @@ -1991,6 +1991,8 @@ nir_intrinsic_from_system_value(gl_system_value val) return nir_intrinsic_load_base_instance; case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE: return nir_intrinsic_load_vertex_id_zero_base; + case SYSTEM_VALUE_IS_INDEXED_DRAW: + return nir_intrinsic_load_is_indexed_draw; case SYSTEM_VALUE_FIRST_VERTEX: return nir_intrinsic_load_first_vertex; case SYSTEM_VALUE_BASE_VERTEX: @@ -2070,6 +2072,8 @@ nir_system_value_from_intrinsic(nir_intrinsic_op intrin) return SYSTEM_VALUE_VERTEX_ID_ZERO_BASE; case nir_intrinsic_load_first_vertex: return SYSTEM_VALUE_FIRST_VERTEX; + case nir_intrinsic_load_is_indexed_draw: + return SYSTEM_VALUE_IS_INDEXED_DRAW; case nir_intrinsic_load_base_vertex: return SYSTEM_VALUE_BASE_VERTEX; case nir_intrinsic_load_invocation_id: diff --git a/src/compiler/nir/nir_gather_info.c b/src/compiler/nir/nir_gather_info.c index a6a699ab25f..dba9f199ec6 100644 --- a/src/compiler/nir/nir_gather_info.c +++ b/src/compiler/nir/nir_gather_info.c @@ -266,6 +266,7 @@ gather_intrinsic_info(nir_intrinsic_instr *instr, nir_shader *shader) case nir_intrinsic_load_vertex_id_zero_base: case nir_intrinsic_load_base_vertex: case nir_intrinsic_load_first_vertex: + case nir_intrinsic_load_is_indexed_draw: case nir_intrinsic_load_base_instance: case nir_intrinsic_load_instance_id: case nir_intrinsic_load_sample_id: diff --git a/src/compiler/nir/nir_intrinsics.py b/src/compiler/nir/nir_intrinsics.py index f26aaf35ee3..b1754a7e50e 100644 --- a/src/compiler/nir/nir_intrinsics.py +++ b/src/compiler/nir/nir_intrinsics.py @@ -414,6 +414,7 @@ system_value("front_face", 1) system_value("vertex_id", 1) system_value("vertex_id_zero_base", 1) system_value("first_vertex", 1) +system_value("is_indexed_draw", 1) system_value("base_vertex", 1) system_value("instance_id", 1) system_value("base_instance", 1) diff --git a/src/compiler/shader_enums.c b/src/compiler/shader_enums.c index ebee076b43c..d596d7d04a0 100644 --- a/src/compiler/shader_enums.c +++ b/src/compiler/shader_enums.c @@ -217,6 +217,7 @@ gl_system_value_name(gl_system_value sysval) ENUM(SYSTEM_VALUE_INSTANCE_INDEX), ENUM(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE), ENUM(SYSTEM_VALUE_FIRST_VERTEX), + ENUM(SYSTEM_VALUE_IS_INDEXED_DRAW), ENUM(SYSTEM_VALUE_BASE_VERTEX), ENUM(SYSTEM_VALUE_BASE_INSTANCE), ENUM(SYSTEM_VALUE_DRAW_ID), diff --git a/src/compiler/shader_enums.h b/src/compiler/shader_enums.h index 8a277a14f21..1ef4d5a33d0 100644 --- a/src/compiler/shader_enums.h +++ b/src/compiler/shader_enums.h @@ -517,6 +517,13 @@ typedef enum */ SYSTEM_VALUE_FIRST_VERTEX, + /** +* If the Draw command used to start the rendering was an indexed draw +* or not (~0/0). Useful to calculate \c SYSTEM_VALUE_BASE_VERTEX as +* \c SYSTEM_VALUE_IS_INDEXED_DRAW & \c SYSTEM_VALUE_FIRST_VERTEX. +*/ + SYSTEM_VALUE_IS_INDEXED_DRAW, + /** * Value of \c baseinstance passed to instanced draw entry points * -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/5] i965: Fix gl_BaseVertex for non-indexed draws
This series is the alternative to the discarded patch 6 of the series: https://patchwork.freedesktop.org/series/41307/ It fixes gl_BaseVertex in i965 by calculating it as: is_indexed_draw(~0/0) & firstvertex. I have run jenkins for the last patch of the series and the intermediate patch 3. No regressions found. Antia Puentes (5): compiler: Add SYSTEM_VALUE_IS_INDEXED_DRAW and instrinsics intel/compiler: Add uses_is_indexed_draw flag intel: emit is_indexed_draw in the same VE than gl_DrawID compiler/nir: Add conditional lowering for gl_BaseVertex intel: activate the gl_BaseVertex lowering src/compiler/nir/nir.c| 4 +++ src/compiler/nir/nir.h| 6 src/compiler/nir/nir_gather_info.c| 1 + src/compiler/nir/nir_intrinsics.py| 1 + src/compiler/nir/nir_lower_system_values.c| 15 ++ src/compiler/shader_enums.c | 1 + src/compiler/shader_enums.h | 7 + src/intel/compiler/brw_compiler.c | 3 +- src/intel/compiler/brw_compiler.h | 2 +- src/intel/compiler/brw_fs_nir.cpp | 10 --- src/intel/compiler/brw_nir.c | 16 ++- src/intel/compiler/brw_vec4.cpp | 21 -- src/mesa/drivers/dri/i965/brw_context.h | 31 ++-- src/mesa/drivers/dri/i965/brw_draw.c | 29 ++- src/mesa/drivers/dri/i965/brw_draw_upload.c | 13 - src/mesa/drivers/dri/i965/genX_state_upload.c | 41 +-- 16 files changed, 128 insertions(+), 73 deletions(-) -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/5] compiler/nir: Add conditional lowering for gl_BaseVertex
--- src/compiler/nir/nir.h | 6 ++ src/compiler/nir/nir_lower_system_values.c | 15 +++ 2 files changed, 21 insertions(+) diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h index f3326e6df94..1b1dd4dd31b 100644 --- a/src/compiler/nir/nir.h +++ b/src/compiler/nir/nir.h @@ -1933,6 +1933,12 @@ typedef struct nir_shader_compiler_options { /* Indicates that the driver only has zero-based vertex id */ bool vertex_id_zero_based; + /** +* If enabled, gl_BaseVertex will be lowered as: +* is_indexed_draw (~0/0) & firstvertex +*/ + bool lower_base_vertex; + bool lower_cs_local_index_from_id; bool lower_device_index_to_zero; diff --git a/src/compiler/nir/nir_lower_system_values.c b/src/compiler/nir/nir_lower_system_values.c index 47709e9887b..487da042620 100644 --- a/src/compiler/nir/nir_lower_system_values.c +++ b/src/compiler/nir/nir_lower_system_values.c @@ -121,6 +121,21 @@ convert_block(nir_block *block, nir_builder *b) } break; + case SYSTEM_VALUE_BASE_VERTEX: + /** + * From the OpenGL 4.6 (11.1.3.9 Shader Inputs) specification: + * + * "gl_BaseVertex holds the integer value passed to the baseVertex + * parameter to the command that resulted in the current shader + * invocation. In the case where the command has no baseVertex + * parameter, the value of gl_BaseVertex is zero." + */ + if (b->shader->options->lower_base_vertex) +sysval = nir_iand(b, + nir_load_is_indexed_draw(b), + nir_load_first_vertex(b)); + break; + case SYSTEM_VALUE_INSTANCE_INDEX: sysval = nir_iadd(b, nir_load_instance_id(b), -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/5] intel: emit is_indexed_draw in the same VE than gl_DrawID
The Vertex Elements are now: * VE 1: * VE 2: VE1 is it kept as it was before, VE2 additionally contains the new system value. --- src/intel/compiler/brw_fs_nir.cpp | 2 ++ src/intel/compiler/brw_nir.c | 11 +-- src/intel/compiler/brw_vec4.cpp | 14 + src/mesa/drivers/dri/i965/brw_context.h | 31 +++- src/mesa/drivers/dri/i965/brw_draw.c | 21 +- src/mesa/drivers/dri/i965/brw_draw_upload.c | 8 ++--- src/mesa/drivers/dri/i965/genX_state_upload.c | 42 +-- 7 files changed, 80 insertions(+), 49 deletions(-) diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index 9698a0111ef..22beb0e00d1 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -116,6 +116,7 @@ emit_system_values_block(nir_block *block, fs_visitor *v) case nir_intrinsic_load_vertex_id_zero_base: case nir_intrinsic_load_base_vertex: + case nir_intrinsic_load_is_indexed_draw: case nir_intrinsic_load_first_vertex: case nir_intrinsic_load_instance_id: case nir_intrinsic_load_base_instance: @@ -2460,6 +2461,7 @@ fs_visitor::nir_emit_vs_intrinsic(const fs_builder &bld, } case nir_intrinsic_load_first_vertex: + case nir_intrinsic_load_is_indexed_draw: unreachable("lowered by brw_nir_lower_vs_inputs"); default: diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c index 16b0d86814f..a624deb6d2a 100644 --- a/src/intel/compiler/brw_nir.c +++ b/src/intel/compiler/brw_nir.c @@ -266,6 +266,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir, case nir_intrinsic_load_base_instance: case nir_intrinsic_load_vertex_id_zero_base: case nir_intrinsic_load_instance_id: +case nir_intrinsic_load_is_indexed_draw: case nir_intrinsic_load_draw_id: { b.cursor = nir_after_instr(&intrin->instr); @@ -293,11 +294,15 @@ brw_nir_lower_vs_inputs(nir_shader *nir, nir_intrinsic_set_component(load, 3); break; case nir_intrinsic_load_draw_id: - /* gl_DrawID is stored right after gl_VertexID and friends - * if any of them exist. + case nir_intrinsic_load_is_indexed_draw: + /* gl_DrawID and IsIndexedDraw are stored right after + * gl_VertexID and friends if any of them exist. */ nir_intrinsic_set_base(load, num_inputs + has_sgvs); - nir_intrinsic_set_component(load, 0); + if (intrin->intrinsic == nir_intrinsic_load_draw_id) + nir_intrinsic_set_component(load, 0); + else + nir_intrinsic_set_component(load, 1); break; default: unreachable("Invalid system value intrinsic"); diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp index e583c549204..898df90225f 100644 --- a/src/intel/compiler/brw_vec4.cpp +++ b/src/intel/compiler/brw_vec4.cpp @@ -2833,6 +2833,13 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, nr_attribute_slots++; } + /* gl_DrawID and IsIndexedDraw share its very own vec4 */ + if (shader->info.system_values_read & + (BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID) | +BITFIELD64_BIT(SYSTEM_VALUE_IS_INDEXED_DRAW))) { + nr_attribute_slots++; + } + if (shader->info.system_values_read & BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX)) prog_data->uses_basevertex = true; @@ -2857,12 +2864,9 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID)) prog_data->uses_instanceid = true; - /* gl_DrawID has its very own vec4 */ if (shader->info.system_values_read & - BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID)) { - prog_data->uses_drawid = true; - nr_attribute_slots++; - } + BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID)) + prog_data->uses_drawid = true; /* The 3DSTATE_VS documentation lists the lower bound on "Vertex URB Entry * Read Length" as 1 in vec4 mode, and 0 in SIMD8 mode. Empirically, in diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 1e6a45eee1f..be43eab43cc 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -900,20 +900,35 @@ struct brw_context } params; /** - * Buffer and offset used for GL_ARB_shader_draw_parameters - * (for now, only gl_BaseVertex). + * Buffer and offset used for GL_ARB_shader_draw_parameters which will + * point to the indirect buffer for indirect draw calls. */ struct brw_bo *draw_params_bo; uint32_t draw_params_offset; + str
Re: [Mesa-dev] [PATCH] mesa: add missing RGB9_E5 format in _mesa_base_fbo_format
On 26/01/18 11:31, Antia Puentes wrote: On 25/01/18 18:56, Roland Scheidegger wrote: Am 25.01.2018 um 17:56 schrieb Roland Scheidegger: Am 25.01.2018 um 16:30 schrieb Michel Dänzer: On 2018-01-24 05:38 PM, Juan A. Suarez Romero wrote: This fixes KHR-GL45.internalformat.renderbuffer.rgb9_e5. --- src/mesa/main/fbobject.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c index d23916d1ad7..c72204e11a0 100644 --- a/src/mesa/main/fbobject.c +++ b/src/mesa/main/fbobject.c @@ -1976,6 +1976,9 @@ _mesa_base_fbo_format(const struct gl_context *ctx, GLenum internalFormat) ctx->Extensions.ARB_texture_float) || _mesa_is_gles3(ctx) /* EXT_color_buffer_float */ ) ? GL_RGBA : 0; + case GL_RGB9_E5: + return (_mesa_is_desktop_gl(ctx) && ctx->Extensions.EXT_texture_shared_exponent) + ? GL_RGB: 0; case GL_ALPHA16F_ARB: case GL_ALPHA32F_ARB: return ctx->API == API_OPENGL_COMPAT && Unfortunately, this broke the "spec@arb_internalformat_query2@samples and num_sample_counts pname checks" piglit tests with radeonsi and llvmpipe, see below. Any idea what might need to be done in Gallium to fix this? 32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 PIGLIT: {"subtest": {"GL_NUM_SAMPLE_COUNTS" : "fail"}} 32 bit failing case: pname = GL_SAMPLES, target = GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 32 bit failing case: pname = GL_SAMPLES, target = GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 32 bit failing case: pname = GL_SAMPLES, target = GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 64 bit failing case: pname = GL_SAMPLES, target = GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 64 bit failing case: pname = GL_SAMPLES, target = GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 64 bit failing case: pname = GL_SAMPLES, target = GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 PIGLIT: {"subtest": {"GL_SAMPLES" : "fail"}} PIGLIT: {"result": "fail" } Purely coincidentally, I was trying to clean up the formatquery code recently (should help some failures with r600 too), and I think these cleanups would fix it. Basically outright say "no" to target/pname combinations which don't make sense rather than trying to find a format suitable for another target and then asking the driver for the nonsense combination, plus some other small bits like not validating things again (sometimes, a third time...). Albeit it will cause some breakage with the piglit test, which I believe is a test error, but that might be open for debate... (For TEXTURE_BUFFER and the internalformat size/type queries, do you return valid values or unsupported? The problem here is ARB_tbo says you can't get these values via the equivalent GetTexLevelParameter queries, whereas with GL 3.1 you can. And internalformat_query2 says it returns "the same information" as GetTexLevelParameter, albeit it's not entirely true in any case since the equivalent of the internalformat stencil type doesn't even exist. My stance would be that valid values should be reported even without GL 3.1, but the piglit test thinks differently.) Err, actually this won't fix it I suppose - because rgb9e5 now is a valid fbo format. Was that commit really correct? It does not make sense to me, rgb9e5 cannot be a fbo/renderable format. Or was this just working around issues in formatquery.c (which I try to address with this patch)? I believe the commit is wrong, _mesa_base_fbo_format_ is used to validate the internalformat passed to RenderbufferStorage, which in the OpenGL 4.6 is said that nee
[Mesa-dev] [PATCH v3 1/8] i965: allocate a SGVS element when VertexID or InstanceID are read
From: Iago Toral Quiroga Although on gen8+ platforms we can in theory use 3DSTATE_VF_SGVS to put these beyond the last vertex element it seems that we still need to allocate the SVGS element, otherwise we have observed cases where we end up reading garbage. Specifically, the CTS test mentioned below was flaky with a fail rate of ~1% on some gen9+ platforms caused by reading garbage for the gl_InstanceID value. The flakyness goes away as soon as we start allocating the SVGS element. v2: - Do this for gen8+, not just gen9+, and pull the boolean outside the #if block (Jason) Fixes flaky test: KHR-GL45.vertex_attrib_64bit.limits_test Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104335 Reviewed-by: Jason Ekstrand --- src/mesa/drivers/dri/i965/genX_state_upload.c | 17 ++--- 1 file changed, 2 insertions(+), 15 deletions(-) diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c index 50ac5bc59ff..d0a980f9730 100644 --- a/src/mesa/drivers/dri/i965/genX_state_upload.c +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c @@ -486,26 +486,13 @@ genX(emit_vertices)(struct brw_context *brw) } else { brw_batch_emit(brw, GENX(3DSTATE_VF_SGVS), vfs); } +#endif - /* Normally we don't need an element for the SGVS attribute because the -* 3DSTATE_VF_SGVS instruction lets you store the generated attribute in an -* element that is past the list in 3DSTATE_VERTEX_ELEMENTS. However if -* we're using draw parameters then we need an element for the those -* values. Additionally if there is an edge flag element then the SGVS -* can't be inserted past that so we need a dummy element to ensure that -* the edge flag is the last one. -*/ - const bool needs_sgvs_element = (vs_prog_data->uses_basevertex || -vs_prog_data->uses_baseinstance || -((vs_prog_data->uses_instanceid || - vs_prog_data->uses_vertexid) - && uses_edge_flag)); -#else const bool needs_sgvs_element = (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance || vs_prog_data->uses_instanceid || vs_prog_data->uses_vertexid); -#endif + unsigned nr_elements = brw->vb.nr_enabled + needs_sgvs_element + vs_prog_data->uses_drawid; -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 4/8] intel: Handle firstvertex in an identical way to BaseVertex
Until we set gl_BaseVertex to zero for non-indexed draw calls both have an identical value. The Vertex Elements are kept like that: * VE 1: * VE 2: --- src/intel/compiler/brw_nir.c | 3 +++ src/intel/compiler/brw_vec4.cpp | 1 + src/mesa/drivers/dri/i965/brw_context.h | 8 ++-- src/mesa/drivers/dri/i965/brw_draw.c | 14 +- src/mesa/drivers/dri/i965/brw_draw_upload.c | 7 +-- src/mesa/drivers/dri/i965/genX_state_upload.c | 11 +++ 6 files changed, 31 insertions(+), 13 deletions(-) diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c index dbddef0d04d..34b1e44adf0 100644 --- a/src/intel/compiler/brw_nir.c +++ b/src/intel/compiler/brw_nir.c @@ -239,6 +239,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir, const bool has_sgvs = nir->info.system_values_read & (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) | + BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) | BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) | BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) | BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID)); @@ -261,6 +262,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir, switch (intrin->intrinsic) { case nir_intrinsic_load_base_vertex: +case nir_intrinsic_load_first_vertex: case nir_intrinsic_load_base_instance: case nir_intrinsic_load_vertex_id_zero_base: case nir_intrinsic_load_instance_id: @@ -278,6 +280,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir, nir_intrinsic_set_base(load, num_inputs); switch (intrin->intrinsic) { case nir_intrinsic_load_base_vertex: + case nir_intrinsic_load_first_vertex: nir_intrinsic_set_component(load, 0); break; case nir_intrinsic_load_base_instance: diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp index 36e17d77d47..06c79630119 100644 --- a/src/intel/compiler/brw_vec4.cpp +++ b/src/intel/compiler/brw_vec4.cpp @@ -2788,6 +2788,7 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, */ if (shader->info.system_values_read & (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) | +BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) | BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) | BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) | BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))) { diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 9046acd175c..0a20706567e 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -881,8 +881,12 @@ struct brw_context struct { struct { - /** The value of gl_BaseVertex for the current _mesa_prim. */ - int gl_basevertex; + /** + * Either the value of gl_BaseVertex for indexed draw calls or the + * value of the argument for non-indexed draw calls for the + * current _mesa_prim. + */ + int firstvertex; /** The value of gl_BaseInstance for the current _mesa_prim. */ int gl_baseinstance; diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 50cf8b12c74..a1a5161fd35 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -816,25 +816,29 @@ brw_draw_single_prim(struct gl_context *ctx, * always flag if the shader uses one of the values. For direct draws, * we only flag if the values change. */ - const int new_basevertex = + const int new_firstvertex = prim->indexed ? prim->basevertex : prim->start; const int new_baseinstance = prim->base_instance; const struct brw_vs_prog_data *vs_prog_data = brw_vs_prog_data(brw->vs.base.prog_data); if (prim_id > 0) { - const bool uses_draw_parameters = + const bool uses_firstvertex = vs_prog_data->uses_basevertex || + vs_prog_data->uses_firstvertex; + + const bool uses_draw_parameters = + uses_firstvertex || vs_prog_data->uses_baseinstance; if ((uses_draw_parameters && prim->is_indirect) || - (vs_prog_data->uses_basevertex && - brw->draw.params.gl_basevertex != new_basevertex) || + (uses_firstvertex && + brw->draw.params.firstvertex != new_firstvertex) || (vs_prog_data->uses_baseinstance && brw->draw.params.gl_baseinstance != new_baseinstance)) brw->ctx.NewDriverState |= BRW_NEW_VERTICES; } - brw->draw.params.gl_basevertex = new_basevertex; + brw->draw.params.firstvertex = new_firstvertex; brw->draw.params.gl_baseinstance = new_baseinstance; brw_bo_unreference(brw->draw.draw_params_bo); diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c b/src/mesa/drivers/dri/i965/brw_draw_upl
[Mesa-dev] [PATCH v3 7/8] nir: Offset vertex_id by first_vertex instead of base_vertex
From: Neil Roberts base_vertex will be zero for non-indexed calls and in that case we need vertex_id to be offset by the ‘first’ parameter instead. That is what we get with first_vertex. This is true for both GL and Vulkan. The freedreno driver is also setting vertex_id_zero_based on nir_options. In order to avoid breakage this patch switches the relevant code to handle SYSTEM_VALUE_FIRST_VERTEX so that it can retain the same behavior. v2: change a3xx/fd3_emit.c and a4xx/fd4_emit.c from SYSTEM_VALUE_BASE_VERTEX to SYSTEM_VALUE_FIRST_VERTEX (Kenneth). Cc: Rob Clark Cc: Marek Olšák --- src/compiler/nir/nir_lower_system_values.c | 2 +- src/gallium/drivers/freedreno/a3xx/fd3_emit.c| 2 +- src/gallium/drivers/freedreno/a4xx/fd4_emit.c| 2 +- src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 5 ++--- src/intel/vulkan/genX_cmd_buffer.c | 4 src/intel/vulkan/genX_pipeline.c | 4 +--- 6 files changed, 6 insertions(+), 13 deletions(-) diff --git a/src/compiler/nir/nir_lower_system_values.c b/src/compiler/nir/nir_lower_system_values.c index 3594f4ae5ce..6f4fb8233ab 100644 --- a/src/compiler/nir/nir_lower_system_values.c +++ b/src/compiler/nir/nir_lower_system_values.c @@ -105,7 +105,7 @@ convert_block(nir_block *block, nir_builder *b) if (b->shader->options->vertex_id_zero_based) { sysval = nir_iadd(b, nir_load_vertex_id_zero_base(b), - nir_load_base_vertex(b)); + nir_load_first_vertex(b)); } else { sysval = nir_load_vertex_id(b); } diff --git a/src/gallium/drivers/freedreno/a3xx/fd3_emit.c b/src/gallium/drivers/freedreno/a3xx/fd3_emit.c index b9e1af00e2c..3419ba86d46 100644 --- a/src/gallium/drivers/freedreno/a3xx/fd3_emit.c +++ b/src/gallium/drivers/freedreno/a3xx/fd3_emit.c @@ -374,7 +374,7 @@ fd3_emit_vertex_bufs(struct fd_ringbuffer *ring, struct fd3_emit *emit) continue; if (vp->inputs[i].sysval) { switch(vp->inputs[i].slot) { - case SYSTEM_VALUE_BASE_VERTEX: + case SYSTEM_VALUE_FIRST_VERTEX: /* handled elsewhere */ break; case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE: diff --git a/src/gallium/drivers/freedreno/a4xx/fd4_emit.c b/src/gallium/drivers/freedreno/a4xx/fd4_emit.c index 5fec2b6b08a..42268ceea71 100644 --- a/src/gallium/drivers/freedreno/a4xx/fd4_emit.c +++ b/src/gallium/drivers/freedreno/a4xx/fd4_emit.c @@ -378,7 +378,7 @@ fd4_emit_vertex_bufs(struct fd_ringbuffer *ring, struct fd4_emit *emit) continue; if (vp->inputs[i].sysval) { switch(vp->inputs[i].slot) { - case SYSTEM_VALUE_BASE_VERTEX: + case SYSTEM_VALUE_FIRST_VERTEX: /* handled elsewhere */ break; case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE: diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c index 15a3aa4c802..d3a8dbec14e 100644 --- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c +++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c @@ -2073,11 +2073,10 @@ emit_intrinsic(struct ir3_context *ctx, nir_intrinsic_instr *intr) ctx->ir->outputs[n] = src[i]; } break; - case nir_intrinsic_load_base_vertex: + case nir_intrinsic_load_first_vertex: if (!ctx->basevertex) { ctx->basevertex = create_driver_param(ctx, IR3_DP_VTXID_BASE); - add_sysval_input(ctx, SYSTEM_VALUE_BASE_VERTEX, - ctx->basevertex); + add_sysval_input(ctx, SYSTEM_VALUE_FIRST_VERTEX, ctx->basevertex); } dst[0] = ctx->basevertex; break; diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c index 9fc281bf4eb..d7dc14f387b 100644 --- a/src/intel/vulkan/genX_cmd_buffer.c +++ b/src/intel/vulkan/genX_cmd_buffer.c @@ -2224,7 +2224,6 @@ void genX(CmdDraw)( genX(cmd_buffer_flush_state)(cmd_buffer); if (vs_prog_data->uses_firstvertex || - vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance) emit_base_vertex_instance(cmd_buffer, firstVertex, firstInstance); if (vs_prog_data->uses_drawid) @@ -2264,7 +2263,6 @@ void genX(CmdDrawIndexed)( genX(cmd_buffer_flush_state)(cmd_buffer); if (vs_prog_data->uses_firstvertex || - vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance) emit_base_vertex_instance(cmd_buffer, vertexOffset, firstInstance); if (vs_prog_data->u
[Mesa-dev] [PATCH v3 5/8] spirv: Lower BaseVertex to FIRST_VERTEX instead of BASE_VERTEX
From: Neil Roberts The base vertex in Vulkan is different from GL in that for non-indexed primitives the value is taken from the firstVertex parameter instead of being set to zero. This coincides with the new SYSTEM_VALUE_FIRST_VERTEX instead of BASE_VERTEX. --- src/compiler/spirv/vtn_variables.c | 2 +- src/intel/vulkan/genX_cmd_buffer.c | 16 src/intel/vulkan/genX_pipeline.c | 2 ++ 3 files changed, 15 insertions(+), 5 deletions(-) diff --git a/src/compiler/spirv/vtn_variables.c b/src/compiler/spirv/vtn_variables.c index eb306d0c4a8..3e5686af1d9 100644 --- a/src/compiler/spirv/vtn_variables.c +++ b/src/compiler/spirv/vtn_variables.c @@ -1279,7 +1279,7 @@ vtn_get_builtin_location(struct vtn_builder *b, set_mode_system_value(b, mode); break; case SpvBuiltInBaseVertex: - *location = SYSTEM_VALUE_BASE_VERTEX; + *location = SYSTEM_VALUE_FIRST_VERTEX; set_mode_system_value(b, mode); break; case SpvBuiltInBaseInstance: diff --git a/src/intel/vulkan/genX_cmd_buffer.c b/src/intel/vulkan/genX_cmd_buffer.c index c23a54fb7b9..9fc281bf4eb 100644 --- a/src/intel/vulkan/genX_cmd_buffer.c +++ b/src/intel/vulkan/genX_cmd_buffer.c @@ -2223,7 +2223,9 @@ void genX(CmdDraw)( genX(cmd_buffer_flush_state)(cmd_buffer); - if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance) + if (vs_prog_data->uses_firstvertex || + vs_prog_data->uses_basevertex || + vs_prog_data->uses_baseinstance) emit_base_vertex_instance(cmd_buffer, firstVertex, firstInstance); if (vs_prog_data->uses_drawid) emit_draw_index(cmd_buffer, 0); @@ -2261,7 +2263,9 @@ void genX(CmdDrawIndexed)( genX(cmd_buffer_flush_state)(cmd_buffer); - if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance) + if (vs_prog_data->uses_firstvertex || + vs_prog_data->uses_basevertex || + vs_prog_data->uses_baseinstance) emit_base_vertex_instance(cmd_buffer, vertexOffset, firstInstance); if (vs_prog_data->uses_drawid) emit_draw_index(cmd_buffer, 0); @@ -2417,7 +2421,9 @@ void genX(CmdDrawIndirect)( struct anv_bo *bo = buffer->bo; uint32_t bo_offset = buffer->offset + offset; - if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance) + if (vs_prog_data->uses_firstvertex || + vs_prog_data->uses_basevertex || + vs_prog_data->uses_baseinstance) emit_base_vertex_instance_bo(cmd_buffer, bo, bo_offset + 8); if (vs_prog_data->uses_drawid) emit_draw_index(cmd_buffer, i); @@ -2456,7 +2462,9 @@ void genX(CmdDrawIndexedIndirect)( uint32_t bo_offset = buffer->offset + offset; /* TODO: We need to stomp base vertex to 0 somehow */ - if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance) + if (vs_prog_data->uses_firstvertex || + vs_prog_data->uses_basevertex || + vs_prog_data->uses_baseinstance) emit_base_vertex_instance_bo(cmd_buffer, bo, bo_offset + 12); if (vs_prog_data->uses_drawid) emit_draw_index(cmd_buffer, i); diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c index 82fdf206a95..5f4cf58b83d 100644 --- a/src/intel/vulkan/genX_pipeline.c +++ b/src/intel/vulkan/genX_pipeline.c @@ -98,6 +98,7 @@ emit_vertex_input(struct anv_pipeline *pipeline, const bool needs_svgs_elem = vs_prog_data->uses_vertexid || vs_prog_data->uses_instanceid || vs_prog_data->uses_basevertex || +vs_prog_data->uses_firstvertex || vs_prog_data->uses_baseinstance; uint32_t elem_count = __builtin_popcount(elements) - @@ -178,6 +179,7 @@ emit_vertex_input(struct anv_pipeline *pipeline, * well. Just do all or nothing. */ uint32_t base_ctrl = (vs_prog_data->uses_basevertex || +vs_prog_data->uses_firstvertex || vs_prog_data->uses_baseinstance) ? VFCOMP_STORE_SRC : VFCOMP_STORE_0; -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 6/8] i965: Don't request GLSL IR lowering of gl_VertexID
From: Ian Romanick Let the lowering in NIR handle it instead. This hurts one shader that occurs twice in shader-db (SynMark GSCloth) on IVB and HSW. No other shaders or platforms were affected. total cycles in shared programs: 253438422 -> 253438426 (0.00%) cycles in affected programs: 412 -> 416 (0.97%) helped: 0 HURT: 2 Signed-off-by: Ian Romanick Reviewed-by: Antia Puentes --- src/mesa/drivers/dri/i965/brw_context.c | 1 - 1 file changed, 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 9ed8bc64bb3..7775468f98a 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -590,7 +590,6 @@ brw_initialize_context_constants(struct brw_context *brw) ctx->Const.QuadsFollowProvokingVertexConvention = false; ctx->Const.NativeIntegers = true; - ctx->Const.VertexID_is_zero_based = true; /* Regarding the CMP instruction, the Ivybridge PRM says: * -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 3/8] intel/compiler: Add a uses_firstvertex flag
From: Neil Roberts Reviewed-by: Kenneth Graunke --- src/intel/compiler/brw_compiler.h | 1 + src/intel/compiler/brw_vec4.cpp | 4 2 files changed, 5 insertions(+) diff --git a/src/intel/compiler/brw_compiler.h b/src/intel/compiler/brw_compiler.h index b1086bbcee5..0afe5757945 100644 --- a/src/intel/compiler/brw_compiler.h +++ b/src/intel/compiler/brw_compiler.h @@ -966,6 +966,7 @@ struct brw_vs_prog_data { bool uses_vertexid; bool uses_instanceid; bool uses_basevertex; + bool uses_firstvertex; bool uses_baseinstance; bool uses_drawid; }; diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp index ad6d8f9d6bc..36e17d77d47 100644 --- a/src/intel/compiler/brw_vec4.cpp +++ b/src/intel/compiler/brw_vec4.cpp @@ -2798,6 +2798,10 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX)) prog_data->uses_basevertex = true; + if (shader->info.system_values_read & + BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX)) + prog_data->uses_firstvertex = true; + if (shader->info.system_values_read & BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE)) prog_data->uses_baseinstance = true; -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3 2/8] compiler: Add SYSTEM_VALUE_FIRST_VERTEX and instrinsics
This VS system value will contain the value passed as for indexed draw calls or the value passed as for non-indexed draw calls. It can be used to calculate the gl_VertexID as SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus SYSTEM_VALUE_FIRST_VERTEX. From the OpenGL 4.6 spec, 10.4 "Drawing Commands Using Vertex Arrays": - Page 352: "The index of any element transferred to the GL by DrawArraysOneInstance is referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID. The vertex ID of the ith element transferred is first + i." - Page 355: "The index of any element transferred to the GL by DrawElementsOneInstance is referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID. The vertex ID of the ith element transferred is the sum of basevertex and the value stored in the currently bound element array buffer at offset indices + i." Currently the gl_VertexID calculation uses SYSTEM_VALUE_BASE_VERTEX but this will have to change when the value of gl_BaseVertex is fixed. Currently its value is broken for non-indexed draw calls because it must be zero but we are setting it to . v2: use SYSTEM_VALUE_FIRST_VERTEX as name for the value, instead of SYSTEM_VALUE_BASE_VERTEX_ID (Kenneth). Reviewed-by: Neil Roberts Reviewed-by: Kenneth Graunke --- src/compiler/nir/nir.c | 4 src/compiler/nir/nir_gather_info.c | 1 + src/compiler/nir/nir_intrinsics.h | 1 + src/compiler/shader_enums.c| 1 + src/compiler/shader_enums.h| 14 ++ 5 files changed, 21 insertions(+) diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c index bdd8960403c..e69c2accbbf 100644 --- a/src/compiler/nir/nir.c +++ b/src/compiler/nir/nir.c @@ -1919,6 +1919,8 @@ nir_intrinsic_from_system_value(gl_system_value val) return nir_intrinsic_load_base_instance; case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE: return nir_intrinsic_load_vertex_id_zero_base; + case SYSTEM_VALUE_FIRST_VERTEX: + return nir_intrinsic_load_first_vertex; case SYSTEM_VALUE_BASE_VERTEX: return nir_intrinsic_load_base_vertex; case SYSTEM_VALUE_INVOCATION_ID: @@ -1990,6 +1992,8 @@ nir_system_value_from_intrinsic(nir_intrinsic_op intrin) return SYSTEM_VALUE_BASE_INSTANCE; case nir_intrinsic_load_vertex_id_zero_base: return SYSTEM_VALUE_VERTEX_ID_ZERO_BASE; + case nir_intrinsic_load_first_vertex: + return SYSTEM_VALUE_FIRST_VERTEX; case nir_intrinsic_load_base_vertex: return SYSTEM_VALUE_BASE_VERTEX; case nir_intrinsic_load_invocation_id: diff --git a/src/compiler/nir/nir_gather_info.c b/src/compiler/nir/nir_gather_info.c index 946939657ec..555ae77b1d3 100644 --- a/src/compiler/nir/nir_gather_info.c +++ b/src/compiler/nir/nir_gather_info.c @@ -247,6 +247,7 @@ gather_intrinsic_info(nir_intrinsic_instr *instr, nir_shader *shader) case nir_intrinsic_load_vertex_id: case nir_intrinsic_load_vertex_id_zero_base: case nir_intrinsic_load_base_vertex: + case nir_intrinsic_load_first_vertex: case nir_intrinsic_load_base_instance: case nir_intrinsic_load_instance_id: case nir_intrinsic_load_sample_id: diff --git a/src/compiler/nir/nir_intrinsics.h b/src/compiler/nir/nir_intrinsics.h index ede29277876..7d3421f0e30 100644 --- a/src/compiler/nir/nir_intrinsics.h +++ b/src/compiler/nir/nir_intrinsics.h @@ -333,6 +333,7 @@ SYSTEM_VALUE(frag_coord, 4, 0, xx, xx, xx) SYSTEM_VALUE(front_face, 1, 0, xx, xx, xx) SYSTEM_VALUE(vertex_id, 1, 0, xx, xx, xx) SYSTEM_VALUE(vertex_id_zero_base, 1, 0, xx, xx, xx) +SYSTEM_VALUE(first_vertex, 1, 0, xx, xx, xx) SYSTEM_VALUE(base_vertex, 1, 0, xx, xx, xx) SYSTEM_VALUE(instance_id, 1, 0, xx, xx, xx) SYSTEM_VALUE(base_instance, 1, 0, xx, xx, xx) diff --git a/src/compiler/shader_enums.c b/src/compiler/shader_enums.c index 2179c475abd..5e123f29f37 100644 --- a/src/compiler/shader_enums.c +++ b/src/compiler/shader_enums.c @@ -214,6 +214,7 @@ gl_system_value_name(gl_system_value sysval) ENUM(SYSTEM_VALUE_INSTANCE_ID), ENUM(SYSTEM_VALUE_INSTANCE_INDEX), ENUM(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE), + ENUM(SYSTEM_VALUE_FIRST_VERTEX), ENUM(SYSTEM_VALUE_BASE_VERTEX), ENUM(SYSTEM_VALUE_BASE_INSTANCE), ENUM(SYSTEM_VALUE_DRAW_ID), diff --git a/src/compiler/shader_enums.h b/src/compiler/shader_enums.h index ffe551ab20f..9f71194c146 100644 --- a/src/compiler/shader_enums.h +++ b/src/compiler/shader_enums.h @@ -472,6 +472,20 @@ typedef enum */ SYSTEM_VALUE_BASE_VERTEX, + /** +* Depending on the type of the draw call (indexed or non-indexed), +* is the value of \c basevertex passed to \c glDrawElementsBaseVertex and +* similar, or is the value of \c first passed to \c glDrawArrays and +* similar. +* +* \note +* It can be used to calculate the \c SYSTEM_VALUE_VERTEX_ID as +* \c SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus \c SYSTEM_VALUE_FIRST_VERTEX. +* +* \sa SYSTEM_VALUE_VERTEX_ID_ZERO_BASE, SYSTEM_VALUE_VERT
[Mesa-dev] [PATCH v3 8/8] i965: gl_BaseVertex must be zero for non-indexed draw calls
We keep 'firstvertex' as it is and move gl_BaseVertex to the drawID vertex element. The previous Vertex Elements order was: * VE 1: * VE 2: and now it is: * VE 1: * VE 2: To move the BaseVertex keeping VE1 as it is, allows to keep pointing the vertex buffer associated to VE 1 to the indirect buffer for indirect draw calls. From the OpenGL 4.6 (11.1.3.9 Shader Inputs) specification: "gl_BaseVertex holds the integer value passed to the baseVertex parameter to the command that resulted in the current shader invocation. In the case where the command has no baseVertex parameter, the value of gl_BaseVertex is zero." Fixes CTS tests: * KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysParameters * KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysInstancedParameters * KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters * KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysIndirectParameters * KHR-GL45.shader_draw_parameters_tests.MultiDrawArraysIndirectCountParameters Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102678 --- src/intel/compiler/brw_nir.c | 14 + src/intel/compiler/brw_vec4.cpp | 14 + src/mesa/drivers/dri/i965/brw_context.h | 32 ++- src/mesa/drivers/dri/i965/brw_draw.c | 45 ++- src/mesa/drivers/dri/i965/brw_draw_upload.c | 24 -- src/mesa/drivers/dri/i965/genX_state_upload.c | 38 +++--- 6 files changed, 105 insertions(+), 62 deletions(-) diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c index 34b1e44adf0..c10fa73f4fc 100644 --- a/src/intel/compiler/brw_nir.c +++ b/src/intel/compiler/brw_nir.c @@ -238,8 +238,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir, */ const bool has_sgvs = nir->info.system_values_read & - (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) | - BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) | + (BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) | BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) | BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) | BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID)); @@ -279,7 +278,6 @@ brw_nir_lower_vs_inputs(nir_shader *nir, nir_intrinsic_set_base(load, num_inputs); switch (intrin->intrinsic) { - case nir_intrinsic_load_base_vertex: case nir_intrinsic_load_first_vertex: nir_intrinsic_set_component(load, 0); break; @@ -293,11 +291,15 @@ brw_nir_lower_vs_inputs(nir_shader *nir, nir_intrinsic_set_component(load, 3); break; case nir_intrinsic_load_draw_id: - /* gl_DrawID is stored right after gl_VertexID and friends - * if any of them exist. + case nir_intrinsic_load_base_vertex: + /* gl_DrawID and gl_BaseVertex are stored right after + gl_VertexID and friends if any of them exist. */ nir_intrinsic_set_base(load, num_inputs + has_sgvs); - nir_intrinsic_set_component(load, 0); + if (intrin->intrinsic == nir_intrinsic_load_draw_id) + nir_intrinsic_set_component(load, 0); + else + nir_intrinsic_set_component(load, 1); break; default: unreachable("Invalid system value intrinsic"); diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp index 06c79630119..3b4b3c01b57 100644 --- a/src/intel/compiler/brw_vec4.cpp +++ b/src/intel/compiler/brw_vec4.cpp @@ -2787,14 +2787,19 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, * incoming vertex attribute. So, add an extra slot. */ if (shader->info.system_values_read & - (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) | -BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) | + (BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) | BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) | BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) | BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))) { nr_attribute_slots++; } + if (shader->info.system_values_read & + (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) | +BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID))) { + nr_attribute_slots++; + } + if (shader->info.system_values_read & BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX)) prog_data->uses_basevertex = true; @@ -2815,12 +2820,9 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID)) prog_data->uses_instanceid = true; - /* gl_DrawID has its very own vec4 */ if (shader->info.system_values_read & - BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID)) { + BITFIELD64_BIT(SYSTEM_VALUE_DRA
Re: [Mesa-dev] [PATCH] mesa: add missing RGB9_E5 format in _mesa_base_fbo_format
On 25/01/18 18:56, Roland Scheidegger wrote: Am 25.01.2018 um 17:56 schrieb Roland Scheidegger: Am 25.01.2018 um 16:30 schrieb Michel Dänzer: On 2018-01-24 05:38 PM, Juan A. Suarez Romero wrote: This fixes KHR-GL45.internalformat.renderbuffer.rgb9_e5. --- src/mesa/main/fbobject.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c index d23916d1ad7..c72204e11a0 100644 --- a/src/mesa/main/fbobject.c +++ b/src/mesa/main/fbobject.c @@ -1976,6 +1976,9 @@ _mesa_base_fbo_format(const struct gl_context *ctx, GLenum internalFormat) ctx->Extensions.ARB_texture_float) || _mesa_is_gles3(ctx) /* EXT_color_buffer_float */ ) ? GL_RGBA : 0; + case GL_RGB9_E5: + return (_mesa_is_desktop_gl(ctx) && ctx->Extensions.EXT_texture_shared_exponent) + ? GL_RGB: 0; case GL_ALPHA16F_ARB: case GL_ALPHA32F_ARB: return ctx->API == API_OPENGL_COMPAT && Unfortunately, this broke the "spec@arb_internalformat_query2@samples and num_sample_counts pname checks" piglit tests with radeonsi and llvmpipe, see below. Any idea what might need to be done in Gallium to fix this? 32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 PIGLIT: {"subtest": {"GL_NUM_SAMPLE_COUNTS" : "fail"}} 32 bit failing case: pname = GL_SAMPLES, target = GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 32 bit failing case: pname = GL_SAMPLES, target = GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 32 bit failing case: pname = GL_SAMPLES, target = GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 64 bit failing case: pname = GL_SAMPLES, target = GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 64 bit failing case: pname = GL_SAMPLES, target = GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 64 bit failing case: pname = GL_SAMPLES, target = GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 PIGLIT: {"subtest": {"GL_SAMPLES" : "fail"}} PIGLIT: {"result": "fail" } Purely coincidentally, I was trying to clean up the formatquery code recently (should help some failures with r600 too), and I think these cleanups would fix it. Basically outright say "no" to target/pname combinations which don't make sense rather than trying to find a format suitable for another target and then asking the driver for the nonsense combination, plus some other small bits like not validating things again (sometimes, a third time...). Albeit it will cause some breakage with the piglit test, which I believe is a test error, but that might be open for debate... (For TEXTURE_BUFFER and the internalformat size/type queries, do you return valid values or unsupported? The problem here is ARB_tbo says you can't get these values via the equivalent GetTexLevelParameter queries, whereas with GL 3.1 you can. And internalformat_query2 says it returns "the same information" as GetTexLevelParameter, albeit it's not entirely true in any case since the equivalent of the internalformat stencil type doesn't even exist. My stance would be that valid values should be reported even without GL 3.1, but the piglit test thinks differently.) Err, actually this won't fix it I suppose - because rgb9e5 now is a valid fbo format. Was that commit really correct? It does not make sense to me, rgb9e5 cannot be a fbo/renderable format. Or was this just working around issues in formatquery.c (which I try to address with this patch)? I believe the commit is wrong, _mesa_base_fbo_format_ is used to validate the internalformat passed to RenderbufferStorage, which in the OpenGL 4.6 is said that needs to be: An INVALID_ENUM error is generated if internalformat is not one of the color-renderable, depth-renderable, or stencil-renderable formats defined in section 9.4. and that's exactly the error returned by the
[Mesa-dev] [PATCH] Revert "mesa: add missing RGB9_E5 format in _mesa_base_fbo_format"
This reverts commit 513c2263cbff45edb105c7b46e58f316e06746ab. _mesa_base_fbo_format_ is used to validate the internalformat passed to RenderbufferStorage, which in the OpenGL 4.6 is said: "An INVALID_ENUM error is generated if internalformat is not one of the color-renderable, depth-renderable, or stencil-renderable formats defined in section 9.4." RGB9_E5 format is not renderable, as stated in the same specification (Bug 9338). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104794 Cc: Juan A. Suarez Romero Cc: Kenneth Graunke --- src/mesa/main/fbobject.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c index c72204e11a0..d23916d1ad7 100644 --- a/src/mesa/main/fbobject.c +++ b/src/mesa/main/fbobject.c @@ -1976,9 +1976,6 @@ _mesa_base_fbo_format(const struct gl_context *ctx, GLenum internalFormat) ctx->Extensions.ARB_texture_float) || _mesa_is_gles3(ctx) /* EXT_color_buffer_float */ ) ? GL_RGBA : 0; - case GL_RGB9_E5: - return (_mesa_is_desktop_gl(ctx) && ctx->Extensions.EXT_texture_shared_exponent) - ? GL_RGB: 0; case GL_ALPHA16F_ARB: case GL_ALPHA32F_ARB: return ctx->API == API_OPENGL_COMPAT && -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: add missing RGB9_E5 format in _mesa_base_fbo_format
Hi Roland, On 26/01/18 13:57, Roland Scheidegger wrote: Am 26.01.2018 um 11:31 schrieb Antia Puentes: On 25/01/18 18:56, Roland Scheidegger wrote: Am 25.01.2018 um 17:56 schrieb Roland Scheidegger: Am 25.01.2018 um 16:30 schrieb Michel Dänzer: On 2018-01-24 05:38 PM, Juan A. Suarez Romero wrote: This fixes KHR-GL45.internalformat.renderbuffer.rgb9_e5. --- src/mesa/main/fbobject.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c index d23916d1ad7..c72204e11a0 100644 --- a/src/mesa/main/fbobject.c +++ b/src/mesa/main/fbobject.c @@ -1976,6 +1976,9 @@ _mesa_base_fbo_format(const struct gl_context *ctx, GLenum internalFormat) ctx->Extensions.ARB_texture_float) || _mesa_is_gles3(ctx) /* EXT_color_buffer_float */ ) ? GL_RGBA : 0; + case GL_RGB9_E5: + return (_mesa_is_desktop_gl(ctx) && ctx->Extensions.EXT_texture_shared_exponent) + ? GL_RGB: 0; case GL_ALPHA16F_ARB: case GL_ALPHA32F_ARB: return ctx->API == API_OPENGL_COMPAT && Unfortunately, this broke the "spec@arb_internalformat_query2@samples and num_sample_counts pname checks" piglit tests with radeonsi and llvmpipe, see below. Any idea what might need to be done in Gallium to fix this? 32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 PIGLIT: {"subtest": {"GL_NUM_SAMPLE_COUNTS" : "fail"}} 32 bit failing case: pname = GL_SAMPLES, target = GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 32 bit failing case: pname = GL_SAMPLES, target = GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 32 bit failing case: pname = GL_SAMPLES, target = GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 64 bit failing case: pname = GL_SAMPLES, target = GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 64 bit failing case: pname = GL_SAMPLES, target = GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 64 bit failing case: pname = GL_SAMPLES, target = GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 PIGLIT: {"subtest": {"GL_SAMPLES" : "fail"}} PIGLIT: {"result": "fail" } Purely coincidentally, I was trying to clean up the formatquery code recently (should help some failures with r600 too), and I think these cleanups would fix it. Basically outright say "no" to target/pname combinations which don't make sense rather than trying to find a format suitable for another target and then asking the driver for the nonsense combination, plus some other small bits like not validating things again (sometimes, a third time...). Albeit it will cause some breakage with the piglit test, which I believe is a test error, but that might be open for debate... (For TEXTURE_BUFFER and the internalformat size/type queries, do you return valid values or unsupported? The problem here is ARB_tbo says you can't get these values via the equivalent GetTexLevelParameter queries, whereas with GL 3.1 you can. And internalformat_query2 says it returns "the same information" as GetTexLevelParameter, albeit it's not entirely true in any case since the equivalent of the internalformat stencil type doesn't even exist. My stance would be that valid values should be reported even without GL 3.1, but the piglit test thinks differently.) Err, actually this won't fix it I suppose - because rgb9e5 now is a valid fbo format. Was that commit really correct? It does not make sense to me, rgb9e5 cannot be a fbo/renderable format. Or was this just working around issues in formatquery.c (which I try to address with this patch)? I believe the commit is wrong, _mesa_base_fbo_format_ is used to validate the internalformat passed to RenderbufferStora
Re: [Mesa-dev] [PATCH] mesa: add missing RGB9_E5 format in _mesa_base_fbo_format
On 26/01/18 14:19, Roland Scheidegger wrote: Am 26.01.2018 um 14:13 schrieb Antia Puentes: Hi Roland, On 26/01/18 13:57, Roland Scheidegger wrote: Am 26.01.2018 um 11:31 schrieb Antia Puentes: On 25/01/18 18:56, Roland Scheidegger wrote: Am 25.01.2018 um 17:56 schrieb Roland Scheidegger: Am 25.01.2018 um 16:30 schrieb Michel Dänzer: On 2018-01-24 05:38 PM, Juan A. Suarez Romero wrote: This fixes KHR-GL45.internalformat.renderbuffer.rgb9_e5. --- src/mesa/main/fbobject.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c index d23916d1ad7..c72204e11a0 100644 --- a/src/mesa/main/fbobject.c +++ b/src/mesa/main/fbobject.c @@ -1976,6 +1976,9 @@ _mesa_base_fbo_format(const struct gl_context *ctx, GLenum internalFormat) ctx->Extensions.ARB_texture_float) || _mesa_is_gles3(ctx) /* EXT_color_buffer_float */ ) ? GL_RGBA : 0; + case GL_RGB9_E5: + return (_mesa_is_desktop_gl(ctx) && ctx->Extensions.EXT_texture_shared_exponent) + ? GL_RGB: 0; case GL_ALPHA16F_ARB: case GL_ALPHA32F_ARB: return ctx->API == API_OPENGL_COMPAT && Unfortunately, this broke the "spec@arb_internalformat_query2@samples and num_sample_counts pname checks" piglit tests with radeonsi and llvmpipe, see below. Any idea what might need to be done in Gallium to fix this? 32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 PIGLIT: {"subtest": {"GL_NUM_SAMPLE_COUNTS" : "fail"}} 32 bit failing case: pname = GL_SAMPLES, target = GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 32 bit failing case: pname = GL_SAMPLES, target = GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 32 bit failing case: pname = GL_SAMPLES, target = GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 64 bit failing case: pname = GL_SAMPLES, target = GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 64 bit failing case: pname = GL_SAMPLES, target = GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 64 bit failing case: pname = GL_SAMPLES, target = GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1 PIGLIT: {"subtest": {"GL_SAMPLES" : "fail"}} PIGLIT: {"result": "fail" } Purely coincidentally, I was trying to clean up the formatquery code recently (should help some failures with r600 too), and I think these cleanups would fix it. Basically outright say "no" to target/pname combinations which don't make sense rather than trying to find a format suitable for another target and then asking the driver for the nonsense combination, plus some other small bits like not validating things again (sometimes, a third time...). Albeit it will cause some breakage with the piglit test, which I believe is a test error, but that might be open for debate... (For TEXTURE_BUFFER and the internalformat size/type queries, do you return valid values or unsupported? The problem here is ARB_tbo says you can't get these values via the equivalent GetTexLevelParameter queries, whereas with GL 3.1 you can. And internalformat_query2 says it returns "the same information" as GetTexLevelParameter, albeit it's not entirely true in any case since the equivalent of the internalformat stencil type doesn't even exist. My stance would be that valid values should be reported even without GL 3.1, but the piglit test thinks differently.) Err, actually this won't fix it I suppose - because rgb9e5 now is a valid fbo format. Was that commit really correct? It does not make sense to me, rgb9e5 cannot be a fbo/renderable format. Or was this just working around issues in formatquery.c (which I try to address with this patch)? I beli
Re: [Mesa-dev] [PATCH] i965: perform 2 uploads with dual slot *64*PASSTHRU formats on gen<8
Hi, the patch looks good to me. Some small comments: On 29/01/18 01:26, Andres Gomez wrote: The emission of vertex attributes corresponding to dvec3 and dvec4 vertex shader input variables was not correct when the passed to the VertexAttribL* commands was <= 2. In 61a8a55f557 ("i965/gen8: Fix vertex attrib upload for dvec3/4 shader inputs"), for gen8+ we needed to determine if the attrib was dual slot to emit 128 or 256-bit, independently of the VAO size. Similarly, for gen < 8 we also need to determine whether the attrib is dual slot to force the emission of 265-bits through 2 uploads. ^^^ 256-bits Additionally, we make use of the ISL_FORMAT_R32_FLOAT format in this second upload to fill these unspecified components with zeros, as we also do for gen8+. Fixes the following test on Haswell: KHR-GL46.vertex_attrib_binding.basic-inputL-case1 Fixes: 75968a668e4 ("i965/gen7: expose OpenGL 4.2 on Haswell when supported") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103006 Cc: Alejandro Piñeiro Cc: Juan A. Suarez Romero Cc: Antia Puentes Cc: Rafael Antognolli Cc: Kenneth Graunke Signed-off-by: Andres Gomez --- src/mesa/drivers/dri/i965/genX_state_upload.c | 19 ++- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c index aa4d64d08e2..d4c89c05b41 100644 --- a/src/mesa/drivers/dri/i965/genX_state_upload.c +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c @@ -364,11 +364,15 @@ is_passthru_format(uint32_t format) } UNUSED static int -uploads_needed(uint32_t format) +uploads_needed(uint32_t format, + bool is_dual_slot) { if (!is_passthru_format(format)) return 1; + if (is_dual_slot) + return 2; + switch (format) { case ISL_FORMAT_R64_PASSTHRU: case ISL_FORMAT_R64G64_PASSTHRU: @@ -399,9 +403,11 @@ downsize_format_if_needed(uint32_t format, switch (format) { case ISL_FORMAT_R64_PASSTHRU: - return ISL_FORMAT_R32G32_FLOAT; + return !upload ? ISL_FORMAT_R32G32_FLOAT + : ISL_FORMAT_R32_FLOAT; case ISL_FORMAT_R64G64_PASSTHRU: - return ISL_FORMAT_R32G32B32A32_FLOAT; + return !upload ? ISL_FORMAT_R32G32B32A32_FLOAT + : ISL_FORMAT_R32_FLOAT; case ISL_FORMAT_R64G64B64_PASSTHRU: return !upload ? ISL_FORMAT_R32G32B32A32_FLOAT : ISL_FORMAT_R32G32_FLOAT; @@ -420,6 +426,8 @@ static int upload_format_size(uint32_t upload_format) { switch (upload_format) { + case ISL_FORMAT_R32_FLOAT: + return 0; I agree with Piñeiro about at least adding a comment here. case ISL_FORMAT_R32G32_FLOAT: return 2; case ISL_FORMAT_R32G32B32A32_FLOAT: @@ -517,7 +525,7 @@ genX(emit_vertices)(struct brw_context *brw) struct brw_vertex_element *input = brw->vb.enabled[i]; uint32_t format = brw_get_vertex_surface_type(brw, input->glarray); - if (uploads_needed(format) > 1) + if (uploads_needed(format, input->is_dual_slot) > 1) nr_elements++; } #endif @@ -613,7 +621,8 @@ genX(emit_vertices)(struct brw_context *brw) uint32_t comp1 = VFCOMP_STORE_SRC; uint32_t comp2 = VFCOMP_STORE_SRC; uint32_t comp3 = VFCOMP_STORE_SRC; - const unsigned num_uploads = GEN_GEN < 8 ? uploads_needed(format) : 1; + const unsigned num_uploads = GEN_GEN < 8 ? +uploads_needed(format, input->is_dual_slot) : 1; #if GEN_GEN >= 8 /* From the BDW PRM, Volume 2d, page 588 (VERTEX_ELEMENT_STATE): Thanks for fixing this. Reviewed-by: Antia Puentes ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 2/8] compiler: Add SYSTEM_VALUE_FIRST_VERTEX and instrinsics
This series is still waiting for review. On 25/01/18 19:15, Antia Puentes wrote: This VS system value will contain the value passed as for indexed draw calls or the value passed as for non-indexed draw calls. It can be used to calculate the gl_VertexID as SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus SYSTEM_VALUE_FIRST_VERTEX. From the OpenGL 4.6 spec, 10.4 "Drawing Commands Using Vertex Arrays": - Page 352: "The index of any element transferred to the GL by DrawArraysOneInstance is referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID. The vertex ID of the ith element transferred is first + i." - Page 355: "The index of any element transferred to the GL by DrawElementsOneInstance is referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID. The vertex ID of the ith element transferred is the sum of basevertex and the value stored in the currently bound element array buffer at offset indices + i." Currently the gl_VertexID calculation uses SYSTEM_VALUE_BASE_VERTEX but this will have to change when the value of gl_BaseVertex is fixed. Currently its value is broken for non-indexed draw calls because it must be zero but we are setting it to . v2: use SYSTEM_VALUE_FIRST_VERTEX as name for the value, instead of SYSTEM_VALUE_BASE_VERTEX_ID (Kenneth). Reviewed-by: Neil Roberts Reviewed-by: Kenneth Graunke --- src/compiler/nir/nir.c | 4 src/compiler/nir/nir_gather_info.c | 1 + src/compiler/nir/nir_intrinsics.h | 1 + src/compiler/shader_enums.c| 1 + src/compiler/shader_enums.h| 14 ++ 5 files changed, 21 insertions(+) diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c index bdd8960403c..e69c2accbbf 100644 --- a/src/compiler/nir/nir.c +++ b/src/compiler/nir/nir.c @@ -1919,6 +1919,8 @@ nir_intrinsic_from_system_value(gl_system_value val) return nir_intrinsic_load_base_instance; case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE: return nir_intrinsic_load_vertex_id_zero_base; + case SYSTEM_VALUE_FIRST_VERTEX: + return nir_intrinsic_load_first_vertex; case SYSTEM_VALUE_BASE_VERTEX: return nir_intrinsic_load_base_vertex; case SYSTEM_VALUE_INVOCATION_ID: @@ -1990,6 +1992,8 @@ nir_system_value_from_intrinsic(nir_intrinsic_op intrin) return SYSTEM_VALUE_BASE_INSTANCE; case nir_intrinsic_load_vertex_id_zero_base: return SYSTEM_VALUE_VERTEX_ID_ZERO_BASE; + case nir_intrinsic_load_first_vertex: + return SYSTEM_VALUE_FIRST_VERTEX; case nir_intrinsic_load_base_vertex: return SYSTEM_VALUE_BASE_VERTEX; case nir_intrinsic_load_invocation_id: diff --git a/src/compiler/nir/nir_gather_info.c b/src/compiler/nir/nir_gather_info.c index 946939657ec..555ae77b1d3 100644 --- a/src/compiler/nir/nir_gather_info.c +++ b/src/compiler/nir/nir_gather_info.c @@ -247,6 +247,7 @@ gather_intrinsic_info(nir_intrinsic_instr *instr, nir_shader *shader) case nir_intrinsic_load_vertex_id: case nir_intrinsic_load_vertex_id_zero_base: case nir_intrinsic_load_base_vertex: + case nir_intrinsic_load_first_vertex: case nir_intrinsic_load_base_instance: case nir_intrinsic_load_instance_id: case nir_intrinsic_load_sample_id: diff --git a/src/compiler/nir/nir_intrinsics.h b/src/compiler/nir/nir_intrinsics.h index ede29277876..7d3421f0e30 100644 --- a/src/compiler/nir/nir_intrinsics.h +++ b/src/compiler/nir/nir_intrinsics.h @@ -333,6 +333,7 @@ SYSTEM_VALUE(frag_coord, 4, 0, xx, xx, xx) SYSTEM_VALUE(front_face, 1, 0, xx, xx, xx) SYSTEM_VALUE(vertex_id, 1, 0, xx, xx, xx) SYSTEM_VALUE(vertex_id_zero_base, 1, 0, xx, xx, xx) +SYSTEM_VALUE(first_vertex, 1, 0, xx, xx, xx) SYSTEM_VALUE(base_vertex, 1, 0, xx, xx, xx) SYSTEM_VALUE(instance_id, 1, 0, xx, xx, xx) SYSTEM_VALUE(base_instance, 1, 0, xx, xx, xx) diff --git a/src/compiler/shader_enums.c b/src/compiler/shader_enums.c index 2179c475abd..5e123f29f37 100644 --- a/src/compiler/shader_enums.c +++ b/src/compiler/shader_enums.c @@ -214,6 +214,7 @@ gl_system_value_name(gl_system_value sysval) ENUM(SYSTEM_VALUE_INSTANCE_ID), ENUM(SYSTEM_VALUE_INSTANCE_INDEX), ENUM(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE), + ENUM(SYSTEM_VALUE_FIRST_VERTEX), ENUM(SYSTEM_VALUE_BASE_VERTEX), ENUM(SYSTEM_VALUE_BASE_INSTANCE), ENUM(SYSTEM_VALUE_DRAW_ID), diff --git a/src/compiler/shader_enums.h b/src/compiler/shader_enums.h index ffe551ab20f..9f71194c146 100644 --- a/src/compiler/shader_enums.h +++ b/src/compiler/shader_enums.h @@ -472,6 +472,20 @@ typedef enum */ SYSTEM_VALUE_BASE_VERTEX, + /** +* Depending on the type of the draw call (indexed or non-indexed), +* is the value of \c basevertex passed to \c glDrawElementsBaseVertex and +* similar, or is the value of \c first passed to \c glDrawArrays and +* similar. +* +* \note +* It can be used to calculate the \c SYSTE
[Mesa-dev] [PATCH 05/15] i965/fs: shuffle 32bits into 64bits for doubles
From: "Juan A. Suarez Romero" VS Thread Payload handles attributes in URB as vec4, no matter if they are actually single or double precision. So with double-precision types, value ends up in the registers split in 32bits chunks, in different positions. We need to shuffle the chunks to get the doubles correctly. --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 0ff3eaf..4362308 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -3173,6 +3173,12 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr for (unsigned j = 0; j < instr->num_components; j++) { bld.MOV(offset(dest, bld, j), offset(src, bld, j)); } + if (type_sz(src.type) == 8) + SHUFFLE_32BIT_LOAD_RESULT_TO_64BIT_DATA(bld, + offset(dest, bld, 0), + offset(dest, bld, 0), + instr->num_components); + break; } -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/15] i965/fs: calculate first non-payload GRF using attrib slots
From: "Juan A. Suarez Romero" When computing where the first non-payload GRF starts, we can't rely on the number of attributes, as each attribute can be using 1 or 2 slots depending on whether they are a dvec3/4 or other. Instead, we need to use the number of slots used by the attributes. --- src/mesa/drivers/dri/i965/brw_compiler.h | 1 + src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +- src/mesa/drivers/dri/i965/brw_vec4.cpp | 1 + 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h b/src/mesa/drivers/dri/i965/brw_compiler.h index bc15bf3..51cf850 100644 --- a/src/mesa/drivers/dri/i965/brw_compiler.h +++ b/src/mesa/drivers/dri/i965/brw_compiler.h @@ -615,6 +615,7 @@ struct brw_vs_prog_data { GLbitfield64 inputs_read; unsigned nr_attributes; + unsigned nr_attribute_slots; bool uses_vertexid; bool uses_instanceid; diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 98cbf9f..5e20ef9 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -1885,7 +1885,7 @@ fs_visitor::assign_vs_urb_setup() assert(stage == MESA_SHADER_VERTEX); /* Each attribute is 4 regs. */ - this->first_non_payload_grf += 4 * vs_prog_data->nr_attributes; + this->first_non_payload_grf += 4 * vs_prog_data->nr_attribute_slots; assert(vs_prog_data->base.urb_read_length <= 15); diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 005760d..9816f0d 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -2149,6 +2149,7 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, DIV_ROUND_UP(MAX2(nr_attribute_slots, 1), 2); prog_data->nr_attributes = nr_attributes; + prog_data->nr_attribute_slots = nr_attribute_slots; /* Since vertex shaders reuse the same VUE entry for inputs and outputs * (overwriting the original contents), we need to make sure the size is -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/15] nir: add double input bitmap
From: "Juan A. Suarez Romero" This bitmap tracks which input attributes are double-precision. --- src/compiler/nir/glsl_to_nir.cpp | 1 + src/compiler/nir/nir.h | 2 ++ 2 files changed, 3 insertions(+) diff --git a/src/compiler/nir/glsl_to_nir.cpp b/src/compiler/nir/glsl_to_nir.cpp index fafa8bb..b6d25cd 100644 --- a/src/compiler/nir/glsl_to_nir.cpp +++ b/src/compiler/nir/glsl_to_nir.cpp @@ -154,6 +154,7 @@ glsl_to_nir(const struct gl_shader_program *shader_prog, shader->info.num_ssbos = sh->NumShaderStorageBlocks; shader->info.num_images = sh->NumImages; shader->info.inputs_read = sh->Program->InputsRead; + shader->info.double_inputs_read = sh->Program->DoubleInputsRead; shader->info.outputs_written = sh->Program->OutputsWritten; shader->info.patch_inputs_read = sh->Program->PatchInputsRead; shader->info.patch_outputs_written = sh->Program->PatchOutputsWritten; diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h index a3ac57d..4f773f1 100644 --- a/src/compiler/nir/nir.h +++ b/src/compiler/nir/nir.h @@ -1693,6 +1693,8 @@ typedef struct nir_shader_info { /* Which inputs are actually read */ uint64_t inputs_read; + /* Which inputs are actually read and are double */ + uint64_t double_inputs_read; /* Which outputs are actually written */ uint64_t outputs_written; /* Which system values are actually read */ -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/15] i965: passthru formats cannot be used width edge flag enabled
From: Alejandro Piñeiro Add an assertion to detect this case. --- src/mesa/drivers/dri/i965/gen8_draw_upload.c | 20 1 file changed, 20 insertions(+) diff --git a/src/mesa/drivers/dri/i965/gen8_draw_upload.c b/src/mesa/drivers/dri/i965/gen8_draw_upload.c index 14f7091..17a0a83 100644 --- a/src/mesa/drivers/dri/i965/gen8_draw_upload.c +++ b/src/mesa/drivers/dri/i965/gen8_draw_upload.c @@ -34,6 +34,20 @@ #include "intel_batchbuffer.h" #include "intel_buffer_objects.h" +static bool +is_passthru_format(uint32_t format) +{ + switch (format) { + case BRW_SURFACEFORMAT_R64_PASSTHRU: + case BRW_SURFACEFORMAT_R64G64_PASSTHRU: + case BRW_SURFACEFORMAT_R64G64B64_PASSTHRU: + case BRW_SURFACEFORMAT_R64G64B64A64_PASSTHRU: + return true; + default: + return false; + } +} + static void gen8_emit_vertices(struct brw_context *brw) { @@ -193,6 +207,12 @@ gen8_emit_vertices(struct brw_context *brw) uint32_t comp2 = BRW_VE1_COMPONENT_STORE_SRC; uint32_t comp3 = BRW_VE1_COMPONENT_STORE_SRC; + /* From the BDW PRM, Volume 2d, page 588 (VERTEX_ELEMENT_STATE): + * "Any SourceElementFormat of *64*_PASSTHRU cannot be used with an + * element which has edge flag enabled." + */ + assert(!(is_passthru_format(format) && uses_edge_flag)); + /* The gen4 driver expects edgeflag to come in as a float, and passes * that float on to the tests in the clipper. Mesa's current vertex * attribute value for EdgeFlag is stored as a float, which works out. -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 08/15] i965/vec4: use attribute slots to calculate URB read length
From: "Juan A. Suarez Romero" Do not use total attributes because a dvec3/dvec4 attribute requires two slots. So rather use total attribute slots. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 21 ++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 4766be0..005760d 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -2107,6 +2107,7 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, const unsigned *assembly = NULL; unsigned nr_attributes = _mesa_bitcount_64(prog_data->inputs_read); + unsigned nr_attribute_slots = 0; /* gl_VertexID and gl_InstanceID are system values, but arrive via an * incoming vertex attribute. So, add an extra slot. @@ -2117,11 +2118,23 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) | BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))) { nr_attributes++; + nr_attribute_slots++; } /* gl_DrawID has its very own vec4 */ if (shader->info.system_values_read & BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID)) { nr_attributes++; + nr_attribute_slots++; + } + + GLbitfield64 processed_attributes = 0; + foreach_list_typed(nir_variable, var, node, &shader->inputs) { + /* Only interested in values not already processed */ + if (processed_attributes & BITFIELD64_BIT(var->data.location)) + continue; + + nr_attribute_slots += glsl_count_attribute_slots(var->type, false); + processed_attributes |= BITFIELD64_BIT(var->data.location); } /* The 3DSTATE_VS documentation lists the lower bound on "Vertex URB Entry @@ -2129,9 +2142,11 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, * vec4 mode, the hardware appears to wedge unless we read something. */ if (is_scalar) - prog_data->base.urb_read_length = DIV_ROUND_UP(nr_attributes, 2); + prog_data->base.urb_read_length = + DIV_ROUND_UP(nr_attribute_slots, 2); else - prog_data->base.urb_read_length = DIV_ROUND_UP(MAX2(nr_attributes, 1), 2); + prog_data->base.urb_read_length = + DIV_ROUND_UP(MAX2(nr_attribute_slots, 1), 2); prog_data->nr_attributes = nr_attributes; @@ -2140,7 +2155,7 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, * the larger of the two. */ const unsigned vue_entries = - MAX2(nr_attributes, (unsigned)prog_data->base.vue_map.num_slots); + MAX2(nr_attribute_slots, (unsigned)prog_data->base.vue_map.num_slots); if (compiler->devinfo->gen == 6) prog_data->base.urb_entry_size = DIV_ROUND_UP(vue_entries, 8); -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 13/15] i965: Enable ARB_vertex_attrib_64bit for gen8+
From: Alejandro Piñeiro --- src/mesa/drivers/dri/i965/intel_extensions.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index 412dea0..d3905d0 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -371,6 +371,7 @@ intelInitExtensions(struct gl_context *ctx) if (brw->gen >= 8) { ctx->Extensions.ARB_stencil_texturing = true; ctx->Extensions.ARB_gpu_shader_fp64 = true; + ctx->Extensions.ARB_vertex_attrib_64bit = true; } if (brw->gen >= 9) { -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 11/15] i965: abort linking if we exhaust the registers
From: "Juan A. Suarez Romero" Even when the number of vertex attributes is under the limit, for shaders that use a high number of them, we can quickly exhaust the number of hardware registers. In this case, just abort the linking. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 4b8835d..387a266 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -1857,6 +1857,12 @@ fs_visitor::convert_attr_sources_to_hw_regs(fs_inst *inst) inst->src[i].nr + inst->src[i].reg_offset; + if (grf >= 128) { +fail("Failure to register allocate. Reduce the number of " + "vertex input attributes to avoid this."); +return; + } + unsigned exec_size; /* As explained at brw_reg_from_fs_reg, From the Haswell PRM: * -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/15] i965: get the proper vertex surface type for doubles on gen8+
From: Alejandro Piñeiro This commit adds support for PASSTHRU format when pushing double-precision attributes. Check glarray->Doubles in order to know if we should choose a format that does a conversion to float, or just passthru the 64-bit double. --- src/mesa/drivers/dri/i965/brw_draw_upload.c | 30 ++--- 1 file changed, 27 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c b/src/mesa/drivers/dri/i965/brw_draw_upload.c index 58e0516..5af4583 100644 --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c @@ -37,7 +37,7 @@ #include "intel_batchbuffer.h" #include "intel_buffer_objects.h" -static const GLuint double_types[5] = { +static const GLuint double_types_float[5] = { 0, BRW_SURFACEFORMAT_R64_FLOAT, BRW_SURFACEFORMAT_R64G64_FLOAT, @@ -45,6 +45,14 @@ static const GLuint double_types[5] = { BRW_SURFACEFORMAT_R64G64B64A64_FLOAT }; +static const GLuint double_types_passthru[5] = { + 0, + BRW_SURFACEFORMAT_R64_PASSTHRU, + BRW_SURFACEFORMAT_R64G64_PASSTHRU, + BRW_SURFACEFORMAT_R64G64B64_PASSTHRU, + BRW_SURFACEFORMAT_R64G64B64A64_PASSTHRU +}; + static const GLuint float_types[5] = { 0, BRW_SURFACEFORMAT_R32_FLOAT, @@ -213,6 +221,22 @@ static const GLuint byte_types_scale[5] = { BRW_SURFACEFORMAT_R8G8B8A8_SSCALED }; +static GLuint +double_types(struct brw_context *brw, + int size, + GLboolean doubles) +{ + /* From the BDW PRM, Volume 2d, page 588 (VERTEX_ELEMENT_STATE): +* "When SourceElementFormat is set to one of the *64*_PASSTHRU formats, +* 64-bit components are stored in the URB without any conversion." +* Also included on BDW PRM, Volume 7, page 470, table "Source Element +* Formats Supported in VF Unit" +* Previous PRMs don't include those references. +*/ + return (brw->gen >= 8 && doubles + ? double_types_passthru[size] + : double_types_float[size]); +} /** * Given vertex array type/size/format/normalized info, return @@ -245,7 +269,7 @@ brw_get_vertex_surface_type(struct brw_context *brw, return BRW_SURFACEFORMAT_R11G11B10_FLOAT; } else if (glarray->Normalized) { switch (glarray->Type) { - case GL_DOUBLE: return double_types[size]; + case GL_DOUBLE: return double_types(brw, size, glarray->Doubles); case GL_FLOAT: return float_types[size]; case GL_HALF_FLOAT: return half_float_types[size]; case GL_INT: return int_types_norm[size]; @@ -319,7 +343,7 @@ brw_get_vertex_surface_type(struct brw_context *brw, } assert(glarray->Format == GL_RGBA); /* sanity check */ switch (glarray->Type) { - case GL_DOUBLE: return double_types[size]; + case GL_DOUBLE: return double_types(brw, size, glarray->Doubles); case GL_FLOAT: return float_types[size]; case GL_HALF_FLOAT: return half_float_types[size]; case GL_INT: return int_types_scale[size]; -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 15/15] i965: Expose OpenGL 4.1 for gen8+
From: Alejandro Piñeiro ARB_vertex_attrib_64bit was the only feature missing. --- src/mesa/drivers/dri/i965/intel_extensions.c | 2 +- src/mesa/drivers/dri/i965/intel_screen.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index d3905d0..0361d42 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -268,7 +268,7 @@ intelInitExtensions(struct gl_context *ctx) ctx->Extensions.OES_texture_half_float_linear = true; if (brw->gen >= 8) - ctx->Const.GLSLVersion = 400; + ctx->Const.GLSLVersion = 410; else if (brw->gen >= 6) ctx->Const.GLSLVersion = 330; else diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index 6f685b1..f2ec450 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -1386,7 +1386,7 @@ set_max_gl_versions(struct intel_screen *screen) switch (screen->devinfo->gen) { case 9: case 8: - psp->max_gl_core_version = 40; + psp->max_gl_core_version = 41; psp->max_gl_compat_version = 30; psp->max_gl_es1_version = 11; psp->max_gl_es2_version = 31; -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 00/15] Add ARB_vertex_attrib_64bit for i965 scalar (gen8+)
Hello, the following series adds the implementation for the ARB_vertex_attrib_64bit extension on the i965 scalar backend (gen8+). It is the result of working on https://bugs.freedesktop.org/show_bug.cgi?id=94442. As this work depends on the ARB_gpu_shader_fp64 i965 functionality [1], which is work in progress, the aim of sending the series now is to get early feedback and parallelise the review process. The series applies on top of the current version of the https://github.com/Igalia/mesa/tree/i965-fp64 branch, which contains the last fp64 work for gen8. A frozen version of the fp64 branch + the series is available in our repo: $ git clone -b i965-attrib64-v1 https://github.com/Igalia/mesa.git [1] https://bugs.freedesktop.org/show_bug.cgi?id=92760 Alejandro Piñeiro (6): i965: get the proper vertex surface type for doubles on gen8+ i965: passthru formats cannot be used width edge flag enabled i965/fs: half exec_size when dealing with 64 bits attributes i965: Enable ARB_vertex_attrib_64bit for gen8+ docs: Mark ARB_vertex_attrib_64bit as done for i965 i965: Expose OpenGL 4.1 for gen8+ Antia Puentes (1): i965: Configure how to store *64*PASSTHRU vertex components Juan A. Suarez Romero (8): i965/fs: shuffle 32bits into 64bits for doubles nir: add double input bitmap i965: take care of doubles when remapping VS attributes i965/vec4: use attribute slots to calculate URB read length i965/fs: calculate first non-payload GRF using attrib slots i965: take care of doubles when lowering VS inputs i965: abort linking if we exhaust the registers i965: abort linking if URB read length greater than 15 docs/GL3.txt | 2 +- src/compiler/nir/glsl_to_nir.cpp | 1 + src/compiler/nir/nir.h | 2 ++ src/mesa/drivers/dri/i965/brw_compiler.h | 1 + src/mesa/drivers/dri/i965/brw_draw_upload.c | 30 ++-- src/mesa/drivers/dri/i965/brw_fs.cpp | 44 +-- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 6 src/mesa/drivers/dri/i965/brw_nir.c | 29 +++ src/mesa/drivers/dri/i965/brw_shader.h | 1 + src/mesa/drivers/dri/i965/brw_vec4.cpp | 30 ++-- src/mesa/drivers/dri/i965/gen8_draw_upload.c | 54 src/mesa/drivers/dri/i965/intel_extensions.c | 3 +- src/mesa/drivers/dri/i965/intel_screen.c | 2 +- 13 files changed, 177 insertions(+), 28 deletions(-) -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 12/15] i965: abort linking if URB read length greater than 15
From: "Juan A. Suarez Romero" In scalar mode, URB read length limit is 15. Abort if we go beyond it. --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 8 1 file changed, 8 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 9816f0d..04287fb 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -2148,6 +2148,14 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, prog_data->base.urb_read_length = DIV_ROUND_UP(MAX2(nr_attribute_slots, 1), 2); + if (is_scalar && prog_data->base.urb_read_length > 15) { + if (error_str) + *error_str = ralloc_strdup(mem_ctx, +"Too many attributes. Try to reduce the " +"number of attributes or their size"); + return NULL; + } + prog_data->nr_attributes = nr_attributes; prog_data->nr_attribute_slots = nr_attribute_slots; -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 14/15] docs: Mark ARB_vertex_attrib_64bit as done for i965
From: Alejandro Piñeiro --- docs/GL3.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index 95ee508..3f1f7a3 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -142,7 +142,7 @@ GL 4.1, GLSL 4.10 --- all DONE: nvc0, r600, radeonsi GL_ARB_get_program_binary DONE (0 binary formats) GL_ARB_separate_shader_objectsDONE (all drivers) GL_ARB_shader_precision DONE (all drivers that support GLSL 4.10) - GL_ARB_vertex_attrib_64bitDONE (llvmpipe, softpipe) + GL_ARB_vertex_attrib_64bitDONE (i965, llvmpipe, softpipe) GL_ARB_viewport_array DONE (i965, nv50, llvmpipe, softpipe) -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 04/15] i965/fs: half exec_size when dealing with 64 bits attributes
From: Alejandro Piñeiro The HW has a restriction that only vertical stride may cross register boundaries. Until now this was only handled on VGRFs at rw_reg_from_fs_reg, but it is also needed for attributes. This can be seen as the equivalent of commit 552cfa9 but for attributes. Take a look to the git message on that commit for further info. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 23 +-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index ce84d0a..98cbf9f 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -1844,11 +1844,30 @@ fs_visitor::convert_attr_sources_to_hw_regs(fs_inst *inst) inst->src[i].nr + inst->src[i].reg_offset; - unsigned width = inst->src[i].stride == 0 ? 1 : inst->exec_size; + unsigned exec_size; + /* As explained at brw_reg_from_fs_reg, From the Haswell PRM: + * + * VertStride must be used to cross GRF register boundaries. This + * rule implies that elements within a 'Width' cannot cross GRF + * boundaries. + * + * So, for registers that are large enough, we have to split the exec + * size in two and trust the compression state to sort it out. + */ + unsigned total_size = inst->exec_size * + inst->src[i].stride * + type_sz(inst->src[i].type); + if (total_size <= 32) { +exec_size = inst->exec_size; + } else { +assert(total_size / 2 <= 32); +exec_size = inst->exec_size / 2; + } + unsigned width = inst->src[i].stride == 0 ? 1 : exec_size; struct brw_reg reg = stride(byte_offset(retype(brw_vec8_grf(grf, 0), inst->src[i].type), inst->src[i].subreg_offset), - inst->exec_size * inst->src[i].stride, + exec_size * inst->src[i].stride, width, inst->src[i].stride); reg.abs = inst->src[i].abs; reg.negate = inst->src[i].negate; -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/15] i965: take care of doubles when remapping VS attributes
From: "Juan A. Suarez Romero" Double-precision types require 1 slot in VUE for double and dvec2, and 2 slots for anything else. --- src/mesa/drivers/dri/i965/brw_nir.c | 26 +++--- 1 file changed, 11 insertions(+), 15 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c index 42cfbaa..1d14437 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.c +++ b/src/mesa/drivers/dri/i965/brw_nir.c @@ -103,7 +103,7 @@ add_const_offset_to_base(nir_shader *nir, nir_variable_mode mode) static bool remap_vs_attrs(nir_block *block, void *closure) { - GLbitfield64 inputs_read = *((GLbitfield64 *) closure); + struct nir_shader_info *nir_info = (struct nir_shader_info *) closure; nir_foreach_instr(block, instr) { if (instr->type != nir_instr_type_intrinsic) @@ -118,9 +118,11 @@ remap_vs_attrs(nir_block *block, void *closure) * before it and counting the bits. */ int attr = intrin->const_index[0]; - int slot = _mesa_bitcount_64(inputs_read & BITFIELD64_MASK(attr)); - - intrin->const_index[0] = 4 * slot; + int slot = _mesa_bitcount_64(nir_info->inputs_read & + BITFIELD64_MASK(attr)); + int dslot = _mesa_bitcount_64(nir_info->double_inputs_read & + BITFIELD64_MASK(attr)); + intrin->const_index[0] = 4 * (slot + dslot); } } return true; @@ -214,9 +216,9 @@ brw_nir_lower_vs_inputs(nir_shader *nir, var->data.driver_location = var->data.location; } - /* Now use nir_lower_io to walk dereference chains. Attribute arrays -* are loaded as one vec4 per element (or matrix column), so we use -* type_size_vec4 here. + /* Now use nir_lower_io to walk dereference chains. Attribute arrays are +* loaded as one vec4 or dvec4 per element (or matrix column), depending on +* whether it is a double-precision type or not. */ nir_lower_io(nir, nir_var_shader_in, type_size_vec4); @@ -229,17 +231,11 @@ brw_nir_lower_vs_inputs(nir_shader *nir, vs_attrib_wa_flags); if (is_scalar) { - /* Finally, translate VERT_ATTRIB_* values into the actual registers. - * - * Note that we can use nir->info.inputs_read instead of - * key->inputs_read since the two are identical aside from Gen4-5 - * edge flag differences. - */ - GLbitfield64 inputs_read = nir->info.inputs_read; + /* Finally, translate VERT_ATTRIB_* values into the actual registers. */ nir_foreach_function(nir, function) { if (function->impl) { -nir_foreach_block_call(function->impl, remap_vs_attrs, &inputs_read); +nir_foreach_block_call(function->impl, remap_vs_attrs, &nir->info); } } } -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 02/15] i965: Configure how to store *64*PASSTHRU vertex components
From the Broadwell specification, structure VERTEX_ELEMENT_STATE description: "When SourceElementFormat is set to one of the *64*_PASSTHRU formats, 64-bit components are stored in the URB without any conversion. In this case, vertex elements must be written as 128 or 256 bits, with VFCOMP_STORE_0 being used to pad the output as required. E.g., if R64_PASSTHRU is used to copy a 64-bit Red component into the URB, Component 1 must be specified as VFCOMP_STORE_0 (with Components 2,3 set to VFCOMP_NOSTORE) in order to output a 128-bit vertex element, or Components 1-3 must be specified as VFCOMP_STORE_0 in order to output a 256-bit vertex element. Likewise, use of R64G64B64_PASSTHRU requires Component 3 to be specified as VFCOMP_STORE_0 in order to output a 256-bit vertex element." Uses 128-bits to write double and dvec2 vertex elements, and 256-bits for dvec3 and dvec4 vertex elements. Signed-off-by: Juan A. Suarez Romero Signed-off-by: Antia Puentes --- src/mesa/drivers/dri/i965/gen8_draw_upload.c | 34 1 file changed, 34 insertions(+) diff --git a/src/mesa/drivers/dri/i965/gen8_draw_upload.c b/src/mesa/drivers/dri/i965/gen8_draw_upload.c index fe5ed35..14f7091 100644 --- a/src/mesa/drivers/dri/i965/gen8_draw_upload.c +++ b/src/mesa/drivers/dri/i965/gen8_draw_upload.c @@ -217,6 +217,40 @@ gen8_emit_vertices(struct brw_context *brw) break; } + /* From the BDW PRM, Volume 2d, page 586 (VERTEX_ELEMENT_STATE): + * + * "When SourceElementFormat is set to one of the *64*_PASSTHRU + * formats, 64-bit components are stored in the URB without any + * conversion. In this case, vertex elements must be written as 128 + * or 256 bits, with VFCOMP_STORE_0 being used to pad the output + * as required. E.g., if R64_PASSTHRU is used to copy a 64-bit Red + * component into the URB, Component 1 must be specified as + * VFCOMP_STORE_0 (with Components 2,3 set to VFCOMP_NOSTORE) + * in order to output a 128-bit vertex element, or Components 1-3 must + * be specified as VFCOMP_STORE_0 in order to output a 256-bit vertex + * element. Likewise, use of R64G64B64_PASSTHRU requires Component 3 + * to be specified as VFCOMP_STORE_0 in order to output a 256-bit vertex + * element." + */ + if (input->glarray->Doubles) { + switch (input->glarray->Size) { + case 0: + case 1: + case 2: +/* Use 128-bits instead of 256-bits to write double and dvec2 + * vertex elements. + */ +comp2 = BRW_VE1_COMPONENT_NOSTORE; +comp3 = BRW_VE1_COMPONENT_NOSTORE; +break; + case 3: +/* Pad the output using VFCOMP_STORE_0 as suggested + * by the BDW PRM. + */ +comp3 = BRW_VE1_COMPONENT_STORE_0; + } + } + OUT_BATCH((input->buffer << GEN6_VE0_INDEX_SHIFT) | GEN6_VE0_VALID | (format << BRW_VE0_FORMAT_SHIFT) | -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 10/15] i965: take care of doubles when lowering VS inputs
From: "Juan A. Suarez Romero" Input attributes can require 2 vec4 or 1 vec4 depending on whether they are double-precision or not. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 13 + src/mesa/drivers/dri/i965/brw_nir.c| 3 ++- src/mesa/drivers/dri/i965/brw_shader.h | 1 + 3 files changed, 16 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 5e20ef9..4b8835d 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -645,6 +645,19 @@ type_size_vec4_times_4(const struct glsl_type *type) return 4 * type_size_vec4(type); } +/* Attribute arrays are loaded as one vec4 per element (or matrix column), + * except for double-precision types, which are loaded as one dvec4. + */ +extern "C" int +type_size_vs_input(const struct glsl_type *type) +{ + if (type->is_double()) { + return type_size_vec4(type) / 2; + } else { + return type_size_vec4(type); + } +} + /** * Create a MOV to read the timestamp register. * diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c index 1d14437..8b7cd8e 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.c +++ b/src/mesa/drivers/dri/i965/brw_nir.c @@ -23,6 +23,7 @@ #include "brw_nir.h" #include "brw_shader.h" +#include "compiler/glsl_types.h" #include "compiler/nir/glsl_to_nir.h" #include "compiler/nir/nir_builder.h" #include "program/prog_to_nir.h" @@ -220,7 +221,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir, * loaded as one vec4 or dvec4 per element (or matrix column), depending on * whether it is a double-precision type or not. */ - nir_lower_io(nir, nir_var_shader_in, type_size_vec4); + nir_lower_io(nir, nir_var_shader_in, type_size_vs_input); /* This pass needs actual constants */ nir_opt_constant_folding(nir); diff --git a/src/mesa/drivers/dri/i965/brw_shader.h b/src/mesa/drivers/dri/i965/brw_shader.h index f6f6167..8af058b 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.h +++ b/src/mesa/drivers/dri/i965/brw_shader.h @@ -292,6 +292,7 @@ struct gl_shader *brw_new_shader(struct gl_context *ctx, GLuint name, GLuint typ int type_size_scalar(const struct glsl_type *type); int type_size_vec4(const struct glsl_type *type); int type_size_vec4_times_4(const struct glsl_type *type); +int type_size_vs_input(const struct glsl_type *type); unsigned tesslevel_outer_components(GLenum tes_primitive_mode); unsigned tesslevel_inner_components(GLenum tes_primitive_mode); -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 10/15] i965: take care of doubles when lowering VS inputs
From: "Juan A. Suarez Romero" Input attributes can require 2 vec4 or 1 vec4 depending on whether they are double-precision or not. Reviewed-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_fs.cpp | 13 + src/mesa/drivers/dri/i965/brw_nir.c| 3 ++- src/mesa/drivers/dri/i965/brw_shader.h | 1 + 3 files changed, 16 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 970846c..2e3cad9 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -638,6 +638,19 @@ type_size_vec4_times_4(const struct glsl_type *type) return 4 * type_size_vec4(type); } +/* Attribute arrays are loaded as one vec4 per element (or matrix column), + * except for double-precision types, which are loaded as one dvec4. + */ +extern "C" int +type_size_vs_input(const struct glsl_type *type) +{ + if (type->is_double()) { + return type_size_vec4(type) / 2; + } else { + return type_size_vec4(type); + } +} + /** * Create a MOV to read the timestamp register. * diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c index f37bf3a..9afd036 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.c +++ b/src/mesa/drivers/dri/i965/brw_nir.c @@ -23,6 +23,7 @@ #include "brw_nir.h" #include "brw_shader.h" +#include "compiler/glsl_types.h" #include "compiler/nir/glsl_to_nir.h" #include "compiler/nir/nir_builder.h" #include "program/prog_to_nir.h" @@ -205,7 +206,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir, * loaded as one vec4 or dvec4 per element (or matrix column), depending on * whether it is a double-precision type or not. */ - nir_lower_io(nir, nir_var_shader_in, type_size_vec4); + nir_lower_io(nir, nir_var_shader_in, type_size_vs_input); /* This pass needs actual constants */ nir_opt_constant_folding(nir); diff --git a/src/mesa/drivers/dri/i965/brw_shader.h b/src/mesa/drivers/dri/i965/brw_shader.h index 35e7d7a..60f3b5f 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.h +++ b/src/mesa/drivers/dri/i965/brw_shader.h @@ -294,6 +294,7 @@ struct gl_shader *brw_new_shader(struct gl_context *ctx, GLuint name, GLuint typ int type_size_scalar(const struct glsl_type *type); int type_size_vec4(const struct glsl_type *type); int type_size_vec4_times_4(const struct glsl_type *type); +int type_size_vs_input(const struct glsl_type *type); unsigned tesslevel_outer_components(GLenum tes_primitive_mode); unsigned tesslevel_inner_components(GLenum tes_primitive_mode); -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 15/15] i965: Expose OpenGL 4.2 for gen8+
From: Alejandro Piñeiro ARB_vertex_attrib_64bit was the only feature missing. v2: we can expose 4.2 instead of 4.1 (Ian Romanick) Reviewed-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/intel_extensions.c | 2 +- src/mesa/drivers/dri/i965/intel_screen.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index 1c3e8bd..71605f8 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -268,7 +268,7 @@ intelInitExtensions(struct gl_context *ctx) ctx->Extensions.OES_texture_half_float_linear = true; if (brw->gen >= 8) - ctx->Const.GLSLVersion = 400; + ctx->Const.GLSLVersion = 420; else if (brw->gen >= 6) ctx->Const.GLSLVersion = 330; else diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index c2efc6e..1a0541a 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -1376,7 +1376,7 @@ set_max_gl_versions(struct intel_screen *screen) switch (screen->devinfo->gen) { case 9: case 8: - psp->max_gl_core_version = 40; + psp->max_gl_core_version = 42; psp->max_gl_compat_version = 30; psp->max_gl_es1_version = 11; psp->max_gl_es2_version = 31; -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 12/15] dri/i965: dvec3/dvec4 consume twice input vertex attributes
From: "Juan A. Suarez Romero" From the GL 4.5 core spec, section 11.1.1 (Vertex Attributes): "A program with more than the value of MAX_VERTEX_ATTRIBS active attribute variables may fail to link, unless device-dependent optimizations are able to make the program fit within available hardware resources. For the purposes of this test, attribute variables of the type dvec3, dvec4, dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3, and dmat4 may count as consuming twice as many attributes as equivalent single-precision types. While these types use the same number of generic attributes as their single-precision equivalents, implementations are permitted to consume two single-precision vectors of internal storage for each three- or four-component double-precision vector." This commit sets i965 driver backend to consume twice as many attributes as equivalent single-precision types for dvec3, dvec4, dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3 and dmat4. This prevents running out of registers in case we use too many double-precision vertex input attributes. --- src/mesa/drivers/dri/i965/brw_context.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c index 26514a0..d6998b6 100644 --- a/src/mesa/drivers/dri/i965/brw_context.c +++ b/src/mesa/drivers/dri/i965/brw_context.c @@ -619,6 +619,8 @@ brw_initialize_context_constants(struct brw_context *brw) ctx->Const.NativeIntegers = true; ctx->Const.VertexID_is_zero_based = true; + ctx->Const.FP64Vector34Consumes2Locations = true; + /* Regarding the CMP instruction, the Ivybridge PRM says: * * "For each enabled channel 0b or 1b is assigned to the appropriate flag -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 09/15] i965/fs: calculate first non-payload GRF using attrib slots
From: "Juan A. Suarez Romero" When computing where the first non-payload GRF starts, we can't rely on the number of attributes, as each attribute can be using 1 or 2 slots depending on whether they are a dvec3/4 or other. Instead, we need to use the number of slots used by the attributes. Reviewed-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_compiler.h | 1 + src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +- src/mesa/drivers/dri/i965/brw_vec4.cpp | 1 + 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h b/src/mesa/drivers/dri/i965/brw_compiler.h index 5807305..446c609 100644 --- a/src/mesa/drivers/dri/i965/brw_compiler.h +++ b/src/mesa/drivers/dri/i965/brw_compiler.h @@ -607,6 +607,7 @@ struct brw_vs_prog_data { GLbitfield64 inputs_read; unsigned nr_attributes; + unsigned nr_attribute_slots; bool uses_vertexid; bool uses_instanceid; diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index e2a87a7..970846c 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -1875,7 +1875,7 @@ fs_visitor::assign_vs_urb_setup() assert(stage == MESA_SHADER_VERTEX); /* Each attribute is 4 regs. */ - this->first_non_payload_grf += 4 * vs_prog_data->nr_attributes; + this->first_non_payload_grf += 4 * vs_prog_data->nr_attribute_slots; assert(vs_prog_data->base.urb_read_length <= 15); diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index b144fe8..88d9317 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -2120,6 +2120,7 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, DIV_ROUND_UP(MAX2(nr_attribute_slots, 1), 2); prog_data->nr_attributes = nr_attributes; + prog_data->nr_attribute_slots = nr_attribute_slots; /* Since vertex shaders reuse the same VUE entry for inputs and outputs * (overwriting the original contents), we need to make sure the size is -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 01/15] i965: get the proper vertex surface type for doubles on gen8+
From: Alejandro Piñeiro This commit adds support for PASSTHRU format when pushing double-precision attributes. Check glarray->Doubles in order to know if we should choose a format that does a conversion to float, or just passthru the 64-bit double. Reviewed-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_draw_upload.c | 30 ++--- 1 file changed, 27 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c b/src/mesa/drivers/dri/i965/brw_draw_upload.c index 58e0516..5af4583 100644 --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c @@ -37,7 +37,7 @@ #include "intel_batchbuffer.h" #include "intel_buffer_objects.h" -static const GLuint double_types[5] = { +static const GLuint double_types_float[5] = { 0, BRW_SURFACEFORMAT_R64_FLOAT, BRW_SURFACEFORMAT_R64G64_FLOAT, @@ -45,6 +45,14 @@ static const GLuint double_types[5] = { BRW_SURFACEFORMAT_R64G64B64A64_FLOAT }; +static const GLuint double_types_passthru[5] = { + 0, + BRW_SURFACEFORMAT_R64_PASSTHRU, + BRW_SURFACEFORMAT_R64G64_PASSTHRU, + BRW_SURFACEFORMAT_R64G64B64_PASSTHRU, + BRW_SURFACEFORMAT_R64G64B64A64_PASSTHRU +}; + static const GLuint float_types[5] = { 0, BRW_SURFACEFORMAT_R32_FLOAT, @@ -213,6 +221,22 @@ static const GLuint byte_types_scale[5] = { BRW_SURFACEFORMAT_R8G8B8A8_SSCALED }; +static GLuint +double_types(struct brw_context *brw, + int size, + GLboolean doubles) +{ + /* From the BDW PRM, Volume 2d, page 588 (VERTEX_ELEMENT_STATE): +* "When SourceElementFormat is set to one of the *64*_PASSTHRU formats, +* 64-bit components are stored in the URB without any conversion." +* Also included on BDW PRM, Volume 7, page 470, table "Source Element +* Formats Supported in VF Unit" +* Previous PRMs don't include those references. +*/ + return (brw->gen >= 8 && doubles + ? double_types_passthru[size] + : double_types_float[size]); +} /** * Given vertex array type/size/format/normalized info, return @@ -245,7 +269,7 @@ brw_get_vertex_surface_type(struct brw_context *brw, return BRW_SURFACEFORMAT_R11G11B10_FLOAT; } else if (glarray->Normalized) { switch (glarray->Type) { - case GL_DOUBLE: return double_types[size]; + case GL_DOUBLE: return double_types(brw, size, glarray->Doubles); case GL_FLOAT: return float_types[size]; case GL_HALF_FLOAT: return half_float_types[size]; case GL_INT: return int_types_norm[size]; @@ -319,7 +343,7 @@ brw_get_vertex_surface_type(struct brw_context *brw, } assert(glarray->Format == GL_RGBA); /* sanity check */ switch (glarray->Type) { - case GL_DOUBLE: return double_types[size]; + case GL_DOUBLE: return double_types(brw, size, glarray->Doubles); case GL_FLOAT: return float_types[size]; case GL_HALF_FLOAT: return half_float_types[size]; case GL_INT: return int_types_scale[size]; -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 13/15] i965: Enable ARB_vertex_attrib_64bit for gen8+
From: Alejandro Piñeiro Reviewed-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/intel_extensions.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index dcb9831..1c3e8bd 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -376,6 +376,7 @@ intelInitExtensions(struct gl_context *ctx) ctx->Extensions.ARB_stencil_texturing = true; ctx->Extensions.ARB_texture_stencil8 = true; ctx->Extensions.ARB_gpu_shader_fp64 = true; + ctx->Extensions.ARB_vertex_attrib_64bit = true; } if (brw->gen >= 9) { -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 02/15] i965: Configure how to store *64*PASSTHRU vertex components
From the Broadwell specification, structure VERTEX_ELEMENT_STATE description: "When SourceElementFormat is set to one of the *64*_PASSTHRU formats, 64-bit components are stored in the URB without any conversion. In this case, vertex elements must be written as 128 or 256 bits, with VFCOMP_STORE_0 being used to pad the output as required. E.g., if R64_PASSTHRU is used to copy a 64-bit Red component into the URB, Component 1 must be specified as VFCOMP_STORE_0 (with Components 2,3 set to VFCOMP_NOSTORE) in order to output a 128-bit vertex element, or Components 1-3 must be specified as VFCOMP_STORE_0 in order to output a 256-bit vertex element. Likewise, use of R64G64B64_PASSTHRU requires Component 3 to be specified as VFCOMP_STORE_0 in order to output a 256-bit vertex element." Uses 128-bits to write double and dvec2 vertex elements, and 256-bits for dvec3 and dvec4 vertex elements. Signed-off-by: Juan A. Suarez Romero Signed-off-by: Antia Puentes Reviewed-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/gen8_draw_upload.c | 35 1 file changed, 35 insertions(+) diff --git a/src/mesa/drivers/dri/i965/gen8_draw_upload.c b/src/mesa/drivers/dri/i965/gen8_draw_upload.c index fe5ed35..c862f05 100644 --- a/src/mesa/drivers/dri/i965/gen8_draw_upload.c +++ b/src/mesa/drivers/dri/i965/gen8_draw_upload.c @@ -217,6 +217,41 @@ gen8_emit_vertices(struct brw_context *brw) break; } + /* From the BDW PRM, Volume 2d, page 586 (VERTEX_ELEMENT_STATE): + * + * "When SourceElementFormat is set to one of the *64*_PASSTHRU + * formats, 64-bit components are stored in the URB without any + * conversion. In this case, vertex elements must be written as 128 + * or 256 bits, with VFCOMP_STORE_0 being used to pad the output + * as required. E.g., if R64_PASSTHRU is used to copy a 64-bit Red + * component into the URB, Component 1 must be specified as + * VFCOMP_STORE_0 (with Components 2,3 set to VFCOMP_NOSTORE) + * in order to output a 128-bit vertex element, or Components 1-3 must + * be specified as VFCOMP_STORE_0 in order to output a 256-bit vertex + * element. Likewise, use of R64G64B64_PASSTHRU requires Component 3 + * to be specified as VFCOMP_STORE_0 in order to output a 256-bit vertex + * element." + */ + if (input->glarray->Doubles) { + switch (input->glarray->Size) { + case 0: + case 1: + case 2: +/* Use 128-bits instead of 256-bits to write double and dvec2 + * vertex elements. + */ +comp2 = BRW_VE1_COMPONENT_NOSTORE; +comp3 = BRW_VE1_COMPONENT_NOSTORE; +break; + case 3: +/* Pad the output using VFCOMP_STORE_0 as suggested + * by the BDW PRM. + */ +comp3 = BRW_VE1_COMPONENT_STORE_0; +break; + } + } + OUT_BATCH((input->buffer << GEN6_VE0_INDEX_SHIFT) | GEN6_VE0_VALID | (format << BRW_VE0_FORMAT_SHIFT) | -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 03/15] i965: passthru formats cannot be used width edge flag enabled
From: Alejandro Piñeiro Add an assertion to detect this case. Reviewed-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/gen8_draw_upload.c | 20 1 file changed, 20 insertions(+) diff --git a/src/mesa/drivers/dri/i965/gen8_draw_upload.c b/src/mesa/drivers/dri/i965/gen8_draw_upload.c index c862f05..dce11dd 100644 --- a/src/mesa/drivers/dri/i965/gen8_draw_upload.c +++ b/src/mesa/drivers/dri/i965/gen8_draw_upload.c @@ -34,6 +34,20 @@ #include "intel_batchbuffer.h" #include "intel_buffer_objects.h" +static bool +is_passthru_format(uint32_t format) +{ + switch (format) { + case BRW_SURFACEFORMAT_R64_PASSTHRU: + case BRW_SURFACEFORMAT_R64G64_PASSTHRU: + case BRW_SURFACEFORMAT_R64G64B64_PASSTHRU: + case BRW_SURFACEFORMAT_R64G64B64A64_PASSTHRU: + return true; + default: + return false; + } +} + static void gen8_emit_vertices(struct brw_context *brw) { @@ -193,6 +207,12 @@ gen8_emit_vertices(struct brw_context *brw) uint32_t comp2 = BRW_VE1_COMPONENT_STORE_SRC; uint32_t comp3 = BRW_VE1_COMPONENT_STORE_SRC; + /* From the BDW PRM, Volume 2d, page 588 (VERTEX_ELEMENT_STATE): + * "Any SourceElementFormat of *64*_PASSTHRU cannot be used with an + * element which has edge flag enabled." + */ + assert(!(is_passthru_format(format) && uses_edge_flag)); + /* The gen4 driver expects edgeflag to come in as a float, and passes * that float on to the tests in the clipper. Mesa's current vertex * attribute value for EdgeFlag is stored as a float, which works out. -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 07/15] i965: take care of doubles when remapping VS attributes
From: "Juan A. Suarez Romero" Double-precision types require 1 slot in VUE for double and dvec2, and 2 slots for anything else. Reviewed-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_nir.c | 26 +++--- 1 file changed, 11 insertions(+), 15 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c index c501bc1..f37bf3a 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.c +++ b/src/mesa/drivers/dri/i965/brw_nir.c @@ -96,7 +96,7 @@ add_const_offset_to_base(nir_shader *nir, nir_variable_mode mode) } static bool -remap_vs_attrs(nir_block *block, GLbitfield64 inputs_read) +remap_vs_attrs(nir_block *block, struct nir_shader_info *nir_info) { nir_foreach_instr(instr, block) { if (instr->type != nir_instr_type_intrinsic) @@ -111,9 +111,11 @@ remap_vs_attrs(nir_block *block, GLbitfield64 inputs_read) * before it and counting the bits. */ int attr = intrin->const_index[0]; - int slot = _mesa_bitcount_64(inputs_read & BITFIELD64_MASK(attr)); - - intrin->const_index[0] = 4 * slot; + int slot = _mesa_bitcount_64(nir_info->inputs_read & + BITFIELD64_MASK(attr)); + int dslot = _mesa_bitcount_64(nir_info->double_inputs_read & + BITFIELD64_MASK(attr)); + intrin->const_index[0] = 4 * (slot + dslot); } } return true; @@ -199,9 +201,9 @@ brw_nir_lower_vs_inputs(nir_shader *nir, var->data.driver_location = var->data.location; } - /* Now use nir_lower_io to walk dereference chains. Attribute arrays -* are loaded as one vec4 per element (or matrix column), so we use -* type_size_vec4 here. + /* Now use nir_lower_io to walk dereference chains. Attribute arrays are +* loaded as one vec4 or dvec4 per element (or matrix column), depending on +* whether it is a double-precision type or not. */ nir_lower_io(nir, nir_var_shader_in, type_size_vec4); @@ -214,18 +216,12 @@ brw_nir_lower_vs_inputs(nir_shader *nir, vs_attrib_wa_flags); if (is_scalar) { - /* Finally, translate VERT_ATTRIB_* values into the actual registers. - * - * Note that we can use nir->info.inputs_read instead of - * key->inputs_read since the two are identical aside from Gen4-5 - * edge flag differences. - */ - GLbitfield64 inputs_read = nir->info.inputs_read; + /* Finally, translate VERT_ATTRIB_* values into the actual registers. */ nir_foreach_function(function, nir) { if (function->impl) { nir_foreach_block(block, function->impl) { - remap_vs_attrs(block, inputs_read); + remap_vs_attrs(block, &nir->info); } } } -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 06/15] nir: add double input bitmap
From: "Juan A. Suarez Romero" This bitmap tracks which input attributes are double-precision. Reviewed-by: Kenneth Graunke --- src/compiler/nir/glsl_to_nir.cpp | 1 + src/compiler/nir/nir.h | 2 ++ 2 files changed, 3 insertions(+) diff --git a/src/compiler/nir/glsl_to_nir.cpp b/src/compiler/nir/glsl_to_nir.cpp index a86b966..9b1b098 100644 --- a/src/compiler/nir/glsl_to_nir.cpp +++ b/src/compiler/nir/glsl_to_nir.cpp @@ -152,6 +152,7 @@ glsl_to_nir(const struct gl_shader_program *shader_prog, shader->info.num_ssbos = sh->NumShaderStorageBlocks; shader->info.num_images = sh->NumImages; shader->info.inputs_read = sh->Program->InputsRead; + shader->info.double_inputs_read = sh->Program->DoubleInputsRead; shader->info.outputs_written = sh->Program->OutputsWritten; shader->info.patch_inputs_read = sh->Program->PatchInputsRead; shader->info.patch_outputs_written = sh->Program->PatchOutputsWritten; diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h index 20927a2..5007f4c 100644 --- a/src/compiler/nir/nir.h +++ b/src/compiler/nir/nir.h @@ -1697,6 +1697,8 @@ typedef struct nir_shader_info { /* Which inputs are actually read */ uint64_t inputs_read; + /* Which inputs are actually read and are double */ + uint64_t double_inputs_read; /* Which outputs are actually written */ uint64_t outputs_written; /* Which system values are actually read */ -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 08/15] i965/vec4: use attribute slots to calculate URB read length
From: "Juan A. Suarez Romero" Do not use total attributes because a dvec3/dvec4 attribute requires two slots. So rather use total attribute slots. v2: do not use loop to calculate required attribute slots (Kenneth Graunke) Reviewed-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_vec4.cpp | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 815eaed..b144fe8 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -2104,14 +2104,20 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, nr_attributes++; } + unsigned nr_attribute_slots = + nr_attributes + + _mesa_bitcount_64(shader->info.double_inputs_read); + /* The 3DSTATE_VS documentation lists the lower bound on "Vertex URB Entry * Read Length" as 1 in vec4 mode, and 0 in SIMD8 mode. Empirically, in * vec4 mode, the hardware appears to wedge unless we read something. */ if (is_scalar) - prog_data->base.urb_read_length = DIV_ROUND_UP(nr_attributes, 2); + prog_data->base.urb_read_length = + DIV_ROUND_UP(nr_attribute_slots, 2); else - prog_data->base.urb_read_length = DIV_ROUND_UP(MAX2(nr_attributes, 1), 2); + prog_data->base.urb_read_length = + DIV_ROUND_UP(MAX2(nr_attribute_slots, 1), 2); prog_data->nr_attributes = nr_attributes; @@ -2120,7 +2126,7 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, * the larger of the two. */ const unsigned vue_entries = - MAX2(nr_attributes, (unsigned)prog_data->base.vue_map.num_slots); + MAX2(nr_attribute_slots, (unsigned)prog_data->base.vue_map.num_slots); if (compiler->devinfo->gen == 6) prog_data->base.urb_entry_size = DIV_ROUND_UP(vue_entries, 8); -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 11/15] glsl/linker: dvec3/dvec4 may consume twice input vertex attributes
From: "Juan A. Suarez Romero" From the GL 4.5 core spec, section 11.1.1 (Vertex Attributes): "A program with more than the value of MAX_VERTEX_ATTRIBS active attribute variables may fail to link, unless device-dependent optimizations are able to make the program fit within available hardware resources. For the purposes of this test, attribute variables of the type dvec3, dvec4, dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3, and dmat4 may count as consuming twice as many attributes as equivalent single-precision types. While these types use the same number of generic attributes as their single-precision equivalents, implementations are permitted to consume two single-precision vectors of internal storage for each three- or four-component double-precision vector." This commits adds a flag that allows driver to specify if dvec3, dvec4, dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3 and dmat4 count as consuming twice as many attributes as equivalent single-precision types (default value being false). --- src/compiler/glsl/linker.cpp | 72 +++- src/mesa/main/context.c | 2 ++ src/mesa/main/mtypes.h | 13 3 files changed, 66 insertions(+), 21 deletions(-) diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp index 0268b74..ffec007 100644 --- a/src/compiler/glsl/linker.cpp +++ b/src/compiler/glsl/linker.cpp @@ -2434,6 +2434,37 @@ resize_tes_inputs(struct gl_context *ctx, } /** + * From the GL 4.5 core spec, section 11.1.1 (Vertex Attributes): + * + * "A program with more than the value of MAX_VERTEX_ATTRIBS + * active attribute variables may fail to link, unless + * device-dependent optimizations are able to make the program + * fit within available hardware resources. For the purposes + * of this test, attribute variables of the type dvec3, dvec4, + * dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3, and dmat4 may + * count as consuming twice as many attributes as equivalent + * single-precision types. While these types use the same number + * of generic attributes as their single-precision equivalents, + * implementations are permitted to consume two single-precision + * vectors of internal storage for each three- or four-component + * double-precision vector." + * + * Returns true if three- or four-component double-precision vector consumes + * two single-precision vectors of internal storage + */ + +static inline bool +attribute_consumes_two_locations(struct gl_constants *constants, + ir_variable *var) +{ + if (var->type->without_array()->is_dual_slot_double() && + constants->FP64Vector34Consumes2Locations) + return true; + else + return false; +} + +/** * Find a contiguous set of available bits in a bitmask. * * \param used_mask Bits representing used (1) and unused (0) locations @@ -2725,27 +2756,7 @@ assign_attribute_or_color_locations(gl_shader_program *prog, used_locations |= (use_mask << attr); -/* From the GL 4.5 core spec, section 11.1.1 (Vertex Attributes): - * - * "A program with more than the value of MAX_VERTEX_ATTRIBS - * active attribute variables may fail to link, unless - * device-dependent optimizations are able to make the program - * fit within available hardware resources. For the purposes - * of this test, attribute variables of the type dvec3, dvec4, - * dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3, and dmat4 may - * count as consuming twice as many attributes as equivalent - * single-precision types. While these types use the same number - * of generic attributes as their single-precision equivalents, - * implementations are permitted to consume two single-precision - * vectors of internal storage for each three- or four-component - * double-precision vector." - * - * Mark this attribute slot as taking up twice as much space - * so we can count it properly against limits. According to - * issue (3) of the GL_ARB_vertex_attrib_64bit behavior, this - * is optional behavior, but it seems preferable. - */ -if (var->type->without_array()->is_dual_slot_double()) +if (attribute_consumes_two_locations(constants, var)) double_storage_locations |= (use_mask << attr); } @@ -2818,6 +2829,25 @@ assign_attribute_or_color_locations(gl_shader_program *prog, to_assign[i].var->data.location = generic_base + location; to_assign[i].var->data.is_unmatched_generic_inout = 0; used_locations |= (use_mask << location); + + if (attribute_consumes_two_locations(constants, to_assign[i].var)) + double_storage_locations |= (use_mask << location); + } + + /* Now that we have all the locations, take in account that dvec3/4 c
[Mesa-dev] [PATCH v2 14/15] docs: Mark ARB_vertex_attrib_64bit as done for i965/gen8+
From: Alejandro Piñeiro v2: label as done for i965/gen8+ instead of i965 (Kenneth Graunke) Reviewed-by: Kenneth Graunke --- docs/GL3.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index c957604..a5ab7ee 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -142,7 +142,7 @@ GL 4.1, GLSL 4.10 --- all DONE: nvc0, r600, radeonsi GL_ARB_get_program_binary DONE (0 binary formats) GL_ARB_separate_shader_objectsDONE (all drivers) GL_ARB_shader_precision DONE (all drivers that support GLSL 4.10) - GL_ARB_vertex_attrib_64bitDONE (llvmpipe, softpipe) + GL_ARB_vertex_attrib_64bitDONE (i965/gen8+, llvmpipe, softpipe) GL_ARB_viewport_array DONE (i965, nv50, llvmpipe, softpipe) -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 04/15] i965/fs: half exec_size when dealing with 64 bits attributes
From: Alejandro Piñeiro The HW has a restriction that only vertical stride may cross register boundaries. Until now this was only handled on VGRFs at rw_reg_from_fs_reg, but it is also needed for attributes. v2: * Remove reference to commit id on commit message (Juan Suarez) * Simplify code that compute final exec_size (Ian Romanick) * Use REG_SIZE on that same code (Kenneth Graunke) Reviewed-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_fs.cpp | 21 +++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 3c58ccb..e2a87a7 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -1836,11 +1836,28 @@ fs_visitor::convert_attr_sources_to_hw_regs(fs_inst *inst) inst->src[i].nr + inst->src[i].reg_offset; - unsigned width = inst->src[i].stride == 0 ? 1 : inst->exec_size; + /* As explained at brw_reg_from_fs_reg, From the Haswell PRM: + * + * VertStride must be used to cross GRF register boundaries. This + * rule implies that elements within a 'Width' cannot cross GRF + * boundaries. + * + * So, for registers that are large enough, we have to split the exec + * size in two and trust the compression state to sort it out. + */ + unsigned total_size = inst->exec_size * + inst->src[i].stride * + type_sz(inst->src[i].type); + + assert(total_size <= 2 * REG_SIZE); + const unsigned exec_size = +(total_size <= REG_SIZE) ? inst->exec_size : inst->exec_size / 2; + + unsigned width = inst->src[i].stride == 0 ? 1 : exec_size; struct brw_reg reg = stride(byte_offset(retype(brw_vec8_grf(grf, 0), inst->src[i].type), inst->src[i].subreg_offset), - inst->exec_size * inst->src[i].stride, + exec_size * inst->src[i].stride, width, inst->src[i].stride); reg.abs = inst->src[i].abs; reg.negate = inst->src[i].negate; -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 00/15] Add ARB_vertex_attrib_64bit for i965 scalar (gen8+)
Hello, the following series is a resend of the implementation for the ARB_vertex_attrib_64bit extension on the i965 scalar backend (gen8+). This resend includes all the feedback received to v1 plus: * Minor changes after rebasing the series against the updated version of the arb_gpu_shader_fp64 extension, that is being reviewed on the list too right now. Those changes were basically adapt our function calls after changes on their signatures, and a retype of the dest register when shufling (patch 05) * Two new patches, to handle properly the linking, as agreed with Kenneth Graunke on this email [1]. Those two are the only ones totally unreviewed right now: [PATCH 11/15] glsl/linker: dvec3/dvec4 may consume twice input vertex attributes [PATCH 12/15] dri/i965: dvec3/dvec4 consume twice input vertex attributes Again, as this work depends on the ARB_gpu_shader_fp64 i965 functionality [2], which is work in progress, the aim of sending the series now is to get early feedback and parallelise the review process. The series applies on top of the current version of the https://github.com/Igalia/mesa/tree/i965-fp64 branch, which contains the last fp64 work for gen8. A frozen version of the branch containing the fp64 patches + the series is available in our repo: $ git clone -b i965-attrib64-v2 https://github.com/Igalia/mesa.git [1] https://lists.freedesktop.org/archives/mesa-dev/2016-May/116381.html [2] https://bugs.freedesktop.org/show_bug.cgi?id=92760 Alejandro Piñeiro (6): i965: get the proper vertex surface type for doubles on gen8+ i965: passthru formats cannot be used width edge flag enabled i965/fs: half exec_size when dealing with 64 bits attributes i965: Enable ARB_vertex_attrib_64bit for gen8+ docs: Mark ARB_vertex_attrib_64bit as done for i965/gen8+ i965: Expose OpenGL 4.2 for gen8+ Antia Puentes (1): i965: Configure how to store *64*PASSTHRU vertex components Juan A. Suarez Romero (8): i965/fs: shuffle 32bits into 64bits for doubles nir: add double input bitmap i965: take care of doubles when remapping VS attributes i965/vec4: use attribute slots to calculate URB read length i965/fs: calculate first non-payload GRF using attrib slots i965: take care of doubles when lowering VS inputs glsl/linker: dvec3/dvec4 may consume twice input vertex attributes dri/i965: dvec3/dvec4 consume twice input vertex attributes docs/GL3.txt | 2 +- src/compiler/glsl/linker.cpp | 72 src/compiler/nir/glsl_to_nir.cpp | 1 + src/compiler/nir/nir.h | 2 + src/mesa/drivers/dri/i965/brw_compiler.h | 1 + src/mesa/drivers/dri/i965/brw_context.c | 2 + src/mesa/drivers/dri/i965/brw_draw_upload.c | 30 ++-- src/mesa/drivers/dri/i965/brw_fs.cpp | 36 -- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 6 +++ src/mesa/drivers/dri/i965/brw_nir.c | 29 +-- src/mesa/drivers/dri/i965/brw_shader.h | 1 + src/mesa/drivers/dri/i965/brw_vec4.cpp | 13 +++-- src/mesa/drivers/dri/i965/gen8_draw_upload.c | 55 + src/mesa/drivers/dri/i965/intel_extensions.c | 3 +- src/mesa/drivers/dri/i965/intel_screen.c | 2 +- src/mesa/main/context.c | 2 + src/mesa/main/mtypes.h | 13 + 17 files changed, 221 insertions(+), 49 deletions(-) -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 05/15] i965/fs: shuffle 32bits into 64bits for doubles
From: "Juan A. Suarez Romero" VS Thread Payload handles attributes in URB as vec4, no matter if they are actually single or double precision. So with double-precision types, value ends up in the registers split in 32bits chunks, in different positions. We need to shuffle the chunks to get the doubles correctly. v2: use dest directly (Kenneth Graunke) Reviewed-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 57ab020..5ac515f 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -3691,6 +3691,12 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr for (unsigned j = 0; j < instr->num_components; j++) { bld.MOV(offset(dest, bld, j), offset(src, bld, j)); } + if (type_sz(src.type) == 8) + shuffle_32bit_load_result_to_64bit_data(bld, + dest, + retype(dest, BRW_REGISTER_TYPE_F), + instr->num_components); + break; } -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 4/8] intel: Handle firstvertex in an identical way to BaseVertex
Thanks for looking at this. On 27/03/18 03:06, Ian Romanick wrote: It looks like there are a couple places in brw_fs_nir.cpp (emit_system_values_block and fs_visitor::nir_emit_vs_intrinsic) that handle nir_intrinsic_load_base_instance but not nir_intrinsic_load_first_vertex. Should those cases be added? We should definitely include it in the emit_system_values_block to mark it as unreachable, as like the others, it is lowered by brw_nir_lower_vs_inputs. About fs_visitor::nir_emit_vs_intrinsic, it makes sense to add it there too. As far as I can see, the vs_inputs intrinsics should be lowered by then, we may want to replace the code by an "unreachable (lowered by brw_nir_lower_vs_inputs)". I have pushed a branch to my fd.o repo called gl_VertexID-fixes that rebases this series on current master. I had to apply some fixes to the last patch in the series to get it to build. I have tested that whole series patch-by-patch in the CI with no regressions. I have also applied my R-b on several patches in the series. I think once we decide what to do about this patch and the squash! patch in my branch, we can (finally) push this. Excellent. On 01/25/2018 10:15 AM, Antia Puentes wrote: Until we set gl_BaseVertex to zero for non-indexed draw calls both have an identical value. The Vertex Elements are kept like that: * VE 1: * VE 2: --- src/intel/compiler/brw_nir.c | 3 +++ src/intel/compiler/brw_vec4.cpp | 1 + src/mesa/drivers/dri/i965/brw_context.h | 8 ++-- src/mesa/drivers/dri/i965/brw_draw.c | 14 +- src/mesa/drivers/dri/i965/brw_draw_upload.c | 7 +-- src/mesa/drivers/dri/i965/genX_state_upload.c | 11 +++ 6 files changed, 31 insertions(+), 13 deletions(-) diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c index dbddef0d04d..34b1e44adf0 100644 --- a/src/intel/compiler/brw_nir.c +++ b/src/intel/compiler/brw_nir.c @@ -239,6 +239,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir, const bool has_sgvs = nir->info.system_values_read & (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) | + BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) | BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) | BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) | BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID)); @@ -261,6 +262,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir, switch (intrin->intrinsic) { case nir_intrinsic_load_base_vertex: +case nir_intrinsic_load_first_vertex: case nir_intrinsic_load_base_instance: case nir_intrinsic_load_vertex_id_zero_base: case nir_intrinsic_load_instance_id: @@ -278,6 +280,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir, nir_intrinsic_set_base(load, num_inputs); switch (intrin->intrinsic) { case nir_intrinsic_load_base_vertex: + case nir_intrinsic_load_first_vertex: nir_intrinsic_set_component(load, 0); break; case nir_intrinsic_load_base_instance: diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp index 36e17d77d47..06c79630119 100644 --- a/src/intel/compiler/brw_vec4.cpp +++ b/src/intel/compiler/brw_vec4.cpp @@ -2788,6 +2788,7 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, */ if (shader->info.system_values_read & (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) | +BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) | BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) | BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) | BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))) { diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 9046acd175c..0a20706567e 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -881,8 +881,12 @@ struct brw_context struct { struct { - /** The value of gl_BaseVertex for the current _mesa_prim. */ - int gl_basevertex; + /** + * Either the value of gl_BaseVertex for indexed draw calls or the + * value of the argument for non-indexed draw calls for the + * current _mesa_prim. + */ + int firstvertex; /** The value of gl_BaseInstance for the current _mesa_prim. */ int gl_baseinstance; diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index 50cf8b12c74..a1a5161fd35 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -816,25 +816,29 @@ brw_draw_single_prim(struct gl_context *ctx, * always flag if the shader uses one of the values. For direct draws, * we only flag if the values change. */ - const int new_basevertex =
Re: [Mesa-dev] [PATCH v4 6/6] i965: gl_BaseVertex must be zero for non-indexed draw calls
On 07/04/18 08:21, Jason Ekstrand wrote: On Fri, Apr 6, 2018 at 2:53 PM, Ian Romanick <mailto:i...@freedesktop.org>> wrote: From: Antia Puentes mailto:apuen...@igalia.com>> We keep 'firstvertex' as it is and move gl_BaseVertex to the drawID vertex element. The previous Vertex Elements order was: * VE 1: * VE 2: and now it is: * VE 1: * VE 2: To move the BaseVertex keeping VE1 as it is, allows to keep pointing the vertex buffer associated to VE 1 to the indirect buffer for indirect draw calls. From the OpenGL 4.6 (11.1.3.9 Shader Inputs) specification: "gl_BaseVertex holds the integer value passed to the baseVertex parameter to the command that resulted in the current shader invocation. In the case where the command has no baseVertex parameter, the value of gl_BaseVertex is zero." Fixes CTS tests: * KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysParameters * KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysInstancedParameters * KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters * KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysIndirectParameters * KHR-GL45.shader_draw_parameters_tests.MultiDrawArraysIndirectCountParameters v2 (idr): Make changes to brw_prepare_shader_draw_parameters matching those in genX(emit_vertices). Reformat commit message to 72 columns. Signed-off-by: Ian Romanick mailto:ian.d.roman...@intel.com>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102678 <https://bugs.freedesktop.org/show_bug.cgi?id=102678> --- src/intel/compiler/brw_nir.c | 14 + src/intel/compiler/brw_vec4.cpp | 14 + src/mesa/drivers/dri/i965/brw_context.h | 32 ++- src/mesa/drivers/dri/i965/brw_draw.c | 45 ++- src/mesa/drivers/dri/i965/brw_draw_upload.c | 14 - src/mesa/drivers/dri/i965/genX_state_upload.c | 38 +++--- 6 files changed, 97 insertions(+), 60 deletions(-) diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c index 16b0d86814f..16ab529737b 100644 --- a/src/intel/compiler/brw_nir.c +++ b/src/intel/compiler/brw_nir.c @@ -238,8 +238,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir, */ const bool has_sgvs = nir->info.system_values_read & - (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) | - BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) | + (BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) | BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) | BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) | BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID)); @@ -279,7 +278,6 @@ brw_nir_lower_vs_inputs(nir_shader *nir, nir_intrinsic_set_base(load, num_inputs); switch (intrin->intrinsic) { - case nir_intrinsic_load_base_vertex: case nir_intrinsic_load_first_vertex: nir_intrinsic_set_component(load, 0); break; @@ -293,11 +291,15 @@ brw_nir_lower_vs_inputs(nir_shader *nir, nir_intrinsic_set_component(load, 3); break; case nir_intrinsic_load_draw_id: - /* gl_DrawID is stored right after gl_VertexID and friends - * if any of them exist. + case nir_intrinsic_load_base_vertex: + /* gl_DrawID and gl_BaseVertex are stored right after + gl_VertexID and friends if any of them exist. */ nir_intrinsic_set_base(load, num_inputs + has_sgvs); - nir_intrinsic_set_component(load, 0); + if (intrin->intrinsic == nir_intrinsic_load_draw_id) + nir_intrinsic_set_component(load, 0); + else + nir_intrinsic_set_component(load, 1); break; default: unreachable("Invalid system value intrinsic"); diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp index 1e384f5bf4d..d33caefdea9 100644 --- a/src/intel/compiler/brw_vec4.cpp +++ b/src/intel/compiler/brw_vec4.cpp @@ -2825,14 +2825,19 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, * incoming vertex attribute. So, add an extra slot. */ if (shader->info.system_values_read & - (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) | - BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) | +
Re: [Mesa-dev] [PATCH v4 6/6] i965: gl_BaseVertex must be zero for non-indexed draw calls
On 10/04/18 18:26, Jason Ekstrand wrote: On Tue, Apr 10, 2018 at 1:28 AM, Antia Puentes <mailto:apuen...@igalia.com>> wrote: On 07/04/18 08:21, Jason Ekstrand wrote: On Fri, Apr 6, 2018 at 2:53 PM, Ian Romanick mailto:i...@freedesktop.org>> wrote: From: Antia Puentes mailto:apuen...@igalia.com>> We keep 'firstvertex' as it is and move gl_BaseVertex to the drawID vertex element. The previous Vertex Elements order was: * VE 1: * VE 2: and now it is: * VE 1: * VE 2: To move the BaseVertex keeping VE1 as it is, allows to keep pointing the vertex buffer associated to VE 1 to the indirect buffer for indirect draw calls. From the OpenGL 4.6 (11.1.3.9 Shader Inputs) specification: "gl_BaseVertex holds the integer value passed to the baseVertex parameter to the command that resulted in the current shader invocation. In the case where the command has no baseVertex parameter, the value of gl_BaseVertex is zero." Fixes CTS tests: * KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysParameters * KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysInstancedParameters * KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters * KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysIndirectParameters * KHR-GL45.shader_draw_parameters_tests.MultiDrawArraysIndirectCountParameters v2 (idr): Make changes to brw_prepare_shader_draw_parameters matching those in genX(emit_vertices). Reformat commit message to 72 columns. Signed-off-by: Ian Romanick mailto:ian.d.roman...@intel.com>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102678 <https://bugs.freedesktop.org/show_bug.cgi?id=102678> --- src/intel/compiler/brw_nir.c | 14 + src/intel/compiler/brw_vec4.cpp | 14 + src/mesa/drivers/dri/i965/brw_context.h | 32 ++- src/mesa/drivers/dri/i965/brw_draw.c | 45 ++- src/mesa/drivers/dri/i965/brw_draw_upload.c | 14 - src/mesa/drivers/dri/i965/genX_state_upload.c | 38 +++--- 6 files changed, 97 insertions(+), 60 deletions(-) diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c index 16b0d86814f..16ab529737b 100644 --- a/src/intel/compiler/brw_nir.c +++ b/src/intel/compiler/brw_nir.c @@ -238,8 +238,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir, */ const bool has_sgvs = nir->info.system_values_read & - (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) | - BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) | + (BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) | BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) | BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) | BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID)); @@ -279,7 +278,6 @@ brw_nir_lower_vs_inputs(nir_shader *nir, nir_intrinsic_set_base(load, num_inputs); switch (intrin->intrinsic) { - case nir_intrinsic_load_base_vertex: case nir_intrinsic_load_first_vertex: nir_intrinsic_set_component(load, 0); break; @@ -293,11 +291,15 @@ brw_nir_lower_vs_inputs(nir_shader *nir, nir_intrinsic_set_component(load, 3); break; case nir_intrinsic_load_draw_id: - /* gl_DrawID is stored right after gl_VertexID and friends - * if any of them exist. + case nir_intrinsic_load_base_vertex: + /* gl_DrawID and gl_BaseVertex are stored right after + gl_VertexID and friends if any of them exist. */ nir_intrinsic_set_base(load, num_inputs + has_sgvs); - nir_intrinsic_set_component(load, 0); + if (intrin->intrinsic == nir_intrinsic_load_draw_id) + nir_intrinsic_set_component(load, 0); + else + nir_intrinsic_set_component(load, 1); break; default: unreachable("Invalid system value intrinsic"); diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp index 1e384f5bf4d..d33caefdea9 100644 --- a/src/intel/compiler/br