from:"Antia Puentes"

[Mesa-dev] Fix saturation when coalescing registers in NIR-vec4

2015-08-14 Thread Antia Puentes

With this fix, we can load the constants in NIR-vec4 as integers,
as it is done in the FS backend.

This is related to:
http://lists.freedesktop.org/archives/mesa-dev/2015-July/089899.html

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] i965/vec4: Fix saturation errors when coalescing registers

2015-08-14 Thread Antia Puentes

If the register types do not match and the instruction
that contains the final destination is saturated, register
coalescing generated non-equivalent code.

This did not happen when using IR because types usually
matched, but it is visible in nir-vec4.

For example,
   mov  vgrf7:D vgrf2:D
   mov.sat  m4:F vgrf7:F

is coalesced to:
   mov.sat  m4:D vgrf2:D

The patch prevents coalescing in such scenario, unless the
instruction we want to coalesce into is a MOV. In that case,
the patch sets the register types to the type of the final
destination.
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index f18915a..52982c3 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1053,6 +1053,15 @@ vec4_visitor::opt_register_coalesce()
}
 }
 
+/* This doesn't handle saturation on the instruction we
+ * want to coalesce away if the register types do not match.
+ * But if scan_inst is a 'mov' we can fix the types later.
+ */
+if (inst->saturate &&
+inst->dst.type != scan_inst->dst.type &&
+scan_inst->opcode != BRW_OPCODE_MOV)
+   break;
+
 /* If we can't handle the swizzle, bail. */
 if (!scan_inst->can_reswizzle(inst->dst.writemask,
   inst->src[0].swizzle,
@@ -1128,6 +1137,15 @@ vec4_visitor::opt_register_coalesce()
   scan_inst->dst.file = inst->dst.file;
   scan_inst->dst.reg = inst->dst.reg;
   scan_inst->dst.reg_offset = inst->dst.reg_offset;
+   if (inst->saturate &&
+   inst->dst.type != scan_inst->dst.type) {
+  /* If we have reached this point, scan_inst is a 'mov' and
+   * we can modify its register types to match the ones in 
inst.
+   * Otherwise, we could have an incorrect saturation result.
+   */
+  scan_inst->dst.type = inst->dst.type;
+  scan_inst->src[0].type = inst->src[0].type;
+   }
   scan_inst->saturate |= inst->saturate;
}
scan_inst = (vec4_instruction *)scan_inst->next;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] i965/vec4_nir: Load constants as integers

2015-08-14 Thread Antia Puentes

Loads constants using integer as their register type, this is done
for consistency with the FS backend.
---
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index 923e2d3..e4ae899 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
@@ -456,7 +456,7 @@ void
 vec4_visitor::nir_emit_load_const(nir_load_const_instr *instr)
 {
dst_reg reg = dst_reg(GRF, alloc.allocate(1));
-   reg.type =  BRW_REGISTER_TYPE_F;
+   reg.type =  BRW_REGISTER_TYPE_D;
 
/* @FIXME: consider emitting vector operations to save some MOVs in
 * cases where the components are representable in 8 bits.
@@ -464,7 +464,7 @@ vec4_visitor::nir_emit_load_const(nir_load_const_instr 
*instr)
 */
for (unsigned i = 0; i < instr->def.num_components; ++i) {
   reg.writemask = 1 << i;
-  emit(MOV(reg, src_reg(instr->value.f[i])));
+  emit(MOV(reg, src_reg(instr->value.i[i])));
}
 
/* Set final writemask */
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 2/2] i965/vec4_nir: Load constants as integers

2015-08-24 Thread Antia Puentes

Loads constants using integer as their register type, this is done
for consistency with the FS backend.
---
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index 632e409..23b2fab 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
@@ -456,7 +456,7 @@ void
 vec4_visitor::nir_emit_load_const(nir_load_const_instr *instr)
 {
dst_reg reg = dst_reg(GRF, alloc.allocate(1));
-   reg.type =  BRW_REGISTER_TYPE_F;
+   reg.type =  BRW_REGISTER_TYPE_D;
 
unsigned remaining = brw_writemask_for_size(instr->def.num_components);
 
@@ -477,7 +477,7 @@ vec4_visitor::nir_emit_load_const(nir_load_const_instr 
*instr)
   }
 
   reg.writemask = writemask;
-  emit(MOV(reg, src_reg(instr->value.f[i])));
+  emit(MOV(reg, src_reg(instr->value.i[i])));
 
   remaining &= ~writemask;
}
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2] i965: don't fail to shift height images for levels.

2016-08-23 Thread Antia Puentes

From: Dave Airlie 

This fixes one subtest of:
GL44-CTS.shader_image_size.advanced-nonMS-fs-int

I've no idea why this wouldn't be scaled up here,
and I've no idea what else will break, but I might
as well open for discussion.

v2: Only shift height if the texture is not an 1D_ARRAY,
it fixes assertion in GL44-CTS.texture_view.gettexparameter
due to the original patch (Antia).

Signed-off-by: Dave Airlie 
Signed-off-by: Antia Puentes 
---

I have not taken a deep look to the test so take this with a grain of salt.
As I said in a previous email, this patch raises an assertion in
GL44-CTS.texture_view.gettexparameter:

"glcts: intel_mipmap_tree.c:368: intel_miptree_create_layout: Assertion
`height0 = 1' failed."

Looking at the code surrounding the assertion, we have:

   if (target == GL_TEXTURE_1D_ARRAY)
 assert(height0 == 1);

which suggests that we should avoid shifting the height at least for
TEXTURE_1D_ARRAYs. Sending a second version of the patch.

 src/mesa/drivers/dri/i965/intel_tex_image.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c 
b/src/mesa/drivers/dri/i965/intel_tex_image.c
index 958f8bd..120e7e0 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_image.c
+++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
@@ -49,7 +49,7 @@ intel_miptree_create_for_teximage(struct brw_context *brw,
/* Figure out image dimensions at start level. */
for (i = intelImage->base.Base.Level; i > 0; i--) {
   width <<= 1;
-  if (height != 1)
+  if (intelObj->base.Target != GL_TEXTURE_1D_ARRAY)
  height <<= 1;
   if (intelObj->base.Target == GL_TEXTURE_3D)
  depth <<= 1;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3] i965: Fix calculation of the image height at start level

2016-09-02 Thread Antia Puentes

- Fixes CTS tests:

* GL44-CTS.shader_image_size.advanced-nonMS-cs-float
* GL44-CTS.shader_image_size.advanced-nonMS-cs-int
* GL44-CTS.shader_image_size.advanced-nonMS-cs-uint
* GL44-CTS.shader_image_size.advanced-nonMS-gs-float
* GL44-CTS.shader_image_size.advanced-nonMS-gs-int
* GL44-CTS.shader_image_size.advanced-nonMS-gs-uint
* GL44-CTS.shader_image_size.advanced-nonMS-tes-float
* GL44-CTS.shader_image_size.advanced-nonMS-tes-int
* GL44-CTS.shader_image_size.advanced-nonMS-tes-uint
* GL44-CTS.shader_image_size.advanced-nonMS-vs-float
* GL44-CTS.shader_image_size.advanced-nonMS-vs-int
* GL44-CTS.shader_image_size.advanced-nonMS-vs-uint

v1: (written by Dave Airlie) Always shift height images for levels.
Fixed the CTS tests.

v2: Only shift height if the texture is not an 1D_ARRAY,
it fixes assertion in GL44-CTS.texture_view.gettexparameter
due to the original patch (Antia).

v3: Remove the loop. Do not shift height either for 1D textures.
Use an explicit switch and add an assertion (levels == 0) for
multisampled textures (Jason).

Signed-off-by: Dave Airlie 
Signed-off-by: Antia Puentes 
---
 src/mesa/drivers/dri/i965/intel_tex_image.c | 27 +--
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c 
b/src/mesa/drivers/dri/i965/intel_tex_image.c
index 7affe08..cfcbf3c 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_image.c
+++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
@@ -47,12 +47,27 @@ intel_miptree_create_for_teximage(struct brw_context *brw,
DBG("%s\n", __func__);
 
/* Figure out image dimensions at start level. */
-   for (i = intelImage->base.Base.Level; i > 0; i--) {
-  width <<= 1;
-  if (height != 1)
- height <<= 1;
-  if (intelObj->base.Target == GL_TEXTURE_3D)
- depth <<= 1;
+   switch(intelObj->base.Target) {
+   case GL_TEXTURE_2D_MULTISAMPLE:
+   case GL_TEXTURE_2D_MULTISAMPLE_ARRAY:
+  assert(intelImage->base.Base.Level == 0);
+  break;
+   case GL_TEXTURE_3D:
+  depth <<= intelImage->base.Base.Level;
+  /* Fall through */
+   case GL_TEXTURE_2D:
+   case GL_TEXTURE_2D_ARRAY:
+   case GL_TEXTURE_RECTANGLE:
+   case GL_TEXTURE_CUBE_MAP:
+   case GL_TEXTURE_CUBE_MAP_ARRAY:
+  height <<= intelImage->base.Base.Level;
+  /* Fall through */
+   case GL_TEXTURE_1D:
+   case GL_TEXTURE_1D_ARRAY:
+  width <<= intelImage->base.Base.Level;
+  break;
+   default:
+  unreachable("Unexpected target");
}
 
/* Guess a reasonable value for lastLevel.  This is probably going
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v4] i965: Fix calculation of the image height at start level

2016-09-02 Thread Antia Puentes

- Fixes CTS tests:

* GL44-CTS.shader_image_size.advanced-nonMS-cs-float
* GL44-CTS.shader_image_size.advanced-nonMS-cs-int
* GL44-CTS.shader_image_size.advanced-nonMS-cs-uint
* GL44-CTS.shader_image_size.advanced-nonMS-gs-float
* GL44-CTS.shader_image_size.advanced-nonMS-gs-int
* GL44-CTS.shader_image_size.advanced-nonMS-gs-uint
* GL44-CTS.shader_image_size.advanced-nonMS-tes-float
* GL44-CTS.shader_image_size.advanced-nonMS-tes-int
* GL44-CTS.shader_image_size.advanced-nonMS-tes-uint
* GL44-CTS.shader_image_size.advanced-nonMS-vs-float
* GL44-CTS.shader_image_size.advanced-nonMS-vs-int
* GL44-CTS.shader_image_size.advanced-nonMS-vs-uint

v1: (written by Dave Airlie) Always shift height images for levels.
Fixed the CTS test.

v2: Only shift height if the texture is not an 1D_ARRAY,
it fixes assertion in GL44-CTS.texture_view.gettexparameter
due to the original patch (Antia).

v3: Remove the loop. Do not shift height either for 1D textures.
Use an explicit switch and add an assertion (levels == 0) for
multisampled textures (Jason).

v4: Rectangle textures can not have levels either (Ilia Mirkin).

Signed-off-by: Dave Airlie 
Signed-off-by: Antia Puentes 
---
 src/mesa/drivers/dri/i965/intel_tex_image.c | 27 +--
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c 
b/src/mesa/drivers/dri/i965/intel_tex_image.c
index 7affe08..65962eb 100644
--- a/src/mesa/drivers/dri/i965/intel_tex_image.c
+++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
@@ -47,12 +47,27 @@ intel_miptree_create_for_teximage(struct brw_context *brw,
DBG("%s\n", __func__);
 
/* Figure out image dimensions at start level. */
-   for (i = intelImage->base.Base.Level; i > 0; i--) {
-  width <<= 1;
-  if (height != 1)
- height <<= 1;
-  if (intelObj->base.Target == GL_TEXTURE_3D)
- depth <<= 1;
+   switch(intelObj->base.Target) {
+   case GL_TEXTURE_2D_MULTISAMPLE:
+   case GL_TEXTURE_2D_MULTISAMPLE_ARRAY:
+   case GL_TEXTURE_RECTANGLE:
+  assert(intelImage->base.Base.Level == 0);
+  break;
+   case GL_TEXTURE_3D:
+  depth <<= intelImage->base.Base.Level;
+  /* Fall through */
+   case GL_TEXTURE_2D:
+   case GL_TEXTURE_2D_ARRAY:
+   case GL_TEXTURE_CUBE_MAP:
+   case GL_TEXTURE_CUBE_MAP_ARRAY:
+  height <<= intelImage->base.Base.Level;
+  /* Fall through */
+   case GL_TEXTURE_1D:
+   case GL_TEXTURE_1D_ARRAY:
+  width <<= intelImage->base.Base.Level;
+  break;
+   default:
+  unreachable("Unexpected target");
}
 
/* Guess a reasonable value for lastLevel.  This is probably going
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965/gen8: Fix vertex attrib upload for dvec3/4 shader inputs

2016-10-31 Thread Antia Puentes

The emission of vertex attributes corresponding to dvec3 and dvec4
vertex shader input variables was not correct when the  passed
to the VertexAttribL* commands was <= 2.

This was because we were using the vertex array size when emitting vertices
to decide if we uploaded a 64-bit floating point attribute as 1 slot (128-bits)
for sizes 1 and 2, or 2 slots (256-bits) for sizes 3 and 4. This caused problems
when mapping the input variables to registers because, for deciding which
registers contain the values uploaded for a certain variable, we use the size
and type given to the variable in the shader, so we will be assigning 256-bits
to dvec3/4 variables, even if we only uploaded 128-bits for them, which happened
when the vertex array size was <= 2.

The patch uses the shader information to only emit as 128-bits those 64-bit 
floating
point variables that were declared as double or dvec2 in the vertex shader. 
Dvec3 and
dvec4 variables will be always uploaded as 256-bits, independently of the 
 given
to the VertexAttribL* command.

From the ARB_vertex_attrib_64bit specification:

   "For the 64-bit double precision types listed in Table X.1, no default
attribute values are provided if the values of the vertex attribute variable
are specified with fewer components than required for the attribute
variable. For example, the fourth component of a variable of type dvec4
will be undefined if specified using VertexAttribL3dv or using a vertex
array specified with VertexAttribLPointer and a size of three."

We are filling these unspecified components with zeros, which coincidentally is
also what the GL44-CTS.vertex_attrib_binding.basic-inputL-case1 expects.

Fixes: GL44-CTS.vertex_attrib_binding.basic-inputL-case1 test

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97287
---
 src/mesa/drivers/dri/i965/brw_compiler.h |  1 +
 src/mesa/drivers/dri/i965/brw_context.h  |  2 +-
 src/mesa/drivers/dri/i965/brw_draw_upload.c  |  4 +++-
 src/mesa/drivers/dri/i965/brw_vs.c   |  1 +
 src/mesa/drivers/dri/i965/gen8_draw_upload.c | 35 
 5 files changed, 21 insertions(+), 22 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h 
b/src/mesa/drivers/dri/i965/brw_compiler.h
index 819c7d6..c2400f9 100644
--- a/src/mesa/drivers/dri/i965/brw_compiler.h
+++ b/src/mesa/drivers/dri/i965/brw_compiler.h
@@ -641,6 +641,7 @@ struct brw_vs_prog_data {
struct brw_vue_prog_data base;
 
GLbitfield64 inputs_read;
+   GLbitfield64 double_inputs_read;
 
unsigned nr_attributes;
unsigned nr_attribute_slots;
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 308ba99..310372a 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -535,7 +535,7 @@ struct brw_vertex_element {
const struct gl_vertex_array *glarray;
 
int buffer;
-
+   bool is_dual_slot;
/** Offset of the first element within the buffer object */
unsigned int offset;
 };
diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c 
b/src/mesa/drivers/dri/i965/brw_draw_upload.c
index da13e7a..19c8065 100644
--- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
@@ -472,7 +472,9 @@ brw_prepare_vertices(struct brw_context *brw)
while (vs_inputs) {
   GLuint index = ffsll(vs_inputs) - 1;
   struct brw_vertex_element *input = &brw->vb.inputs[index];
-
+  input->is_dual_slot =
+ (brw->gen >= 8 && _mesa_bitcount_64(vs_prog_data->double_inputs_read &
+ BITFIELD64_BIT(index)));
   vs_inputs &= ~BITFIELD64_BIT(index);
   brw->vb.enabled[brw->vb.nr_enabled++] = input;
}
diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
b/src/mesa/drivers/dri/i965/brw_vs.c
index 842c516..02a88ca 100644
--- a/src/mesa/drivers/dri/i965/brw_vs.c
+++ b/src/mesa/drivers/dri/i965/brw_vs.c
@@ -151,6 +151,7 @@ brw_codegen_vs_prog(struct brw_context *brw,
uint64_t outputs_written =
   brw_vs_outputs_written(brw, key, vp->program.info.outputs_written);
prog_data.inputs_read = vp->program.info.inputs_read;
+   prog_data.double_inputs_read = vp->program.info.double_inputs_read;
 
if (key->copy_edgeflag) {
   prog_data.inputs_read |= VERT_BIT_EDGEFLAG;
diff --git a/src/mesa/drivers/dri/i965/gen8_draw_upload.c 
b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
index 23c7587..69ba8e9 100644
--- a/src/mesa/drivers/dri/i965/gen8_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
@@ -230,8 +230,15 @@ gen8_emit_vertices(struct brw_context *brw)
   case 0: comp0 = BRW_VE1_COMPONENT_STORE_0;
   case 1: comp1 = BRW_VE1_COMPONENT_STORE_0;
   case 2: comp2 = BRW_VE1_COMPONENT_STORE_0;
-  case 3: comp3 = input->glarray->Integer ? BRW_VE1_COMPONENT_STORE_1_INT
-  : BRW_VE1_COMPONENT_STORE_1_FLT;
+  case 3:
+ if (i

[Mesa-dev] [PATCH] i965/vec4: Don't coalesce registers in gen6 math ops if reswizzling needed

2015-09-20 Thread Antia Puentes

Math operations in SandyBridge do not support source swizzling

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92033
---
 src/mesa/drivers/dri/i965/brw_ir_vec4.h |  3 ++-
 src/mesa/drivers/dri/i965/brw_vec4.cpp  | 11 +--
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h 
b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
index 966a410..a48bb68 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
@@ -175,7 +175,8 @@ public:
 
bool is_send_from_grf();
unsigned regs_read(unsigned arg) const;
-   bool can_reswizzle(int dst_writemask, int swizzle, int swizzle_mask);
+   bool can_reswizzle(const struct brw_device_info *devinfo, int dst_writemask,
+  int swizzle, int swizzle_mask);
void reswizzle(int dst_writemask, int swizzle);
bool can_do_source_mods(const struct brw_device_info *devinfo);
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index ed49cd3..d7192e4 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -941,10 +941,17 @@ vec4_visitor::opt_set_dependency_control()
 }
 
 bool
-vec4_instruction::can_reswizzle(int dst_writemask,
+vec4_instruction::can_reswizzle(const struct brw_device_info *devinfo,
+int dst_writemask,
 int swizzle,
 int swizzle_mask)
 {
+
+   /* gen6 math instructions can not manage source swizzles */
+   if (devinfo->gen == 6 && is_math() &&
+   swizzle != BRW_SWIZZLE_XYZW)
+  return false;
+
/* If this instruction sets anything not referenced by swizzle, then we'd
 * totally break it when we reswizzle.
 */
@@ -1077,7 +1084,7 @@ vec4_visitor::opt_register_coalesce()
break;
 
 /* If we can't handle the swizzle, bail. */
-if (!scan_inst->can_reswizzle(inst->dst.writemask,
+if (!scan_inst->can_reswizzle(devinfo, inst->dst.writemask,
   inst->src[0].swizzle,
   chans_needed)) {
break;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2] i965/vec4: Don't coalesce regs in Gen6 MATH ops if reswizzle/writemask needed

2015-09-22 Thread Antia Puentes

Gen6 MATH instructions can not execute in align16 mode, so swizzles or
writemasking are not allowed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92033
---

I have tried to find an example where the writemask check is strictly needed
in order to avoid an incorrect register coalescing, but I have failed.

If I have something like this in my shader:

  in vec4 attr_pos;

  void main() {
vec4 a;
a.xy = sin(attr_pos);
..
  }

It is translated into the following vec4 instructions:
  sin vgrf4.0:F, vgrf3.xyzw:F
  mov vgrf0.0.xy:F, vgrf4.xyzw:F <-- (added by emit_math to fix the 
writemasking issue)

And converted by the opt_reduce_swizzle optimization into:
  sin vgrf4.0:F, vgrf3.xyzw:F
  mov vgrf0.0.xy:F, vgrf4.xyyy:F

The swizzle in the MOV instruction is not the identity anymore, so the method 
"can_reswizzle"
in opt_register_coalesce will return FALSE because of the swizzle-check that my 
first
version of the patch added.

Out of curiosity, I disabled the opt_reduce_swizzle optimization to see what 
happens.
Still "can_reswizzle" returns FALSE preventing the incorrect register 
coalescing,
because the math function sets more things than the MOV instruction; i.e.
the check "(dst.writemask & ~swizzle_mask) is positive.

Although I have not found an example where the writemask-check is crucial, that 
does not neccessarily
mean that it does not exists. I am sending the second version of the patch in 
case you prefer to have
that check in place.

 src/mesa/drivers/dri/i965/brw_ir_vec4.h |  3 ++-
 src/mesa/drivers/dri/i965/brw_vec4.cpp  | 11 +--
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h 
b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
index 966a410..a48bb68 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
@@ -175,7 +175,8 @@ public:
 
bool is_send_from_grf();
unsigned regs_read(unsigned arg) const;
-   bool can_reswizzle(int dst_writemask, int swizzle, int swizzle_mask);
+   bool can_reswizzle(const struct brw_device_info *devinfo, int dst_writemask,
+  int swizzle, int swizzle_mask);
void reswizzle(int dst_writemask, int swizzle);
bool can_do_source_mods(const struct brw_device_info *devinfo);
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index ed49cd3..ae695f1 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -941,10 +941,17 @@ vec4_visitor::opt_set_dependency_control()
 }
 
 bool
-vec4_instruction::can_reswizzle(int dst_writemask,
+vec4_instruction::can_reswizzle(const struct brw_device_info *devinfo,
+int dst_writemask,
 int swizzle,
 int swizzle_mask)
 {
+   /* Gen6 MATH instructions can not execute in align16 mode, so swizzles
+* or writemasking are not allowed. */
+   if (devinfo->gen == 6 && is_math() &&
+   (swizzle != BRW_SWIZZLE_XYZW || dst_writemask != WRITEMASK_XYZW))
+  return false;
+
/* If this instruction sets anything not referenced by swizzle, then we'd
 * totally break it when we reswizzle.
 */
@@ -1077,7 +1084,7 @@ vec4_visitor::opt_register_coalesce()
break;
 
 /* If we can't handle the swizzle, bail. */
-if (!scan_inst->can_reswizzle(inst->dst.writemask,
+if (!scan_inst->can_reswizzle(devinfo, inst->dst.writemask,
   inst->src[0].swizzle,
   chans_needed)) {
break;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965: Emit IF/ELSE/ENDIF/WHILE JIP with type W on Gen7

2015-03-13 Thread Antia Puentes

IvyBridge and Haswell PRM say that the JIP should be emitted
with type W but we were using UD. The previous implementation
did not show adverse effects, however changing the type to
D caused a GPU hang, see bug 84557; IMHO it is safer to
follow the specification thoroughly.
---
 src/mesa/drivers/dri/i965/brw_eu_emit.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
b/src/mesa/drivers/dri/i965/brw_eu_emit.c
index 43e5783..1ca79a9 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
@@ -1332,7 +1332,7 @@ brw_IF(struct brw_compile *p, unsigned execute_size)
} else if (brw->gen == 7) {
   brw_set_dest(p, insn, vec1(retype(brw_null_reg(), BRW_REGISTER_TYPE_D)));
   brw_set_src0(p, insn, vec1(retype(brw_null_reg(), BRW_REGISTER_TYPE_D)));
-  brw_set_src1(p, insn, brw_imm_ud(0));
+  brw_set_src1(p, insn, brw_imm_w(0));
   brw_inst_set_jip(brw, insn, 0);
   brw_inst_set_uip(brw, insn, 0);
} else {
@@ -1533,7 +1533,7 @@ brw_ELSE(struct brw_compile *p)
} else if (brw->gen == 7) {
   brw_set_dest(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
   brw_set_src0(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
-  brw_set_src1(p, insn, brw_imm_ud(0));
+  brw_set_src1(p, insn, brw_imm_w(0));
   brw_inst_set_jip(brw, insn, 0);
   brw_inst_set_uip(brw, insn, 0);
} else {
@@ -1610,7 +1610,7 @@ brw_ENDIF(struct brw_compile *p)
} else if (brw->gen == 7) {
   brw_set_dest(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
   brw_set_src0(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
-  brw_set_src1(p, insn, brw_imm_ud(0));
+  brw_set_src1(p, insn, brw_imm_w(0));
} else {
   brw_set_src0(p, insn, brw_imm_d(0));
}
@@ -1802,7 +1802,7 @@ brw_WHILE(struct brw_compile *p)
   } else if (brw->gen == 7) {
  brw_set_dest(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
  brw_set_src0(p, insn, retype(brw_null_reg(), BRW_REGISTER_TYPE_D));
- brw_set_src1(p, insn, brw_imm_ud(0));
+ brw_set_src1(p, insn, brw_imm_w(0));
  brw_inst_set_jip(brw, insn, br * (do_insn - insn));
   } else {
  brw_set_dest(p, insn, brw_imm_w(0));
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] glsl: respect the source number set by #line

2015-03-23 Thread Antia Puentes

From GLSL 1.30.10, section 3.3 (Preprocessor):
"#line line source-string-number ... After processing this directive
(including its new-line), the implementation will behave as if it is
compiling at ... source string number source-string-number. Subsequent
source strings will be numbered sequentially, until another #line
directive overrides that numbering."

In the previous implementation the source number was always zero.
Subsequent source strings are still not numbered sequentially, because
in the glShaderSource implementation we are concatenating the source code
strings into one long string.

Partially fixes https://bugs.freedesktop.org/show_bug.cgi?id=88815
---
 src/glsl/glsl_lexer.ll | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/glsl/glsl_lexer.ll b/src/glsl/glsl_lexer.ll
index 8dc3d10..f0e047e 100644
--- a/src/glsl/glsl_lexer.ll
+++ b/src/glsl/glsl_lexer.ll
@@ -36,14 +36,13 @@ static int classify_identifier(struct 
_mesa_glsl_parse_state *, const char *);
 
 #define YY_USER_ACTION \
do {\
-  yylloc->source = 0;  \
   yylloc->first_column = yycolumn + 1; \
   yylloc->first_line = yylloc->last_line = yylineno + 1;   \
   yycolumn += yyleng;  \
   yylloc->last_column = yycolumn + 1;  \
} while(0);
 
-#define YY_USER_INIT yylineno = 0; yycolumn = 0;
+#define YY_USER_INIT yylineno = 0; yycolumn = 0; yylloc->source = 0;
 
 /* A macro for handling reserved words and keywords across language versions.
  *
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] glsl: Update the #line behaviour on GLSL 3.30+ and GLSL ES+

2015-03-23 Thread Antia Puentes

From GLSL 3.30 and GLSL ES 1.00 on, after processing the line
directive (including its new-line), the implementation should
behave as if it is compiling at the line number passed as
argument. In previous versions, it behaved as if compiling
at the passed line number + 1.

Partially fixes https://bugs.freedesktop.org/show_bug.cgi?id=88815
---
 src/glsl/glsl_lexer.ll | 17 +
 1 file changed, 17 insertions(+)

diff --git a/src/glsl/glsl_lexer.ll b/src/glsl/glsl_lexer.ll
index f0e047e..2785ed1 100644
--- a/src/glsl/glsl_lexer.ll
+++ b/src/glsl/glsl_lexer.ll
@@ -187,6 +187,15 @@ HASH   ^{SPC}#{SPC}
* one-based.
*/
   yylineno = strtol(ptr, &ptr, 0) - 1;
+
+   /* From GLSL 3.30 and GLSL ES on, after 
processing the
+* line directive (including its new-line), 
the implementation
+* will behave as if it is compiling at the 
line number passed
+* as argument. It was line number + 1 in 
older specifications.
+*/
+   if (yyextra->is_version(330, 100))
+  yylineno--;
+
   yylloc->source = strtol(ptr, NULL, 0);
}
 {HASH}line{SPCP}{INT}{SPC}${
@@ -202,6 +211,14 @@ HASH   ^{SPC}#{SPC}
* one-based.
*/
   yylineno = strtol(ptr, &ptr, 0) - 1;
+
+   /* From GLSL 3.30 and GLSL ES on, after 
processing the
+* line directive (including its new-line), 
the implementation
+* will behave as if it is compiling at the 
line number passed
+* as argument. It was line number + 1 in 
older specifications.
+*/
+   if (yyextra->is_version(330, 100))
+  yylineno--;
}
 ^{SPC}#{SPC}pragma{SPCP}debug{SPC}\({SPC}on{SPC}\) {
  BEGIN PP;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] formatquery: use correct target check for IMAGE_FORMAT_COMPATIBILITY_TYPE

2017-10-27 Thread Antia Puentes


Thanks for fixing this.

Reviewed-by: Antia Puentes 


On 27/10/17 11:18, Alejandro Piñeiro wrote:

 From the spec:
"IMAGE_FORMAT_COMPATIBILITY_TYPE: The matching criteria use for the
 resource when used as an image textures is returned in
 . This is equivalent to calling GetTexParameter"

So we would need to return None for any target not supported by
GetTexParameter. By mistake, we were using the target check for
GetTexLevelParameter.
---
  src/mesa/main/formatquery.c | 8 +++-
  1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/formatquery.c b/src/mesa/main/formatquery.c
index 77c7faa2251..39c628039b8 100644
--- a/src/mesa/main/formatquery.c
+++ b/src/mesa/main/formatquery.c
@@ -1430,7 +1430,13 @@ _mesa_GetInternalformativ(GLenum target, GLenum 
internalformat, GLenum pname,
if (!_mesa_has_ARB_shader_image_load_store(ctx))
   goto end;
  
-  if (!_mesa_legal_get_tex_level_parameter_target(ctx, target, true))

+  /* As pointed by the spec quote below, this pname query should return
+   * the same value that GetTexParameter. So if the target is not valid
+   * for GetTexParameter we return the unsupported value. The check below
+   * is the same target check used by GetTextParameter.
+   */
+  int targetIndex = _mesa_tex_target_to_index(ctx, target);
+  if (targetIndex < 0 || targetIndex == TEXTURE_BUFFER_INDEX)
   goto end;
  
/* From spec: "Equivalent to calling GetTexParameter with  set


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/9] nir: Add SYSTEM_VALUE_BASE_VERTEX_ID instrinsics

2017-11-10 Thread Antia Puentes

Reviewed-by: Neil Roberts 
---
 src/compiler/nir/nir.c | 4 
 src/compiler/nir/nir_gather_info.c | 1 +
 src/compiler/nir/nir_intrinsics.h  | 1 +
 3 files changed, 6 insertions(+)

diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
index 7380bf436a8..6f0477b0676 100644
--- a/src/compiler/nir/nir.c
+++ b/src/compiler/nir/nir.c
@@ -1913,6 +1913,8 @@ nir_intrinsic_from_system_value(gl_system_value val)
   return nir_intrinsic_load_base_instance;
case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE:
   return nir_intrinsic_load_vertex_id_zero_base;
+   case SYSTEM_VALUE_BASE_VERTEX_ID:
+  return nir_intrinsic_load_base_vertex_id;
case SYSTEM_VALUE_BASE_VERTEX:
   return nir_intrinsic_load_base_vertex;
case SYSTEM_VALUE_INVOCATION_ID:
@@ -1982,6 +1984,8 @@ nir_system_value_from_intrinsic(nir_intrinsic_op intrin)
   return SYSTEM_VALUE_BASE_INSTANCE;
case nir_intrinsic_load_vertex_id_zero_base:
   return SYSTEM_VALUE_VERTEX_ID_ZERO_BASE;
+   case nir_intrinsic_load_base_vertex_id:
+  return SYSTEM_VALUE_BASE_VERTEX_ID;
case nir_intrinsic_load_base_vertex:
   return SYSTEM_VALUE_BASE_VERTEX;
case nir_intrinsic_load_invocation_id:
diff --git a/src/compiler/nir/nir_gather_info.c 
b/src/compiler/nir/nir_gather_info.c
index 13cdae39eca..cf2abb8b8ed 100644
--- a/src/compiler/nir/nir_gather_info.c
+++ b/src/compiler/nir/nir_gather_info.c
@@ -232,6 +232,7 @@ gather_intrinsic_info(nir_intrinsic_instr *instr, 
nir_shader *shader)
case nir_intrinsic_load_vertex_id:
case nir_intrinsic_load_vertex_id_zero_base:
case nir_intrinsic_load_base_vertex:
+   case nir_intrinsic_load_base_vertex_id:
case nir_intrinsic_load_base_instance:
case nir_intrinsic_load_instance_id:
case nir_intrinsic_load_sample_id:
diff --git a/src/compiler/nir/nir_intrinsics.h 
b/src/compiler/nir/nir_intrinsics.h
index 20bef339ac4..78123dd927a 100644
--- a/src/compiler/nir/nir_intrinsics.h
+++ b/src/compiler/nir/nir_intrinsics.h
@@ -326,6 +326,7 @@ SYSTEM_VALUE(frag_coord, 4, 0, xx, xx, xx)
 SYSTEM_VALUE(front_face, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(vertex_id, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(vertex_id_zero_base, 1, 0, xx, xx, xx)
+SYSTEM_VALUE(base_vertex_id, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(base_vertex, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(instance_id, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(base_instance, 1, 0, xx, xx, xx)
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/9] compiler: Add new system value SYSTEM_VALUE_BASE_VERTEX_ID

2017-11-10 Thread Antia Puentes

This VS system value will contain the value passed as 
for indexed draw calls or the value passed as  for non-indexed
draw calls. It will be used to calculate the gl_VertexID as
SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus SYSTEM_VALUE_BASE_VERTEX_ID.
Note that the current calculation which uses SYSTEM_VALUE_BASE_VERTEX
is not right, as gl_BaseVertex should be zero for non-indexed calls.

Reviewed-by: Neil Roberts 
---
 src/compiler/shader_enums.c |  1 +
 src/compiler/shader_enums.h | 15 +++
 2 files changed, 16 insertions(+)

diff --git a/src/compiler/shader_enums.c b/src/compiler/shader_enums.c
index b2ca80b49c2..cd8c256f227 100644
--- a/src/compiler/shader_enums.c
+++ b/src/compiler/shader_enums.c
@@ -215,6 +215,7 @@ gl_system_value_name(gl_system_value sysval)
  ENUM(SYSTEM_VALUE_INSTANCE_ID),
  ENUM(SYSTEM_VALUE_INSTANCE_INDEX),
  ENUM(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE),
+ ENUM(SYSTEM_VALUE_BASE_VERTEX_ID),
  ENUM(SYSTEM_VALUE_BASE_VERTEX),
  ENUM(SYSTEM_VALUE_BASE_INSTANCE),
  ENUM(SYSTEM_VALUE_DRAW_ID),
diff --git a/src/compiler/shader_enums.h b/src/compiler/shader_enums.h
index 9d229d4199e..7437cccdd0a 100644
--- a/src/compiler/shader_enums.h
+++ b/src/compiler/shader_enums.h
@@ -474,6 +474,21 @@ typedef enum
 */
SYSTEM_VALUE_BASE_VERTEX,
 
+
+   /**
+* Depending on the type of the draw call (indexed or non-indexed),
+* is the value of \c basevertex passed to \c glDrawElementsBaseVertex and
+* similar, or is the value of \c first passed to \c glDrawArrays and
+* similar.
+*
+* \note
+* It can be used to calculate the \c SYSTEM_VALUE_VERTEX_ID as
+* \c SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus \c SYSTEM_VALUE_BASE_VERTEX_ID.
+*
+* \sa SYSTEM_VALUE_VERTEX_ID_ZERO_BASE, SYSTEM_VALUE_VERTEX_ID
+*/
+   SYSTEM_VALUE_BASE_VERTEX_ID,
+
/**
 * Value of \c baseinstance passed to instanced draw entry points
 *
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/9] Fix shader_draw_parameters CTS tests

2017-11-10 Thread Antia Puentes

Hi,

the series sets gl_BaseVertex to zero for gl*DrawArrays* in i965.
Previously, its value was the value passed as  in the non-indexed
draw call. This was convinient because the gl_VertexID was calculated as
gl_VertexIDBaseZero + gl_BaseVertex.

However, as gl_BaseVertex must be zero for non-indexed draw calls, this
calculation is broken.

Apart from setting the basevertex to zero in i965, the following patches add
a new VS system value which will contain  or  as needed
to make the gl_VertexID calculation right.


The relevant bits of the OpenGL 4.6 specification involved here are:

* 11.1.3.9 Shader Inputs

"gl_BaseVertex holds the integer value passed to the baseVertex parameter to the
command that resulted in the current shader invocation. In the case where the
command has no baseVertex parameter, the value of gl_BaseVertex is zero."

"gl_VertexID holds the integer index i implicitly passed by DrawArrays or
one of the other drawing commands defined in section 10.4."

* 10.4. Drawing Commands Using Vertex Arrays:

  - Page 352:
"The index of any element transferred to the GL by DrawArraysOneInstance is
referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID.
The vertex ID of the ith element transferred is first + i."

  - Page 355:
"The index of any element transferred to the GL by DrawElementsOneInstance is
referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID.
The vertex ID of the ith element transferred is the sum of basevertex and the
value stored in the currently bound element array buffer at offset indices + i."

Regards,
Antia.

Antia Puentes (5):
  compiler: Add new system value SYSTEM_VALUE_BASE_VERTEX_ID
  nir: Add SYSTEM_VALUE_BASE_VERTEX_ID instrinsics
  i965: emit basevertexid as a vertex element component
  i965: gl_BaseVertex must be zero for non-indexed draw calls
  intel/compiler: implement the basevertex(id) load intrinsics

Neil Roberts (4):
  intel/compiler: Add a uses_basevertexid flag
  nir: Offset vertex_id by base_vertex_id instead of base_vertex
  i965: Let nir lower gl_VertexID instead of the linker
  spirv: Lower BaseVertex to BASE_VERTEX_ID instead of BASE_VERTEX

 src/compiler/nir/nir.c|  4 +++
 src/compiler/nir/nir_gather_info.c|  1 +
 src/compiler/nir/nir_intrinsics.h |  1 +
 src/compiler/nir/nir_lower_system_values.c|  2 +-
 src/compiler/shader_enums.c   |  1 +
 src/compiler/shader_enums.h   | 15 +++
 src/compiler/spirv/vtn_variables.c|  2 +-
 src/intel/compiler/brw_compiler.h |  1 +
 src/intel/compiler/brw_nir.c  | 12 ++---
 src/intel/compiler/brw_vec4.cpp   | 18 -
 src/intel/vulkan/genX_cmd_buffer.c|  8 +++---
 src/intel/vulkan/genX_pipeline.c  |  4 +--
 src/mesa/drivers/dri/i965/brw_context.c   |  3 ++-
 src/mesa/drivers/dri/i965/brw_context.h   | 36 ++---
 src/mesa/drivers/dri/i965/brw_draw.c  | 33 ---
 src/mesa/drivers/dri/i965/brw_draw_upload.c   | 13 -
 src/mesa/drivers/dri/i965/genX_state_upload.c | 39 +++
 17 files changed, 132 insertions(+), 61 deletions(-)

-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 6/9] intel/compiler: implement the basevertex(id) load intrinsics

2017-11-10 Thread Antia Puentes

The gl_BaseVertex is in a new location now, and the new basevertexid occupies
the old gl_BaseVertex place.

Reviewed-by: Neil Roberts 
---
 src/intel/compiler/brw_nir.c| 12 
 src/intel/compiler/brw_vec4.cpp | 14 --
 2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
index 8f3f77f89ae..a4fe2f5eb14 100644
--- a/src/intel/compiler/brw_nir.c
+++ b/src/intel/compiler/brw_nir.c
@@ -240,8 +240,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 */
const bool has_sgvs =
   nir->info.system_values_read &
-  (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
-   BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) |
+  (BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) |
BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID));
 
@@ -262,6 +261,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
 
 switch (intrin->intrinsic) {
+case nir_intrinsic_load_base_vertex_id:
 case nir_intrinsic_load_base_vertex:
 case nir_intrinsic_load_base_instance:
 case nir_intrinsic_load_vertex_id_zero_base:
@@ -279,7 +279,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 
nir_intrinsic_set_base(load, num_inputs);
switch (intrin->intrinsic) {
-   case nir_intrinsic_load_base_vertex:
+   case nir_intrinsic_load_base_vertex_id:
   nir_intrinsic_set_component(load, 0);
   break;
case nir_intrinsic_load_base_instance:
@@ -292,11 +292,15 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
   nir_intrinsic_set_component(load, 3);
   break;
case nir_intrinsic_load_draw_id:
+   case nir_intrinsic_load_base_vertex:
   /* gl_DrawID is stored right after gl_VertexID and friends
* if any of them exist.
*/
   nir_intrinsic_set_base(load, num_inputs + has_sgvs);
-  nir_intrinsic_set_component(load, 0);
+  if (intrin->intrinsic ==  nir_intrinsic_load_draw_id)
+ nir_intrinsic_set_component(load, 0);
+  else
+ nir_intrinsic_set_component(load, 1);
   break;
default:
   unreachable("Invalid system value intrinsic");
diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index d996ab8c89f..5d86cc4ac7e 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -2784,13 +2784,18 @@ brw_compile_vs(const struct brw_compiler *compiler, 
void *log_data,
 * incoming vertex attribute.  So, add an extra slot.
 */
if (shader->info.system_values_read &
-   (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
-BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) |
+   (BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) |
 BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
 BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))) {
   nr_attribute_slots++;
}
 
+   if (shader->info.system_values_read &
+   (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
+BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID))) {
+  nr_attribute_slots++;
+   }
+
if (shader->info.system_values_read &
BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX))
   prog_data->uses_basevertex = true;
@@ -2811,12 +2816,9 @@ brw_compile_vs(const struct brw_compiler *compiler, void 
*log_data,
BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))
   prog_data->uses_instanceid = true;
 
-   /* gl_DrawID has its very own vec4 */
if (shader->info.system_values_read &
-   BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID)) {
+   BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID))
   prog_data->uses_drawid = true;
-  nr_attribute_slots++;
-   }
 
/* The 3DSTATE_VS documentation lists the lower bound on "Vertex URB Entry
 * Read Length" as 1 in vec4 mode, and 0 in SIMD8 mode.  Empirically, in
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 7/9] nir: Offset vertex_id by base_vertex_id instead of base_vertex

2017-11-10 Thread Antia Puentes

From: Neil Roberts 

base_vertex will be zero for non-indexed calls, but we need it to
include the ‘first’ parameter. This is true for both GL and Vulkan.

I think this patch will also affect freedreno and radeonsi. I believe
if they are relying on this lowering then they are currently already
broken because they will have the wrong values for gl_BaseVertex.
However this patch will also make them break for gl_VertexID and they
will need to be fixed to use base_vertex_id instead.
---
 src/compiler/nir/nir_lower_system_values.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/nir/nir_lower_system_values.c 
b/src/compiler/nir/nir_lower_system_values.c
index 3594f4ae5ce..5464a1b3980 100644
--- a/src/compiler/nir/nir_lower_system_values.c
+++ b/src/compiler/nir/nir_lower_system_values.c
@@ -105,7 +105,7 @@ convert_block(nir_block *block, nir_builder *b)
  if (b->shader->options->vertex_id_zero_based) {
 sysval = nir_iadd(b,
   nir_load_vertex_id_zero_base(b),
-  nir_load_base_vertex(b));
+  nir_load_base_vertex_id(b));
  } else {
 sysval = nir_load_vertex_id(b);
  }
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/9] intel/compiler: Add a uses_basevertexid flag

2017-11-10 Thread Antia Puentes

From: Neil Roberts 

---
 src/intel/compiler/brw_compiler.h | 1 +
 src/intel/compiler/brw_vec4.cpp   | 4 
 2 files changed, 5 insertions(+)

diff --git a/src/intel/compiler/brw_compiler.h 
b/src/intel/compiler/brw_compiler.h
index df6ee018546..6b5b73a54f0 100644
--- a/src/intel/compiler/brw_compiler.h
+++ b/src/intel/compiler/brw_compiler.h
@@ -969,6 +969,7 @@ struct brw_vs_prog_data {
bool uses_vertexid;
bool uses_instanceid;
bool uses_basevertex;
+   bool uses_basevertexid;
bool uses_baseinstance;
bool uses_drawid;
 };
diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index bbe4585e0c7..d996ab8c89f 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -2795,6 +2795,10 @@ brw_compile_vs(const struct brw_compiler *compiler, void 
*log_data,
BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX))
   prog_data->uses_basevertex = true;
 
+   if (shader->info.system_values_read &
+   BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX_ID))
+  prog_data->uses_basevertexid = true;
+
if (shader->info.system_values_read &
BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE))
   prog_data->uses_baseinstance = true;
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/9] i965: emit basevertexid as a vertex element component

2017-11-10 Thread Antia Puentes

The new basevertexid will be emitted in the position previously occupied by
gl_BaseVertex. This way we can keep pointing the indirect buffer for indirect
draw calls. The gl_BaseVertex is now emited as part of the draw_id
VERTEX_ELEMENT.

Reviewed-by: Neil Roberts 
---
 src/mesa/drivers/dri/i965/brw_context.h   | 36 ++---
 src/mesa/drivers/dri/i965/brw_draw.c  | 26 +++---
 src/mesa/drivers/dri/i965/brw_draw_upload.c   | 13 -
 src/mesa/drivers/dri/i965/genX_state_upload.c | 39 +++
 4 files changed, 72 insertions(+), 42 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 8aa0c5ff64c..8ad3cdd7ebd 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -807,28 +807,44 @@ struct brw_context
 
struct {
   struct {
- /** The value of gl_BaseVertex for the current _mesa_prim. */
- int gl_basevertex;
+ /**
+  * Either the value of gl_BaseVertex for indexed draw calls or the
+  * value of the argument  for non-indexed draw calls, for the
+  * current _mesa_prim.
+  */
+ int basevertexid;
 
  /** The value of gl_BaseInstance for the current _mesa_prim. */
  int gl_baseinstance;
   } params;
 
   /**
-   * Buffer and offset used for GL_ARB_shader_draw_parameters
-   * (for now, only gl_BaseVertex).
+   * Buffer and offset used for GL_ARB_shader_draw_parameters. For indirect
+   * draw calls it will point to the indirect buffer.
*/
   struct brw_bo *draw_params_bo;
   uint32_t draw_params_offset;
 
+  struct {
+ /** The value of gl_DrawID for the current _mesa_prim. */
+ int gl_drawid;
+
+ /**
+  * The value of gl_BaseVertex for the current _mesa_prim. It must be
+  * zero for non-indexed calls.
+  */
+ int gl_basevertex;
+  } derived_params;
+
   /**
-   * The value of gl_DrawID for the current _mesa_prim. This always comes
-   * in from it's own vertex buffer since it's not part of the indirect
-   * draw parameters.
+   * Buffer and offset also used for GL_ARB_shader_draw_parameters. As they
+   * are not part of the indirect buffer they will go in their own vertex
+   * element. Note that although gl_BaseVertex is part of the indirect
+   * buffer for indexed draw calls, that is not longer the case for
+   * non-indexed.
*/
-  int gl_drawid;
-  struct brw_bo *draw_id_bo;
-  uint32_t draw_id_offset;
+  struct brw_bo *derived_draw_params_bo;
+  uint32_t derived_draw_params_offset;
 
   /**
* Pointer to the the buffer storing the indirect draw parameters. It
diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 809e7221b2d..8c26273035b 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -766,25 +766,25 @@ brw_draw_single_prim(struct gl_context *ctx,
 * always flag if the shader uses one of the values. For direct draws,
 * we only flag if the values change.
 */
-   const int new_basevertex =
+   const int new_basevertexid =
   prim->indexed ? prim->basevertex : prim->start;
const int new_baseinstance = prim->base_instance;
const struct brw_vs_prog_data *vs_prog_data =
   brw_vs_prog_data(brw->vs.base.prog_data);
if (prim_id > 0) {
   const bool uses_draw_parameters =
- vs_prog_data->uses_basevertex ||
+ vs_prog_data->uses_basevertexid ||
  vs_prog_data->uses_baseinstance;
 
   if ((uses_draw_parameters && prim->is_indirect) ||
-  (vs_prog_data->uses_basevertex &&
-   brw->draw.params.gl_basevertex != new_basevertex) ||
+  (vs_prog_data->uses_basevertexid &&
+   brw->draw.params.basevertexid != new_basevertexid) ||
   (vs_prog_data->uses_baseinstance &&
brw->draw.params.gl_baseinstance != new_baseinstance))
  brw->ctx.NewDriverState |= BRW_NEW_VERTICES;
}
 
-   brw->draw.params.gl_basevertex = new_basevertex;
+   brw->draw.params.basevertexid = new_basevertexid;
brw->draw.params.gl_baseinstance = new_baseinstance;
brw_bo_unreference(brw->draw.draw_params_bo);
 
@@ -809,12 +809,20 @@ brw_draw_single_prim(struct gl_context *ctx,
 * valid vs_prog_data, but we always flag BRW_NEW_VERTICES before
 * the loop.
 */
-   brw->draw.gl_drawid = prim->draw_id;
-   brw_bo_unreference(brw->draw.draw_id_bo);
-   brw->draw.draw_id_bo = NULL;
-   if (prim_id > 0 && vs_prog_data->uses_drawid)
+   const int new_basevertex = prim->indexed ? prim->basevertex : prim->start;
+
+   if (prim_id > 0 &&
+   (vs_prog_data->uses_drawid ||
+(vs_prog_data->uses_basevertex &&
+ brw->draw.derived_params.gl_basevertex != new_basevertex)))
   brw->ctx.NewDr

[Mesa-dev] [PATCH 5/9] i965: gl_BaseVertex must be zero for non-indexed draw calls

2017-11-10 Thread Antia Puentes

- From the OpenGL 4.6 (11.1.3.9 Shader Inputs) specification:

"gl_BaseVertex holds the integer value passed to the baseVertex parameter to the
command that resulted in the current shader invocation. In the case where the
command has no baseVertex parameter, the value of gl_BaseVertex is zero."

- Fixes CTS tests:

* KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysParameters
* KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysInstancedParameters
* KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters
* KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysIndirectParameters
* KHR-GL45.shader_draw_parameters_tests.MultiDrawArraysIndirectCountParameters

Reviewed-by: Neil Roberts 
---
 src/mesa/drivers/dri/i965/brw_draw.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 8c26273035b..54d35c0522c 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -239,6 +239,13 @@ brw_emit_prim(struct brw_context *brw,
prim->indirect_offset + 12);
  brw_load_register_mem(brw, GEN7_3DPRIM_START_INSTANCE, bo,
prim->indirect_offset + 16);
+
+ if (brw->draw.derived_draw_params_bo != NULL) {
+brw_store_register_mem32(brw, brw->draw.derived_draw_params_bo,
+ GEN7_3DPRIM_BASE_VERTEX,
+ brw->draw.derived_draw_params_offset + 4);
+brw_emit_mi_flush(brw);
+ }
   } else {
  brw_load_register_mem(brw, GEN7_3DPRIM_START_INSTANCE, bo,
prim->indirect_offset + 12);
@@ -809,7 +816,7 @@ brw_draw_single_prim(struct gl_context *ctx,
 * valid vs_prog_data, but we always flag BRW_NEW_VERTICES before
 * the loop.
 */
-   const int new_basevertex = prim->indexed ? prim->basevertex : prim->start;
+   const int new_basevertex = prim->indexed ? prim->basevertex : 0;
 
if (prim_id > 0 &&
(vs_prog_data->uses_drawid ||
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 9/9] spirv: Lower BaseVertex to BASE_VERTEX_ID instead of BASE_VERTEX

2017-11-10 Thread Antia Puentes

From: Neil Roberts 

The base vertex in Vulkan is different from GL in that for non-indexed
primitives the value is taken from the firstVertex parameter instead
of being set to zero. This coincides with the new
SYSTEM_VALUE_BASE_VERTEX_ID instead of BASE_VERTEX.
---
 src/compiler/spirv/vtn_variables.c | 2 +-
 src/intel/vulkan/genX_cmd_buffer.c | 8 
 src/intel/vulkan/genX_pipeline.c   | 4 ++--
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index 1cf9d597cf0..82d04f2558b 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -1155,7 +1155,7 @@ vtn_get_builtin_location(struct vtn_builder *b,
   set_mode_system_value(mode);
   break;
case SpvBuiltInBaseVertex:
-  *location = SYSTEM_VALUE_BASE_VERTEX;
+  *location = SYSTEM_VALUE_BASE_VERTEX_ID;
   set_mode_system_value(mode);
   break;
case SpvBuiltInBaseInstance:
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index fbb5706606a..4e9b884d3ac 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2175,7 +2175,7 @@ void genX(CmdDraw)(
 
genX(cmd_buffer_flush_state)(cmd_buffer);
 
-   if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance)
+   if (vs_prog_data->uses_basevertexid || vs_prog_data->uses_baseinstance)
   emit_base_vertex_instance(cmd_buffer, firstVertex, firstInstance);
if (vs_prog_data->uses_drawid)
   emit_draw_index(cmd_buffer, 0);
@@ -2213,7 +2213,7 @@ void genX(CmdDrawIndexed)(
 
genX(cmd_buffer_flush_state)(cmd_buffer);
 
-   if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance)
+   if (vs_prog_data->uses_basevertexid || vs_prog_data->uses_baseinstance)
   emit_base_vertex_instance(cmd_buffer, vertexOffset, firstInstance);
if (vs_prog_data->uses_drawid)
   emit_draw_index(cmd_buffer, 0);
@@ -2369,7 +2369,7 @@ void genX(CmdDrawIndirect)(
   struct anv_bo *bo = buffer->bo;
   uint32_t bo_offset = buffer->offset + offset;
 
-  if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance)
+  if (vs_prog_data->uses_basevertexid || vs_prog_data->uses_baseinstance)
  emit_base_vertex_instance_bo(cmd_buffer, bo, bo_offset + 8);
   if (vs_prog_data->uses_drawid)
  emit_draw_index(cmd_buffer, i);
@@ -2408,7 +2408,7 @@ void genX(CmdDrawIndexedIndirect)(
   uint32_t bo_offset = buffer->offset + offset;
 
   /* TODO: We need to stomp base vertex to 0 somehow */
-  if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance)
+  if (vs_prog_data->uses_basevertexid || vs_prog_data->uses_baseinstance)
  emit_base_vertex_instance_bo(cmd_buffer, bo, bo_offset + 12);
   if (vs_prog_data->uses_drawid)
  emit_draw_index(cmd_buffer, i);
diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c
index 7e3a785c584..01e48ce4559 100644
--- a/src/intel/vulkan/genX_pipeline.c
+++ b/src/intel/vulkan/genX_pipeline.c
@@ -97,7 +97,7 @@ emit_vertex_input(struct anv_pipeline *pipeline,
const uint32_t elements_double = double_inputs_read >> VERT_ATTRIB_GENERIC0;
const bool needs_svgs_elem = vs_prog_data->uses_vertexid ||
 vs_prog_data->uses_instanceid ||
-vs_prog_data->uses_basevertex ||
+vs_prog_data->uses_basevertexid ||
 vs_prog_data->uses_baseinstance;
 
uint32_t elem_count = __builtin_popcount(elements) -
@@ -177,7 +177,7 @@ emit_vertex_input(struct anv_pipeline *pipeline,
* This means, that if we have BaseInstance, we need BaseVertex as
* well.  Just do all or nothing.
*/
-  uint32_t base_ctrl = (vs_prog_data->uses_basevertex ||
+  uint32_t base_ctrl = (vs_prog_data->uses_basevertexid ||
 vs_prog_data->uses_baseinstance) ?
VFCOMP_STORE_SRC : VFCOMP_STORE_0;
 
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 8/9] i965: Let nir lower gl_VertexID instead of the linker

2017-11-10 Thread Antia Puentes

From: Neil Roberts 

---
 src/mesa/drivers/dri/i965/brw_context.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 19d5a2e3503..01c8e8cb3cf 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -586,7 +586,8 @@ brw_initialize_context_constants(struct brw_context *brw)
   ctx->Const.QuadsFollowProvokingVertexConvention = false;
 
ctx->Const.NativeIntegers = true;
-   ctx->Const.VertexID_is_zero_based = true;
+   /* This is lowered by NIR instead */
+   ctx->Const.VertexID_is_zero_based = false;
 
/* Regarding the CMP instruction, the Ivybridge PRM says:
 *
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 0/7] Fix shader_draw_parameters CTS tests

2017-12-04 Thread Antia Puentes


This patch series is a v2 of the one we sent some weeks ago to fix
gl_BaseVertex in i965. The original patch series can be found on here:
https://patchwork.freedesktop.org/series/33623/

* Rationale:

gl_BaseVertex must be zero for non-indexed draw comands. However,
for this kind of draw comands, we were setting it to the value passed
as  argument of the command.

The series fixes this bug for i965. Notice that we still need the value of
the  argument of the non-indexed draw call to calculate gl_VertexID,
as for those gl_VertexID must be equal to: gl_VertexIDBaseZero + .

The Vertex Elements' composition will be now as follows:
* VE 1: 
* VE 2: 

- BaseVertex will be the  passed to indexed draw calls,
zero otherwise.
- firstvertex will be the  vertex passed to non-indexed draw calls,
and will be the  for indexed draw commands. Setting firstvertex
to  for indexed draw commands is convinient for two reasons:
 1) In case of indirect draw calls, we can keep pointing to the indirect buffer
 when emitting the vertices.
 2) We can calculate gl_VertexID as VertexID(basezero) + firstindex for both
 types of draw commands and these two values will be in the same Vertex Element.
 We will only emit VE2 if gl_DrawID or gl_BaseVertex are referenced in the
 shader.

* Changes in v2:

- Rename the new system value as SYSTEM_VALUE_FIRST_VERTEX and
  the related variables through the series.
- Squash patches 1-2 of the original series, and 4 and 6.
- Move patch 5 to the end of the patch series.
- Swap the order of patches 7 and 8.

* OpenGL 4.6 specification bits:

- 11.1.3.9 Shader Inputs

"gl_BaseVertex holds the integer value passed to the baseVertex parameter to the
command that resulted in the current shader invocation. In the case where the
command has no baseVertex parameter, the value of gl_BaseVertex is zero."

"gl_VertexID holds the integer index i implicitly passed by DrawArrays or
one of the other drawing commands defined in section 10.4."

- 10.4. Drawing Commands Using Vertex Arrays:

  -> Page 352:
"The index of any element transferred to the GL by DrawArraysOneInstance is
referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID.
The vertex ID of the ith element transferred is first + i."

  -> Page 355:
"The index of any element transferred to the GL by DrawElementsOneInstance is
referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID.
The vertex ID of the ith element transferred is the sum of basevertex and the
value stored in the currently bound element array buffer at offset indices + i."

Antia Puentes (3):
  compiler: Add SYSTEM_VALUE_FIRST_VERTEX and instrinsics
  intel: emit first_vertex and reorder the VE' components
  i965: gl_BaseVertex must be zero for non-indexed draw calls

Neil Roberts (4):
  intel/compiler: Add a uses_firstvertex flag
  i965: Let nir lower gl_VertexID instead of the linker
  nir: Offset vertex_id by first_vertex instead of base_vertex
  spirv: Lower BaseVertex to FIRST_VERTEX instead of BASE_VERTEX

 src/compiler/nir/nir.c|  4 +++
 src/compiler/nir/nir_gather_info.c|  1 +
 src/compiler/nir/nir_intrinsics.h |  1 +
 src/compiler/nir/nir_lower_system_values.c|  2 +-
 src/compiler/shader_enums.c   |  1 +
 src/compiler/shader_enums.h   | 15 +++
 src/compiler/spirv/vtn_variables.c|  2 +-
 src/intel/compiler/brw_compiler.h |  1 +
 src/intel/compiler/brw_nir.c  | 11 +---
 src/intel/compiler/brw_vec4.cpp   | 17 
 src/intel/vulkan/genX_cmd_buffer.c|  8 +++---
 src/intel/vulkan/genX_pipeline.c  |  4 +--
 src/mesa/drivers/dri/i965/brw_context.c   |  3 ++-
 src/mesa/drivers/dri/i965/brw_context.h   | 36 ++---
 src/mesa/drivers/dri/i965/brw_draw.c  | 33 ---
 src/mesa/drivers/dri/i965/brw_draw_upload.c   | 13 -
 src/mesa/drivers/dri/i965/genX_state_upload.c | 39 +++
 17 files changed, 132 insertions(+), 59 deletions(-)

-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 2/7] intel/compiler: Add a uses_firstvertex flag

2017-12-04 Thread Antia Puentes

From: Neil Roberts 

Reviewed-by: Kenneth Graunke 
---
 src/intel/compiler/brw_compiler.h | 1 +
 src/intel/compiler/brw_vec4.cpp   | 4 
 2 files changed, 5 insertions(+)

diff --git a/src/intel/compiler/brw_compiler.h 
b/src/intel/compiler/brw_compiler.h
index 28aed833245..6661427e982 100644
--- a/src/intel/compiler/brw_compiler.h
+++ b/src/intel/compiler/brw_compiler.h
@@ -967,6 +967,7 @@ struct brw_vs_prog_data {
bool uses_vertexid;
bool uses_instanceid;
bool uses_basevertex;
+   bool uses_firstvertex;
bool uses_baseinstance;
bool uses_drawid;
 };
diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index 73c40ad6009..14f930e0264 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -2800,6 +2800,10 @@ brw_compile_vs(const struct brw_compiler *compiler, void 
*log_data,
BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX))
   prog_data->uses_basevertex = true;
 
+   if (shader->info.system_values_read &
+   BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX))
+  prog_data->uses_firstvertex = true;
+
if (shader->info.system_values_read &
BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE))
   prog_data->uses_baseinstance = true;
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 3/7] intel: emit first_vertex and reorder the VE' components

2017-12-04 Thread Antia Puentes

The new order is:
* VE 1: 
* VE 2: 

Previously it was:
* VE 1: 
* VE 2: 

The gl_BaseVertex is in a new location now, and firstvertex occupies
the old gl_BaseVertex place. This way we can keep pointing to the
indirect buffer for indirect draw calls.

Reviewed-by: Neil Roberts 
---
 src/intel/compiler/brw_nir.c  | 11 +---
 src/intel/compiler/brw_vec4.cpp   | 13 +
 src/mesa/drivers/dri/i965/brw_context.h   | 36 ++---
 src/mesa/drivers/dri/i965/brw_draw.c  | 26 +++---
 src/mesa/drivers/dri/i965/brw_draw_upload.c   | 13 -
 src/mesa/drivers/dri/i965/genX_state_upload.c | 39 +++
 6 files changed, 88 insertions(+), 50 deletions(-)

diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
index 8f3f77f89ae..f702f5b8534 100644
--- a/src/intel/compiler/brw_nir.c
+++ b/src/intel/compiler/brw_nir.c
@@ -240,7 +240,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 */
const bool has_sgvs =
   nir->info.system_values_read &
-  (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
+  (BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) |
BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID));
@@ -262,6 +262,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
 
 switch (intrin->intrinsic) {
+case nir_intrinsic_load_first_vertex:
 case nir_intrinsic_load_base_vertex:
 case nir_intrinsic_load_base_instance:
 case nir_intrinsic_load_vertex_id_zero_base:
@@ -279,7 +280,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 
nir_intrinsic_set_base(load, num_inputs);
switch (intrin->intrinsic) {
-   case nir_intrinsic_load_base_vertex:
+   case nir_intrinsic_load_first_vertex:
   nir_intrinsic_set_component(load, 0);
   break;
case nir_intrinsic_load_base_instance:
@@ -292,11 +293,15 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
   nir_intrinsic_set_component(load, 3);
   break;
case nir_intrinsic_load_draw_id:
+   case nir_intrinsic_load_base_vertex:
   /* gl_DrawID is stored right after gl_VertexID and friends
* if any of them exist.
*/
   nir_intrinsic_set_base(load, num_inputs + has_sgvs);
-  nir_intrinsic_set_component(load, 0);
+  if (intrin->intrinsic ==  nir_intrinsic_load_draw_id)
+ nir_intrinsic_set_component(load, 0);
+  else
+ nir_intrinsic_set_component(load, 1);
   break;
default:
   unreachable("Invalid system value intrinsic");
diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index 14f930e0264..70a197a9fa0 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -2789,13 +2789,19 @@ brw_compile_vs(const struct brw_compiler *compiler, 
void *log_data,
 * incoming vertex attribute.  So, add an extra slot.
 */
if (shader->info.system_values_read &
-   (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
+   (BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
 BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) |
 BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
 BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))) {
   nr_attribute_slots++;
}
 
+   if (shader->info.system_values_read &
+   (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
+BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID))) {
+  nr_attribute_slots++;
+   }
+
if (shader->info.system_values_read &
BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX))
   prog_data->uses_basevertex = true;
@@ -2816,12 +2822,9 @@ brw_compile_vs(const struct brw_compiler *compiler, void 
*log_data,
BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))
   prog_data->uses_instanceid = true;
 
-   /* gl_DrawID has its very own vec4 */
if (shader->info.system_values_read &
-   BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID)) {
+   BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID))
   prog_data->uses_drawid = true;
-  nr_attribute_slots++;
-   }
 
/* The 3DSTATE_VS documentation lists the lower bound on "Vertex URB Entry
 * Read Length" as 1 in vec4 mode, and 0 in SIMD8 mode.  Empirically, in
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 06704838061..c6d176bc243 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -843,28 +843,44 @@ struct brw_context
 
struct {
   struct {
- /** The value of gl_BaseVertex for the current _mesa_prim. */
- int gl_basevertex;

[Mesa-dev] [PATCH v2 7/7] i965: gl_BaseVertex must be zero for non-indexed draw calls

2017-12-04 Thread Antia Puentes

- From the OpenGL 4.6 (11.1.3.9 Shader Inputs) specification:

"gl_BaseVertex holds the integer value passed to the baseVertex parameter to the
command that resulted in the current shader invocation. In the case where the
command has no baseVertex parameter, the value of gl_BaseVertex is zero."

- Fixes CTS tests:

* KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysParameters
* KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysInstancedParameters
* KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters
* KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysIndirectParameters
* KHR-GL45.shader_draw_parameters_tests.MultiDrawArraysIndirectCountParameters

Reviewed-by: Neil Roberts 
---
 src/mesa/drivers/dri/i965/brw_draw.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 18d26cfafca..3d831e7d866 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -239,6 +239,13 @@ brw_emit_prim(struct brw_context *brw,
prim->indirect_offset + 12);
  brw_load_register_mem(brw, GEN7_3DPRIM_START_INSTANCE, bo,
prim->indirect_offset + 16);
+
+ if (brw->draw.derived_draw_params_bo != NULL) {
+brw_store_register_mem32(brw, brw->draw.derived_draw_params_bo,
+ GEN7_3DPRIM_BASE_VERTEX,
+ brw->draw.derived_draw_params_offset + 4);
+brw_emit_mi_flush(brw);
+ }
   } else {
  brw_load_register_mem(brw, GEN7_3DPRIM_START_INSTANCE, bo,
prim->indirect_offset + 12);
@@ -803,7 +810,7 @@ brw_draw_single_prim(struct gl_context *ctx,
 * valid vs_prog_data, but we always flag BRW_NEW_VERTICES before
 * the loop.
 */
-   const int new_basevertex = prim->indexed ? prim->basevertex : prim->start;
+   const int new_basevertex = prim->indexed ? prim->basevertex : 0;
 
if (prim_id > 0 &&
(vs_prog_data->uses_drawid ||
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 6/7] spirv: Lower BaseVertex to FIRST_VERTEX instead of BASE_VERTEX

2017-12-04 Thread Antia Puentes

From: Neil Roberts 

The base vertex in Vulkan is different from GL in that for non-indexed
primitives the value is taken from the firstVertex parameter instead
of being set to zero. This coincides with the new SYSTEM_VALUE_FIRST_VERTEX
instead of BASE_VERTEX.
---
 src/compiler/spirv/vtn_variables.c | 2 +-
 src/intel/vulkan/genX_cmd_buffer.c | 8 
 src/intel/vulkan/genX_pipeline.c   | 4 ++--
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index c57f5541319..67833a4c134 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -1155,7 +1155,7 @@ vtn_get_builtin_location(struct vtn_builder *b,
   set_mode_system_value(mode);
   break;
case SpvBuiltInBaseVertex:
-  *location = SYSTEM_VALUE_BASE_VERTEX;
+  *location = SYSTEM_VALUE_FIRST_VERTEX;
   set_mode_system_value(mode);
   break;
case SpvBuiltInBaseInstance:
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index ab5590d7cee..ac4882494be 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2085,7 +2085,7 @@ void genX(CmdDraw)(
 
genX(cmd_buffer_flush_state)(cmd_buffer);
 
-   if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance)
+   if (vs_prog_data->uses_firstvertex || vs_prog_data->uses_baseinstance)
   emit_base_vertex_instance(cmd_buffer, firstVertex, firstInstance);
if (vs_prog_data->uses_drawid)
   emit_draw_index(cmd_buffer, 0);
@@ -2123,7 +2123,7 @@ void genX(CmdDrawIndexed)(
 
genX(cmd_buffer_flush_state)(cmd_buffer);
 
-   if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance)
+   if (vs_prog_data->uses_firstvertex || vs_prog_data->uses_baseinstance)
   emit_base_vertex_instance(cmd_buffer, vertexOffset, firstInstance);
if (vs_prog_data->uses_drawid)
   emit_draw_index(cmd_buffer, 0);
@@ -2279,7 +2279,7 @@ void genX(CmdDrawIndirect)(
   struct anv_bo *bo = buffer->bo;
   uint32_t bo_offset = buffer->offset + offset;
 
-  if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance)
+  if (vs_prog_data->uses_firstvertex || vs_prog_data->uses_baseinstance)
  emit_base_vertex_instance_bo(cmd_buffer, bo, bo_offset + 8);
   if (vs_prog_data->uses_drawid)
  emit_draw_index(cmd_buffer, i);
@@ -2318,7 +2318,7 @@ void genX(CmdDrawIndexedIndirect)(
   uint32_t bo_offset = buffer->offset + offset;
 
   /* TODO: We need to stomp base vertex to 0 somehow */
-  if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance)
+  if (vs_prog_data->uses_firstvertex || vs_prog_data->uses_baseinstance)
  emit_base_vertex_instance_bo(cmd_buffer, bo, bo_offset + 12);
   if (vs_prog_data->uses_drawid)
  emit_draw_index(cmd_buffer, i);
diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c
index 7e3a785c584..4f88fbaf656 100644
--- a/src/intel/vulkan/genX_pipeline.c
+++ b/src/intel/vulkan/genX_pipeline.c
@@ -97,7 +97,7 @@ emit_vertex_input(struct anv_pipeline *pipeline,
const uint32_t elements_double = double_inputs_read >> VERT_ATTRIB_GENERIC0;
const bool needs_svgs_elem = vs_prog_data->uses_vertexid ||
 vs_prog_data->uses_instanceid ||
-vs_prog_data->uses_basevertex ||
+vs_prog_data->uses_firstvertex ||
 vs_prog_data->uses_baseinstance;
 
uint32_t elem_count = __builtin_popcount(elements) -
@@ -177,7 +177,7 @@ emit_vertex_input(struct anv_pipeline *pipeline,
* This means, that if we have BaseInstance, we need BaseVertex as
* well.  Just do all or nothing.
*/
-  uint32_t base_ctrl = (vs_prog_data->uses_basevertex ||
+  uint32_t base_ctrl = (vs_prog_data->uses_firstvertex ||
 vs_prog_data->uses_baseinstance) ?
VFCOMP_STORE_SRC : VFCOMP_STORE_0;
 
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 4/7] i965: Let nir lower gl_VertexID instead of the linker

2017-12-04 Thread Antia Puentes

From: Neil Roberts 

---
 src/mesa/drivers/dri/i965/brw_context.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index b62852d90c8..249f62847b9 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -585,7 +585,8 @@ brw_initialize_context_constants(struct brw_context *brw)
   ctx->Const.QuadsFollowProvokingVertexConvention = false;
 
ctx->Const.NativeIntegers = true;
-   ctx->Const.VertexID_is_zero_based = true;
+   /* This is lowered by NIR instead */
+   ctx->Const.VertexID_is_zero_based = false;
 
/* Regarding the CMP instruction, the Ivybridge PRM says:
 *
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 5/7] nir: Offset vertex_id by first_vertex instead of base_vertex

2017-12-04 Thread Antia Puentes

From: Neil Roberts 

base_vertex will be zero for non-indexed calls, but we need it to
include the ‘first’ parameter. This is true for both GL and Vulkan.

I think this patch will also affect freedreno and radeonsi. I believe
if they are relying on this lowering then they are currently already
broken because they will have the wrong values for gl_BaseVertex.
However this patch will also make them break for gl_VertexID and they
will need to be fixed to use firt_vertex instead.
---
 src/compiler/nir/nir_lower_system_values.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/nir/nir_lower_system_values.c 
b/src/compiler/nir/nir_lower_system_values.c
index 3594f4ae5ce..6f4fb8233ab 100644
--- a/src/compiler/nir/nir_lower_system_values.c
+++ b/src/compiler/nir/nir_lower_system_values.c
@@ -105,7 +105,7 @@ convert_block(nir_block *block, nir_builder *b)
  if (b->shader->options->vertex_id_zero_based) {
 sysval = nir_iadd(b,
   nir_load_vertex_id_zero_base(b),
-  nir_load_base_vertex(b));
+  nir_load_first_vertex(b));
  } else {
 sysval = nir_load_vertex_id(b);
  }
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 1/7] compiler: Add SYSTEM_VALUE_FIRST_VERTEX and instrinsics

2017-12-04 Thread Antia Puentes

This VS system value will contain the value passed as 
for indexed draw calls or the value passed as  for non-indexed
draw calls. It will be used to calculate the gl_VertexID as
SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus SYSTEM_VALUE_FIRST_VERTEX.
Note that the current calculation which uses SYSTEM_VALUE_BASE_VERTEX
is not right, as gl_BaseVertex should be zero for non-indexed calls.

v2: use SYSTEM_VALUE_FIRST_VERTEX as name for the value, instead of
SYSTEM_VALUE_BASE_VERTEX_ID (Kenneth).

Reviewed-by: Neil Roberts 
Reviewed-by: Kenneth Graunke 
---
 src/compiler/nir/nir.c |  4 
 src/compiler/nir/nir_gather_info.c |  1 +
 src/compiler/nir/nir_intrinsics.h  |  1 +
 src/compiler/shader_enums.c|  1 +
 src/compiler/shader_enums.h| 15 +++
 5 files changed, 22 insertions(+)

diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
index 7380bf436a8..29730f5fa86 100644
--- a/src/compiler/nir/nir.c
+++ b/src/compiler/nir/nir.c
@@ -1913,6 +1913,8 @@ nir_intrinsic_from_system_value(gl_system_value val)
   return nir_intrinsic_load_base_instance;
case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE:
   return nir_intrinsic_load_vertex_id_zero_base;
+   case SYSTEM_VALUE_FIRST_VERTEX:
+  return nir_intrinsic_load_first_vertex;
case SYSTEM_VALUE_BASE_VERTEX:
   return nir_intrinsic_load_base_vertex;
case SYSTEM_VALUE_INVOCATION_ID:
@@ -1982,6 +1984,8 @@ nir_system_value_from_intrinsic(nir_intrinsic_op intrin)
   return SYSTEM_VALUE_BASE_INSTANCE;
case nir_intrinsic_load_vertex_id_zero_base:
   return SYSTEM_VALUE_VERTEX_ID_ZERO_BASE;
+   case nir_intrinsic_load_first_vertex:
+  return SYSTEM_VALUE_FIRST_VERTEX;
case nir_intrinsic_load_base_vertex:
   return SYSTEM_VALUE_BASE_VERTEX;
case nir_intrinsic_load_invocation_id:
diff --git a/src/compiler/nir/nir_gather_info.c 
b/src/compiler/nir/nir_gather_info.c
index 946939657ec..555ae77b1d3 100644
--- a/src/compiler/nir/nir_gather_info.c
+++ b/src/compiler/nir/nir_gather_info.c
@@ -247,6 +247,7 @@ gather_intrinsic_info(nir_intrinsic_instr *instr, 
nir_shader *shader)
case nir_intrinsic_load_vertex_id:
case nir_intrinsic_load_vertex_id_zero_base:
case nir_intrinsic_load_base_vertex:
+   case nir_intrinsic_load_first_vertex:
case nir_intrinsic_load_base_instance:
case nir_intrinsic_load_instance_id:
case nir_intrinsic_load_sample_id:
diff --git a/src/compiler/nir/nir_intrinsics.h 
b/src/compiler/nir/nir_intrinsics.h
index 20bef339ac4..a7770bf6a85 100644
--- a/src/compiler/nir/nir_intrinsics.h
+++ b/src/compiler/nir/nir_intrinsics.h
@@ -326,6 +326,7 @@ SYSTEM_VALUE(frag_coord, 4, 0, xx, xx, xx)
 SYSTEM_VALUE(front_face, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(vertex_id, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(vertex_id_zero_base, 1, 0, xx, xx, xx)
+SYSTEM_VALUE(first_vertex, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(base_vertex, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(instance_id, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(base_instance, 1, 0, xx, xx, xx)
diff --git a/src/compiler/shader_enums.c b/src/compiler/shader_enums.c
index 2179c475abd..5e123f29f37 100644
--- a/src/compiler/shader_enums.c
+++ b/src/compiler/shader_enums.c
@@ -214,6 +214,7 @@ gl_system_value_name(gl_system_value sysval)
  ENUM(SYSTEM_VALUE_INSTANCE_ID),
  ENUM(SYSTEM_VALUE_INSTANCE_INDEX),
  ENUM(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE),
+ ENUM(SYSTEM_VALUE_FIRST_VERTEX),
  ENUM(SYSTEM_VALUE_BASE_VERTEX),
  ENUM(SYSTEM_VALUE_BASE_INSTANCE),
  ENUM(SYSTEM_VALUE_DRAW_ID),
diff --git a/src/compiler/shader_enums.h b/src/compiler/shader_enums.h
index ffe551ab20f..76bb2cc4203 100644
--- a/src/compiler/shader_enums.h
+++ b/src/compiler/shader_enums.h
@@ -472,6 +472,21 @@ typedef enum
 */
SYSTEM_VALUE_BASE_VERTEX,
 
+
+   /**
+* Depending on the type of the draw call (indexed or non-indexed),
+* is the value of \c basevertex passed to \c glDrawElementsBaseVertex and
+* similar, or is the value of \c first passed to \c glDrawArrays and
+* similar.
+*
+* \note
+* It can be used to calculate the \c SYSTEM_VALUE_VERTEX_ID as
+* \c SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus \c SYSTEM_VALUE_FIRST_VERTEX.
+*
+* \sa SYSTEM_VALUE_VERTEX_ID_ZERO_BASE, SYSTEM_VALUE_VERTEX_ID
+*/
+   SYSTEM_VALUE_FIRST_VERTEX,
+
/**
 * Value of \c baseinstance passed to instanced draw entry points
 *
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/9] compiler: Add new system value SYSTEM_VALUE_BASE_VERTEX_ID

2017-12-04 Thread Antia Puentes


Hi,

On 30/11/17 21:43, Neil Roberts wrote:


Kenneth Graunke  writes:


We have a number of similar names now:

SYSTEM_VALUE_BASE_VERTEX
SYSTEM_VALUE_BASE_VERTEX_ID
SYSTEM_VALUE_VERTEX_ID
SYSTEM_VALUE_VERTEX_ID_ZERO_BASE

BASE_VERTEX and BASE_VERTEX_ID are really similar names, and honestly
either one seems like it could be the name for gl_BaseVertex.  I'm
afraid it would be easy to mix them up by mistake.  IMHO, it would be
nice to pick a different word, just to keep some distinction between
the two fairly related concepts...

Perhaps SYSTEM_VALUE_FIRST_VERTEX...?  That's only half the meaning,
but it at least uses a different word, and makes you think "do I want
BASE_VERTEX or FIRST_VERTEX?"

Yes, naming this thing is really difficult. I’m not sure if you noticed,
but for Vulkan, the BaseVertex builtin should actually have the value of
BASE_VERTEX_ID unlike GL. So if we rename BASE_VERTEX to something
without “base vertex” in it then it will still be confusing for Vulkan.
So effectively the descriptive names are like:

SYSTEM_VALUE_BASE_VERTEX_ON_GL_BUT_NOT_VULKAN
SYSTEM_VALUE_BASE_VERTEX_ON_VULKAN_OR_OFFSET_FOR_VERTEX_ID_ON_GL

I’m not sure whether that’s enough of an argument against FIRST_VERTEX
though, so personally I don’t really mind either way.

Antia, what do you think?

I am fine renaming it to FIRST_VERTEX. I have sent a second version of 
the series with the renaming and addressing other feedback given by Kenneth.

Thanks.

Regards.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] i965: silence coverity warning

2017-10-09 Thread Antia Puentes


Reviewed-by: Antia Puentes 


On 09/10/17 16:37, Lionel Landwerlin wrote:

Also makes this statement a bit clearer.

CID: 1418920
Signed-off-by: Lionel Landwerlin 
---
  src/mesa/drivers/dri/i965/brw_draw.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index c7ed7284501..7ebed98fcef 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -955,7 +955,7 @@ brw_draw_indirect_prims(struct gl_context *ctx,
 prim[draw_count - 1].end = 1;
 for (i = 0; i < draw_count; ++i, indirect_offset += stride) {
prim[i].mode = mode;
-  prim[i].indexed = !!ib;
+  prim[i].indexed = ib != NULL;
prim[i].indirect_offset = indirect_offset;
prim[i].is_indirect = 1;
prim[i].draw_id = i;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] intel: blorp: fix potential NULL pointer dereferences

2017-10-09 Thread Antia Puentes


Reviewed-by: Antia Puentes 


On 09/10/17 16:37, Lionel Landwerlin wrote:

CID: 1418616, 1418607
Signed-off-by: Lionel Landwerlin 
---
  src/intel/blorp/blorp_blit.c | 16 +++-
  1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/src/intel/blorp/blorp_blit.c b/src/intel/blorp/blorp_blit.c
index 11c2116a758..84e46d7d31b 100644
--- a/src/intel/blorp/blorp_blit.c
+++ b/src/intel/blorp/blorp_blit.c
@@ -2343,23 +2343,29 @@ blorp_surf_convert_to_uncompressed(const struct 
isl_device *isl_dev,
  */
 blorp_surf_convert_to_single_slice(isl_dev, info);
  
-   if (width || height) {

+   if (width) {
  #ifndef NDEBUG
uint32_t right_edge_px = info->tile_x_sa + *x + *width;
-  uint32_t bottom_edge_px = info->tile_y_sa + *y + *height;
assert(*width % fmtl->bw == 0 ||
   right_edge_px == info->surf.logical_level0_px.width);
+#endif
+  *width = DIV_ROUND_UP(*width, fmtl->bw);
+   }
+   if (height) {
+#ifndef NDEBUG
+  uint32_t bottom_edge_px = info->tile_y_sa + *y + *height;
assert(*height % fmtl->bh == 0 ||
   bottom_edge_px == info->surf.logical_level0_px.height);
  #endif
-  *width = DIV_ROUND_UP(*width, fmtl->bw);
*height = DIV_ROUND_UP(*height, fmtl->bh);
 }
  
-   if (x || y) {

+   if (x) {
assert(*x % fmtl->bw == 0);
-  assert(*y % fmtl->bh == 0);
*x /= fmtl->bw;
+   }
+   if (y) {
+  assert(*y % fmtl->bh == 0);
*y /= fmtl->bh;
 }
  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] meson: fix typo in isl

2017-10-12 Thread Antia Puentes


Reviewed-by: Antia Puentes 


On 12/10/17 13:24, Elie Tournier wrote:

Signed-off-by: Elie Tournier 
---
  src/intel/isl/meson.build | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/intel/isl/meson.build b/src/intel/isl/meson.build
index 789175e256..54024b4d11 100644
--- a/src/intel/isl/meson.build
+++ b/src/intel/isl/meson.build
@@ -101,5 +101,5 @@ if with_tests
  build_by_default : false,
)
  
-  test('isl_surf_get_imaage_offset', isl_surf_get_image_offset_test)

+  test('isl_surf_get_image_offset', isl_surf_get_image_offset_test)
  endif


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/5] intel/compiler: Add uses_is_indexed_draw flag

2018-04-28 Thread Antia Puentes

---
 src/intel/compiler/brw_compiler.h | 1 +
 src/intel/compiler/brw_vec4.cpp   | 4 
 2 files changed, 5 insertions(+)

diff --git a/src/intel/compiler/brw_compiler.h 
b/src/intel/compiler/brw_compiler.h
index 24196248b8e..e3bf535a519 100644
--- a/src/intel/compiler/brw_compiler.h
+++ b/src/intel/compiler/brw_compiler.h
@@ -978,6 +978,7 @@ struct brw_vs_prog_data {
bool uses_vertexid;
bool uses_instanceid;
bool uses_basevertex;
+   bool uses_is_indexed_draw;
bool uses_firstvertex;
bool uses_baseinstance;
bool uses_drawid;
diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index 1e384f5bf4d..e583c549204 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -2837,6 +2837,10 @@ brw_compile_vs(const struct brw_compiler *compiler, void 
*log_data,
BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX))
   prog_data->uses_basevertex = true;
 
+   if (shader->info.system_values_read &
+   BITFIELD64_BIT(SYSTEM_VALUE_IS_INDEXED_DRAW))
+  prog_data->uses_is_indexed_draw = true;
+
if (shader->info.system_values_read &
BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX))
   prog_data->uses_firstvertex = true;
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/5] intel: activate the gl_BaseVertex lowering

2018-04-28 Thread Antia Puentes

Surplus code related to the basevertex is removed.

The Vertex Elements contain now:
* VE 1: 
* VE 2: 

Also fixes unreachable message.

Fixes OpenGL CTS tests:
* KHR-GL46.shader_draw_parameters_tests.ShaderDrawArraysInstancedParameters
* KHR-GL46.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters
* KHR-GL46.shader_draw_parameters_tests.MultiDrawArraysIndirectCountParameters
* KHR-GL46.shader_draw_parameters_tests.ShaderDrawArraysParameters
* KHR-GL46.shader_draw_parameters_tests.ShaderMultiDrawArraysIndirectParameters

Fixes Piglit tests:
* arb_shader_draw_parameters-drawid-indirect baseinstance
* arb_shader_draw_parameters-basevertex

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102678
---
 src/intel/compiler/brw_compiler.c | 3 ++-
 src/intel/compiler/brw_compiler.h | 1 -
 src/intel/compiler/brw_fs_nir.cpp | 8 
 src/intel/compiler/brw_nir.c  | 5 +
 src/intel/compiler/brw_vec4.cpp   | 7 +--
 src/mesa/drivers/dri/i965/brw_draw.c  | 8 ++--
 src/mesa/drivers/dri/i965/brw_draw_upload.c   | 5 +
 src/mesa/drivers/dri/i965/genX_state_upload.c | 1 -
 8 files changed, 11 insertions(+), 27 deletions(-)

diff --git a/src/intel/compiler/brw_compiler.c 
b/src/intel/compiler/brw_compiler.c
index d5f483798a9..6480dbefbf6 100644
--- a/src/intel/compiler/brw_compiler.c
+++ b/src/intel/compiler/brw_compiler.c
@@ -45,7 +45,8 @@
.lower_device_index_to_zero = true,\
.native_integers = true,   \
.use_interpolated_input_intrinsics = true, \
-   .vertex_id_zero_based = true
+   .vertex_id_zero_based = true,  \
+   .lower_base_vertex = true
 
 #define COMMON_SCALAR_OPTIONS \
.lower_pack_half_2x16 = true,  \
diff --git a/src/intel/compiler/brw_compiler.h 
b/src/intel/compiler/brw_compiler.h
index e3bf535a519..8b4e6fe2e29 100644
--- a/src/intel/compiler/brw_compiler.h
+++ b/src/intel/compiler/brw_compiler.h
@@ -977,7 +977,6 @@ struct brw_vs_prog_data {
 
bool uses_vertexid;
bool uses_instanceid;
-   bool uses_basevertex;
bool uses_is_indexed_draw;
bool uses_firstvertex;
bool uses_baseinstance;
diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index 22beb0e00d1..02aaf144019 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -112,10 +112,10 @@ emit_system_values_block(nir_block *block, fs_visitor *v)
   nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
   switch (intrin->intrinsic) {
   case nir_intrinsic_load_vertex_id:
- unreachable("should be lowered by lower_vertex_id().");
+  case nir_intrinsic_load_base_vertex:
+ unreachable("should be lowered by nir_lower_system_values().");
 
   case nir_intrinsic_load_vertex_id_zero_base:
-  case nir_intrinsic_load_base_vertex:
   case nir_intrinsic_load_is_indexed_draw:
   case nir_intrinsic_load_first_vertex:
   case nir_intrinsic_load_instance_id:
@@ -2420,10 +2420,10 @@ fs_visitor::nir_emit_vs_intrinsic(const fs_builder &bld,
 
switch (instr->intrinsic) {
case nir_intrinsic_load_vertex_id:
-  unreachable("should be lowered by lower_vertex_id()");
+   case nir_intrinsic_load_base_vertex:
+  unreachable("should be lowered by nir_lower_system_values()");
 
case nir_intrinsic_load_vertex_id_zero_base:
-   case nir_intrinsic_load_base_vertex:
case nir_intrinsic_load_instance_id:
case nir_intrinsic_load_base_instance:
case nir_intrinsic_load_draw_id: {
diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
index a624deb6d2a..9998c59586e 100644
--- a/src/intel/compiler/brw_nir.c
+++ b/src/intel/compiler/brw_nir.c
@@ -238,8 +238,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 */
const bool has_sgvs =
   nir->info.system_values_read &
-  (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
-   BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
+  (BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) |
BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID));
@@ -261,7 +260,6 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
 
 switch (intrin->intrinsic) {
-case nir_intrinsic_load_base_vertex:
 case nir_intrinsic_load_first_vertex:
 case nir_intrinsic_load_base_instance:
 case nir_intrinsic_load_vertex_id_zero_base:
@@ -280,7 +278,6 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 
nir_intrinsic_set_base(load, num_inputs);
switch (intrin->intrinsic) {
-

[Mesa-dev] [PATCH 1/5] compiler: Add SYSTEM_VALUE_IS_INDEXED_DRAW and instrinsics

2018-04-28 Thread Antia Puentes

This VS system value contains if the draw command used to start the
rendering was an indexed draw command or a non-indexed one
(~0/0 respectively). Useful to calculate the gl_BaseVertex as:
(SYSTEM_VALUE_IS_INDEXED_DRAW & SYSTEM_VALUE_FIRST_VERTEX).
---
 src/compiler/nir/nir.c | 4 
 src/compiler/nir/nir_gather_info.c | 1 +
 src/compiler/nir/nir_intrinsics.py | 1 +
 src/compiler/shader_enums.c| 1 +
 src/compiler/shader_enums.h| 7 +++
 5 files changed, 14 insertions(+)

diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
index ea28fbd1af5..dc1c560319e 100644
--- a/src/compiler/nir/nir.c
+++ b/src/compiler/nir/nir.c
@@ -1991,6 +1991,8 @@ nir_intrinsic_from_system_value(gl_system_value val)
   return nir_intrinsic_load_base_instance;
case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE:
   return nir_intrinsic_load_vertex_id_zero_base;
+   case SYSTEM_VALUE_IS_INDEXED_DRAW:
+  return nir_intrinsic_load_is_indexed_draw;
case SYSTEM_VALUE_FIRST_VERTEX:
   return nir_intrinsic_load_first_vertex;
case SYSTEM_VALUE_BASE_VERTEX:
@@ -2070,6 +2072,8 @@ nir_system_value_from_intrinsic(nir_intrinsic_op intrin)
   return SYSTEM_VALUE_VERTEX_ID_ZERO_BASE;
case nir_intrinsic_load_first_vertex:
   return SYSTEM_VALUE_FIRST_VERTEX;
+   case nir_intrinsic_load_is_indexed_draw:
+  return SYSTEM_VALUE_IS_INDEXED_DRAW;
case nir_intrinsic_load_base_vertex:
   return SYSTEM_VALUE_BASE_VERTEX;
case nir_intrinsic_load_invocation_id:
diff --git a/src/compiler/nir/nir_gather_info.c 
b/src/compiler/nir/nir_gather_info.c
index a6a699ab25f..dba9f199ec6 100644
--- a/src/compiler/nir/nir_gather_info.c
+++ b/src/compiler/nir/nir_gather_info.c
@@ -266,6 +266,7 @@ gather_intrinsic_info(nir_intrinsic_instr *instr, 
nir_shader *shader)
case nir_intrinsic_load_vertex_id_zero_base:
case nir_intrinsic_load_base_vertex:
case nir_intrinsic_load_first_vertex:
+   case nir_intrinsic_load_is_indexed_draw:
case nir_intrinsic_load_base_instance:
case nir_intrinsic_load_instance_id:
case nir_intrinsic_load_sample_id:
diff --git a/src/compiler/nir/nir_intrinsics.py 
b/src/compiler/nir/nir_intrinsics.py
index f26aaf35ee3..b1754a7e50e 100644
--- a/src/compiler/nir/nir_intrinsics.py
+++ b/src/compiler/nir/nir_intrinsics.py
@@ -414,6 +414,7 @@ system_value("front_face", 1)
 system_value("vertex_id", 1)
 system_value("vertex_id_zero_base", 1)
 system_value("first_vertex", 1)
+system_value("is_indexed_draw", 1)
 system_value("base_vertex", 1)
 system_value("instance_id", 1)
 system_value("base_instance", 1)
diff --git a/src/compiler/shader_enums.c b/src/compiler/shader_enums.c
index ebee076b43c..d596d7d04a0 100644
--- a/src/compiler/shader_enums.c
+++ b/src/compiler/shader_enums.c
@@ -217,6 +217,7 @@ gl_system_value_name(gl_system_value sysval)
  ENUM(SYSTEM_VALUE_INSTANCE_INDEX),
  ENUM(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE),
  ENUM(SYSTEM_VALUE_FIRST_VERTEX),
+ ENUM(SYSTEM_VALUE_IS_INDEXED_DRAW),
  ENUM(SYSTEM_VALUE_BASE_VERTEX),
  ENUM(SYSTEM_VALUE_BASE_INSTANCE),
  ENUM(SYSTEM_VALUE_DRAW_ID),
diff --git a/src/compiler/shader_enums.h b/src/compiler/shader_enums.h
index 8a277a14f21..1ef4d5a33d0 100644
--- a/src/compiler/shader_enums.h
+++ b/src/compiler/shader_enums.h
@@ -517,6 +517,13 @@ typedef enum
 */
SYSTEM_VALUE_FIRST_VERTEX,
 
+   /**
+* If the Draw command used to start the rendering was an indexed draw
+* or not (~0/0). Useful to calculate \c SYSTEM_VALUE_BASE_VERTEX as
+* \c SYSTEM_VALUE_IS_INDEXED_DRAW & \c SYSTEM_VALUE_FIRST_VERTEX.
+*/
+   SYSTEM_VALUE_IS_INDEXED_DRAW,
+
/**
 * Value of \c baseinstance passed to instanced draw entry points
 *
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 0/5] i965: Fix gl_BaseVertex for non-indexed draws

2018-04-28 Thread Antia Puentes

This series is the alternative to the discarded patch 6 of the series:
https://patchwork.freedesktop.org/series/41307/

It fixes gl_BaseVertex in i965 by calculating it as:
is_indexed_draw(~0/0) & firstvertex.

I have run jenkins for the last patch of the series and
the intermediate patch 3. No regressions found.

Antia Puentes (5):
  compiler: Add SYSTEM_VALUE_IS_INDEXED_DRAW and instrinsics
  intel/compiler: Add uses_is_indexed_draw flag
  intel: emit is_indexed_draw in the same VE than gl_DrawID
  compiler/nir: Add conditional lowering for gl_BaseVertex
  intel: activate the gl_BaseVertex lowering

 src/compiler/nir/nir.c|  4 +++
 src/compiler/nir/nir.h|  6 
 src/compiler/nir/nir_gather_info.c|  1 +
 src/compiler/nir/nir_intrinsics.py|  1 +
 src/compiler/nir/nir_lower_system_values.c| 15 ++
 src/compiler/shader_enums.c   |  1 +
 src/compiler/shader_enums.h   |  7 +
 src/intel/compiler/brw_compiler.c |  3 +-
 src/intel/compiler/brw_compiler.h |  2 +-
 src/intel/compiler/brw_fs_nir.cpp | 10 ---
 src/intel/compiler/brw_nir.c  | 16 ++-
 src/intel/compiler/brw_vec4.cpp   | 21 --
 src/mesa/drivers/dri/i965/brw_context.h   | 31 ++--
 src/mesa/drivers/dri/i965/brw_draw.c  | 29 ++-
 src/mesa/drivers/dri/i965/brw_draw_upload.c   | 13 -
 src/mesa/drivers/dri/i965/genX_state_upload.c | 41 +--
 16 files changed, 128 insertions(+), 73 deletions(-)

-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/5] compiler/nir: Add conditional lowering for gl_BaseVertex

2018-04-28 Thread Antia Puentes

---
 src/compiler/nir/nir.h |  6 ++
 src/compiler/nir/nir_lower_system_values.c | 15 +++
 2 files changed, 21 insertions(+)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index f3326e6df94..1b1dd4dd31b 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -1933,6 +1933,12 @@ typedef struct nir_shader_compiler_options {
/* Indicates that the driver only has zero-based vertex id */
bool vertex_id_zero_based;
 
+   /**
+* If enabled, gl_BaseVertex will be lowered as:
+* is_indexed_draw (~0/0) & firstvertex
+*/
+   bool lower_base_vertex;
+
bool lower_cs_local_index_from_id;
 
bool lower_device_index_to_zero;
diff --git a/src/compiler/nir/nir_lower_system_values.c 
b/src/compiler/nir/nir_lower_system_values.c
index 47709e9887b..487da042620 100644
--- a/src/compiler/nir/nir_lower_system_values.c
+++ b/src/compiler/nir/nir_lower_system_values.c
@@ -121,6 +121,21 @@ convert_block(nir_block *block, nir_builder *b)
  }
  break;
 
+  case SYSTEM_VALUE_BASE_VERTEX:
+ /**
+  * From the OpenGL 4.6 (11.1.3.9 Shader Inputs) specification:
+  *
+  * "gl_BaseVertex holds the integer value passed to the baseVertex
+  * parameter to the command that resulted in the current shader
+  * invocation. In the case where the command has no baseVertex
+  * parameter, the value of gl_BaseVertex is zero."
+  */
+ if (b->shader->options->lower_base_vertex)
+sysval = nir_iand(b,
+  nir_load_is_indexed_draw(b),
+  nir_load_first_vertex(b));
+ break;
+
   case SYSTEM_VALUE_INSTANCE_INDEX:
  sysval = nir_iadd(b,
nir_load_instance_id(b),
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/5] intel: emit is_indexed_draw in the same VE than gl_DrawID

2018-04-28 Thread Antia Puentes

The Vertex Elements are now:
* VE 1: 
* VE 2: 

VE1 is it kept as it was before, VE2 additionally contains the new
system value.
---
 src/intel/compiler/brw_fs_nir.cpp |  2 ++
 src/intel/compiler/brw_nir.c  | 11 +--
 src/intel/compiler/brw_vec4.cpp   | 14 +
 src/mesa/drivers/dri/i965/brw_context.h   | 31 +++-
 src/mesa/drivers/dri/i965/brw_draw.c  | 21 +-
 src/mesa/drivers/dri/i965/brw_draw_upload.c   |  8 ++---
 src/mesa/drivers/dri/i965/genX_state_upload.c | 42 +--
 7 files changed, 80 insertions(+), 49 deletions(-)

diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index 9698a0111ef..22beb0e00d1 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -116,6 +116,7 @@ emit_system_values_block(nir_block *block, fs_visitor *v)
 
   case nir_intrinsic_load_vertex_id_zero_base:
   case nir_intrinsic_load_base_vertex:
+  case nir_intrinsic_load_is_indexed_draw:
   case nir_intrinsic_load_first_vertex:
   case nir_intrinsic_load_instance_id:
   case nir_intrinsic_load_base_instance:
@@ -2460,6 +2461,7 @@ fs_visitor::nir_emit_vs_intrinsic(const fs_builder &bld,
}
 
case nir_intrinsic_load_first_vertex:
+   case nir_intrinsic_load_is_indexed_draw:
   unreachable("lowered by brw_nir_lower_vs_inputs");
 
default:
diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
index 16b0d86814f..a624deb6d2a 100644
--- a/src/intel/compiler/brw_nir.c
+++ b/src/intel/compiler/brw_nir.c
@@ -266,6 +266,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 case nir_intrinsic_load_base_instance:
 case nir_intrinsic_load_vertex_id_zero_base:
 case nir_intrinsic_load_instance_id:
+case nir_intrinsic_load_is_indexed_draw:
 case nir_intrinsic_load_draw_id: {
b.cursor = nir_after_instr(&intrin->instr);
 
@@ -293,11 +294,15 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
   nir_intrinsic_set_component(load, 3);
   break;
case nir_intrinsic_load_draw_id:
-  /* gl_DrawID is stored right after gl_VertexID and friends
-   * if any of them exist.
+   case nir_intrinsic_load_is_indexed_draw:
+  /* gl_DrawID and IsIndexedDraw are stored right after
+   * gl_VertexID and friends if any of them exist.
*/
   nir_intrinsic_set_base(load, num_inputs + has_sgvs);
-  nir_intrinsic_set_component(load, 0);
+  if (intrin->intrinsic == nir_intrinsic_load_draw_id)
+ nir_intrinsic_set_component(load, 0);
+  else
+ nir_intrinsic_set_component(load, 1);
   break;
default:
   unreachable("Invalid system value intrinsic");
diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index e583c549204..898df90225f 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -2833,6 +2833,13 @@ brw_compile_vs(const struct brw_compiler *compiler, void 
*log_data,
   nr_attribute_slots++;
}
 
+   /* gl_DrawID and IsIndexedDraw share its very own vec4 */
+   if (shader->info.system_values_read &
+   (BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID) |
+BITFIELD64_BIT(SYSTEM_VALUE_IS_INDEXED_DRAW))) {
+  nr_attribute_slots++;
+   }
+
if (shader->info.system_values_read &
BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX))
   prog_data->uses_basevertex = true;
@@ -2857,12 +2864,9 @@ brw_compile_vs(const struct brw_compiler *compiler, void 
*log_data,
BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))
   prog_data->uses_instanceid = true;
 
-   /* gl_DrawID has its very own vec4 */
if (shader->info.system_values_read &
-   BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID)) {
-  prog_data->uses_drawid = true;
-  nr_attribute_slots++;
-   }
+   BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID))
+  prog_data->uses_drawid = true;
 
/* The 3DSTATE_VS documentation lists the lower bound on "Vertex URB Entry
 * Read Length" as 1 in vec4 mode, and 0 in SIMD8 mode.  Empirically, in
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 1e6a45eee1f..be43eab43cc 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -900,20 +900,35 @@ struct brw_context
   } params;
 
   /**
-   * Buffer and offset used for GL_ARB_shader_draw_parameters
-   * (for now, only gl_BaseVertex).
+   * Buffer and offset used for GL_ARB_shader_draw_parameters which will
+   * point to the indirect buffer for indirect draw calls.
*/
   struct brw_bo *draw_params_bo;
   uint32_t draw_params_offset;
 
+  str

Re: [Mesa-dev] [PATCH] mesa: add missing RGB9_E5 format in _mesa_base_fbo_format

2018-02-20 Thread Antia Puentes




On 26/01/18 11:31, Antia Puentes wrote:

On 25/01/18 18:56, Roland Scheidegger wrote:


Am 25.01.2018 um 17:56 schrieb Roland Scheidegger:

Am 25.01.2018 um 16:30 schrieb Michel Dänzer:

On 2018-01-24 05:38 PM, Juan A. Suarez Romero wrote:

This fixes KHR-GL45.internalformat.renderbuffer.rgb9_e5.
---
  src/mesa/main/fbobject.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
index d23916d1ad7..c72204e11a0 100644
--- a/src/mesa/main/fbobject.c
+++ b/src/mesa/main/fbobject.c
@@ -1976,6 +1976,9 @@ _mesa_base_fbo_format(const struct 
gl_context *ctx, GLenum internalFormat)

 ctx->Extensions.ARB_texture_float) ||
    _mesa_is_gles3(ctx) /* EXT_color_buffer_float */ )
   ? GL_RGBA : 0;
+   case GL_RGB9_E5:
+  return (_mesa_is_desktop_gl(ctx) && 
ctx->Extensions.EXT_texture_shared_exponent)

+ ? GL_RGB: 0;
 case GL_ALPHA16F_ARB:
 case GL_ALPHA32F_ARB:
    return ctx->API == API_OPENGL_COMPAT &&


Unfortunately, this broke the "spec@arb_internalformat_query2@samples
and num_sample_counts pname checks" piglit tests with radeonsi and
llvmpipe, see below.

Any idea what might need to be done in Gallium to fix this?


 32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = 
(1,GL_TRUE), supported=1
 32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = 
(1,GL_TRUE), supported=1
 32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, 
params[0] = (1,GL_TRUE), supported=1
 64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = 
(1,GL_TRUE), supported=1
 64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = 
(1,GL_TRUE), supported=1
 64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, 
params[0] = (1,GL_TRUE), supported=1

PIGLIT: {"subtest": {"GL_NUM_SAMPLE_COUNTS" : "fail"}}
 32 bit failing case: pname = GL_SAMPLES, target = 
GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = 
(1,GL_TRUE), supported=1
 32 bit failing case: pname = GL_SAMPLES, target = 
GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = 
(1,GL_TRUE), supported=1
 32 bit failing case: pname = GL_SAMPLES, target = 
GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, 
params[0] = (1,GL_TRUE), supported=1
 64 bit failing case: pname = GL_SAMPLES, target = 
GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = 
(1,GL_TRUE), supported=1
 64 bit failing case: pname = GL_SAMPLES, target = 
GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = 
(1,GL_TRUE), supported=1
 64 bit failing case: pname = GL_SAMPLES, target = 
GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, 
params[0] = (1,GL_TRUE), supported=1

PIGLIT: {"subtest": {"GL_SAMPLES" : "fail"}}
PIGLIT: {"result": "fail" }



Purely coincidentally, I was trying to clean up the formatquery code
recently (should help some failures with r600 too), and I think these
cleanups would fix it.
Basically outright say "no" to target/pname combinations which don't
make sense rather than trying to find a format suitable for another
target and then asking the driver for the nonsense combination, plus
some other small bits like not validating things again (sometimes, a
third time...).
Albeit it will cause some breakage with the piglit test, which I 
believe

is a test error, but that might be open for debate...
(For TEXTURE_BUFFER and the internalformat size/type queries, do you
return valid values or unsupported? The problem here is ARB_tbo says 
you

can't get these values via the equivalent GetTexLevelParameter queries,
whereas with GL 3.1 you can. And internalformat_query2 says it returns
"the same information" as GetTexLevelParameter, albeit it's not 
entirely
true in any case since the equivalent of the internalformat stencil 
type

doesn't even exist. My stance would be that valid values should be
reported even without GL 3.1, but the piglit test thinks differently.)


Err, actually this won't fix it I suppose - because rgb9e5 now is a
valid fbo format. Was that commit really correct? It does not make sense
to me, rgb9e5 cannot be a fbo/renderable format. Or was this just
working around issues in formatquery.c (which I try to address with this
patch)?

I believe the commit is wrong, _mesa_base_fbo_format_ is used to 
validate the internalformat
passed to RenderbufferStorage, which in the OpenGL 4.6 is said that 
nee

[Mesa-dev] [PATCH v3 1/8] i965: allocate a SGVS element when VertexID or InstanceID are read

2018-01-25 Thread Antia Puentes

From: Iago Toral Quiroga 

Although on gen8+ platforms we can in theory use 3DSTATE_VF_SGVS
to put these beyond the last vertex element it seems that we still
need to allocate the SVGS element, otherwise we have observed cases
where we end up reading garbage. Specifically, the CTS test mentioned
below was flaky with a fail rate of ~1% on some gen9+ platforms caused
by reading garbage for the gl_InstanceID value. The flakyness goes
away as soon as we start allocating the SVGS element.

v2:
  - Do this for gen8+, not just gen9+, and pull the boolean
outside the #if block (Jason)

Fixes flaky test:
KHR-GL45.vertex_attrib_64bit.limits_test

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104335

Reviewed-by: Jason Ekstrand 
---
 src/mesa/drivers/dri/i965/genX_state_upload.c | 17 ++---
 1 file changed, 2 insertions(+), 15 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
b/src/mesa/drivers/dri/i965/genX_state_upload.c
index 50ac5bc59ff..d0a980f9730 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -486,26 +486,13 @@ genX(emit_vertices)(struct brw_context *brw)
} else {
   brw_batch_emit(brw, GENX(3DSTATE_VF_SGVS), vfs);
}
+#endif
 
-   /* Normally we don't need an element for the SGVS attribute because the
-* 3DSTATE_VF_SGVS instruction lets you store the generated attribute in an
-* element that is past the list in 3DSTATE_VERTEX_ELEMENTS. However if
-* we're using draw parameters then we need an element for the those
-* values.  Additionally if there is an edge flag element then the SGVS
-* can't be inserted past that so we need a dummy element to ensure that
-* the edge flag is the last one.
-*/
-   const bool needs_sgvs_element = (vs_prog_data->uses_basevertex ||
-vs_prog_data->uses_baseinstance ||
-((vs_prog_data->uses_instanceid ||
-  vs_prog_data->uses_vertexid)
- && uses_edge_flag));
-#else
const bool needs_sgvs_element = (vs_prog_data->uses_basevertex ||
 vs_prog_data->uses_baseinstance ||
 vs_prog_data->uses_instanceid ||
 vs_prog_data->uses_vertexid);
-#endif
+
unsigned nr_elements =
   brw->vb.nr_enabled + needs_sgvs_element + vs_prog_data->uses_drawid;
 
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 4/8] intel: Handle firstvertex in an identical way to BaseVertex

2018-01-25 Thread Antia Puentes

Until we set gl_BaseVertex to zero for non-indexed draw calls
both have an identical value.

The Vertex Elements are kept like that:
* VE 1: 
* VE 2: 
---
 src/intel/compiler/brw_nir.c  |  3 +++
 src/intel/compiler/brw_vec4.cpp   |  1 +
 src/mesa/drivers/dri/i965/brw_context.h   |  8 ++--
 src/mesa/drivers/dri/i965/brw_draw.c  | 14 +-
 src/mesa/drivers/dri/i965/brw_draw_upload.c   |  7 +--
 src/mesa/drivers/dri/i965/genX_state_upload.c | 11 +++
 6 files changed, 31 insertions(+), 13 deletions(-)

diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
index dbddef0d04d..34b1e44adf0 100644
--- a/src/intel/compiler/brw_nir.c
+++ b/src/intel/compiler/brw_nir.c
@@ -239,6 +239,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
const bool has_sgvs =
   nir->info.system_values_read &
   (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
+   BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) |
BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID));
@@ -261,6 +262,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 
 switch (intrin->intrinsic) {
 case nir_intrinsic_load_base_vertex:
+case nir_intrinsic_load_first_vertex:
 case nir_intrinsic_load_base_instance:
 case nir_intrinsic_load_vertex_id_zero_base:
 case nir_intrinsic_load_instance_id:
@@ -278,6 +280,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
nir_intrinsic_set_base(load, num_inputs);
switch (intrin->intrinsic) {
case nir_intrinsic_load_base_vertex:
+   case nir_intrinsic_load_first_vertex:
   nir_intrinsic_set_component(load, 0);
   break;
case nir_intrinsic_load_base_instance:
diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index 36e17d77d47..06c79630119 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -2788,6 +2788,7 @@ brw_compile_vs(const struct brw_compiler *compiler, void 
*log_data,
 */
if (shader->info.system_values_read &
(BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
+BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
 BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) |
 BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
 BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))) {
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 9046acd175c..0a20706567e 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -881,8 +881,12 @@ struct brw_context
 
struct {
   struct {
- /** The value of gl_BaseVertex for the current _mesa_prim. */
- int gl_basevertex;
+ /**
+  * Either the value of gl_BaseVertex for indexed draw calls or the
+  * value of the argument  for non-indexed draw calls for the
+  * current _mesa_prim.
+  */
+ int firstvertex;
 
  /** The value of gl_BaseInstance for the current _mesa_prim. */
  int gl_baseinstance;
diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 50cf8b12c74..a1a5161fd35 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -816,25 +816,29 @@ brw_draw_single_prim(struct gl_context *ctx,
 * always flag if the shader uses one of the values. For direct draws,
 * we only flag if the values change.
 */
-   const int new_basevertex =
+   const int new_firstvertex =
   prim->indexed ? prim->basevertex : prim->start;
const int new_baseinstance = prim->base_instance;
const struct brw_vs_prog_data *vs_prog_data =
   brw_vs_prog_data(brw->vs.base.prog_data);
if (prim_id > 0) {
-  const bool uses_draw_parameters =
+  const bool uses_firstvertex =
  vs_prog_data->uses_basevertex ||
+ vs_prog_data->uses_firstvertex;
+
+  const bool uses_draw_parameters =
+ uses_firstvertex ||
  vs_prog_data->uses_baseinstance;
 
   if ((uses_draw_parameters && prim->is_indirect) ||
-  (vs_prog_data->uses_basevertex &&
-   brw->draw.params.gl_basevertex != new_basevertex) ||
+  (uses_firstvertex &&
+   brw->draw.params.firstvertex != new_firstvertex) ||
   (vs_prog_data->uses_baseinstance &&
brw->draw.params.gl_baseinstance != new_baseinstance))
  brw->ctx.NewDriverState |= BRW_NEW_VERTICES;
}
 
-   brw->draw.params.gl_basevertex = new_basevertex;
+   brw->draw.params.firstvertex = new_firstvertex;
brw->draw.params.gl_baseinstance = new_baseinstance;
brw_bo_unreference(brw->draw.draw_params_bo);
 
diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c 
b/src/mesa/drivers/dri/i965/brw_draw_upl

[Mesa-dev] [PATCH v3 7/8] nir: Offset vertex_id by first_vertex instead of base_vertex

2018-01-25 Thread Antia Puentes

From: Neil Roberts 

base_vertex will be zero for non-indexed calls and in that case we
need vertex_id to be offset by the ‘first’ parameter instead. That is
what we get with first_vertex. This is true for both GL and Vulkan.

The freedreno driver is also setting vertex_id_zero_based on
nir_options. In order to avoid breakage this patch switches the
relevant code to handle SYSTEM_VALUE_FIRST_VERTEX so that it can
retain the same behavior.

v2: change a3xx/fd3_emit.c and a4xx/fd4_emit.c from
SYSTEM_VALUE_BASE_VERTEX to SYSTEM_VALUE_FIRST_VERTEX (Kenneth).

Cc: Rob Clark 
Cc: Marek Olšák 
---
 src/compiler/nir/nir_lower_system_values.c   | 2 +-
 src/gallium/drivers/freedreno/a3xx/fd3_emit.c| 2 +-
 src/gallium/drivers/freedreno/a4xx/fd4_emit.c| 2 +-
 src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c | 5 ++---
 src/intel/vulkan/genX_cmd_buffer.c   | 4 
 src/intel/vulkan/genX_pipeline.c | 4 +---
 6 files changed, 6 insertions(+), 13 deletions(-)

diff --git a/src/compiler/nir/nir_lower_system_values.c 
b/src/compiler/nir/nir_lower_system_values.c
index 3594f4ae5ce..6f4fb8233ab 100644
--- a/src/compiler/nir/nir_lower_system_values.c
+++ b/src/compiler/nir/nir_lower_system_values.c
@@ -105,7 +105,7 @@ convert_block(nir_block *block, nir_builder *b)
  if (b->shader->options->vertex_id_zero_based) {
 sysval = nir_iadd(b,
   nir_load_vertex_id_zero_base(b),
-  nir_load_base_vertex(b));
+  nir_load_first_vertex(b));
  } else {
 sysval = nir_load_vertex_id(b);
  }
diff --git a/src/gallium/drivers/freedreno/a3xx/fd3_emit.c 
b/src/gallium/drivers/freedreno/a3xx/fd3_emit.c
index b9e1af00e2c..3419ba86d46 100644
--- a/src/gallium/drivers/freedreno/a3xx/fd3_emit.c
+++ b/src/gallium/drivers/freedreno/a3xx/fd3_emit.c
@@ -374,7 +374,7 @@ fd3_emit_vertex_bufs(struct fd_ringbuffer *ring, struct 
fd3_emit *emit)
continue;
if (vp->inputs[i].sysval) {
switch(vp->inputs[i].slot) {
-   case SYSTEM_VALUE_BASE_VERTEX:
+   case SYSTEM_VALUE_FIRST_VERTEX:
/* handled elsewhere */
break;
case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE:
diff --git a/src/gallium/drivers/freedreno/a4xx/fd4_emit.c 
b/src/gallium/drivers/freedreno/a4xx/fd4_emit.c
index 5fec2b6b08a..42268ceea71 100644
--- a/src/gallium/drivers/freedreno/a4xx/fd4_emit.c
+++ b/src/gallium/drivers/freedreno/a4xx/fd4_emit.c
@@ -378,7 +378,7 @@ fd4_emit_vertex_bufs(struct fd_ringbuffer *ring, struct 
fd4_emit *emit)
continue;
if (vp->inputs[i].sysval) {
switch(vp->inputs[i].slot) {
-   case SYSTEM_VALUE_BASE_VERTEX:
+   case SYSTEM_VALUE_FIRST_VERTEX:
/* handled elsewhere */
break;
case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE:
diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c 
b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
index 15a3aa4c802..d3a8dbec14e 100644
--- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
+++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
@@ -2073,11 +2073,10 @@ emit_intrinsic(struct ir3_context *ctx, 
nir_intrinsic_instr *intr)
ctx->ir->outputs[n] = src[i];
}
break;
-   case nir_intrinsic_load_base_vertex:
+   case nir_intrinsic_load_first_vertex:
if (!ctx->basevertex) {
ctx->basevertex = create_driver_param(ctx, 
IR3_DP_VTXID_BASE);
-   add_sysval_input(ctx, SYSTEM_VALUE_BASE_VERTEX,
-   ctx->basevertex);
+   add_sysval_input(ctx, SYSTEM_VALUE_FIRST_VERTEX, 
ctx->basevertex);
}
dst[0] = ctx->basevertex;
break;
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index 9fc281bf4eb..d7dc14f387b 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2224,7 +2224,6 @@ void genX(CmdDraw)(
genX(cmd_buffer_flush_state)(cmd_buffer);
 
if (vs_prog_data->uses_firstvertex ||
-   vs_prog_data->uses_basevertex ||
vs_prog_data->uses_baseinstance)
   emit_base_vertex_instance(cmd_buffer, firstVertex, firstInstance);
if (vs_prog_data->uses_drawid)
@@ -2264,7 +2263,6 @@ void genX(CmdDrawIndexed)(
genX(cmd_buffer_flush_state)(cmd_buffer);
 
if (vs_prog_data->uses_firstvertex ||
-   vs_prog_data->uses_basevertex ||
vs_prog_data->uses_baseinstance)
   emit_base_vertex_instance(cmd_buffer, vertexOffset, firstInstance);
if (vs_prog_data->u

[Mesa-dev] [PATCH v3 5/8] spirv: Lower BaseVertex to FIRST_VERTEX instead of BASE_VERTEX

2018-01-25 Thread Antia Puentes

From: Neil Roberts 

The base vertex in Vulkan is different from GL in that for non-indexed
primitives the value is taken from the firstVertex parameter instead
of being set to zero. This coincides with the new SYSTEM_VALUE_FIRST_VERTEX
instead of BASE_VERTEX.
---
 src/compiler/spirv/vtn_variables.c |  2 +-
 src/intel/vulkan/genX_cmd_buffer.c | 16 
 src/intel/vulkan/genX_pipeline.c   |  2 ++
 3 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/src/compiler/spirv/vtn_variables.c 
b/src/compiler/spirv/vtn_variables.c
index eb306d0c4a8..3e5686af1d9 100644
--- a/src/compiler/spirv/vtn_variables.c
+++ b/src/compiler/spirv/vtn_variables.c
@@ -1279,7 +1279,7 @@ vtn_get_builtin_location(struct vtn_builder *b,
   set_mode_system_value(b, mode);
   break;
case SpvBuiltInBaseVertex:
-  *location = SYSTEM_VALUE_BASE_VERTEX;
+  *location = SYSTEM_VALUE_FIRST_VERTEX;
   set_mode_system_value(b, mode);
   break;
case SpvBuiltInBaseInstance:
diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
b/src/intel/vulkan/genX_cmd_buffer.c
index c23a54fb7b9..9fc281bf4eb 100644
--- a/src/intel/vulkan/genX_cmd_buffer.c
+++ b/src/intel/vulkan/genX_cmd_buffer.c
@@ -2223,7 +2223,9 @@ void genX(CmdDraw)(
 
genX(cmd_buffer_flush_state)(cmd_buffer);
 
-   if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance)
+   if (vs_prog_data->uses_firstvertex ||
+   vs_prog_data->uses_basevertex ||
+   vs_prog_data->uses_baseinstance)
   emit_base_vertex_instance(cmd_buffer, firstVertex, firstInstance);
if (vs_prog_data->uses_drawid)
   emit_draw_index(cmd_buffer, 0);
@@ -2261,7 +2263,9 @@ void genX(CmdDrawIndexed)(
 
genX(cmd_buffer_flush_state)(cmd_buffer);
 
-   if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance)
+   if (vs_prog_data->uses_firstvertex ||
+   vs_prog_data->uses_basevertex ||
+   vs_prog_data->uses_baseinstance)
   emit_base_vertex_instance(cmd_buffer, vertexOffset, firstInstance);
if (vs_prog_data->uses_drawid)
   emit_draw_index(cmd_buffer, 0);
@@ -2417,7 +2421,9 @@ void genX(CmdDrawIndirect)(
   struct anv_bo *bo = buffer->bo;
   uint32_t bo_offset = buffer->offset + offset;
 
-  if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance)
+  if (vs_prog_data->uses_firstvertex ||
+  vs_prog_data->uses_basevertex ||
+  vs_prog_data->uses_baseinstance)
  emit_base_vertex_instance_bo(cmd_buffer, bo, bo_offset + 8);
   if (vs_prog_data->uses_drawid)
  emit_draw_index(cmd_buffer, i);
@@ -2456,7 +2462,9 @@ void genX(CmdDrawIndexedIndirect)(
   uint32_t bo_offset = buffer->offset + offset;
 
   /* TODO: We need to stomp base vertex to 0 somehow */
-  if (vs_prog_data->uses_basevertex || vs_prog_data->uses_baseinstance)
+  if (vs_prog_data->uses_firstvertex ||
+  vs_prog_data->uses_basevertex ||
+  vs_prog_data->uses_baseinstance)
  emit_base_vertex_instance_bo(cmd_buffer, bo, bo_offset + 12);
   if (vs_prog_data->uses_drawid)
  emit_draw_index(cmd_buffer, i);
diff --git a/src/intel/vulkan/genX_pipeline.c b/src/intel/vulkan/genX_pipeline.c
index 82fdf206a95..5f4cf58b83d 100644
--- a/src/intel/vulkan/genX_pipeline.c
+++ b/src/intel/vulkan/genX_pipeline.c
@@ -98,6 +98,7 @@ emit_vertex_input(struct anv_pipeline *pipeline,
const bool needs_svgs_elem = vs_prog_data->uses_vertexid ||
 vs_prog_data->uses_instanceid ||
 vs_prog_data->uses_basevertex ||
+vs_prog_data->uses_firstvertex ||
 vs_prog_data->uses_baseinstance;
 
uint32_t elem_count = __builtin_popcount(elements) -
@@ -178,6 +179,7 @@ emit_vertex_input(struct anv_pipeline *pipeline,
* well.  Just do all or nothing.
*/
   uint32_t base_ctrl = (vs_prog_data->uses_basevertex ||
+vs_prog_data->uses_firstvertex ||
 vs_prog_data->uses_baseinstance) ?
VFCOMP_STORE_SRC : VFCOMP_STORE_0;
 
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 6/8] i965: Don't request GLSL IR lowering of gl_VertexID

2018-01-25 Thread Antia Puentes

From: Ian Romanick 

Let the lowering in NIR handle it instead.

This hurts one shader that occurs twice in shader-db (SynMark GSCloth)
on IVB and HSW.  No other shaders or platforms were affected.

total cycles in shared programs: 253438422 -> 253438426 (0.00%)
cycles in affected programs: 412 -> 416 (0.97%)
helped: 0
HURT: 2

Signed-off-by: Ian Romanick 
Reviewed-by: Antia Puentes 
---
 src/mesa/drivers/dri/i965/brw_context.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 9ed8bc64bb3..7775468f98a 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -590,7 +590,6 @@ brw_initialize_context_constants(struct brw_context *brw)
   ctx->Const.QuadsFollowProvokingVertexConvention = false;
 
ctx->Const.NativeIntegers = true;
-   ctx->Const.VertexID_is_zero_based = true;
 
/* Regarding the CMP instruction, the Ivybridge PRM says:
 *
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 3/8] intel/compiler: Add a uses_firstvertex flag

2018-01-25 Thread Antia Puentes

From: Neil Roberts 

Reviewed-by: Kenneth Graunke 
---
 src/intel/compiler/brw_compiler.h | 1 +
 src/intel/compiler/brw_vec4.cpp   | 4 
 2 files changed, 5 insertions(+)

diff --git a/src/intel/compiler/brw_compiler.h 
b/src/intel/compiler/brw_compiler.h
index b1086bbcee5..0afe5757945 100644
--- a/src/intel/compiler/brw_compiler.h
+++ b/src/intel/compiler/brw_compiler.h
@@ -966,6 +966,7 @@ struct brw_vs_prog_data {
bool uses_vertexid;
bool uses_instanceid;
bool uses_basevertex;
+   bool uses_firstvertex;
bool uses_baseinstance;
bool uses_drawid;
 };
diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index ad6d8f9d6bc..36e17d77d47 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -2798,6 +2798,10 @@ brw_compile_vs(const struct brw_compiler *compiler, void 
*log_data,
BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX))
   prog_data->uses_basevertex = true;
 
+   if (shader->info.system_values_read &
+   BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX))
+  prog_data->uses_firstvertex = true;
+
if (shader->info.system_values_read &
BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE))
   prog_data->uses_baseinstance = true;
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 2/8] compiler: Add SYSTEM_VALUE_FIRST_VERTEX and instrinsics

2018-01-25 Thread Antia Puentes

This VS system value will contain the value passed as 
for indexed draw calls or the value passed as  for non-indexed
draw calls. It can be used to calculate the gl_VertexID as
SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus SYSTEM_VALUE_FIRST_VERTEX.

From the OpenGL 4.6 spec, 10.4 "Drawing Commands Using Vertex Arrays":

-  Page 352:
"The index of any element transferred to the GL by DrawArraysOneInstance is
referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID.
The vertex ID of the ith element transferred is first + i."

  - Page 355:
"The index of any element transferred to the GL by DrawElementsOneInstance is
referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID.
The vertex ID of the ith element transferred is the sum of basevertex and the
value stored in the currently bound element array buffer at offset indices + i."

Currently the gl_VertexID calculation uses SYSTEM_VALUE_BASE_VERTEX but this 
will
have to change when the value of gl_BaseVertex is fixed. Currently its value is
broken for non-indexed draw calls because it must be zero but we are setting it
to .

v2: use SYSTEM_VALUE_FIRST_VERTEX as name for the value, instead of
SYSTEM_VALUE_BASE_VERTEX_ID (Kenneth).

Reviewed-by: Neil Roberts 
Reviewed-by: Kenneth Graunke 
---
 src/compiler/nir/nir.c |  4 
 src/compiler/nir/nir_gather_info.c |  1 +
 src/compiler/nir/nir_intrinsics.h  |  1 +
 src/compiler/shader_enums.c|  1 +
 src/compiler/shader_enums.h| 14 ++
 5 files changed, 21 insertions(+)

diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
index bdd8960403c..e69c2accbbf 100644
--- a/src/compiler/nir/nir.c
+++ b/src/compiler/nir/nir.c
@@ -1919,6 +1919,8 @@ nir_intrinsic_from_system_value(gl_system_value val)
   return nir_intrinsic_load_base_instance;
case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE:
   return nir_intrinsic_load_vertex_id_zero_base;
+   case SYSTEM_VALUE_FIRST_VERTEX:
+  return nir_intrinsic_load_first_vertex;
case SYSTEM_VALUE_BASE_VERTEX:
   return nir_intrinsic_load_base_vertex;
case SYSTEM_VALUE_INVOCATION_ID:
@@ -1990,6 +1992,8 @@ nir_system_value_from_intrinsic(nir_intrinsic_op intrin)
   return SYSTEM_VALUE_BASE_INSTANCE;
case nir_intrinsic_load_vertex_id_zero_base:
   return SYSTEM_VALUE_VERTEX_ID_ZERO_BASE;
+   case nir_intrinsic_load_first_vertex:
+  return SYSTEM_VALUE_FIRST_VERTEX;
case nir_intrinsic_load_base_vertex:
   return SYSTEM_VALUE_BASE_VERTEX;
case nir_intrinsic_load_invocation_id:
diff --git a/src/compiler/nir/nir_gather_info.c 
b/src/compiler/nir/nir_gather_info.c
index 946939657ec..555ae77b1d3 100644
--- a/src/compiler/nir/nir_gather_info.c
+++ b/src/compiler/nir/nir_gather_info.c
@@ -247,6 +247,7 @@ gather_intrinsic_info(nir_intrinsic_instr *instr, 
nir_shader *shader)
case nir_intrinsic_load_vertex_id:
case nir_intrinsic_load_vertex_id_zero_base:
case nir_intrinsic_load_base_vertex:
+   case nir_intrinsic_load_first_vertex:
case nir_intrinsic_load_base_instance:
case nir_intrinsic_load_instance_id:
case nir_intrinsic_load_sample_id:
diff --git a/src/compiler/nir/nir_intrinsics.h 
b/src/compiler/nir/nir_intrinsics.h
index ede29277876..7d3421f0e30 100644
--- a/src/compiler/nir/nir_intrinsics.h
+++ b/src/compiler/nir/nir_intrinsics.h
@@ -333,6 +333,7 @@ SYSTEM_VALUE(frag_coord, 4, 0, xx, xx, xx)
 SYSTEM_VALUE(front_face, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(vertex_id, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(vertex_id_zero_base, 1, 0, xx, xx, xx)
+SYSTEM_VALUE(first_vertex, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(base_vertex, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(instance_id, 1, 0, xx, xx, xx)
 SYSTEM_VALUE(base_instance, 1, 0, xx, xx, xx)
diff --git a/src/compiler/shader_enums.c b/src/compiler/shader_enums.c
index 2179c475abd..5e123f29f37 100644
--- a/src/compiler/shader_enums.c
+++ b/src/compiler/shader_enums.c
@@ -214,6 +214,7 @@ gl_system_value_name(gl_system_value sysval)
  ENUM(SYSTEM_VALUE_INSTANCE_ID),
  ENUM(SYSTEM_VALUE_INSTANCE_INDEX),
  ENUM(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE),
+ ENUM(SYSTEM_VALUE_FIRST_VERTEX),
  ENUM(SYSTEM_VALUE_BASE_VERTEX),
  ENUM(SYSTEM_VALUE_BASE_INSTANCE),
  ENUM(SYSTEM_VALUE_DRAW_ID),
diff --git a/src/compiler/shader_enums.h b/src/compiler/shader_enums.h
index ffe551ab20f..9f71194c146 100644
--- a/src/compiler/shader_enums.h
+++ b/src/compiler/shader_enums.h
@@ -472,6 +472,20 @@ typedef enum
 */
SYSTEM_VALUE_BASE_VERTEX,
 
+   /**
+* Depending on the type of the draw call (indexed or non-indexed),
+* is the value of \c basevertex passed to \c glDrawElementsBaseVertex and
+* similar, or is the value of \c first passed to \c glDrawArrays and
+* similar.
+*
+* \note
+* It can be used to calculate the \c SYSTEM_VALUE_VERTEX_ID as
+* \c SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus \c SYSTEM_VALUE_FIRST_VERTEX.
+*
+* \sa SYSTEM_VALUE_VERTEX_ID_ZERO_BASE, SYSTEM_VALUE_VERT

[Mesa-dev] [PATCH v3 8/8] i965: gl_BaseVertex must be zero for non-indexed draw calls

2018-01-25 Thread Antia Puentes

We keep 'firstvertex' as it is and move gl_BaseVertex to the drawID vertex
element. The previous Vertex Elements order was:

  * VE 1: 
  * VE 2: 

and now it is:

  * VE 1: 
  * VE 2: 

To move the BaseVertex keeping VE1 as it is, allows to keep pointing the vertex
buffer associated to VE 1 to the indirect buffer for indirect draw calls.

From the OpenGL 4.6 (11.1.3.9 Shader Inputs) specification:

  "gl_BaseVertex holds the integer value passed to the baseVertex parameter to 
the
  command that resulted in the current shader invocation. In the case where the
  command has no baseVertex parameter, the value of gl_BaseVertex is zero."

Fixes CTS tests:

  * KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysParameters
  * KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysInstancedParameters
  * KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters
  * 
KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysIndirectParameters
  * KHR-GL45.shader_draw_parameters_tests.MultiDrawArraysIndirectCountParameters

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102678
---
 src/intel/compiler/brw_nir.c  | 14 +
 src/intel/compiler/brw_vec4.cpp   | 14 +
 src/mesa/drivers/dri/i965/brw_context.h   | 32 ++-
 src/mesa/drivers/dri/i965/brw_draw.c  | 45 ++-
 src/mesa/drivers/dri/i965/brw_draw_upload.c   | 24 --
 src/mesa/drivers/dri/i965/genX_state_upload.c | 38 +++---
 6 files changed, 105 insertions(+), 62 deletions(-)

diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
index 34b1e44adf0..c10fa73f4fc 100644
--- a/src/intel/compiler/brw_nir.c
+++ b/src/intel/compiler/brw_nir.c
@@ -238,8 +238,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 */
const bool has_sgvs =
   nir->info.system_values_read &
-  (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
-   BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
+  (BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) |
BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID));
@@ -279,7 +278,6 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 
nir_intrinsic_set_base(load, num_inputs);
switch (intrin->intrinsic) {
-   case nir_intrinsic_load_base_vertex:
case nir_intrinsic_load_first_vertex:
   nir_intrinsic_set_component(load, 0);
   break;
@@ -293,11 +291,15 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
   nir_intrinsic_set_component(load, 3);
   break;
case nir_intrinsic_load_draw_id:
-  /* gl_DrawID is stored right after gl_VertexID and friends
-   * if any of them exist.
+   case nir_intrinsic_load_base_vertex:
+  /* gl_DrawID and gl_BaseVertex are stored right after
+ gl_VertexID and friends if any of them exist.
*/
   nir_intrinsic_set_base(load, num_inputs + has_sgvs);
-  nir_intrinsic_set_component(load, 0);
+  if (intrin->intrinsic == nir_intrinsic_load_draw_id)
+ nir_intrinsic_set_component(load, 0);
+  else
+ nir_intrinsic_set_component(load, 1);
   break;
default:
   unreachable("Invalid system value intrinsic");
diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index 06c79630119..3b4b3c01b57 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -2787,14 +2787,19 @@ brw_compile_vs(const struct brw_compiler *compiler, 
void *log_data,
 * incoming vertex attribute.  So, add an extra slot.
 */
if (shader->info.system_values_read &
-   (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
-BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
+   (BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
 BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) |
 BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
 BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))) {
   nr_attribute_slots++;
}
 
+   if (shader->info.system_values_read &
+   (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
+BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID))) {
+  nr_attribute_slots++;
+   }
+
if (shader->info.system_values_read &
BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX))
   prog_data->uses_basevertex = true;
@@ -2815,12 +2820,9 @@ brw_compile_vs(const struct brw_compiler *compiler, void 
*log_data,
BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))
   prog_data->uses_instanceid = true;
 
-   /* gl_DrawID has its very own vec4 */
if (shader->info.system_values_read &
-   BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID)) {
+   BITFIELD64_BIT(SYSTEM_VALUE_DRA

Re: [Mesa-dev] [PATCH] mesa: add missing RGB9_E5 format in _mesa_base_fbo_format

2018-01-26 Thread Antia Puentes


On 25/01/18 18:56, Roland Scheidegger wrote:


Am 25.01.2018 um 17:56 schrieb Roland Scheidegger:

Am 25.01.2018 um 16:30 schrieb Michel Dänzer:

On 2018-01-24 05:38 PM, Juan A. Suarez Romero wrote:

This fixes KHR-GL45.internalformat.renderbuffer.rgb9_e5.
---
  src/mesa/main/fbobject.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
index d23916d1ad7..c72204e11a0 100644
--- a/src/mesa/main/fbobject.c
+++ b/src/mesa/main/fbobject.c
@@ -1976,6 +1976,9 @@ _mesa_base_fbo_format(const struct gl_context *ctx, 
GLenum internalFormat)
 ctx->Extensions.ARB_texture_float) ||
_mesa_is_gles3(ctx) /* EXT_color_buffer_float */ )
   ? GL_RGBA : 0;
+   case GL_RGB9_E5:
+  return (_mesa_is_desktop_gl(ctx) && 
ctx->Extensions.EXT_texture_shared_exponent)
+ ? GL_RGB: 0;
 case GL_ALPHA16F_ARB:
 case GL_ALPHA32F_ARB:
return ctx->API == API_OPENGL_COMPAT &&


Unfortunately, this broke the "spec@arb_internalformat_query2@samples
and num_sample_counts pname checks" piglit tests with radeonsi and
llvmpipe, see below.

Any idea what might need to be done in Gallium to fix this?


 32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), 
supported=1
 32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = 
(1,GL_TRUE), supported=1
 32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = 
(1,GL_TRUE), supported=1
 64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), 
supported=1
 64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = 
(1,GL_TRUE), supported=1
 64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target = 
GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = 
(1,GL_TRUE), supported=1
PIGLIT: {"subtest": {"GL_NUM_SAMPLE_COUNTS" : "fail"}}
 32 bit failing case: pname = GL_SAMPLES, target = GL_RENDERBUFFER, 
internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1
 32 bit failing case: pname = GL_SAMPLES, target = 
GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = 
(1,GL_TRUE), supported=1
 32 bit failing case: pname = GL_SAMPLES, target = 
GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = 
(1,GL_TRUE), supported=1
 64 bit failing case: pname = GL_SAMPLES, target = GL_RENDERBUFFER, 
internalformat = GL_RGB9_E5, params[0] = (1,GL_TRUE), supported=1
 64 bit failing case: pname = GL_SAMPLES, target = 
GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] = 
(1,GL_TRUE), supported=1
 64 bit failing case: pname = GL_SAMPLES, target = 
GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5, params[0] = 
(1,GL_TRUE), supported=1
PIGLIT: {"subtest": {"GL_SAMPLES" : "fail"}}
PIGLIT: {"result": "fail" }



Purely coincidentally, I was trying to clean up the formatquery code
recently (should help some failures with r600 too), and I think these
cleanups would fix it.
Basically outright say "no" to target/pname combinations which don't
make sense rather than trying to find a format suitable for another
target and then asking the driver for the nonsense combination, plus
some other small bits like not validating things again (sometimes, a
third time...).
Albeit it will cause some breakage with the piglit test, which I believe
is a test error, but that might be open for debate...
(For TEXTURE_BUFFER and the internalformat size/type queries, do you
return valid values or unsupported? The problem here is ARB_tbo says you
can't get these values via the equivalent GetTexLevelParameter queries,
whereas with GL 3.1 you can. And internalformat_query2 says it returns
"the same information" as GetTexLevelParameter, albeit it's not entirely
true in any case since the equivalent of the internalformat stencil type
doesn't even exist. My stance would be that valid values should be
reported even without GL 3.1, but the piglit test thinks differently.)


Err, actually this won't fix it I suppose - because rgb9e5 now is a
valid fbo format. Was that commit really correct? It does not make sense
to me, rgb9e5 cannot be a fbo/renderable format. Or was this just
working around issues in formatquery.c (which I try to address with this
patch)?

I believe the commit is wrong, _mesa_base_fbo_format_ is used to 
validate the internalformat
passed to RenderbufferStorage, which in the OpenGL 4.6 is said that 
needs to be:


An INVALID_ENUM error is generated if internalformat is not one of the
color-renderable, depth-renderable, or stencil-renderable formats defined in
section 9.4.

and that's exactly the error returned by the

[Mesa-dev] [PATCH] Revert "mesa: add missing RGB9_E5 format in _mesa_base_fbo_format"

2018-01-26 Thread Antia Puentes

This reverts commit 513c2263cbff45edb105c7b46e58f316e06746ab.

_mesa_base_fbo_format_ is used to validate the internalformat
passed to RenderbufferStorage, which in the OpenGL 4.6 is said:

"An INVALID_ENUM error is generated if internalformat is not one of the
color-renderable, depth-renderable, or stencil-renderable formats defined
in section 9.4."

RGB9_E5 format is not renderable, as stated in the same specification
(Bug 9338).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104794

Cc: Juan A. Suarez Romero 
Cc: Kenneth Graunke 
---
 src/mesa/main/fbobject.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
index c72204e11a0..d23916d1ad7 100644
--- a/src/mesa/main/fbobject.c
+++ b/src/mesa/main/fbobject.c
@@ -1976,9 +1976,6 @@ _mesa_base_fbo_format(const struct gl_context *ctx, 
GLenum internalFormat)
ctx->Extensions.ARB_texture_float) ||
   _mesa_is_gles3(ctx) /* EXT_color_buffer_float */ )
  ? GL_RGBA : 0;
-   case GL_RGB9_E5:
-  return (_mesa_is_desktop_gl(ctx) && 
ctx->Extensions.EXT_texture_shared_exponent)
- ? GL_RGB: 0;
case GL_ALPHA16F_ARB:
case GL_ALPHA32F_ARB:
   return ctx->API == API_OPENGL_COMPAT &&
-- 
2.14.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: add missing RGB9_E5 format in _mesa_base_fbo_format

2018-01-26 Thread Antia Puentes


Hi Roland,

On 26/01/18 13:57, Roland Scheidegger wrote:

Am 26.01.2018 um 11:31 schrieb Antia Puentes:

On 25/01/18 18:56, Roland Scheidegger wrote:


Am 25.01.2018 um 17:56 schrieb Roland Scheidegger:

Am 25.01.2018 um 16:30 schrieb Michel Dänzer:

On 2018-01-24 05:38 PM, Juan A. Suarez Romero wrote:

This fixes KHR-GL45.internalformat.renderbuffer.rgb9_e5.
---
   src/mesa/main/fbobject.c | 3 +++
   1 file changed, 3 insertions(+)

diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
index d23916d1ad7..c72204e11a0 100644
--- a/src/mesa/main/fbobject.c
+++ b/src/mesa/main/fbobject.c
@@ -1976,6 +1976,9 @@ _mesa_base_fbo_format(const struct gl_context
*ctx, GLenum internalFormat)
  ctx->Extensions.ARB_texture_float) ||
     _mesa_is_gles3(ctx) /* EXT_color_buffer_float */ )
    ? GL_RGBA : 0;
+   case GL_RGB9_E5:
+  return (_mesa_is_desktop_gl(ctx) &&
ctx->Extensions.EXT_texture_shared_exponent)
+ ? GL_RGB: 0;
  case GL_ALPHA16F_ARB:
  case GL_ALPHA32F_ARB:
     return ctx->API == API_OPENGL_COMPAT &&


Unfortunately, this broke the "spec@arb_internalformat_query2@samples
and num_sample_counts pname checks" piglit tests with radeonsi and
llvmpipe, see below.

Any idea what might need to be done in Gallium to fix this?


  32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target =
GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] =
(1,GL_TRUE), supported=1
  32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target =
GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] =
(1,GL_TRUE), supported=1
  32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target =
GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5,
params[0] = (1,GL_TRUE), supported=1
  64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target =
GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] =
(1,GL_TRUE), supported=1
  64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target =
GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] =
(1,GL_TRUE), supported=1
  64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target =
GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5,
params[0] = (1,GL_TRUE), supported=1
PIGLIT: {"subtest": {"GL_NUM_SAMPLE_COUNTS" : "fail"}}
  32 bit failing case: pname = GL_SAMPLES, target =
GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] =
(1,GL_TRUE), supported=1
  32 bit failing case: pname = GL_SAMPLES, target =
GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] =
(1,GL_TRUE), supported=1
  32 bit failing case: pname = GL_SAMPLES, target =
GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5,
params[0] = (1,GL_TRUE), supported=1
  64 bit failing case: pname = GL_SAMPLES, target =
GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] =
(1,GL_TRUE), supported=1
  64 bit failing case: pname = GL_SAMPLES, target =
GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] =
(1,GL_TRUE), supported=1
  64 bit failing case: pname = GL_SAMPLES, target =
GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5,
params[0] = (1,GL_TRUE), supported=1
PIGLIT: {"subtest": {"GL_SAMPLES" : "fail"}}
PIGLIT: {"result": "fail" }



Purely coincidentally, I was trying to clean up the formatquery code
recently (should help some failures with r600 too), and I think these
cleanups would fix it.
Basically outright say "no" to target/pname combinations which don't
make sense rather than trying to find a format suitable for another
target and then asking the driver for the nonsense combination, plus
some other small bits like not validating things again (sometimes, a
third time...).
Albeit it will cause some breakage with the piglit test, which I believe
is a test error, but that might be open for debate...
(For TEXTURE_BUFFER and the internalformat size/type queries, do you
return valid values or unsupported? The problem here is ARB_tbo says you
can't get these values via the equivalent GetTexLevelParameter queries,
whereas with GL 3.1 you can. And internalformat_query2 says it returns
"the same information" as GetTexLevelParameter, albeit it's not entirely
true in any case since the equivalent of the internalformat stencil type
doesn't even exist. My stance would be that valid values should be
reported even without GL 3.1, but the piglit test thinks differently.)


Err, actually this won't fix it I suppose - because rgb9e5 now is a
valid fbo format. Was that commit really correct? It does not make sense
to me, rgb9e5 cannot be a fbo/renderable format. Or was this just
working around issues in formatquery.c (which I try to address with this
patch)?


I believe the commit is wrong, _mesa_base_fbo_format_ is used to
validate the internalformat
passed to RenderbufferStora

Re: [Mesa-dev] [PATCH] mesa: add missing RGB9_E5 format in _mesa_base_fbo_format

2018-01-26 Thread Antia Puentes



On 26/01/18 14:19, Roland Scheidegger wrote:

Am 26.01.2018 um 14:13 schrieb Antia Puentes:

Hi Roland,

On 26/01/18 13:57, Roland Scheidegger wrote:

Am 26.01.2018 um 11:31 schrieb Antia Puentes:

On 25/01/18 18:56, Roland Scheidegger wrote:


Am 25.01.2018 um 17:56 schrieb Roland Scheidegger:

Am 25.01.2018 um 16:30 schrieb Michel Dänzer:

On 2018-01-24 05:38 PM, Juan A. Suarez Romero wrote:

This fixes KHR-GL45.internalformat.renderbuffer.rgb9_e5.
---
    src/mesa/main/fbobject.c | 3 +++
    1 file changed, 3 insertions(+)

diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
index d23916d1ad7..c72204e11a0 100644
--- a/src/mesa/main/fbobject.c
+++ b/src/mesa/main/fbobject.c
@@ -1976,6 +1976,9 @@ _mesa_base_fbo_format(const struct gl_context
*ctx, GLenum internalFormat)
   ctx->Extensions.ARB_texture_float) ||
  _mesa_is_gles3(ctx) /* EXT_color_buffer_float */ )
     ? GL_RGBA : 0;
+   case GL_RGB9_E5:
+  return (_mesa_is_desktop_gl(ctx) &&
ctx->Extensions.EXT_texture_shared_exponent)
+ ? GL_RGB: 0;
   case GL_ALPHA16F_ARB:
   case GL_ALPHA32F_ARB:
  return ctx->API == API_OPENGL_COMPAT &&


Unfortunately, this broke the "spec@arb_internalformat_query2@samples
and num_sample_counts pname checks" piglit tests with radeonsi and
llvmpipe, see below.

Any idea what might need to be done in Gallium to fix this?


   32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target =
GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] =
(1,GL_TRUE), supported=1
   32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target =
GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] =
(1,GL_TRUE), supported=1
   32 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target =
GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5,
params[0] = (1,GL_TRUE), supported=1
   64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target =
GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] =
(1,GL_TRUE), supported=1
   64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target =
GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] =
(1,GL_TRUE), supported=1
   64 bit failing case: pname = GL_NUM_SAMPLE_COUNTS, target =
GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5,
params[0] = (1,GL_TRUE), supported=1
PIGLIT: {"subtest": {"GL_NUM_SAMPLE_COUNTS" : "fail"}}
   32 bit failing case: pname = GL_SAMPLES, target =
GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] =
(1,GL_TRUE), supported=1
   32 bit failing case: pname = GL_SAMPLES, target =
GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] =
(1,GL_TRUE), supported=1
   32 bit failing case: pname = GL_SAMPLES, target =
GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5,
params[0] = (1,GL_TRUE), supported=1
   64 bit failing case: pname = GL_SAMPLES, target =
GL_RENDERBUFFER, internalformat = GL_RGB9_E5, params[0] =
(1,GL_TRUE), supported=1
   64 bit failing case: pname = GL_SAMPLES, target =
GL_TEXTURE_2D_MULTISAMPLE, internalformat = GL_RGB9_E5, params[0] =
(1,GL_TRUE), supported=1
   64 bit failing case: pname = GL_SAMPLES, target =
GL_TEXTURE_2D_MULTISAMPLE_ARRAY, internalformat = GL_RGB9_E5,
params[0] = (1,GL_TRUE), supported=1
PIGLIT: {"subtest": {"GL_SAMPLES" : "fail"}}
PIGLIT: {"result": "fail" }



Purely coincidentally, I was trying to clean up the formatquery code
recently (should help some failures with r600 too), and I think these
cleanups would fix it.
Basically outright say "no" to target/pname combinations which don't
make sense rather than trying to find a format suitable for another
target and then asking the driver for the nonsense combination, plus
some other small bits like not validating things again (sometimes, a
third time...).
Albeit it will cause some breakage with the piglit test, which I
believe
is a test error, but that might be open for debate...
(For TEXTURE_BUFFER and the internalformat size/type queries, do you
return valid values or unsupported? The problem here is ARB_tbo
says you
can't get these values via the equivalent GetTexLevelParameter
queries,
whereas with GL 3.1 you can. And internalformat_query2 says it returns
"the same information" as GetTexLevelParameter, albeit it's not
entirely
true in any case since the equivalent of the internalformat stencil
type
doesn't even exist. My stance would be that valid values should be
reported even without GL 3.1, but the piglit test thinks differently.)


Err, actually this won't fix it I suppose - because rgb9e5 now is a
valid fbo format. Was that commit really correct? It does not make
sense
to me, rgb9e5 cannot be a fbo/renderable format. Or was this just
working around issues in formatquery.c (which I try to address with
this
patch)?


I beli

Re: [Mesa-dev] [PATCH] i965: perform 2 uploads with dual slot 64PASSTHRU formats on gen<8

2018-01-29 Thread Antia Puentes


Hi,

the patch looks good to me. Some small comments:

On 29/01/18 01:26, Andres Gomez wrote:

The emission of vertex attributes corresponding to dvec3 and dvec4
vertex shader input variables was not correct when the  passed
to the VertexAttribL* commands was <= 2.

In 61a8a55f557 ("i965/gen8: Fix vertex attrib upload for dvec3/4
shader inputs"), for gen8+ we needed to determine if the attrib was
dual slot to emit 128 or 256-bit, independently of the VAO size.

Similarly, for gen < 8 we also need to determine whether the attrib is
dual slot to force the emission of 265-bits through 2 uploads.

    ^^^
    256-bits

Additionally, we make use of the ISL_FORMAT_R32_FLOAT format in this
second upload to fill these unspecified components with zeros, as we
also do for gen8+.

Fixes the following test on Haswell:
KHR-GL46.vertex_attrib_binding.basic-inputL-case1

Fixes: 75968a668e4 ("i965/gen7: expose OpenGL 4.2 on Haswell when
supported")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103006
Cc: Alejandro Piñeiro 
Cc: Juan A. Suarez Romero 
Cc: Antia Puentes 
Cc: Rafael Antognolli 
Cc: Kenneth Graunke 
Signed-off-by: Andres Gomez 
---
  src/mesa/drivers/dri/i965/genX_state_upload.c | 19 ++-
  1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c 
b/src/mesa/drivers/dri/i965/genX_state_upload.c
index aa4d64d08e2..d4c89c05b41 100644
--- a/src/mesa/drivers/dri/i965/genX_state_upload.c
+++ b/src/mesa/drivers/dri/i965/genX_state_upload.c
@@ -364,11 +364,15 @@ is_passthru_format(uint32_t format)
  }
  
  UNUSED static int

-uploads_needed(uint32_t format)
+uploads_needed(uint32_t format,
+  bool is_dual_slot)
  {
 if (!is_passthru_format(format))
return 1;
  
+   if (is_dual_slot)

+  return 2;
+
 switch (format) {
 case ISL_FORMAT_R64_PASSTHRU:
 case ISL_FORMAT_R64G64_PASSTHRU:
@@ -399,9 +403,11 @@ downsize_format_if_needed(uint32_t format,
  
 switch (format) {

 case ISL_FORMAT_R64_PASSTHRU:
-  return ISL_FORMAT_R32G32_FLOAT;
+  return !upload ? ISL_FORMAT_R32G32_FLOAT
+ : ISL_FORMAT_R32_FLOAT;
 case ISL_FORMAT_R64G64_PASSTHRU:
-  return ISL_FORMAT_R32G32B32A32_FLOAT;
+  return !upload ? ISL_FORMAT_R32G32B32A32_FLOAT
+ : ISL_FORMAT_R32_FLOAT;
 case ISL_FORMAT_R64G64B64_PASSTHRU:
return !upload ? ISL_FORMAT_R32G32B32A32_FLOAT
   : ISL_FORMAT_R32G32_FLOAT;
@@ -420,6 +426,8 @@ static int
  upload_format_size(uint32_t upload_format)
  {
 switch (upload_format) {
+   case ISL_FORMAT_R32_FLOAT:
+  return 0;

I agree with Piñeiro about at least adding a comment here.

 case ISL_FORMAT_R32G32_FLOAT:
return 2;
 case ISL_FORMAT_R32G32B32A32_FLOAT:
@@ -517,7 +525,7 @@ genX(emit_vertices)(struct brw_context *brw)
struct brw_vertex_element *input = brw->vb.enabled[i];
uint32_t format = brw_get_vertex_surface_type(brw, input->glarray);
  
-  if (uploads_needed(format) > 1)

+  if (uploads_needed(format, input->is_dual_slot) > 1)
   nr_elements++;
 }
  #endif
@@ -613,7 +621,8 @@ genX(emit_vertices)(struct brw_context *brw)
uint32_t comp1 = VFCOMP_STORE_SRC;
uint32_t comp2 = VFCOMP_STORE_SRC;
uint32_t comp3 = VFCOMP_STORE_SRC;
-  const unsigned num_uploads = GEN_GEN < 8 ? uploads_needed(format) : 1;
+  const unsigned num_uploads = GEN_GEN < 8 ?
+uploads_needed(format, input->is_dual_slot) : 1;
  
  #if GEN_GEN >= 8

/* From the BDW PRM, Volume 2d, page 588 (VERTEX_ELEMENT_STATE):

Thanks for fixing this.

Reviewed-by: Antia Puentes 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 2/8] compiler: Add SYSTEM_VALUE_FIRST_VERTEX and instrinsics

2018-02-07 Thread Antia Puentes


This series is still waiting for review.


On 25/01/18 19:15, Antia Puentes wrote:

This VS system value will contain the value passed as 
for indexed draw calls or the value passed as  for non-indexed
draw calls. It can be used to calculate the gl_VertexID as
SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus SYSTEM_VALUE_FIRST_VERTEX.

 From the OpenGL 4.6 spec, 10.4 "Drawing Commands Using Vertex Arrays":

-  Page 352:
"The index of any element transferred to the GL by DrawArraysOneInstance is
referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID.
The vertex ID of the ith element transferred is first + i."

   - Page 355:
"The index of any element transferred to the GL by DrawElementsOneInstance is
referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID.
The vertex ID of the ith element transferred is the sum of basevertex and the
value stored in the currently bound element array buffer at offset indices + i."

Currently the gl_VertexID calculation uses SYSTEM_VALUE_BASE_VERTEX but this 
will
have to change when the value of gl_BaseVertex is fixed. Currently its value is
broken for non-indexed draw calls because it must be zero but we are setting it
to .

v2: use SYSTEM_VALUE_FIRST_VERTEX as name for the value, instead of
SYSTEM_VALUE_BASE_VERTEX_ID (Kenneth).

Reviewed-by: Neil Roberts 
Reviewed-by: Kenneth Graunke 
---
  src/compiler/nir/nir.c |  4 
  src/compiler/nir/nir_gather_info.c |  1 +
  src/compiler/nir/nir_intrinsics.h  |  1 +
  src/compiler/shader_enums.c|  1 +
  src/compiler/shader_enums.h| 14 ++
  5 files changed, 21 insertions(+)

diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
index bdd8960403c..e69c2accbbf 100644
--- a/src/compiler/nir/nir.c
+++ b/src/compiler/nir/nir.c
@@ -1919,6 +1919,8 @@ nir_intrinsic_from_system_value(gl_system_value val)
return nir_intrinsic_load_base_instance;
 case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE:
return nir_intrinsic_load_vertex_id_zero_base;
+   case SYSTEM_VALUE_FIRST_VERTEX:
+  return nir_intrinsic_load_first_vertex;
 case SYSTEM_VALUE_BASE_VERTEX:
return nir_intrinsic_load_base_vertex;
 case SYSTEM_VALUE_INVOCATION_ID:
@@ -1990,6 +1992,8 @@ nir_system_value_from_intrinsic(nir_intrinsic_op intrin)
return SYSTEM_VALUE_BASE_INSTANCE;
 case nir_intrinsic_load_vertex_id_zero_base:
return SYSTEM_VALUE_VERTEX_ID_ZERO_BASE;
+   case nir_intrinsic_load_first_vertex:
+  return SYSTEM_VALUE_FIRST_VERTEX;
 case nir_intrinsic_load_base_vertex:
return SYSTEM_VALUE_BASE_VERTEX;
 case nir_intrinsic_load_invocation_id:
diff --git a/src/compiler/nir/nir_gather_info.c 
b/src/compiler/nir/nir_gather_info.c
index 946939657ec..555ae77b1d3 100644
--- a/src/compiler/nir/nir_gather_info.c
+++ b/src/compiler/nir/nir_gather_info.c
@@ -247,6 +247,7 @@ gather_intrinsic_info(nir_intrinsic_instr *instr, 
nir_shader *shader)
 case nir_intrinsic_load_vertex_id:
 case nir_intrinsic_load_vertex_id_zero_base:
 case nir_intrinsic_load_base_vertex:
+   case nir_intrinsic_load_first_vertex:
 case nir_intrinsic_load_base_instance:
 case nir_intrinsic_load_instance_id:
 case nir_intrinsic_load_sample_id:
diff --git a/src/compiler/nir/nir_intrinsics.h 
b/src/compiler/nir/nir_intrinsics.h
index ede29277876..7d3421f0e30 100644
--- a/src/compiler/nir/nir_intrinsics.h
+++ b/src/compiler/nir/nir_intrinsics.h
@@ -333,6 +333,7 @@ SYSTEM_VALUE(frag_coord, 4, 0, xx, xx, xx)
  SYSTEM_VALUE(front_face, 1, 0, xx, xx, xx)
  SYSTEM_VALUE(vertex_id, 1, 0, xx, xx, xx)
  SYSTEM_VALUE(vertex_id_zero_base, 1, 0, xx, xx, xx)
+SYSTEM_VALUE(first_vertex, 1, 0, xx, xx, xx)
  SYSTEM_VALUE(base_vertex, 1, 0, xx, xx, xx)
  SYSTEM_VALUE(instance_id, 1, 0, xx, xx, xx)
  SYSTEM_VALUE(base_instance, 1, 0, xx, xx, xx)
diff --git a/src/compiler/shader_enums.c b/src/compiler/shader_enums.c
index 2179c475abd..5e123f29f37 100644
--- a/src/compiler/shader_enums.c
+++ b/src/compiler/shader_enums.c
@@ -214,6 +214,7 @@ gl_system_value_name(gl_system_value sysval)
   ENUM(SYSTEM_VALUE_INSTANCE_ID),
   ENUM(SYSTEM_VALUE_INSTANCE_INDEX),
   ENUM(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE),
+ ENUM(SYSTEM_VALUE_FIRST_VERTEX),
   ENUM(SYSTEM_VALUE_BASE_VERTEX),
   ENUM(SYSTEM_VALUE_BASE_INSTANCE),
   ENUM(SYSTEM_VALUE_DRAW_ID),
diff --git a/src/compiler/shader_enums.h b/src/compiler/shader_enums.h
index ffe551ab20f..9f71194c146 100644
--- a/src/compiler/shader_enums.h
+++ b/src/compiler/shader_enums.h
@@ -472,6 +472,20 @@ typedef enum
  */
 SYSTEM_VALUE_BASE_VERTEX,
  
+   /**

+* Depending on the type of the draw call (indexed or non-indexed),
+* is the value of \c basevertex passed to \c glDrawElementsBaseVertex and
+* similar, or is the value of \c first passed to \c glDrawArrays and
+* similar.
+*
+* \note
+* It can be used to calculate the \c SYSTE

[Mesa-dev] [PATCH 05/15] i965/fs: shuffle 32bits into 64bits for doubles

2016-04-28 Thread Antia Puentes

From: "Juan A. Suarez Romero" 

VS Thread Payload handles attributes in URB as vec4, no matter if they
are actually single or double precision.

So with double-precision types, value ends up in the registers split in
32bits chunks, in different positions.

We need to shuffle the chunks to get the doubles correctly.
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 0ff3eaf..4362308 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -3173,6 +3173,12 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, 
nir_intrinsic_instr *instr
   for (unsigned j = 0; j < instr->num_components; j++) {
  bld.MOV(offset(dest, bld, j), offset(src, bld, j));
   }
+  if (type_sz(src.type) == 8)
+ SHUFFLE_32BIT_LOAD_RESULT_TO_64BIT_DATA(bld,
+ offset(dest, bld, 0),
+ offset(dest, bld, 0),
+ instr->num_components);
+
   break;
}
 
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 09/15] i965/fs: calculate first non-payload GRF using attrib slots

2016-04-28 Thread Antia Puentes

From: "Juan A. Suarez Romero" 

When computing where the first non-payload GRF starts, we can't rely on
the number of attributes, as each attribute can be using 1 or 2 slots
depending on whether they are a dvec3/4 or other.

Instead, we need to use the number of slots used by the attributes.
---
 src/mesa/drivers/dri/i965/brw_compiler.h | 1 +
 src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
 src/mesa/drivers/dri/i965/brw_vec4.cpp   | 1 +
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h 
b/src/mesa/drivers/dri/i965/brw_compiler.h
index bc15bf3..51cf850 100644
--- a/src/mesa/drivers/dri/i965/brw_compiler.h
+++ b/src/mesa/drivers/dri/i965/brw_compiler.h
@@ -615,6 +615,7 @@ struct brw_vs_prog_data {
GLbitfield64 inputs_read;
 
unsigned nr_attributes;
+   unsigned nr_attribute_slots;
 
bool uses_vertexid;
bool uses_instanceid;
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 98cbf9f..5e20ef9 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1885,7 +1885,7 @@ fs_visitor::assign_vs_urb_setup()
assert(stage == MESA_SHADER_VERTEX);
 
/* Each attribute is 4 regs. */
-   this->first_non_payload_grf += 4 * vs_prog_data->nr_attributes;
+   this->first_non_payload_grf += 4 * vs_prog_data->nr_attribute_slots;
 
assert(vs_prog_data->base.urb_read_length <= 15);
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 005760d..9816f0d 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -2149,6 +2149,7 @@ brw_compile_vs(const struct brw_compiler *compiler, void 
*log_data,
  DIV_ROUND_UP(MAX2(nr_attribute_slots, 1), 2);
 
prog_data->nr_attributes = nr_attributes;
+   prog_data->nr_attribute_slots = nr_attribute_slots;
 
/* Since vertex shaders reuse the same VUE entry for inputs and outputs
 * (overwriting the original contents), we need to make sure the size is
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 06/15] nir: add double input bitmap

2016-04-28 Thread Antia Puentes

From: "Juan A. Suarez Romero" 

This bitmap tracks which input attributes are double-precision.
---
 src/compiler/nir/glsl_to_nir.cpp | 1 +
 src/compiler/nir/nir.h   | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/src/compiler/nir/glsl_to_nir.cpp b/src/compiler/nir/glsl_to_nir.cpp
index fafa8bb..b6d25cd 100644
--- a/src/compiler/nir/glsl_to_nir.cpp
+++ b/src/compiler/nir/glsl_to_nir.cpp
@@ -154,6 +154,7 @@ glsl_to_nir(const struct gl_shader_program *shader_prog,
shader->info.num_ssbos = sh->NumShaderStorageBlocks;
shader->info.num_images = sh->NumImages;
shader->info.inputs_read = sh->Program->InputsRead;
+   shader->info.double_inputs_read = sh->Program->DoubleInputsRead;
shader->info.outputs_written = sh->Program->OutputsWritten;
shader->info.patch_inputs_read = sh->Program->PatchInputsRead;
shader->info.patch_outputs_written = sh->Program->PatchOutputsWritten;
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index a3ac57d..4f773f1 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -1693,6 +1693,8 @@ typedef struct nir_shader_info {
 
/* Which inputs are actually read */
uint64_t inputs_read;
+   /* Which inputs are actually read and are double */
+   uint64_t double_inputs_read;
/* Which outputs are actually written */
uint64_t outputs_written;
/* Which system values are actually read */
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 03/15] i965: passthru formats cannot be used width edge flag enabled

2016-04-28 Thread Antia Puentes

From: Alejandro Piñeiro 

Add an assertion to detect this case.
---
 src/mesa/drivers/dri/i965/gen8_draw_upload.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/gen8_draw_upload.c 
b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
index 14f7091..17a0a83 100644
--- a/src/mesa/drivers/dri/i965/gen8_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
@@ -34,6 +34,20 @@
 #include "intel_batchbuffer.h"
 #include "intel_buffer_objects.h"
 
+static bool
+is_passthru_format(uint32_t format)
+{
+   switch (format) {
+   case BRW_SURFACEFORMAT_R64_PASSTHRU:
+   case BRW_SURFACEFORMAT_R64G64_PASSTHRU:
+   case BRW_SURFACEFORMAT_R64G64B64_PASSTHRU:
+   case BRW_SURFACEFORMAT_R64G64B64A64_PASSTHRU:
+  return true;
+   default:
+  return false;
+   }
+}
+
 static void
 gen8_emit_vertices(struct brw_context *brw)
 {
@@ -193,6 +207,12 @@ gen8_emit_vertices(struct brw_context *brw)
   uint32_t comp2 = BRW_VE1_COMPONENT_STORE_SRC;
   uint32_t comp3 = BRW_VE1_COMPONENT_STORE_SRC;
 
+  /* From the BDW PRM, Volume 2d, page 588 (VERTEX_ELEMENT_STATE):
+   * "Any SourceElementFormat of *64*_PASSTHRU cannot be used with an
+   * element which has edge flag enabled."
+   */
+  assert(!(is_passthru_format(format) && uses_edge_flag));
+
   /* The gen4 driver expects edgeflag to come in as a float, and passes
* that float on to the tests in the clipper.  Mesa's current vertex
* attribute value for EdgeFlag is stored as a float, which works out.
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 08/15] i965/vec4: use attribute slots to calculate URB read length

2016-04-28 Thread Antia Puentes

From: "Juan A. Suarez Romero" 

Do not use total attributes because a dvec3/dvec4 attribute requires two
slots. So rather use total attribute slots.
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 21 ++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 4766be0..005760d 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -2107,6 +2107,7 @@ brw_compile_vs(const struct brw_compiler *compiler, void 
*log_data,
const unsigned *assembly = NULL;
 
unsigned nr_attributes = _mesa_bitcount_64(prog_data->inputs_read);
+   unsigned nr_attribute_slots = 0;
 
/* gl_VertexID and gl_InstanceID are system values, but arrive via an
 * incoming vertex attribute.  So, add an extra slot.
@@ -2117,11 +2118,23 @@ brw_compile_vs(const struct brw_compiler *compiler, 
void *log_data,
 BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
 BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))) {
   nr_attributes++;
+  nr_attribute_slots++;
}
 
/* gl_DrawID has its very own vec4 */
if (shader->info.system_values_read & BITFIELD64_BIT(SYSTEM_VALUE_DRAW_ID)) 
{
   nr_attributes++;
+  nr_attribute_slots++;
+   }
+
+   GLbitfield64 processed_attributes = 0;
+   foreach_list_typed(nir_variable, var, node, &shader->inputs) {
+  /* Only interested in values not already processed */
+  if (processed_attributes & BITFIELD64_BIT(var->data.location))
+ continue;
+
+  nr_attribute_slots += glsl_count_attribute_slots(var->type, false);
+  processed_attributes |= BITFIELD64_BIT(var->data.location);
}
 
/* The 3DSTATE_VS documentation lists the lower bound on "Vertex URB Entry
@@ -2129,9 +2142,11 @@ brw_compile_vs(const struct brw_compiler *compiler, void 
*log_data,
 * vec4 mode, the hardware appears to wedge unless we read something.
 */
if (is_scalar)
-  prog_data->base.urb_read_length = DIV_ROUND_UP(nr_attributes, 2);
+  prog_data->base.urb_read_length =
+ DIV_ROUND_UP(nr_attribute_slots, 2);
else
-  prog_data->base.urb_read_length = DIV_ROUND_UP(MAX2(nr_attributes, 1), 
2);
+  prog_data->base.urb_read_length =
+ DIV_ROUND_UP(MAX2(nr_attribute_slots, 1), 2);
 
prog_data->nr_attributes = nr_attributes;
 
@@ -2140,7 +2155,7 @@ brw_compile_vs(const struct brw_compiler *compiler, void 
*log_data,
 * the larger of the two.
 */
const unsigned vue_entries =
-  MAX2(nr_attributes, (unsigned)prog_data->base.vue_map.num_slots);
+  MAX2(nr_attribute_slots, (unsigned)prog_data->base.vue_map.num_slots);
 
if (compiler->devinfo->gen == 6)
   prog_data->base.urb_entry_size = DIV_ROUND_UP(vue_entries, 8);
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 13/15] i965: Enable ARB_vertex_attrib_64bit for gen8+

2016-04-28 Thread Antia Puentes

From: Alejandro Piñeiro 

---
 src/mesa/drivers/dri/i965/intel_extensions.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index 412dea0..d3905d0 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -371,6 +371,7 @@ intelInitExtensions(struct gl_context *ctx)
if (brw->gen >= 8) {
   ctx->Extensions.ARB_stencil_texturing = true;
   ctx->Extensions.ARB_gpu_shader_fp64 = true;
+  ctx->Extensions.ARB_vertex_attrib_64bit = true;
}
 
if (brw->gen >= 9) {
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 11/15] i965: abort linking if we exhaust the registers

2016-04-28 Thread Antia Puentes

From: "Juan A. Suarez Romero" 

Even when the number of vertex attributes is under the limit, for
shaders that use a high number of them, we can quickly exhaust the
number of hardware registers.

In this case, just abort the linking.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 4b8835d..387a266 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1857,6 +1857,12 @@ fs_visitor::convert_attr_sources_to_hw_regs(fs_inst 
*inst)
inst->src[i].nr +
inst->src[i].reg_offset;
 
+ if (grf >= 128) {
+fail("Failure to register allocate.  Reduce the number of "
+ "vertex input attributes to avoid this.");
+return;
+ }
+
  unsigned exec_size;
  /* As explained at brw_reg_from_fs_reg, From the Haswell PRM:
   *
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 01/15] i965: get the proper vertex surface type for doubles on gen8+

2016-04-28 Thread Antia Puentes

From: Alejandro Piñeiro 

This commit adds support for PASSTHRU format when pushing
double-precision attributes.

Check glarray->Doubles in order to know if we should choose a format
that does a conversion to float, or just passthru the 64-bit double.
---
 src/mesa/drivers/dri/i965/brw_draw_upload.c | 30 ++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c 
b/src/mesa/drivers/dri/i965/brw_draw_upload.c
index 58e0516..5af4583 100644
--- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
@@ -37,7 +37,7 @@
 #include "intel_batchbuffer.h"
 #include "intel_buffer_objects.h"
 
-static const GLuint double_types[5] = {
+static const GLuint double_types_float[5] = {
0,
BRW_SURFACEFORMAT_R64_FLOAT,
BRW_SURFACEFORMAT_R64G64_FLOAT,
@@ -45,6 +45,14 @@ static const GLuint double_types[5] = {
BRW_SURFACEFORMAT_R64G64B64A64_FLOAT
 };
 
+static const GLuint double_types_passthru[5] = {
+   0,
+   BRW_SURFACEFORMAT_R64_PASSTHRU,
+   BRW_SURFACEFORMAT_R64G64_PASSTHRU,
+   BRW_SURFACEFORMAT_R64G64B64_PASSTHRU,
+   BRW_SURFACEFORMAT_R64G64B64A64_PASSTHRU
+};
+
 static const GLuint float_types[5] = {
0,
BRW_SURFACEFORMAT_R32_FLOAT,
@@ -213,6 +221,22 @@ static const GLuint byte_types_scale[5] = {
BRW_SURFACEFORMAT_R8G8B8A8_SSCALED
 };
 
+static GLuint
+double_types(struct brw_context *brw,
+ int size,
+ GLboolean doubles)
+{
+   /* From the BDW PRM, Volume 2d, page 588 (VERTEX_ELEMENT_STATE):
+* "When SourceElementFormat is set to one of the *64*_PASSTHRU formats,
+* 64-bit components are stored in the URB without any conversion."
+* Also included on BDW PRM, Volume 7, page 470, table "Source Element
+* Formats Supported in VF Unit"
+* Previous PRMs don't include those references.
+*/
+   return (brw->gen >= 8 && doubles
+   ? double_types_passthru[size]
+   : double_types_float[size]);
+}
 
 /**
  * Given vertex array type/size/format/normalized info, return
@@ -245,7 +269,7 @@ brw_get_vertex_surface_type(struct brw_context *brw,
   return BRW_SURFACEFORMAT_R11G11B10_FLOAT;
} else if (glarray->Normalized) {
   switch (glarray->Type) {
-  case GL_DOUBLE: return double_types[size];
+  case GL_DOUBLE: return double_types(brw, size, glarray->Doubles);
   case GL_FLOAT: return float_types[size];
   case GL_HALF_FLOAT: return half_float_types[size];
   case GL_INT: return int_types_norm[size];
@@ -319,7 +343,7 @@ brw_get_vertex_surface_type(struct brw_context *brw,
   }
   assert(glarray->Format == GL_RGBA); /* sanity check */
   switch (glarray->Type) {
-  case GL_DOUBLE: return double_types[size];
+  case GL_DOUBLE: return double_types(brw, size, glarray->Doubles);
   case GL_FLOAT: return float_types[size];
   case GL_HALF_FLOAT: return half_float_types[size];
   case GL_INT: return int_types_scale[size];
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 15/15] i965: Expose OpenGL 4.1 for gen8+

2016-04-28 Thread Antia Puentes

From: Alejandro Piñeiro 

ARB_vertex_attrib_64bit was the only feature missing.
---
 src/mesa/drivers/dri/i965/intel_extensions.c | 2 +-
 src/mesa/drivers/dri/i965/intel_screen.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index d3905d0..0361d42 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -268,7 +268,7 @@ intelInitExtensions(struct gl_context *ctx)
ctx->Extensions.OES_texture_half_float_linear = true;
 
if (brw->gen >= 8)
-  ctx->Const.GLSLVersion = 400;
+  ctx->Const.GLSLVersion = 410;
else if (brw->gen >= 6)
   ctx->Const.GLSLVersion = 330;
else
diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index 6f685b1..f2ec450 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -1386,7 +1386,7 @@ set_max_gl_versions(struct intel_screen *screen)
switch (screen->devinfo->gen) {
case 9:
case 8:
-  psp->max_gl_core_version = 40;
+  psp->max_gl_core_version = 41;
   psp->max_gl_compat_version = 30;
   psp->max_gl_es1_version = 11;
   psp->max_gl_es2_version = 31;
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 00/15] Add ARB_vertex_attrib_64bit for i965 scalar (gen8+)

2016-04-28 Thread Antia Puentes

Hello,

the following series adds the implementation for the ARB_vertex_attrib_64bit
extension on the i965 scalar backend (gen8+). It is the result of working on
https://bugs.freedesktop.org/show_bug.cgi?id=94442.

As this work depends on the ARB_gpu_shader_fp64 i965 functionality [1],
which is work in progress, the aim of sending the series now is to get
early feedback and parallelise the review process.

The series applies on top of the current version of the
https://github.com/Igalia/mesa/tree/i965-fp64 branch,
which contains the last fp64 work for gen8.

A frozen version of the fp64 branch + the series is available in our repo:
$ git clone -b i965-attrib64-v1 https://github.com/Igalia/mesa.git

[1] https://bugs.freedesktop.org/show_bug.cgi?id=92760


Alejandro Piñeiro (6):
  i965: get the proper vertex surface type for doubles on gen8+
  i965: passthru formats cannot be used width edge flag enabled
  i965/fs: half exec_size when dealing with 64 bits attributes
  i965: Enable ARB_vertex_attrib_64bit for gen8+
  docs: Mark ARB_vertex_attrib_64bit as done for i965
  i965: Expose OpenGL 4.1 for gen8+

Antia Puentes (1):
  i965: Configure how to store *64*PASSTHRU vertex components

Juan A. Suarez Romero (8):
  i965/fs: shuffle 32bits into 64bits for doubles
  nir: add double input bitmap
  i965: take care of doubles when remapping VS attributes
  i965/vec4: use attribute slots to calculate URB read length
  i965/fs: calculate first non-payload GRF using attrib slots
  i965: take care of doubles when lowering VS inputs
  i965: abort linking if we exhaust the registers
  i965: abort linking if URB read length greater than 15

 docs/GL3.txt |  2 +-
 src/compiler/nir/glsl_to_nir.cpp |  1 +
 src/compiler/nir/nir.h   |  2 ++
 src/mesa/drivers/dri/i965/brw_compiler.h |  1 +
 src/mesa/drivers/dri/i965/brw_draw_upload.c  | 30 ++--
 src/mesa/drivers/dri/i965/brw_fs.cpp | 44 +--
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  6 
 src/mesa/drivers/dri/i965/brw_nir.c  | 29 +++
 src/mesa/drivers/dri/i965/brw_shader.h   |  1 +
 src/mesa/drivers/dri/i965/brw_vec4.cpp   | 30 ++--
 src/mesa/drivers/dri/i965/gen8_draw_upload.c | 54 
 src/mesa/drivers/dri/i965/intel_extensions.c |  3 +-
 src/mesa/drivers/dri/i965/intel_screen.c |  2 +-
 13 files changed, 177 insertions(+), 28 deletions(-)

-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 12/15] i965: abort linking if URB read length greater than 15

2016-04-28 Thread Antia Puentes

From: "Juan A. Suarez Romero" 

In scalar mode, URB read length limit is 15. Abort if we go beyond it.
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 9816f0d..04287fb 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -2148,6 +2148,14 @@ brw_compile_vs(const struct brw_compiler *compiler, void 
*log_data,
   prog_data->base.urb_read_length =
  DIV_ROUND_UP(MAX2(nr_attribute_slots, 1), 2);
 
+   if (is_scalar && prog_data->base.urb_read_length > 15) {
+  if (error_str)
+ *error_str = ralloc_strdup(mem_ctx,
+"Too many attributes. Try to reduce the "
+"number of attributes or their size");
+  return NULL;
+   }
+
prog_data->nr_attributes = nr_attributes;
prog_data->nr_attribute_slots = nr_attribute_slots;
 
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 14/15] docs: Mark ARB_vertex_attrib_64bit as done for i965

2016-04-28 Thread Antia Puentes

From: Alejandro Piñeiro 

---
 docs/GL3.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index 95ee508..3f1f7a3 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -142,7 +142,7 @@ GL 4.1, GLSL 4.10 --- all DONE: nvc0, r600, radeonsi
   GL_ARB_get_program_binary DONE (0 binary formats)
   GL_ARB_separate_shader_objectsDONE (all drivers)
   GL_ARB_shader_precision   DONE (all drivers that 
support GLSL 4.10)
-  GL_ARB_vertex_attrib_64bitDONE (llvmpipe, 
softpipe)
+  GL_ARB_vertex_attrib_64bitDONE (i965, llvmpipe, 
softpipe)
   GL_ARB_viewport_array DONE (i965, nv50, 
llvmpipe, softpipe)
 
 
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 04/15] i965/fs: half exec_size when dealing with 64 bits attributes

2016-04-28 Thread Antia Puentes

From: Alejandro Piñeiro 

The HW has a restriction that only vertical stride may cross register
boundaries. Until now this was only handled on VGRFs at
rw_reg_from_fs_reg, but it is also needed for attributes.

This can be seen as the equivalent of commit 552cfa9 but for
attributes. Take a look to the git message on that commit for further
info.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 23 +--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index ce84d0a..98cbf9f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1844,11 +1844,30 @@ fs_visitor::convert_attr_sources_to_hw_regs(fs_inst 
*inst)
inst->src[i].nr +
inst->src[i].reg_offset;
 
- unsigned width = inst->src[i].stride == 0 ? 1 : inst->exec_size;
+ unsigned exec_size;
+ /* As explained at brw_reg_from_fs_reg, From the Haswell PRM:
+  *
+  * VertStride must be used to cross GRF register boundaries. This
+  * rule implies that elements within a 'Width' cannot cross GRF
+  * boundaries.
+  *
+  * So, for registers that are large enough, we have to split the exec
+  * size in two and trust the compression state to sort it out.
+  */
+ unsigned total_size = inst->exec_size *
+   inst->src[i].stride *
+   type_sz(inst->src[i].type);
+ if (total_size <= 32) {
+exec_size = inst->exec_size;
+ } else {
+assert(total_size / 2 <= 32);
+exec_size = inst->exec_size / 2;
+ }
+ unsigned width = inst->src[i].stride == 0 ? 1 : exec_size;
  struct brw_reg reg =
 stride(byte_offset(retype(brw_vec8_grf(grf, 0), inst->src[i].type),
inst->src[i].subreg_offset),
-   inst->exec_size * inst->src[i].stride,
+   exec_size * inst->src[i].stride,
width, inst->src[i].stride);
  reg.abs = inst->src[i].abs;
  reg.negate = inst->src[i].negate;
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 07/15] i965: take care of doubles when remapping VS attributes

2016-04-28 Thread Antia Puentes

From: "Juan A. Suarez Romero" 

Double-precision types require 1 slot in VUE for double and dvec2, and 2 slots 
for
anything else.
---
 src/mesa/drivers/dri/i965/brw_nir.c | 26 +++---
 1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
b/src/mesa/drivers/dri/i965/brw_nir.c
index 42cfbaa..1d14437 100644
--- a/src/mesa/drivers/dri/i965/brw_nir.c
+++ b/src/mesa/drivers/dri/i965/brw_nir.c
@@ -103,7 +103,7 @@ add_const_offset_to_base(nir_shader *nir, nir_variable_mode 
mode)
 static bool
 remap_vs_attrs(nir_block *block, void *closure)
 {
-   GLbitfield64 inputs_read = *((GLbitfield64 *) closure);
+   struct nir_shader_info *nir_info = (struct nir_shader_info *) closure;
 
nir_foreach_instr(block, instr) {
   if (instr->type != nir_instr_type_intrinsic)
@@ -118,9 +118,11 @@ remap_vs_attrs(nir_block *block, void *closure)
   * before it and counting the bits.
   */
  int attr = intrin->const_index[0];
- int slot = _mesa_bitcount_64(inputs_read & BITFIELD64_MASK(attr));
-
- intrin->const_index[0] = 4 * slot;
+ int slot = _mesa_bitcount_64(nir_info->inputs_read &
+  BITFIELD64_MASK(attr));
+ int dslot = _mesa_bitcount_64(nir_info->double_inputs_read &
+   BITFIELD64_MASK(attr));
+ intrin->const_index[0] = 4 * (slot + dslot);
   }
}
return true;
@@ -214,9 +216,9 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
   var->data.driver_location = var->data.location;
}
 
-   /* Now use nir_lower_io to walk dereference chains.  Attribute arrays
-* are loaded as one vec4 per element (or matrix column), so we use
-* type_size_vec4 here.
+   /* Now use nir_lower_io to walk dereference chains.  Attribute arrays are
+* loaded as one vec4 or dvec4 per element (or matrix column), depending on
+* whether it is a double-precision type or not.
 */
nir_lower_io(nir, nir_var_shader_in, type_size_vec4);
 
@@ -229,17 +231,11 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
vs_attrib_wa_flags);
 
if (is_scalar) {
-  /* Finally, translate VERT_ATTRIB_* values into the actual registers.
-   *
-   * Note that we can use nir->info.inputs_read instead of
-   * key->inputs_read since the two are identical aside from Gen4-5
-   * edge flag differences.
-   */
-  GLbitfield64 inputs_read = nir->info.inputs_read;
+  /* Finally, translate VERT_ATTRIB_* values into the actual registers. */
 
   nir_foreach_function(nir, function) {
  if (function->impl) {
-nir_foreach_block_call(function->impl, remap_vs_attrs, 
&inputs_read);
+nir_foreach_block_call(function->impl, remap_vs_attrs, &nir->info);
  }
   }
}
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 02/15] i965: Configure how to store 64PASSTHRU vertex components

2016-04-28 Thread Antia Puentes

From the Broadwell specification, structure VERTEX_ELEMENT_STATE
description:

   "When SourceElementFormat is set to one of the *64*_PASSTHRU
formats,  64-bit components are stored in the URB without any
conversion. In this case, vertex elements must be written as 128
or 256 bits, with VFCOMP_STORE_0 being used to pad the output
as required. E.g., if R64_PASSTHRU is used to copy a 64-bit Red component 
into
the URB, Component 1 must be specified as VFCOMP_STORE_0 (with
Components 2,3 set to VFCOMP_NOSTORE) in order to output a 128-bit
vertex element, or Components 1-3 must be specified as VFCOMP_STORE_0
in order to output a 256-bit vertex element. Likewise, use of
R64G64B64_PASSTHRU requires Component 3 to be specified as VFCOMP_STORE_0
in order to output a 256-bit vertex element."

Uses 128-bits to write double and dvec2 vertex elements, and 256-bits for
dvec3 and dvec4 vertex elements.

Signed-off-by: Juan A. Suarez Romero 
Signed-off-by: Antia Puentes 
---
 src/mesa/drivers/dri/i965/gen8_draw_upload.c | 34 
 1 file changed, 34 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/gen8_draw_upload.c 
b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
index fe5ed35..14f7091 100644
--- a/src/mesa/drivers/dri/i965/gen8_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
@@ -217,6 +217,40 @@ gen8_emit_vertices(struct brw_context *brw)
  break;
   }
 
+  /* From the BDW PRM, Volume 2d, page 586 (VERTEX_ELEMENT_STATE):
+   *
+   * "When SourceElementFormat is set to one of the *64*_PASSTHRU
+   * formats, 64-bit components are stored in the URB without any
+   * conversion. In this case, vertex elements must be written as 128
+   * or 256 bits, with VFCOMP_STORE_0 being used to pad the output
+   * as required. E.g., if R64_PASSTHRU is used to copy a 64-bit Red
+   * component into the URB, Component 1 must be specified as
+   * VFCOMP_STORE_0 (with Components 2,3 set to VFCOMP_NOSTORE)
+   * in order to output a 128-bit vertex element, or Components 1-3 
must
+   * be specified as VFCOMP_STORE_0 in order to output a 256-bit vertex
+   * element. Likewise, use of R64G64B64_PASSTHRU requires Component 3
+   * to be specified as VFCOMP_STORE_0 in order to output a 256-bit 
vertex
+   * element."
+   */
+  if (input->glarray->Doubles) {
+ switch (input->glarray->Size) {
+ case 0:
+ case 1:
+ case 2:
+/*  Use 128-bits instead of 256-bits to write double and dvec2
+ *  vertex elements.
+ */
+comp2 = BRW_VE1_COMPONENT_NOSTORE;
+comp3 = BRW_VE1_COMPONENT_NOSTORE;
+break;
+ case 3:
+/* Pad the output using VFCOMP_STORE_0 as suggested
+ * by the BDW PRM.
+ */
+comp3 = BRW_VE1_COMPONENT_STORE_0;
+ }
+  }
+
   OUT_BATCH((input->buffer << GEN6_VE0_INDEX_SHIFT) |
 GEN6_VE0_VALID |
 (format << BRW_VE0_FORMAT_SHIFT) |
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 10/15] i965: take care of doubles when lowering VS inputs

2016-04-28 Thread Antia Puentes

From: "Juan A. Suarez Romero" 

Input attributes can require 2 vec4 or 1 vec4 depending on whether they
are double-precision or not.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp   | 13 +
 src/mesa/drivers/dri/i965/brw_nir.c|  3 ++-
 src/mesa/drivers/dri/i965/brw_shader.h |  1 +
 3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 5e20ef9..4b8835d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -645,6 +645,19 @@ type_size_vec4_times_4(const struct glsl_type *type)
return 4 * type_size_vec4(type);
 }
 
+/* Attribute arrays are loaded as one vec4 per element (or matrix column),
+ * except for double-precision types, which are loaded as one dvec4.
+ */
+extern "C" int
+type_size_vs_input(const struct glsl_type *type)
+{
+   if (type->is_double()) {
+  return type_size_vec4(type) / 2;
+   } else {
+  return type_size_vec4(type);
+   }
+}
+
 /**
  * Create a MOV to read the timestamp register.
  *
diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
b/src/mesa/drivers/dri/i965/brw_nir.c
index 1d14437..8b7cd8e 100644
--- a/src/mesa/drivers/dri/i965/brw_nir.c
+++ b/src/mesa/drivers/dri/i965/brw_nir.c
@@ -23,6 +23,7 @@
 
 #include "brw_nir.h"
 #include "brw_shader.h"
+#include "compiler/glsl_types.h"
 #include "compiler/nir/glsl_to_nir.h"
 #include "compiler/nir/nir_builder.h"
 #include "program/prog_to_nir.h"
@@ -220,7 +221,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 * loaded as one vec4 or dvec4 per element (or matrix column), depending on
 * whether it is a double-precision type or not.
 */
-   nir_lower_io(nir, nir_var_shader_in, type_size_vec4);
+   nir_lower_io(nir, nir_var_shader_in, type_size_vs_input);
 
/* This pass needs actual constants */
nir_opt_constant_folding(nir);
diff --git a/src/mesa/drivers/dri/i965/brw_shader.h 
b/src/mesa/drivers/dri/i965/brw_shader.h
index f6f6167..8af058b 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.h
+++ b/src/mesa/drivers/dri/i965/brw_shader.h
@@ -292,6 +292,7 @@ struct gl_shader *brw_new_shader(struct gl_context *ctx, 
GLuint name, GLuint typ
 int type_size_scalar(const struct glsl_type *type);
 int type_size_vec4(const struct glsl_type *type);
 int type_size_vec4_times_4(const struct glsl_type *type);
+int type_size_vs_input(const struct glsl_type *type);
 
 unsigned tesslevel_outer_components(GLenum tes_primitive_mode);
 unsigned tesslevel_inner_components(GLenum tes_primitive_mode);
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 10/15] i965: take care of doubles when lowering VS inputs

2016-05-12 Thread Antia Puentes

From: "Juan A. Suarez Romero" 

Input attributes can require 2 vec4 or 1 vec4 depending on whether they
are double-precision or not.

Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp   | 13 +
 src/mesa/drivers/dri/i965/brw_nir.c|  3 ++-
 src/mesa/drivers/dri/i965/brw_shader.h |  1 +
 3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 970846c..2e3cad9 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -638,6 +638,19 @@ type_size_vec4_times_4(const struct glsl_type *type)
return 4 * type_size_vec4(type);
 }
 
+/* Attribute arrays are loaded as one vec4 per element (or matrix column),
+ * except for double-precision types, which are loaded as one dvec4.
+ */
+extern "C" int
+type_size_vs_input(const struct glsl_type *type)
+{
+   if (type->is_double()) {
+  return type_size_vec4(type) / 2;
+   } else {
+  return type_size_vec4(type);
+   }
+}
+
 /**
  * Create a MOV to read the timestamp register.
  *
diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
b/src/mesa/drivers/dri/i965/brw_nir.c
index f37bf3a..9afd036 100644
--- a/src/mesa/drivers/dri/i965/brw_nir.c
+++ b/src/mesa/drivers/dri/i965/brw_nir.c
@@ -23,6 +23,7 @@
 
 #include "brw_nir.h"
 #include "brw_shader.h"
+#include "compiler/glsl_types.h"
 #include "compiler/nir/glsl_to_nir.h"
 #include "compiler/nir/nir_builder.h"
 #include "program/prog_to_nir.h"
@@ -205,7 +206,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 * loaded as one vec4 or dvec4 per element (or matrix column), depending on
 * whether it is a double-precision type or not.
 */
-   nir_lower_io(nir, nir_var_shader_in, type_size_vec4);
+   nir_lower_io(nir, nir_var_shader_in, type_size_vs_input);
 
/* This pass needs actual constants */
nir_opt_constant_folding(nir);
diff --git a/src/mesa/drivers/dri/i965/brw_shader.h 
b/src/mesa/drivers/dri/i965/brw_shader.h
index 35e7d7a..60f3b5f 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.h
+++ b/src/mesa/drivers/dri/i965/brw_shader.h
@@ -294,6 +294,7 @@ struct gl_shader *brw_new_shader(struct gl_context *ctx, 
GLuint name, GLuint typ
 int type_size_scalar(const struct glsl_type *type);
 int type_size_vec4(const struct glsl_type *type);
 int type_size_vec4_times_4(const struct glsl_type *type);
+int type_size_vs_input(const struct glsl_type *type);
 
 unsigned tesslevel_outer_components(GLenum tes_primitive_mode);
 unsigned tesslevel_inner_components(GLenum tes_primitive_mode);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 15/15] i965: Expose OpenGL 4.2 for gen8+

2016-05-12 Thread Antia Puentes

From: Alejandro Piñeiro 

ARB_vertex_attrib_64bit was the only feature missing.

v2: we can expose 4.2 instead of 4.1 (Ian Romanick)

Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/intel_extensions.c | 2 +-
 src/mesa/drivers/dri/i965/intel_screen.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index 1c3e8bd..71605f8 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -268,7 +268,7 @@ intelInitExtensions(struct gl_context *ctx)
ctx->Extensions.OES_texture_half_float_linear = true;
 
if (brw->gen >= 8)
-  ctx->Const.GLSLVersion = 400;
+  ctx->Const.GLSLVersion = 420;
else if (brw->gen >= 6)
   ctx->Const.GLSLVersion = 330;
else
diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index c2efc6e..1a0541a 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -1376,7 +1376,7 @@ set_max_gl_versions(struct intel_screen *screen)
switch (screen->devinfo->gen) {
case 9:
case 8:
-  psp->max_gl_core_version = 40;
+  psp->max_gl_core_version = 42;
   psp->max_gl_compat_version = 30;
   psp->max_gl_es1_version = 11;
   psp->max_gl_es2_version = 31;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 12/15] dri/i965: dvec3/dvec4 consume twice input vertex attributes

2016-05-12 Thread Antia Puentes

From: "Juan A. Suarez Romero" 

From the GL 4.5 core spec, section 11.1.1 (Vertex Attributes):

"A program with more than the value of MAX_VERTEX_ATTRIBS
active attribute variables may fail to link, unless
device-dependent optimizations are able to make the program
fit within available hardware resources. For the purposes
of this test, attribute variables of the type dvec3, dvec4,
dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3, and dmat4 may
count as consuming twice as many attributes as equivalent
single-precision types. While these types use the same number
of generic attributes as their single-precision equivalents,
implementations are permitted to consume two single-precision
vectors of internal storage for each three- or four-component
double-precision vector."

This commit sets i965 driver backend to consume twice as many attributes
as equivalent single-precision types for dvec3, dvec4, dmat2x3, dmat2x4,
dmat3, dmat3x4, dmat4x3 and dmat4.

This prevents running out of registers in case we use too many
double-precision vertex input attributes.
---
 src/mesa/drivers/dri/i965/brw_context.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 26514a0..d6998b6 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -619,6 +619,8 @@ brw_initialize_context_constants(struct brw_context *brw)
ctx->Const.NativeIntegers = true;
ctx->Const.VertexID_is_zero_based = true;
 
+   ctx->Const.FP64Vector34Consumes2Locations = true;
+
/* Regarding the CMP instruction, the Ivybridge PRM says:
 *
 *   "For each enabled channel 0b or 1b is assigned to the appropriate flag
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 09/15] i965/fs: calculate first non-payload GRF using attrib slots

2016-05-12 Thread Antia Puentes

From: "Juan A. Suarez Romero" 

When computing where the first non-payload GRF starts, we can't rely on
the number of attributes, as each attribute can be using 1 or 2 slots
depending on whether they are a dvec3/4 or other.

Instead, we need to use the number of slots used by the attributes.

Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_compiler.h | 1 +
 src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
 src/mesa/drivers/dri/i965/brw_vec4.cpp   | 1 +
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h 
b/src/mesa/drivers/dri/i965/brw_compiler.h
index 5807305..446c609 100644
--- a/src/mesa/drivers/dri/i965/brw_compiler.h
+++ b/src/mesa/drivers/dri/i965/brw_compiler.h
@@ -607,6 +607,7 @@ struct brw_vs_prog_data {
GLbitfield64 inputs_read;
 
unsigned nr_attributes;
+   unsigned nr_attribute_slots;
 
bool uses_vertexid;
bool uses_instanceid;
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index e2a87a7..970846c 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1875,7 +1875,7 @@ fs_visitor::assign_vs_urb_setup()
assert(stage == MESA_SHADER_VERTEX);
 
/* Each attribute is 4 regs. */
-   this->first_non_payload_grf += 4 * vs_prog_data->nr_attributes;
+   this->first_non_payload_grf += 4 * vs_prog_data->nr_attribute_slots;
 
assert(vs_prog_data->base.urb_read_length <= 15);
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index b144fe8..88d9317 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -2120,6 +2120,7 @@ brw_compile_vs(const struct brw_compiler *compiler, void 
*log_data,
  DIV_ROUND_UP(MAX2(nr_attribute_slots, 1), 2);
 
prog_data->nr_attributes = nr_attributes;
+   prog_data->nr_attribute_slots = nr_attribute_slots;
 
/* Since vertex shaders reuse the same VUE entry for inputs and outputs
 * (overwriting the original contents), we need to make sure the size is
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 01/15] i965: get the proper vertex surface type for doubles on gen8+

2016-05-12 Thread Antia Puentes

From: Alejandro Piñeiro 

This commit adds support for PASSTHRU format when pushing
double-precision attributes.

Check glarray->Doubles in order to know if we should choose a format
that does a conversion to float, or just passthru the 64-bit double.

Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_draw_upload.c | 30 ++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c 
b/src/mesa/drivers/dri/i965/brw_draw_upload.c
index 58e0516..5af4583 100644
--- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
@@ -37,7 +37,7 @@
 #include "intel_batchbuffer.h"
 #include "intel_buffer_objects.h"
 
-static const GLuint double_types[5] = {
+static const GLuint double_types_float[5] = {
0,
BRW_SURFACEFORMAT_R64_FLOAT,
BRW_SURFACEFORMAT_R64G64_FLOAT,
@@ -45,6 +45,14 @@ static const GLuint double_types[5] = {
BRW_SURFACEFORMAT_R64G64B64A64_FLOAT
 };
 
+static const GLuint double_types_passthru[5] = {
+   0,
+   BRW_SURFACEFORMAT_R64_PASSTHRU,
+   BRW_SURFACEFORMAT_R64G64_PASSTHRU,
+   BRW_SURFACEFORMAT_R64G64B64_PASSTHRU,
+   BRW_SURFACEFORMAT_R64G64B64A64_PASSTHRU
+};
+
 static const GLuint float_types[5] = {
0,
BRW_SURFACEFORMAT_R32_FLOAT,
@@ -213,6 +221,22 @@ static const GLuint byte_types_scale[5] = {
BRW_SURFACEFORMAT_R8G8B8A8_SSCALED
 };
 
+static GLuint
+double_types(struct brw_context *brw,
+ int size,
+ GLboolean doubles)
+{
+   /* From the BDW PRM, Volume 2d, page 588 (VERTEX_ELEMENT_STATE):
+* "When SourceElementFormat is set to one of the *64*_PASSTHRU formats,
+* 64-bit components are stored in the URB without any conversion."
+* Also included on BDW PRM, Volume 7, page 470, table "Source Element
+* Formats Supported in VF Unit"
+* Previous PRMs don't include those references.
+*/
+   return (brw->gen >= 8 && doubles
+   ? double_types_passthru[size]
+   : double_types_float[size]);
+}
 
 /**
  * Given vertex array type/size/format/normalized info, return
@@ -245,7 +269,7 @@ brw_get_vertex_surface_type(struct brw_context *brw,
   return BRW_SURFACEFORMAT_R11G11B10_FLOAT;
} else if (glarray->Normalized) {
   switch (glarray->Type) {
-  case GL_DOUBLE: return double_types[size];
+  case GL_DOUBLE: return double_types(brw, size, glarray->Doubles);
   case GL_FLOAT: return float_types[size];
   case GL_HALF_FLOAT: return half_float_types[size];
   case GL_INT: return int_types_norm[size];
@@ -319,7 +343,7 @@ brw_get_vertex_surface_type(struct brw_context *brw,
   }
   assert(glarray->Format == GL_RGBA); /* sanity check */
   switch (glarray->Type) {
-  case GL_DOUBLE: return double_types[size];
+  case GL_DOUBLE: return double_types(brw, size, glarray->Doubles);
   case GL_FLOAT: return float_types[size];
   case GL_HALF_FLOAT: return half_float_types[size];
   case GL_INT: return int_types_scale[size];
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 13/15] i965: Enable ARB_vertex_attrib_64bit for gen8+

2016-05-12 Thread Antia Puentes

From: Alejandro Piñeiro 

Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/intel_extensions.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index dcb9831..1c3e8bd 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -376,6 +376,7 @@ intelInitExtensions(struct gl_context *ctx)
   ctx->Extensions.ARB_stencil_texturing = true;
   ctx->Extensions.ARB_texture_stencil8 = true;
   ctx->Extensions.ARB_gpu_shader_fp64 = true;
+  ctx->Extensions.ARB_vertex_attrib_64bit = true;
}
 
if (brw->gen >= 9) {
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 02/15] i965: Configure how to store 64PASSTHRU vertex components

2016-05-12 Thread Antia Puentes

From the Broadwell specification, structure VERTEX_ELEMENT_STATE
description:

   "When SourceElementFormat is set to one of the *64*_PASSTHRU
formats,  64-bit components are stored in the URB without any
conversion. In this case, vertex elements must be written as 128
or 256 bits, with VFCOMP_STORE_0 being used to pad the output
as required. E.g., if R64_PASSTHRU is used to copy a 64-bit Red component 
into
the URB, Component 1 must be specified as VFCOMP_STORE_0 (with
Components 2,3 set to VFCOMP_NOSTORE) in order to output a 128-bit
vertex element, or Components 1-3 must be specified as VFCOMP_STORE_0
in order to output a 256-bit vertex element. Likewise, use of
R64G64B64_PASSTHRU requires Component 3 to be specified as VFCOMP_STORE_0
in order to output a 256-bit vertex element."

Uses 128-bits to write double and dvec2 vertex elements, and 256-bits for
dvec3 and dvec4 vertex elements.

Signed-off-by: Juan A. Suarez Romero 
Signed-off-by: Antia Puentes 

Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/gen8_draw_upload.c | 35 
 1 file changed, 35 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/gen8_draw_upload.c 
b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
index fe5ed35..c862f05 100644
--- a/src/mesa/drivers/dri/i965/gen8_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
@@ -217,6 +217,41 @@ gen8_emit_vertices(struct brw_context *brw)
  break;
   }
 
+  /* From the BDW PRM, Volume 2d, page 586 (VERTEX_ELEMENT_STATE):
+   *
+   * "When SourceElementFormat is set to one of the *64*_PASSTHRU
+   * formats, 64-bit components are stored in the URB without any
+   * conversion. In this case, vertex elements must be written as 128
+   * or 256 bits, with VFCOMP_STORE_0 being used to pad the output
+   * as required. E.g., if R64_PASSTHRU is used to copy a 64-bit Red
+   * component into the URB, Component 1 must be specified as
+   * VFCOMP_STORE_0 (with Components 2,3 set to VFCOMP_NOSTORE)
+   * in order to output a 128-bit vertex element, or Components 1-3 
must
+   * be specified as VFCOMP_STORE_0 in order to output a 256-bit vertex
+   * element. Likewise, use of R64G64B64_PASSTHRU requires Component 3
+   * to be specified as VFCOMP_STORE_0 in order to output a 256-bit 
vertex
+   * element."
+   */
+  if (input->glarray->Doubles) {
+ switch (input->glarray->Size) {
+ case 0:
+ case 1:
+ case 2:
+/*  Use 128-bits instead of 256-bits to write double and dvec2
+ *  vertex elements.
+ */
+comp2 = BRW_VE1_COMPONENT_NOSTORE;
+comp3 = BRW_VE1_COMPONENT_NOSTORE;
+break;
+ case 3:
+/* Pad the output using VFCOMP_STORE_0 as suggested
+ * by the BDW PRM.
+ */
+comp3 = BRW_VE1_COMPONENT_STORE_0;
+break;
+ }
+  }
+
   OUT_BATCH((input->buffer << GEN6_VE0_INDEX_SHIFT) |
 GEN6_VE0_VALID |
 (format << BRW_VE0_FORMAT_SHIFT) |
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 03/15] i965: passthru formats cannot be used width edge flag enabled

2016-05-12 Thread Antia Puentes

From: Alejandro Piñeiro 

Add an assertion to detect this case.

Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/gen8_draw_upload.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/gen8_draw_upload.c 
b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
index c862f05..dce11dd 100644
--- a/src/mesa/drivers/dri/i965/gen8_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
@@ -34,6 +34,20 @@
 #include "intel_batchbuffer.h"
 #include "intel_buffer_objects.h"
 
+static bool
+is_passthru_format(uint32_t format)
+{
+   switch (format) {
+   case BRW_SURFACEFORMAT_R64_PASSTHRU:
+   case BRW_SURFACEFORMAT_R64G64_PASSTHRU:
+   case BRW_SURFACEFORMAT_R64G64B64_PASSTHRU:
+   case BRW_SURFACEFORMAT_R64G64B64A64_PASSTHRU:
+  return true;
+   default:
+  return false;
+   }
+}
+
 static void
 gen8_emit_vertices(struct brw_context *brw)
 {
@@ -193,6 +207,12 @@ gen8_emit_vertices(struct brw_context *brw)
   uint32_t comp2 = BRW_VE1_COMPONENT_STORE_SRC;
   uint32_t comp3 = BRW_VE1_COMPONENT_STORE_SRC;
 
+  /* From the BDW PRM, Volume 2d, page 588 (VERTEX_ELEMENT_STATE):
+   * "Any SourceElementFormat of *64*_PASSTHRU cannot be used with an
+   * element which has edge flag enabled."
+   */
+  assert(!(is_passthru_format(format) && uses_edge_flag));
+
   /* The gen4 driver expects edgeflag to come in as a float, and passes
* that float on to the tests in the clipper.  Mesa's current vertex
* attribute value for EdgeFlag is stored as a float, which works out.
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 07/15] i965: take care of doubles when remapping VS attributes

2016-05-12 Thread Antia Puentes

From: "Juan A. Suarez Romero" 

Double-precision types require 1 slot in VUE for double and dvec2, and 2 slots 
for
anything else.

Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_nir.c | 26 +++---
 1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
b/src/mesa/drivers/dri/i965/brw_nir.c
index c501bc1..f37bf3a 100644
--- a/src/mesa/drivers/dri/i965/brw_nir.c
+++ b/src/mesa/drivers/dri/i965/brw_nir.c
@@ -96,7 +96,7 @@ add_const_offset_to_base(nir_shader *nir, nir_variable_mode 
mode)
 }
 
 static bool
-remap_vs_attrs(nir_block *block, GLbitfield64 inputs_read)
+remap_vs_attrs(nir_block *block, struct nir_shader_info *nir_info)
 {
nir_foreach_instr(instr, block) {
   if (instr->type != nir_instr_type_intrinsic)
@@ -111,9 +111,11 @@ remap_vs_attrs(nir_block *block, GLbitfield64 inputs_read)
   * before it and counting the bits.
   */
  int attr = intrin->const_index[0];
- int slot = _mesa_bitcount_64(inputs_read & BITFIELD64_MASK(attr));
-
- intrin->const_index[0] = 4 * slot;
+ int slot = _mesa_bitcount_64(nir_info->inputs_read &
+  BITFIELD64_MASK(attr));
+ int dslot = _mesa_bitcount_64(nir_info->double_inputs_read &
+   BITFIELD64_MASK(attr));
+ intrin->const_index[0] = 4 * (slot + dslot);
   }
}
return true;
@@ -199,9 +201,9 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
   var->data.driver_location = var->data.location;
}
 
-   /* Now use nir_lower_io to walk dereference chains.  Attribute arrays
-* are loaded as one vec4 per element (or matrix column), so we use
-* type_size_vec4 here.
+   /* Now use nir_lower_io to walk dereference chains.  Attribute arrays are
+* loaded as one vec4 or dvec4 per element (or matrix column), depending on
+* whether it is a double-precision type or not.
 */
nir_lower_io(nir, nir_var_shader_in, type_size_vec4);
 
@@ -214,18 +216,12 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
vs_attrib_wa_flags);
 
if (is_scalar) {
-  /* Finally, translate VERT_ATTRIB_* values into the actual registers.
-   *
-   * Note that we can use nir->info.inputs_read instead of
-   * key->inputs_read since the two are identical aside from Gen4-5
-   * edge flag differences.
-   */
-  GLbitfield64 inputs_read = nir->info.inputs_read;
+  /* Finally, translate VERT_ATTRIB_* values into the actual registers. */
 
   nir_foreach_function(function, nir) {
  if (function->impl) {
 nir_foreach_block(block, function->impl) {
-   remap_vs_attrs(block, inputs_read);
+   remap_vs_attrs(block, &nir->info);
 }
  }
   }
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 06/15] nir: add double input bitmap

2016-05-12 Thread Antia Puentes

From: "Juan A. Suarez Romero" 

This bitmap tracks which input attributes are double-precision.

Reviewed-by: Kenneth Graunke 
---
 src/compiler/nir/glsl_to_nir.cpp | 1 +
 src/compiler/nir/nir.h   | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/src/compiler/nir/glsl_to_nir.cpp b/src/compiler/nir/glsl_to_nir.cpp
index a86b966..9b1b098 100644
--- a/src/compiler/nir/glsl_to_nir.cpp
+++ b/src/compiler/nir/glsl_to_nir.cpp
@@ -152,6 +152,7 @@ glsl_to_nir(const struct gl_shader_program *shader_prog,
shader->info.num_ssbos = sh->NumShaderStorageBlocks;
shader->info.num_images = sh->NumImages;
shader->info.inputs_read = sh->Program->InputsRead;
+   shader->info.double_inputs_read = sh->Program->DoubleInputsRead;
shader->info.outputs_written = sh->Program->OutputsWritten;
shader->info.patch_inputs_read = sh->Program->PatchInputsRead;
shader->info.patch_outputs_written = sh->Program->PatchOutputsWritten;
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 20927a2..5007f4c 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -1697,6 +1697,8 @@ typedef struct nir_shader_info {
 
/* Which inputs are actually read */
uint64_t inputs_read;
+   /* Which inputs are actually read and are double */
+   uint64_t double_inputs_read;
/* Which outputs are actually written */
uint64_t outputs_written;
/* Which system values are actually read */
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 08/15] i965/vec4: use attribute slots to calculate URB read length

2016-05-12 Thread Antia Puentes

From: "Juan A. Suarez Romero" 

Do not use total attributes because a dvec3/dvec4 attribute requires two
slots. So rather use total attribute slots.

v2: do not use loop to calculate required attribute slots (Kenneth
Graunke)

Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 815eaed..b144fe8 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -2104,14 +2104,20 @@ brw_compile_vs(const struct brw_compiler *compiler, 
void *log_data,
   nr_attributes++;
}
 
+   unsigned nr_attribute_slots =
+  nr_attributes +
+  _mesa_bitcount_64(shader->info.double_inputs_read);
+
/* The 3DSTATE_VS documentation lists the lower bound on "Vertex URB Entry
 * Read Length" as 1 in vec4 mode, and 0 in SIMD8 mode.  Empirically, in
 * vec4 mode, the hardware appears to wedge unless we read something.
 */
if (is_scalar)
-  prog_data->base.urb_read_length = DIV_ROUND_UP(nr_attributes, 2);
+  prog_data->base.urb_read_length =
+ DIV_ROUND_UP(nr_attribute_slots, 2);
else
-  prog_data->base.urb_read_length = DIV_ROUND_UP(MAX2(nr_attributes, 1), 
2);
+  prog_data->base.urb_read_length =
+ DIV_ROUND_UP(MAX2(nr_attribute_slots, 1), 2);
 
prog_data->nr_attributes = nr_attributes;
 
@@ -2120,7 +2126,7 @@ brw_compile_vs(const struct brw_compiler *compiler, void 
*log_data,
 * the larger of the two.
 */
const unsigned vue_entries =
-  MAX2(nr_attributes, (unsigned)prog_data->base.vue_map.num_slots);
+  MAX2(nr_attribute_slots, (unsigned)prog_data->base.vue_map.num_slots);
 
if (compiler->devinfo->gen == 6)
   prog_data->base.urb_entry_size = DIV_ROUND_UP(vue_entries, 8);
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 11/15] glsl/linker: dvec3/dvec4 may consume twice input vertex attributes

2016-05-12 Thread Antia Puentes

From: "Juan A. Suarez Romero" 

From the GL 4.5 core spec, section 11.1.1 (Vertex Attributes):

"A program with more than the value of MAX_VERTEX_ATTRIBS
active attribute variables may fail to link, unless
device-dependent optimizations are able to make the program
fit within available hardware resources. For the purposes
of this test, attribute variables of the type dvec3, dvec4,
dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3, and dmat4 may
count as consuming twice as many attributes as equivalent
single-precision types. While these types use the same number
of generic attributes as their single-precision equivalents,
implementations are permitted to consume two single-precision
vectors of internal storage for each three- or four-component
double-precision vector."

This commits adds a flag that allows driver to specify if dvec3, dvec4,
dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3 and dmat4 count as consuming
twice as many attributes as equivalent single-precision types (default
value being false).
---
 src/compiler/glsl/linker.cpp | 72 +++-
 src/mesa/main/context.c  |  2 ++
 src/mesa/main/mtypes.h   | 13 
 3 files changed, 66 insertions(+), 21 deletions(-)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 0268b74..ffec007 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -2434,6 +2434,37 @@ resize_tes_inputs(struct gl_context *ctx,
 }
 
 /**
+ * From the GL 4.5 core spec, section 11.1.1 (Vertex Attributes):
+ *
+ * "A program with more than the value of MAX_VERTEX_ATTRIBS
+ *  active attribute variables may fail to link, unless
+ *  device-dependent optimizations are able to make the program
+ *  fit within available hardware resources. For the purposes
+ *  of this test, attribute variables of the type dvec3, dvec4,
+ *  dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3, and dmat4 may
+ *  count as consuming twice as many attributes as equivalent
+ *  single-precision types. While these types use the same number
+ *  of generic attributes as their single-precision equivalents,
+ *  implementations are permitted to consume two single-precision
+ *  vectors of internal storage for each three- or four-component
+ *  double-precision vector."
+ *
+ * Returns true if three- or four-component double-precision vector consumes
+ * two single-precision vectors of internal storage
+ */
+
+static inline bool
+attribute_consumes_two_locations(struct gl_constants *constants,
+ ir_variable *var)
+{
+   if (var->type->without_array()->is_dual_slot_double() &&
+   constants->FP64Vector34Consumes2Locations)
+  return true;
+   else
+  return false;
+}
+
+/**
  * Find a contiguous set of available bits in a bitmask.
  *
  * \param used_mask Bits representing used (1) and unused (0) locations
@@ -2725,27 +2756,7 @@ assign_attribute_or_color_locations(gl_shader_program 
*prog,
 
used_locations |= (use_mask << attr);
 
-/* From the GL 4.5 core spec, section 11.1.1 (Vertex Attributes):
- *
- * "A program with more than the value of MAX_VERTEX_ATTRIBS
- *  active attribute variables may fail to link, unless
- *  device-dependent optimizations are able to make the program
- *  fit within available hardware resources. For the purposes
- *  of this test, attribute variables of the type dvec3, dvec4,
- *  dmat2x3, dmat2x4, dmat3, dmat3x4, dmat4x3, and dmat4 may
- *  count as consuming twice as many attributes as equivalent
- *  single-precision types. While these types use the same number
- *  of generic attributes as their single-precision equivalents,
- *  implementations are permitted to consume two single-precision
- *  vectors of internal storage for each three- or four-component
- *  double-precision vector."
- *
- * Mark this attribute slot as taking up twice as much space
- * so we can count it properly against limits.  According to
- * issue (3) of the GL_ARB_vertex_attrib_64bit behavior, this
- * is optional behavior, but it seems preferable.
- */
-if (var->type->without_array()->is_dual_slot_double())
+if (attribute_consumes_two_locations(constants, var))
double_storage_locations |= (use_mask << attr);
 }
 
@@ -2818,6 +2829,25 @@ assign_attribute_or_color_locations(gl_shader_program 
*prog,
   to_assign[i].var->data.location = generic_base + location;
   to_assign[i].var->data.is_unmatched_generic_inout = 0;
   used_locations |= (use_mask << location);
+
+  if (attribute_consumes_two_locations(constants, to_assign[i].var))
+ double_storage_locations |= (use_mask << location);
+   }
+
+   /* Now that we have all the locations, take in account that dvec3/4 c

[Mesa-dev] [PATCH v2 14/15] docs: Mark ARB_vertex_attrib_64bit as done for i965/gen8+

2016-05-12 Thread Antia Puentes

From: Alejandro Piñeiro 

v2: label as done for i965/gen8+ instead of i965 (Kenneth Graunke)

Reviewed-by: Kenneth Graunke 
---
 docs/GL3.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index c957604..a5ab7ee 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -142,7 +142,7 @@ GL 4.1, GLSL 4.10 --- all DONE: nvc0, r600, radeonsi
   GL_ARB_get_program_binary DONE (0 binary formats)
   GL_ARB_separate_shader_objectsDONE (all drivers)
   GL_ARB_shader_precision   DONE (all drivers that 
support GLSL 4.10)
-  GL_ARB_vertex_attrib_64bitDONE (llvmpipe, 
softpipe)
+  GL_ARB_vertex_attrib_64bitDONE (i965/gen8+, 
llvmpipe, softpipe)
   GL_ARB_viewport_array DONE (i965, nv50, 
llvmpipe, softpipe)
 
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 04/15] i965/fs: half exec_size when dealing with 64 bits attributes

2016-05-12 Thread Antia Puentes

From: Alejandro Piñeiro 

The HW has a restriction that only vertical stride may cross register
boundaries. Until now this was only handled on VGRFs at
rw_reg_from_fs_reg, but it is also needed for attributes.

v2:
 * Remove reference to commit id on commit message (Juan Suarez)
 * Simplify code that compute final exec_size (Ian Romanick)
 * Use REG_SIZE on that same code (Kenneth Graunke)

Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 21 +++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 3c58ccb..e2a87a7 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1836,11 +1836,28 @@ fs_visitor::convert_attr_sources_to_hw_regs(fs_inst 
*inst)
inst->src[i].nr +
inst->src[i].reg_offset;
 
- unsigned width = inst->src[i].stride == 0 ? 1 : inst->exec_size;
+ /* As explained at brw_reg_from_fs_reg, From the Haswell PRM:
+  *
+  * VertStride must be used to cross GRF register boundaries. This
+  * rule implies that elements within a 'Width' cannot cross GRF
+  * boundaries.
+  *
+  * So, for registers that are large enough, we have to split the exec
+  * size in two and trust the compression state to sort it out.
+  */
+ unsigned total_size = inst->exec_size *
+   inst->src[i].stride *
+   type_sz(inst->src[i].type);
+
+ assert(total_size <= 2 * REG_SIZE);
+ const unsigned exec_size =
+(total_size <= REG_SIZE) ? inst->exec_size : inst->exec_size / 2;
+
+ unsigned width = inst->src[i].stride == 0 ? 1 : exec_size;
  struct brw_reg reg =
 stride(byte_offset(retype(brw_vec8_grf(grf, 0), inst->src[i].type),
inst->src[i].subreg_offset),
-   inst->exec_size * inst->src[i].stride,
+   exec_size * inst->src[i].stride,
width, inst->src[i].stride);
  reg.abs = inst->src[i].abs;
  reg.negate = inst->src[i].negate;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 00/15] Add ARB_vertex_attrib_64bit for i965 scalar (gen8+)

2016-05-12 Thread Antia Puentes

Hello,

the following series is a resend of the implementation for the
ARB_vertex_attrib_64bit extension on the i965 scalar backend (gen8+).

This resend includes all the feedback received to v1 plus:

 * Minor changes after rebasing the series against the updated version
of the arb_gpu_shader_fp64 extension, that is being reviewed on the
   list too right now. Those changes were basically adapt our function
  calls after changes on their signatures, and a retype of the dest
 register when shufling (patch 05)

 * Two new patches, to handle properly the linking, as agreed with
Kenneth Graunke on this email [1]. Those two are the only ones
   totally unreviewed right now:

 [PATCH 11/15] glsl/linker: dvec3/dvec4 may consume twice input vertex 
attributes
  [PATCH 12/15] dri/i965: dvec3/dvec4 consume twice input vertex 
attributes

Again, as this work depends on the ARB_gpu_shader_fp64 i965
functionality [2], which is work in progress, the aim of sending the
series now is to get early feedback and parallelise the review
process.

The series applies on top of the current version of the
https://github.com/Igalia/mesa/tree/i965-fp64 branch,
which contains the last fp64 work for gen8.

A frozen version of the branch containing the fp64 patches + the
series is available in our repo:

$ git clone -b i965-attrib64-v2 https://github.com/Igalia/mesa.git

[1] https://lists.freedesktop.org/archives/mesa-dev/2016-May/116381.html
[2] https://bugs.freedesktop.org/show_bug.cgi?id=92760


Alejandro Piñeiro (6):
  i965: get the proper vertex surface type for doubles on gen8+
  i965: passthru formats cannot be used width edge flag enabled
  i965/fs: half exec_size when dealing with 64 bits attributes
  i965: Enable ARB_vertex_attrib_64bit for gen8+
  docs: Mark ARB_vertex_attrib_64bit as done for i965/gen8+
  i965: Expose OpenGL 4.2 for gen8+

Antia Puentes (1):
  i965: Configure how to store *64*PASSTHRU vertex components

Juan A. Suarez Romero (8):
  i965/fs: shuffle 32bits into 64bits for doubles
  nir: add double input bitmap
  i965: take care of doubles when remapping VS attributes
  i965/vec4: use attribute slots to calculate URB read length
  i965/fs: calculate first non-payload GRF using attrib slots
  i965: take care of doubles when lowering VS inputs
  glsl/linker: dvec3/dvec4 may consume twice input vertex attributes
  dri/i965: dvec3/dvec4 consume twice input vertex attributes

 docs/GL3.txt |  2 +-
 src/compiler/glsl/linker.cpp | 72 
 src/compiler/nir/glsl_to_nir.cpp |  1 +
 src/compiler/nir/nir.h   |  2 +
 src/mesa/drivers/dri/i965/brw_compiler.h |  1 +
 src/mesa/drivers/dri/i965/brw_context.c  |  2 +
 src/mesa/drivers/dri/i965/brw_draw_upload.c  | 30 ++--
 src/mesa/drivers/dri/i965/brw_fs.cpp | 36 --
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  6 +++
 src/mesa/drivers/dri/i965/brw_nir.c  | 29 +--
 src/mesa/drivers/dri/i965/brw_shader.h   |  1 +
 src/mesa/drivers/dri/i965/brw_vec4.cpp   | 13 +++--
 src/mesa/drivers/dri/i965/gen8_draw_upload.c | 55 +
 src/mesa/drivers/dri/i965/intel_extensions.c |  3 +-
 src/mesa/drivers/dri/i965/intel_screen.c |  2 +-
 src/mesa/main/context.c  |  2 +
 src/mesa/main/mtypes.h   | 13 +
 17 files changed, 221 insertions(+), 49 deletions(-)

-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 05/15] i965/fs: shuffle 32bits into 64bits for doubles

2016-05-12 Thread Antia Puentes

From: "Juan A. Suarez Romero" 

VS Thread Payload handles attributes in URB as vec4, no matter if they
are actually single or double precision.

So with double-precision types, value ends up in the registers split in
32bits chunks, in different positions.

We need to shuffle the chunks to get the doubles correctly.

v2: use dest directly (Kenneth Graunke)

Reviewed-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 57ab020..5ac515f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -3691,6 +3691,12 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, 
nir_intrinsic_instr *instr
   for (unsigned j = 0; j < instr->num_components; j++) {
  bld.MOV(offset(dest, bld, j), offset(src, bld, j));
   }
+  if (type_sz(src.type) == 8)
+ shuffle_32bit_load_result_to_64bit_data(bld,
+ dest,
+ retype(dest, 
BRW_REGISTER_TYPE_F),
+ instr->num_components);
+
   break;
}
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 4/8] intel: Handle firstvertex in an identical way to BaseVertex

2018-03-27 Thread Antia Puentes


Thanks for looking at this.

On 27/03/18 03:06, Ian Romanick wrote:


It looks like there are a couple places in brw_fs_nir.cpp
(emit_system_values_block and fs_visitor::nir_emit_vs_intrinsic) that
handle nir_intrinsic_load_base_instance but not
nir_intrinsic_load_first_vertex.  Should those cases be added?
We should definitely include it in the emit_system_values_block to mark 
it as unreachable,

as like the others, it is lowered by brw_nir_lower_vs_inputs.

About fs_visitor::nir_emit_vs_intrinsic, it makes sense to add it there 
too. As far as I can see,
the vs_inputs intrinsics should be lowered by then,  we may want to 
replace the code by an

"unreachable (lowered by brw_nir_lower_vs_inputs)".


I have pushed a branch to my fd.o repo called gl_VertexID-fixes that
rebases this series on current master.  I had to apply some fixes to the
last patch in the series to get it to build.  I have tested that whole
series patch-by-patch in the CI with no regressions.  I have also
applied my R-b on several patches in the series.  I think once we decide
what to do about this patch and the squash! patch in my branch, we can
(finally) push this.

Excellent.


On 01/25/2018 10:15 AM, Antia Puentes wrote:

Until we set gl_BaseVertex to zero for non-indexed draw calls
both have an identical value.

The Vertex Elements are kept like that:
* VE 1: 
* VE 2: 
---
  src/intel/compiler/brw_nir.c  |  3 +++
  src/intel/compiler/brw_vec4.cpp   |  1 +
  src/mesa/drivers/dri/i965/brw_context.h   |  8 ++--
  src/mesa/drivers/dri/i965/brw_draw.c  | 14 +-
  src/mesa/drivers/dri/i965/brw_draw_upload.c   |  7 +--
  src/mesa/drivers/dri/i965/genX_state_upload.c | 11 +++
  6 files changed, 31 insertions(+), 13 deletions(-)

diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
index dbddef0d04d..34b1e44adf0 100644
--- a/src/intel/compiler/brw_nir.c
+++ b/src/intel/compiler/brw_nir.c
@@ -239,6 +239,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 const bool has_sgvs =
nir->info.system_values_read &
(BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
+   BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
 BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) |
 BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
 BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID));
@@ -261,6 +262,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
  
  switch (intrin->intrinsic) {

  case nir_intrinsic_load_base_vertex:
+case nir_intrinsic_load_first_vertex:
  case nir_intrinsic_load_base_instance:
  case nir_intrinsic_load_vertex_id_zero_base:
  case nir_intrinsic_load_instance_id:
@@ -278,6 +280,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 nir_intrinsic_set_base(load, num_inputs);
 switch (intrin->intrinsic) {
 case nir_intrinsic_load_base_vertex:
+   case nir_intrinsic_load_first_vertex:
nir_intrinsic_set_component(load, 0);
break;
 case nir_intrinsic_load_base_instance:
diff --git a/src/intel/compiler/brw_vec4.cpp b/src/intel/compiler/brw_vec4.cpp
index 36e17d77d47..06c79630119 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -2788,6 +2788,7 @@ brw_compile_vs(const struct brw_compiler *compiler, void 
*log_data,
  */
 if (shader->info.system_values_read &
 (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
+BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
  BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) |
  BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
  BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID))) {
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 9046acd175c..0a20706567e 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -881,8 +881,12 @@ struct brw_context
  
 struct {

struct {
- /** The value of gl_BaseVertex for the current _mesa_prim. */
- int gl_basevertex;
+ /**
+  * Either the value of gl_BaseVertex for indexed draw calls or the
+  * value of the argument  for non-indexed draw calls for the
+  * current _mesa_prim.
+  */
+ int firstvertex;
  
   /** The value of gl_BaseInstance for the current _mesa_prim. */

   int gl_baseinstance;
diff --git a/src/mesa/drivers/dri/i965/brw_draw.c 
b/src/mesa/drivers/dri/i965/brw_draw.c
index 50cf8b12c74..a1a5161fd35 100644
--- a/src/mesa/drivers/dri/i965/brw_draw.c
+++ b/src/mesa/drivers/dri/i965/brw_draw.c
@@ -816,25 +816,29 @@ brw_draw_single_prim(struct gl_context *ctx,
  * always flag if the shader uses one of the values. For direct draws,
  * we only flag if the values change.
  */
-   const int new_basevertex =

Re: [Mesa-dev] [PATCH v4 6/6] i965: gl_BaseVertex must be zero for non-indexed draw calls

2018-04-10 Thread Antia Puentes


On 07/04/18 08:21, Jason Ekstrand wrote:

On Fri, Apr 6, 2018 at 2:53 PM, Ian Romanick <mailto:i...@freedesktop.org>> wrote:


    From: Antia Puentes mailto:apuen...@igalia.com>>

We keep 'firstvertex' as it is and move gl_BaseVertex to the drawID
vertex element. The previous Vertex Elements order was:

  * VE 1: 
  * VE 2: 

and now it is:

  * VE 1: 
  * VE 2: 

To move the BaseVertex keeping VE1 as it is, allows to keep
pointing the
vertex buffer associated to VE 1 to the indirect buffer for indirect
draw calls.

From the OpenGL 4.6 (11.1.3.9 Shader Inputs) specification:

  "gl_BaseVertex holds the integer value passed to the baseVertex
  parameter to the command that resulted in the current shader
  invocation. In the case where the command has no baseVertex
parameter,
  the value of gl_BaseVertex is zero."

Fixes CTS tests:

  * KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysParameters
  *
KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysInstancedParameters
  *
KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters
  *

KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysIndirectParameters
  *
KHR-GL45.shader_draw_parameters_tests.MultiDrawArraysIndirectCountParameters

v2 (idr): Make changes to brw_prepare_shader_draw_parameters matching
those in genX(emit_vertices).  Reformat commit message to 72 columns.

Signed-off-by: Ian Romanick mailto:ian.d.roman...@intel.com>>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102678
<https://bugs.freedesktop.org/show_bug.cgi?id=102678>
---
 src/intel/compiler/brw_nir.c                  | 14 +
 src/intel/compiler/brw_vec4.cpp               | 14 +
 src/mesa/drivers/dri/i965/brw_context.h       | 32
++-
 src/mesa/drivers/dri/i965/brw_draw.c          | 45
++-
 src/mesa/drivers/dri/i965/brw_draw_upload.c   | 14 -
 src/mesa/drivers/dri/i965/genX_state_upload.c | 38
+++---
 6 files changed, 97 insertions(+), 60 deletions(-)

diff --git a/src/intel/compiler/brw_nir.c
b/src/intel/compiler/brw_nir.c
index 16b0d86814f..16ab529737b 100644
--- a/src/intel/compiler/brw_nir.c
+++ b/src/intel/compiler/brw_nir.c
@@ -238,8 +238,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
     */
    const bool has_sgvs =
       nir->info.system_values_read &
-      (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
-       BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
+      (BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
        BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) |
        BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
        BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID));
@@ -279,7 +278,6 @@ brw_nir_lower_vs_inputs(nir_shader *nir,

                nir_intrinsic_set_base(load, num_inputs);
                switch (intrin->intrinsic) {
-               case nir_intrinsic_load_base_vertex:
                case nir_intrinsic_load_first_vertex:
                   nir_intrinsic_set_component(load, 0);
                   break;
@@ -293,11 +291,15 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
                   nir_intrinsic_set_component(load, 3);
                   break;
                case nir_intrinsic_load_draw_id:
-                  /* gl_DrawID is stored right after gl_VertexID
and friends
-                   * if any of them exist.
+               case nir_intrinsic_load_base_vertex:
+                  /* gl_DrawID and gl_BaseVertex are stored right
after
+                     gl_VertexID and friends if any of them exist.
                    */
                   nir_intrinsic_set_base(load, num_inputs +
has_sgvs);
-                  nir_intrinsic_set_component(load, 0);
+                  if (intrin->intrinsic ==
nir_intrinsic_load_draw_id)
+                     nir_intrinsic_set_component(load, 0);
+                  else
+                     nir_intrinsic_set_component(load, 1);
                   break;
                default:
                   unreachable("Invalid system value intrinsic");
diff --git a/src/intel/compiler/brw_vec4.cpp
b/src/intel/compiler/brw_vec4.cpp
index 1e384f5bf4d..d33caefdea9 100644
--- a/src/intel/compiler/brw_vec4.cpp
+++ b/src/intel/compiler/brw_vec4.cpp
@@ -2825,14 +2825,19 @@ brw_compile_vs(const struct brw_compiler
*compiler, void *log_data,
     * incoming vertex attribute.  So, add an extra slot.
     */
    if (shader->info.system_values_read &
-       (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
-        BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
+

Re: [Mesa-dev] [PATCH v4 6/6] i965: gl_BaseVertex must be zero for non-indexed draw calls

2018-04-11 Thread Antia Puentes



On 10/04/18 18:26, Jason Ekstrand wrote:
On Tue, Apr 10, 2018 at 1:28 AM, Antia Puentes <mailto:apuen...@igalia.com>> wrote:


On 07/04/18 08:21, Jason Ekstrand wrote:


On Fri, Apr 6, 2018 at 2:53 PM, Ian Romanick mailto:i...@freedesktop.org>> wrote:

    From: Antia Puentes mailto:apuen...@igalia.com>>

We keep 'firstvertex' as it is and move gl_BaseVertex to the
drawID
vertex element. The previous Vertex Elements order was:

  * VE 1: 
  * VE 2: 

and now it is:

  * VE 1: 
  * VE 2: 

To move the BaseVertex keeping VE1 as it is, allows to keep
pointing the
vertex buffer associated to VE 1 to the indirect buffer for
indirect
draw calls.

From the OpenGL 4.6 (11.1.3.9 Shader Inputs) specification:

  "gl_BaseVertex holds the integer value passed to the baseVertex
  parameter to the command that resulted in the current shader
  invocation. In the case where the command has no baseVertex
parameter,
  the value of gl_BaseVertex is zero."

Fixes CTS tests:

  *
KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysParameters
  *

KHR-GL45.shader_draw_parameters_tests.ShaderDrawArraysInstancedParameters
  *
KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters
  *

KHR-GL45.shader_draw_parameters_tests.ShaderMultiDrawArraysIndirectParameters
  *

KHR-GL45.shader_draw_parameters_tests.MultiDrawArraysIndirectCountParameters

v2 (idr): Make changes to brw_prepare_shader_draw_parameters
matching
those in genX(emit_vertices).  Reformat commit message to 72
columns.

Signed-off-by: Ian Romanick mailto:ian.d.roman...@intel.com>>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102678
<https://bugs.freedesktop.org/show_bug.cgi?id=102678>
---
 src/intel/compiler/brw_nir.c     | 14 +
 src/intel/compiler/brw_vec4.cpp          | 14 +
 src/mesa/drivers/dri/i965/brw_context.h      | 32
++-
 src/mesa/drivers/dri/i965/brw_draw.c         | 45
++-
 src/mesa/drivers/dri/i965/brw_draw_upload.c  | 14 -
 src/mesa/drivers/dri/i965/genX_state_upload.c | 38
+++---
 6 files changed, 97 insertions(+), 60 deletions(-)

diff --git a/src/intel/compiler/brw_nir.c
b/src/intel/compiler/brw_nir.c
index 16b0d86814f..16ab529737b 100644
--- a/src/intel/compiler/brw_nir.c
+++ b/src/intel/compiler/brw_nir.c
@@ -238,8 +238,7 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
     */
    const bool has_sgvs =
       nir->info.system_values_read &
-      (BITFIELD64_BIT(SYSTEM_VALUE_BASE_VERTEX) |
-       BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
+      (BITFIELD64_BIT(SYSTEM_VALUE_FIRST_VERTEX) |
        BITFIELD64_BIT(SYSTEM_VALUE_BASE_INSTANCE) |
        BITFIELD64_BIT(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE) |
        BITFIELD64_BIT(SYSTEM_VALUE_INSTANCE_ID));
@@ -279,7 +278,6 @@ brw_nir_lower_vs_inputs(nir_shader *nir,

nir_intrinsic_set_base(load, num_inputs);
                switch (intrin->intrinsic) {
-               case nir_intrinsic_load_base_vertex:
                case nir_intrinsic_load_first_vertex:
 nir_intrinsic_set_component(load, 0);
                   break;
@@ -293,11 +291,15 @@ brw_nir_lower_vs_inputs(nir_shader *nir,
 nir_intrinsic_set_component(load, 3);
                   break;
                case nir_intrinsic_load_draw_id:
-                  /* gl_DrawID is stored right after
gl_VertexID and friends
-                   * if any of them exist.
+               case nir_intrinsic_load_base_vertex:
+                  /* gl_DrawID and gl_BaseVertex are stored
right after
+                     gl_VertexID and friends if any of them
exist.
                    */
 nir_intrinsic_set_base(load, num_inputs + has_sgvs);
- nir_intrinsic_set_component(load, 0);
+                  if (intrin->intrinsic ==
nir_intrinsic_load_draw_id)
+  nir_intrinsic_set_component(load, 0);
+                  else
+  nir_intrinsic_set_component(load, 1);
                   break;
                default:
                   unreachable("Invalid system value intrinsic");
diff --git a/src/intel/compiler/brw_vec4.cpp
b/src/intel/compiler/brw_vec4.cpp
index 1e384f5bf4d..d33caefdea9 100644
--- a/src/intel/compiler/br

92 matches

Mail list logo