Re: [Mesa-dev] [PATCH] i965/skl: Use 1 register for uniform pull constant payload

2015-02-20 Thread Timo Aaltonen
On 20.02.2015 08:48, Ben Widawsky wrote:
> When under dispatch_width=16 the previous code would allocate 2 registers for
> the payload when only one is needed. This manifested itself through bugs on 
> SKL
> which needs to mess with this instruction.
> 
> Ken though this might impact shader-db, but apparently it doesn't
> 
> Cc: Kenneth Graunke 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89118
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88999
> Signed-off-by: Ben Widawsky 
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index c46e1d7..24125cc 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -3062,7 +3062,7 @@ fs_visitor::lower_uniform_pull_constant_loads()
>   assert(const_offset_reg.file == IMM &&
>  const_offset_reg.type == BRW_REGISTER_TYPE_UD);
>   const_offset_reg.fixed_hw_reg.dw1.ud /= 4;
> - fs_reg payload = vgrf(glsl_type::uint_type);
> + fs_reg payload = fs_reg(GRF, alloc.allocate(1));
>  
>   /* We have to use a message header on Skylake to get SIMD4x2 mode.
>* Reserve space for the register.

Yup, tested on both master and with s/alloc.allocate/virtual_grf_alloc/
on 10.5.0-rc1, so

Tested-by: Timo Aaltonen 

One thing I noticed was that master needed drm-intel-next-2015-02-14 and
10.5.0-rc1 works with an earlier version of dinq, otherwise starting
lightdm might "hang" or logging in the session would take an eternity.
Not related to this patch though, just an observation while testing it
on master & 10.5.0..


-- 
t
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 23/23] main: Cosmetic changes to GetBufferSubData.

2015-02-20 Thread Martin Peres

Please squash this commit with the one introducing GetBufferSubData.

It should be pretty easy to do with git rebase -i :)

On 12/02/2015 04:06, Laura Ekstrand wrote:

---
  src/mesa/main/bufferobj.c | 2 +-
  src/mesa/main/bufferobj.h | 4 ++--
  2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index 0272704..38d8b5a 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -1668,7 +1668,7 @@ _mesa_GetBufferSubData(GLenum target, GLintptr offset,
 }
  
 ASSERT(ctx->Driver.GetBufferSubData);

-   ctx->Driver.GetBufferSubData( ctx, offset, size, data, bufObj );
+   ctx->Driver.GetBufferSubData(ctx, offset, size, data, bufObj);
  }
  
  void GLAPIENTRY

diff --git a/src/mesa/main/bufferobj.h b/src/mesa/main/bufferobj.h
index feeaa8b..b5d73ae 100644
--- a/src/mesa/main/bufferobj.h
+++ b/src/mesa/main/bufferobj.h
@@ -230,8 +230,8 @@ _mesa_NamedBufferSubData(GLuint buffer, GLintptr offset,
   GLsizeiptr size, const GLvoid *data);
  
  void GLAPIENTRY

-_mesa_GetBufferSubData(GLenum target, GLintptrARB offset,
-   GLsizeiptrARB size, void * data);
+_mesa_GetBufferSubData(GLenum target, GLintptr offset,
+   GLsizeiptr size, GLvoid *data);
  
  void GLAPIENTRY

  _mesa_GetNamedBufferSubData(GLuint buffer, GLintptr offset,


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: add double support for packing varyings

2015-02-20 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---

This works with a modified varying-packing test for everything except
dvec4 array which crashes in st/mesa somewhere. Still working on that
one. The IR generated here kinda stinks, but I couldn't get an
assignment with a swizzle to work.

 src/glsl/lower_packed_varyings.cpp | 88 +-
 1 file changed, 67 insertions(+), 21 deletions(-)

diff --git a/src/glsl/lower_packed_varyings.cpp 
b/src/glsl/lower_packed_varyings.cpp
index 5e844c7..f9e3e85 100644
--- a/src/glsl/lower_packed_varyings.cpp
+++ b/src/glsl/lower_packed_varyings.cpp
@@ -146,7 +146,11 @@
 
 #include "glsl_symbol_table.h"
 #include "ir.h"
+#include "ir_builder.h"
 #include "ir_optimization.h"
+#include "program/prog_instruction.h"
+
+using namespace ir_builder;
 
 namespace {
 
@@ -168,8 +172,8 @@ public:
void run(exec_list *instructions);
 
 private:
-   ir_assignment *bitwise_assign_pack(ir_rvalue *lhs, ir_rvalue *rhs);
-   ir_assignment *bitwise_assign_unpack(ir_rvalue *lhs, ir_rvalue *rhs);
+   void bitwise_assign_pack(ir_rvalue *lhs, ir_rvalue *rhs);
+   void bitwise_assign_unpack(ir_rvalue *lhs, ir_rvalue *rhs);
unsigned lower_rvalue(ir_rvalue *rvalue, unsigned fine_location,
  ir_variable *unpacked_var, const char *name,
  bool gs_input_toplevel, unsigned vertex_index);
@@ -274,6 +278,7 @@ lower_packed_varyings_visitor::run(exec_list *instructions)
}
 }
 
+#define SWIZZLE_ZWZW MAKE_SWIZZLE4(SWIZZLE_Z, SWIZZLE_W, SWIZZLE_Z, SWIZZLE_W)
 
 /**
  * Make an ir_assignment from \c rhs to \c lhs, performing appropriate
@@ -281,7 +286,7 @@ lower_packed_varyings_visitor::run(exec_list *instructions)
  *
  * This function is called when packing varyings.
  */
-ir_assignment *
+void
 lower_packed_varyings_visitor::bitwise_assign_pack(ir_rvalue *lhs,
ir_rvalue *rhs)
 {
@@ -300,12 +305,28 @@ 
lower_packed_varyings_visitor::bitwise_assign_pack(ir_rvalue *lhs,
  rhs = new(this->mem_ctx)
 ir_expression(ir_unop_bitcast_f2i, lhs->type, rhs);
  break;
+  case GLSL_TYPE_DOUBLE:
+ assert(rhs->type->vector_elements <= 2);
+ if (rhs->type->vector_elements == 2) {
+ir_variable *t = new(mem_ctx) ir_variable(lhs->type, "pack", 
ir_var_temporary);
+
+assert(lhs->type->vector_elements == 4);
+this->out_instructions->push_tail(t);
+this->out_instructions->push_tail(
+  assign(t, u2i(expr(ir_unop_unpack_double_2x32, 
swizzle_x(rhs->clone(mem_ctx, NULL, 0x3));
+this->out_instructions->push_tail(
+  assign(t,  u2i(expr(ir_unop_unpack_double_2x32, 
swizzle_y(rhs))), 0xc));
+rhs = deref(t).val;
+ } else {
+rhs = u2i(expr(ir_unop_unpack_double_2x32, rhs));
+ }
+ break;
   default:
  assert(!"Unexpected type conversion while lowering varyings");
  break;
   }
}
-   return new(this->mem_ctx) ir_assignment(lhs, rhs);
+   this->out_instructions->push_tail(new (this->mem_ctx) ir_assignment(lhs, 
rhs));
 }
 
 
@@ -315,7 +336,7 @@ 
lower_packed_varyings_visitor::bitwise_assign_pack(ir_rvalue *lhs,
  *
  * This function is called when unpacking varyings.
  */
-ir_assignment *
+void
 lower_packed_varyings_visitor::bitwise_assign_unpack(ir_rvalue *lhs,
  ir_rvalue *rhs)
 {
@@ -334,12 +355,27 @@ 
lower_packed_varyings_visitor::bitwise_assign_unpack(ir_rvalue *lhs,
  rhs = new(this->mem_ctx)
 ir_expression(ir_unop_bitcast_i2f, lhs->type, rhs);
  break;
+  case GLSL_TYPE_DOUBLE:
+ assert(lhs->type->vector_elements <= 2);
+ if (lhs->type->vector_elements == 2) {
+ir_variable *t = new(mem_ctx) ir_variable(lhs->type, "unpack", 
ir_var_temporary);
+assert(rhs->type->vector_elements == 4);
+this->out_instructions->push_tail(t);
+this->out_instructions->push_tail(
+  assign(t, expr(ir_unop_pack_double_2x32, 
i2u(swizzle_xy(rhs->clone(mem_ctx, NULL, 0x1));
+this->out_instructions->push_tail(
+  assign(t, expr(ir_unop_pack_double_2x32, 
i2u(swizzle(rhs->clone(mem_ctx, NULL), SWIZZLE_ZWZW, 2))), 0x2));
+rhs = deref(t).val;
+ } else {
+rhs = expr(ir_unop_pack_double_2x32, i2u(rhs));
+ }
+ break;
   default:
  assert(!"Unexpected type conversion while lowering varyings");
  break;
   }
}
-   return new(this->mem_ctx) ir_assignment(lhs, rhs);
+   this->out_instructions->push_tail(new(this->mem_ctx) ir_assignment(lhs, 
rhs));
 }
 
 
@@ -372,6 +408,7 @@ lower_packed_varyings_visitor::lower_rvalue(ir_rvalue 
*rvalue,
 bool gs_input_toplevel,
 unsigned verte

Re: [Mesa-dev] [PATCH 08/23] main: Add entry point for NamedBufferSubData.

2015-02-20 Thread Martin Peres

On 12/02/2015 04:05, Laura Ekstrand wrote:

v2: review by Ian Romanick
- Remove "_mesa" from name of static software fallback buffer_sub_data.
- Remove mappedRange from _mesa_buffer_sub_data.
- Removed some cosmetic changes to a separate commit.
---
  src/mapi/glapi/gen/ARB_direct_state_access.xml |   7 ++
  src/mesa/main/bufferobj.c  | 129 +++--
  src/mesa/main/bufferobj.h  |   9 ++
  src/mesa/main/tests/dispatch_sanity.cpp|   1 +
  4 files changed, 97 insertions(+), 49 deletions(-)

diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml 
b/src/mapi/glapi/gen/ARB_direct_state_access.xml
index 7779262..6d70b8e 100644
--- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
+++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
@@ -28,6 +28,13 @@

 
  
+   

+  
+  
+  
+  
+   
+
 
  
 

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index ac8eed1..4f89748 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -227,67 +227,58 @@ bufferobj_range_mapped(const struct gl_buffer_object *obj,
   * \c glClearBufferSubData.
   *
   * \param ctx GL context.
- * \param target  Buffer object target on which to operate.
+ * \param bufObj  The buffer object.
   * \param offset  Offset of the first byte of the subdata range.
   * \param sizeSize, in bytes, of the subdata range.
   * \param mappedRange  If true, checks if an overlapping range is mapped.
   * If false, checks if buffer is mapped.
- * \param errorNoBuffer  Error code if no buffer is bound to target.
   * \param caller  Name of calling function for recording errors.
- * \return   A pointer to the buffer object bound to \c target in the
- *   specified context or \c NULL if any of the parameter or state
- *   conditions are invalid.
+ * \return   false if error, true otherwise
   *
   * \sa glBufferSubDataARB, glGetBufferSubDataARB, glClearBufferSubData
   */
-static struct gl_buffer_object *
-buffer_object_subdata_range_good(struct gl_context * ctx, GLenum target,
- GLintptrARB offset, GLsizeiptrARB size,
- bool mappedRange, GLenum errorNoBuffer,
- const char *caller)
+static bool
+buffer_object_subdata_range_good(struct gl_context *ctx,
+ struct gl_buffer_object *bufObj,
+ GLintptr offset, GLsizeiptr size,
+ bool mappedRange, const char *caller)
  {
-   struct gl_buffer_object *bufObj;
-
 if (size < 0) {
_mesa_error(ctx, GL_INVALID_VALUE, "%s(size < 0)", caller);
-  return NULL;
+  return false;
 }
  
 if (offset < 0) {

_mesa_error(ctx, GL_INVALID_VALUE, "%s(offset < 0)", caller);
-  return NULL;
+  return false;
 }
  
-   bufObj = get_buffer(ctx, caller, target, errorNoBuffer);

-   if (!bufObj)
-  return NULL;
-
 if (offset + size > bufObj->Size) {
_mesa_error(ctx, GL_INVALID_VALUE,
"%s(offset %lu + size %lu > buffer size %lu)", caller,
(unsigned long) offset,
(unsigned long) size,
(unsigned long) bufObj->Size);
-  return NULL;
+  return false;
 }
  
 if (bufObj->Mappings[MAP_USER].AccessFlags & GL_MAP_PERSISTENT_BIT)

-  return bufObj;
+  return true;
  
 if (mappedRange) {

if (bufferobj_range_mapped(bufObj, offset, size)) {
   _mesa_error(ctx, GL_INVALID_OPERATION, "%s", caller);
- return NULL;
+ return false;
}
 }
 else {
if (_mesa_bufferobj_mapped(bufObj, MAP_USER)) {
   _mesa_error(ctx, GL_INVALID_OPERATION, "%s", caller);
- return NULL;
+ return false;
}
 }
  
-   return bufObj;

+   return true;
  }
  
  
@@ -602,9 +593,9 @@ buffer_data_fallback(struct gl_context *ctx, GLenum target, GLsizeiptr size,

   * \sa glBufferSubDataARB, dd_function_table::BufferSubData.
   */
  static void
-_mesa_buffer_subdata( struct gl_context *ctx, GLintptrARB offset,
- GLsizeiptrARB size, const GLvoid * data,
- struct gl_buffer_object * bufObj )
+buffer_sub_data_fallback(struct gl_context *ctx, GLintptr offset,
+ GLsizeiptr size, const GLvoid *data,
+ struct gl_buffer_object *bufObj)
  {
 (void) ctx;
  
@@ -1113,7 +1104,7 @@ _mesa_init_buffer_object_functions(struct dd_function_table *driver)

 driver->NewBufferObject = _mesa_new_buffer_object;
 driver->DeleteBuffer = _mesa_delete_buffer_object;
 driver->BufferData = buffer_data_fallback;
-   driver->BufferSubData = _mesa_buffer_subdata;
+   driver->BufferSubData = buffer_sub_data_fallback;
 driver->GetBufferSubData = _mesa_buffer_get_subdata;
 driver->UnmapBuff

Re: [Mesa-dev] [PATCH] glsl: add double support for packing varyings

2015-02-20 Thread Ilia Mirkin
On Fri, Feb 20, 2015 at 4:10 AM, Ilia Mirkin  wrote:
> Signed-off-by: Ilia Mirkin 
> ---
>
> This works with a modified varying-packing test for everything except
> dvec4 array which crashes in st/mesa somewhere. Still working on that
> one. The IR generated here kinda stinks, but I couldn't get an
> assignment with a swizzle to work.

Ugh, looks like this is a no-go for GS due to the temp vars. But this
change gets it mostly working. I also had to switch doubles to
*always* getting packed, even for dvec2 and dvec4... I think there's
some variable storage/linking-related logic which doesn't take the
double size into account properly.

  -ilia

>
>  src/glsl/lower_packed_varyings.cpp | 88 
> +-
>  1 file changed, 67 insertions(+), 21 deletions(-)
>
> diff --git a/src/glsl/lower_packed_varyings.cpp 
> b/src/glsl/lower_packed_varyings.cpp
> index 5e844c7..f9e3e85 100644
> --- a/src/glsl/lower_packed_varyings.cpp
> +++ b/src/glsl/lower_packed_varyings.cpp
> @@ -146,7 +146,11 @@
>
>  #include "glsl_symbol_table.h"
>  #include "ir.h"
> +#include "ir_builder.h"
>  #include "ir_optimization.h"
> +#include "program/prog_instruction.h"
> +
> +using namespace ir_builder;
>
>  namespace {
>
> @@ -168,8 +172,8 @@ public:
> void run(exec_list *instructions);
>
>  private:
> -   ir_assignment *bitwise_assign_pack(ir_rvalue *lhs, ir_rvalue *rhs);
> -   ir_assignment *bitwise_assign_unpack(ir_rvalue *lhs, ir_rvalue *rhs);
> +   void bitwise_assign_pack(ir_rvalue *lhs, ir_rvalue *rhs);
> +   void bitwise_assign_unpack(ir_rvalue *lhs, ir_rvalue *rhs);
> unsigned lower_rvalue(ir_rvalue *rvalue, unsigned fine_location,
>   ir_variable *unpacked_var, const char *name,
>   bool gs_input_toplevel, unsigned vertex_index);
> @@ -274,6 +278,7 @@ lower_packed_varyings_visitor::run(exec_list 
> *instructions)
> }
>  }
>
> +#define SWIZZLE_ZWZW MAKE_SWIZZLE4(SWIZZLE_Z, SWIZZLE_W, SWIZZLE_Z, 
> SWIZZLE_W)
>
>  /**
>   * Make an ir_assignment from \c rhs to \c lhs, performing appropriate
> @@ -281,7 +286,7 @@ lower_packed_varyings_visitor::run(exec_list 
> *instructions)
>   *
>   * This function is called when packing varyings.
>   */
> -ir_assignment *
> +void
>  lower_packed_varyings_visitor::bitwise_assign_pack(ir_rvalue *lhs,
> ir_rvalue *rhs)
>  {
> @@ -300,12 +305,28 @@ 
> lower_packed_varyings_visitor::bitwise_assign_pack(ir_rvalue *lhs,
>   rhs = new(this->mem_ctx)
>  ir_expression(ir_unop_bitcast_f2i, lhs->type, rhs);
>   break;
> +  case GLSL_TYPE_DOUBLE:
> + assert(rhs->type->vector_elements <= 2);
> + if (rhs->type->vector_elements == 2) {
> +ir_variable *t = new(mem_ctx) ir_variable(lhs->type, "pack", 
> ir_var_temporary);
> +
> +assert(lhs->type->vector_elements == 4);
> +this->out_instructions->push_tail(t);
> +this->out_instructions->push_tail(
> +  assign(t, u2i(expr(ir_unop_unpack_double_2x32, 
> swizzle_x(rhs->clone(mem_ctx, NULL, 0x3));
> +this->out_instructions->push_tail(
> +  assign(t,  u2i(expr(ir_unop_unpack_double_2x32, 
> swizzle_y(rhs))), 0xc));
> +rhs = deref(t).val;
> + } else {
> +rhs = u2i(expr(ir_unop_unpack_double_2x32, rhs));
> + }
> + break;
>default:
>   assert(!"Unexpected type conversion while lowering varyings");
>   break;
>}
> }
> -   return new(this->mem_ctx) ir_assignment(lhs, rhs);
> +   this->out_instructions->push_tail(new (this->mem_ctx) ir_assignment(lhs, 
> rhs));
>  }
>
>
> @@ -315,7 +336,7 @@ 
> lower_packed_varyings_visitor::bitwise_assign_pack(ir_rvalue *lhs,
>   *
>   * This function is called when unpacking varyings.
>   */
> -ir_assignment *
> +void
>  lower_packed_varyings_visitor::bitwise_assign_unpack(ir_rvalue *lhs,
>   ir_rvalue *rhs)
>  {
> @@ -334,12 +355,27 @@ 
> lower_packed_varyings_visitor::bitwise_assign_unpack(ir_rvalue *lhs,
>   rhs = new(this->mem_ctx)
>  ir_expression(ir_unop_bitcast_i2f, lhs->type, rhs);
>   break;
> +  case GLSL_TYPE_DOUBLE:
> + assert(lhs->type->vector_elements <= 2);
> + if (lhs->type->vector_elements == 2) {
> +ir_variable *t = new(mem_ctx) ir_variable(lhs->type, "unpack", 
> ir_var_temporary);
> +assert(rhs->type->vector_elements == 4);
> +this->out_instructions->push_tail(t);
> +this->out_instructions->push_tail(
> +  assign(t, expr(ir_unop_pack_double_2x32, 
> i2u(swizzle_xy(rhs->clone(mem_ctx, NULL, 0x1));
> +this->out_instructions->push_tail(
> +  assign(t, expr(ir_unop_pack_double_2x32, 
> i2u(swizzle(rhs->clone(mem_ctx, NULL), SWIZZLE_ZWZW, 2))), 0x2));
> +rhs = d

Re: [Mesa-dev] [PATCH 09/23] main: Cosmetic changes to BufferSubData infrastructure.

2015-02-20 Thread Martin Peres

Be more precise in the commit message. I propose:
report more explicit error messages in the BufferSubData infrastructure

Other than that:
Reviewed-by: Martin Peres 

On 12/02/2015 04:05, Laura Ekstrand wrote:

---
  src/mesa/main/bufferobj.c | 10 +++---
  src/mesa/main/bufferobj.h |  4 ++--
  2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index 4f89748..fc01d02 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -267,13 +267,17 @@ buffer_object_subdata_range_good(struct gl_context *ctx,
  
 if (mappedRange) {

if (bufferobj_range_mapped(bufObj, offset, size)) {
- _mesa_error(ctx, GL_INVALID_OPERATION, "%s", caller);
+ _mesa_error(ctx, GL_INVALID_OPERATION,
+ "%s(range is mapped without persistent bit)",
+ caller);
   return false;
}
 }
 else {
if (_mesa_bufferobj_mapped(bufObj, MAP_USER)) {
- _mesa_error(ctx, GL_INVALID_OPERATION, "%s", caller);
+ _mesa_error(ctx, GL_INVALID_OPERATION,
+ "%s(buffer is mapped without persistent bit)",
+ caller);
   return false;
}
 }
@@ -1613,7 +1617,7 @@ _mesa_buffer_sub_data(struct gl_context *ctx, struct 
gl_buffer_object *bufObj,
 bufObj->Written = GL_TRUE;
  
 ASSERT(ctx->Driver.BufferSubData);

-   ctx->Driver.BufferSubData( ctx, offset, size, data, bufObj );
+   ctx->Driver.BufferSubData(ctx, offset, size, data, bufObj);
  }
  
  void GLAPIENTRY

diff --git a/src/mesa/main/bufferobj.h b/src/mesa/main/bufferobj.h
index 889bbb1..d15ad00 100644
--- a/src/mesa/main/bufferobj.h
+++ b/src/mesa/main/bufferobj.h
@@ -190,8 +190,8 @@ _mesa_NamedBufferData(GLuint buffer, GLsizeiptr size,
const GLvoid *data, GLenum usage);
  
  void GLAPIENTRY

-_mesa_BufferSubData(GLenum target, GLintptrARB offset,
-GLsizeiptrARB size, const GLvoid * data);
+_mesa_BufferSubData(GLenum target, GLintptr offset,
+GLsizeiptr size, const GLvoid *data);
  
  void GLAPIENTRY

  _mesa_NamedBufferSubData(GLuint buffer, GLintptr offset,


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Set pixel/sample mask for compute shaders atomic ops

2015-02-20 Thread Francisco Jerez
Jordan Justen  writes:

> On 2015-02-19 21:40:37, Ben Widawsky wrote:
>> On Thu, Feb 19, 2015 at 03:42:05PM -0800, Jordan Justen wrote:
>> > For fragment programs, we pull this mask from the payload header. The same
>> > mask doesn't exist for compute shaders, so we set all bits to enabled.
>> > 
>> > Note: this mask is ANDed with the execution mask, so some channels may not 
>> > end
>> > up issuing the atomic operation.
>> > 
>> > Signed-off-by: Jordan Justen 
>> > Cc: Ben Widawsky 
>> > Cc: Francisco Jerez 
>> 
>> Just add to the commit message that this is needed specifically because 
>> compute
>> is invoked as SIMD16 (and perhaps reference the other commits?) and it's:
>> Reviewed-by: Ben Widawsky 
>
> Good idea. I'll add those.
>
>> Sorry it advance...
>> we may as well just go for 0x in case we ever support SIMD32.
>
> I had been setting all 32-bits previously. I mentioned to you that I
> thought this was needed for SIMD32. I wanted to double check it before
> sending the patch out. I think I found the field for IVB in the PRM:
>
> IVB Vol 4 Part 1 3.9.9.9 Message Header
> Pixel/Sample Mask
>
> ...and it looks like it is only 16-bits. Maybe Francisco can confirm
> that I got it right.
>
> I couldn't find this same information in the HSW PRMs.
>
> I'm not sure what that means for SIMD32.
>
Yeah, it's only 16 bits.  For SIMD32 it means that, *sigh*, we'll have
to split up the message in several SIMD16 chunks (my image load store
branch has to do something similar to do typed surface operations in
SIMD16 mode because there are only 8-wide variants of those messages).
To initialize the header we would just copy the first or second 16-bit
half of the sample mask, and set the quarter control bits appropriately
for each half so the execution mask is taken into account correctly.

With Ben's and Matt's suggestions this patch is:
Reviewed-by: Francisco Jerez 


> -Jordan
>
>> > ---
>> >  While it's fresh in our minds. :)
>> > 
>> >  This seems to work for gen7 & gen8 CS. For CS simd16, we need the
>> >  0x change, but it seems to work fine for simd8 as well.
>> > 
>> >  I also tested gen8 (simd8vs), and there were no piglit regressions.
>> > 
>> >  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 8 
>> >  1 file changed, 4 insertions(+), 4 deletions(-)
>> > 
>> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
>> > b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> > index 24cc118..960a0aa 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> > +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> > @@ -2998,9 +2998,9 @@ fs_visitor::emit_untyped_atomic(unsigned atomic_op, 
>> > unsigned surf_index,
>> > * mask sent in the header to compute the actual set of channels 
>> > that execute
>> > * the atomic operation.
>> > */
>> > -  assert(stage == MESA_SHADER_VERTEX);
>> > +  assert(stage == MESA_SHADER_VERTEX || stage == MESA_SHADER_COMPUTE);
>> >emit(MOV(component(sources[0], 7),
>> > -   brw_imm_ud(0xff)))->force_writemask_all = true;
>> > +   brw_imm_ud(0x)))->force_writemask_all = true;
>> > }
>> > length++;
>> >  
>> > @@ -3061,9 +3061,9 @@ fs_visitor::emit_untyped_surface_read(unsigned 
>> > surf_index, fs_reg dst,
>> > * mask sent in the header to compute the actual set of channels 
>> > that execute
>> > * the atomic operation.
>> > */
>> > -  assert(stage == MESA_SHADER_VERTEX);
>> > +  assert(stage == MESA_SHADER_VERTEX || stage == MESA_SHADER_COMPUTE);
>> >emit(MOV(component(sources[0], 7),
>> > -   brw_imm_ud(0xff)))->force_writemask_all = true;
>> > +   brw_imm_ud(0x)))->force_writemask_all = true;
>> > }
>> >  
>> > /* Set the surface read offset. */
>> > -- 
>> > 2.1.4
>> > 


pgpppOi6R6AAi.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/skl: Use 1 register for uniform pull constant payload

2015-02-20 Thread Kenneth Graunke
On Thursday, February 19, 2015 10:48:08 PM Ben Widawsky wrote:
> When under dispatch_width=16 the previous code would allocate 2 registers for
> the payload when only one is needed. This manifested itself through bugs on 
> SKL
> which needs to mess with this instruction.
> 
> Ken though this might impact shader-db, but apparently it doesn't
> 
> Cc: Kenneth Graunke 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89118
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88999
> Signed-off-by: Ben Widawsky 
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index c46e1d7..24125cc 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -3062,7 +3062,7 @@ fs_visitor::lower_uniform_pull_constant_loads()
>   assert(const_offset_reg.file == IMM &&
>  const_offset_reg.type == BRW_REGISTER_TYPE_UD);
>   const_offset_reg.fixed_hw_reg.dw1.ud /= 4;
> - fs_reg payload = vgrf(glsl_type::uint_type);
> + fs_reg payload = fs_reg(GRF, alloc.allocate(1));
>  
>   /* We have to use a message header on Skylake to get SIMD4x2 mode.
>* Reserve space for the register.
> 

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Francisco Jerez
Jason Ekstrand  writes:

> I'm still a little pensive.  But
>
> Reviewed-by: Jason Ekstrand 
>
Thanks.

> Now for a little aside.  I have come to the conclusion that I made a grave
> mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I should have
> just subclassed fs_inst for load_payload.  The problem is that we need to
> snag a bunch of information for the sources when we create the
> load_payload.  In particular, we need to know the width of the source so
> that we know how much space it consumes in the payload and we need to know
> the information required to properly re-create the mov such as
> force_sechalf and force_writemask_all.  Really, in order to do things
> properly, we need to gather this information *before* we do any
> optimizations.  The nasty pile of code that you're editing together with
> the "effective_width" parameter is a lame attempt to capture/reconstruct
> this information.  Really, we should just subclass, capture the information
> up-front, and do it properly.
>
Yes, absolutely, this lowering pass is a real mess.  There are four more
unreviewed patches in the mailing list fixing bugs of the metadata
guessing of lower_load_payload() [1][2][3][4], you may want to take a
look at them.  There are more bugs I'm aware of but it didn't seem
practical to fix them.

That said, I don't think it would be worth subclassing fs_inst.

My suggestion would have been to keep it simple and lower LOAD_PAYLOAD
into a series of MOVs with force_writemask_all enabled in all cases,
simply rely on the optimizer to eliminate those where possible.  Then
get rid of the metadata and effective_width guessing.  Instead agree on
immediates and uniforms being exec_size-wide by convention
(i.e. LOAD_PAYLOAD's exec_size rather than the original instruction's),
then prevent constant propagation from propagating an immediate load
into a LOAD_PAYLOAD if it would lead to a change in the semantics of the
program, and maybe just run constant propagation after lowering so we
can be sure those cases are cleaned up properly where register coalesce
isn't enough.

[1] http://lists.freedesktop.org/archives/mesa-dev/2015-January/074614.html
[2] http://lists.freedesktop.org/archives/mesa-dev/2015-February/076094.html
[3] http://lists.freedesktop.org/archives/mesa-dev/2015-February/076097.html
[4] http://lists.freedesktop.org/archives/mesa-dev/2015-February/076098.html


> --Jason
>
> On Thu, Feb 19, 2015 at 1:53 PM, Jason Ekstrand 
> wrote:
>
>>
>>
>> On Thu, Feb 19, 2015 at 1:25 PM, Francisco Jerez 
>> wrote:
>>
>>> Jason Ekstrand  writes:
>>>
>>> > On Thu, Feb 19, 2015 at 12:13 PM, Francisco Jerez <
>>> curroje...@riseup.net>
>>> > wrote:
>>> >
>>> >> Jason Ekstrand  writes:
>>> >>
>>> >> > On Fri, Feb 6, 2015 at 4:01 PM, Francisco Jerez <
>>> curroje...@riseup.net>
>>> >> > wrote:
>>> >> >
>>> >> >> Hey Matt,
>>> >> >>
>>> >> >> Matt Turner  writes:
>>> >> >>
>>> >> >> > On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez <
>>> >> curroje...@riseup.net>
>>> >> >> wrote:
>>> >> >> >> MRFs cannot be read from anyway so they cannot possibly be a
>>> valid
>>> >> >> >> source of LOAD_PAYLOAD.
>>> >> >> >> ---
>>> >> >> >
>>> >> >> > The function only seems to test inst->dst.file == MRF. I don't
>>> see any
>>> >> >> > code for handling MRF sources. What am I missing?
>>> >> >>
>>> >> >> That test is for "handling" MRF sources -- More precisely, it's
>>> >> >> collecting the writemask and half flags for MRF writes, which can
>>> only
>>> >> >> possibly be useful if we're going to use them later on to read
>>> something
>>> >> >> out of an MRF into a payload, which we shouldn't be doing in the
>>> first
>>> >> >> place.
>>> >> >>
>>> >> >> Aside from simplifying the function somewhat, that allows us to
>>> drop the
>>> >> >> 16 register gap reserved for MRFs at register offset zero, what will
>>> >> >> allow us to drop the vgrf_to_reg[] offset calculation completely
>>> (also
>>> >> >> in split_virtual_grfs()) in a future patch (not sent for review
>>> yet).
>>> >> >>
>>> >> >
>>> >> > No, we do read from MRF's sort-of...  Send messages have an implicit
>>> >> "read"
>>> >> > from an MRF.
>>> >>
>>> >> Heh, and that's pretty much the only way you "read" from it.
>>> >>
>>> >> > This was written precicely so that we could use LOAD_PAYLOAD
>>> >> > to build MRF payloads.  We do on pre-GEN6.
>>> >> >
>>> >> I'm aware, but you don't need any of this meta-data to LOAD_PAYLOAD
>>> >> *into* an MRF, and LOAD_PAYLOAD with an MRF as source should be illegal
>>> >> anyway.
>>> >>
>>> >
>>> > And no one is using it that way.  All of the metadata checks you are
>>> > deleting are checks on the *destination*.
>>> >
>>>
>>> Didn't you write this code yourself?  The only use for the collected
>>> metadata is initializing the instruction flags of the MOVs subsequent
>>> LOAD_PAYLOAD instructions are lowered to, based on the metadata already
>>> collected for its source registers, which can never be MRFs, so the
>>> metadata you collect from

[Mesa-dev] [PATCH] mesa: Adds missing error condition in _mesa_check_sample_count()

2015-02-20 Thread Eduardo Lima Mitev
This corrects a trivial error introduced in commit
19252fee46b835cb4f6b1cce18d7737d62b64a2e. That patch was merged recently
and omits one condition (that 'samples' is greater than zero) in one of
the error checks. That error will definitely cause regressions.
---
 src/mesa/main/multisample.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/multisample.c b/src/mesa/main/multisample.c
index b696de9..50a8a11 100644
--- a/src/mesa/main/multisample.c
+++ b/src/mesa/main/multisample.c
@@ -155,7 +155,8 @@ _mesa_check_sample_count(struct gl_context *ctx, GLenum 
target,
 * "If internalformat is a signed or unsigned integer format and samples
 * is greater than zero, then the error INVALID_OPERATION is generated."
 */
-   if (_mesa_is_gles3(ctx) && _mesa_is_enum_format_integer(internalFormat)) {
+   if (_mesa_is_gles3(ctx) && _mesa_is_enum_format_integer(internalFormat)
+   && samples > 0) {
   return GL_INVALID_OPERATION;
}
 
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/23] main: Add entry point for CopyNamedBufferSubData.

2015-02-20 Thread Martin Peres

Looks good to me.

Reviewed-by: Martin Peres 

On 12/02/2015 04:05, Laura Ekstrand wrote:

v2: remove _mesa in front of static software fallback.
---
  src/mapi/glapi/gen/ARB_direct_state_access.xml |  8 +++
  src/mesa/main/bufferobj.c  | 99 +-
  src/mesa/main/bufferobj.h  | 12 
  src/mesa/main/tests/dispatch_sanity.cpp|  1 +
  4 files changed, 87 insertions(+), 33 deletions(-)

diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml 
b/src/mapi/glapi/gen/ARB_direct_state_access.xml
index 6d70b8e..042b2a8 100644
--- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
+++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
@@ -35,6 +35,14 @@

 
  
+   

+  
+  
+  
+  
+  
+   
+
 
  
 

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index fc01d02..7225b64 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -761,11 +761,11 @@ _mesa_buffer_unmap(struct gl_context *ctx, struct 
gl_buffer_object *bufObj,
   * Called via glCopyBufferSubData().
   */
  static void
-_mesa_copy_buffer_subdata(struct gl_context *ctx,
-  struct gl_buffer_object *src,
-  struct gl_buffer_object *dst,
-  GLintptr readOffset, GLintptr writeOffset,
-  GLsizeiptr size)
+copy_buffer_sub_data_fallback(struct gl_context *ctx,
+  struct gl_buffer_object *src,
+  struct gl_buffer_object *dst,
+  GLintptr readOffset, GLintptr writeOffset,
+  GLsizeiptr size)
  {
 GLubyte *srcPtr, *dstPtr;
  
@@ -1120,7 +1120,7 @@ _mesa_init_buffer_object_functions(struct dd_function_table *driver)

 driver->FlushMappedBufferRange = _mesa_buffer_flush_mapped_range;
  
 /* GL_ARB_copy_buffer */

-   driver->CopyBufferSubData = _mesa_copy_buffer_subdata;
+   driver->CopyBufferSubData = copy_buffer_sub_data_fallback;
  }
  
  
@@ -2101,65 +2101,54 @@ _mesa_GetBufferPointerv(GLenum target, GLenum pname, GLvoid **params)

  }
  
  
-void GLAPIENTRY

-_mesa_CopyBufferSubData(GLenum readTarget, GLenum writeTarget,
-GLintptr readOffset, GLintptr writeOffset,
-GLsizeiptr size)
+void
+_mesa_copy_buffer_sub_data(struct gl_context *ctx,
+   struct gl_buffer_object *src,
+   struct gl_buffer_object *dst,
+   GLintptr readOffset, GLintptr writeOffset,
+   GLsizeiptr size, const char *func)
  {
-   GET_CURRENT_CONTEXT(ctx);
-   struct gl_buffer_object *src, *dst;
-
-   src = get_buffer(ctx, "glCopyBufferSubData", readTarget,
-GL_INVALID_OPERATION);
-   if (!src)
-  return;
-
-   dst = get_buffer(ctx, "glCopyBufferSubData", writeTarget,
-GL_INVALID_OPERATION);
-   if (!dst)
-  return;
-
 if (_mesa_check_disallowed_mapping(src)) {
_mesa_error(ctx, GL_INVALID_OPERATION,
-  "glCopyBufferSubData(readBuffer is mapped)");
+  "%s(readBuffer is mapped)", func);
return;
 }
  
 if (_mesa_check_disallowed_mapping(dst)) {

_mesa_error(ctx, GL_INVALID_OPERATION,
-  "glCopyBufferSubData(writeBuffer is mapped)");
+  "%s(writeBuffer is mapped)", func);
return;
 }
  
 if (readOffset < 0) {

_mesa_error(ctx, GL_INVALID_VALUE,
-  "glCopyBufferSubData(readOffset = %d)", (int) readOffset);
+  "%s(readOffset %d < 0)", func, (int) readOffset);
return;
 }
  
 if (writeOffset < 0) {

_mesa_error(ctx, GL_INVALID_VALUE,
-  "glCopyBufferSubData(writeOffset = %d)", (int) writeOffset);
+  "%s(writeOffset %d < 0)", func, (int) writeOffset);
return;
 }
  
 if (size < 0) {

_mesa_error(ctx, GL_INVALID_VALUE,
-  "glCopyBufferSubData(writeOffset = %d)", (int) size);
+  "%s(size %d < 0)", func, (int) size);
return;
 }
  
 if (readOffset + size > src->Size) {

_mesa_error(ctx, GL_INVALID_VALUE,
-  "glCopyBufferSubData(readOffset + size = %d)",
-  (int) (readOffset + size));
+  "%s(readOffset %d + size %d > src_buffer_size %d)", func,
+  (int) readOffset, (int) size, (int) src->Size);
return;
 }
  
 if (writeOffset + size > dst->Size) {

_mesa_error(ctx, GL_INVALID_VALUE,
-  "glCopyBufferSubData(writeOffset + size = %d)",
-  (int) (writeOffset + size));
+  "%s(writeOffset %d + size %d > dst_buffer_size %d)", func,
+  (int) writeOffset, (int) size, (int) dst->Size);
return;
 }
  
@@ -2173,7 +2162,7 @@ _mesa_Co

Re: [Mesa-dev] [PATCH 11/23] main: Refactor ClearBuffer[Sub]Data.

2015-02-20 Thread Martin Peres

On 12/02/2015 04:05, Laura Ekstrand wrote:

v2: review by Jason Ekstrand
- Split refactor of clear buffer sub data from addition of DSA entry
  points.
---
  src/mesa/main/bufferobj.c| 125 ---
  src/mesa/main/bufferobj.h|  19 ++--
  src/mesa/state_tracker/st_cb_bufferobjects.c |   4 +-
  3 files changed, 69 insertions(+), 79 deletions(-)

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index 7225b64..b8fa917 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -660,11 +660,11 @@ _mesa_buffer_get_subdata( struct gl_context *ctx, 
GLintptrARB offset,
   * dd_function_table::ClearBufferSubData.
   */
  void
-_mesa_buffer_clear_subdata(struct gl_context *ctx,
-   GLintptr offset, GLsizeiptr size,
-   const GLvoid *clearValue,
-   GLsizeiptr clearValueSize,
-   struct gl_buffer_object *bufObj)
+_mesa_ClearBufferSubData_sw(struct gl_context *ctx,


Interesting choice of naming the function as a mix of camel case and 
underscores.


Since it is an internal function, shouldn't it only have underscores?

Other than that:

Reviewed-by: Martin Peres 

+GLintptr offset, GLsizeiptr size,
+const GLvoid *clearValue,
+GLsizeiptr clearValueSize,
+struct gl_buffer_object *bufObj)
  {
 GLsizeiptr i;
 GLubyte *dest;
@@ -1113,7 +1113,7 @@ _mesa_init_buffer_object_functions(struct 
dd_function_table *driver)
 driver->UnmapBuffer = _mesa_buffer_unmap;
  
 /* GL_ARB_clear_buffer_object */

-   driver->ClearBufferSubData = _mesa_buffer_clear_subdata;
+   driver->ClearBufferSubData = _mesa_ClearBufferSubData_sw;
  
 /* GL_ARB_map_buffer_range */

 driver->MapBufferRange = _mesa_buffer_map_range;
@@ -1671,57 +1671,77 @@ _mesa_GetBufferSubData(GLenum target, GLintptr offset,
 ctx->Driver.GetBufferSubData( ctx, offset, size, data, bufObj );
  }
  
-

-void GLAPIENTRY
-_mesa_ClearBufferData(GLenum target, GLenum internalformat, GLenum format,
-  GLenum type, const GLvoid* data)
+/**
+ * \param subdata   true if caller is *SubData, false if *Data
+ */
+void
+_mesa_clear_buffer_sub_data(struct gl_context *ctx,
+struct gl_buffer_object *bufObj,
+GLenum internalformat,
+GLintptr offset, GLsizeiptr size,
+GLenum format, GLenum type,
+const GLvoid *data,
+const char *func, bool subdata)
  {
-   GET_CURRENT_CONTEXT(ctx);
-   struct gl_buffer_object* bufObj;
 mesa_format mesaFormat;
 GLubyte clearValue[MAX_PIXEL_BYTES];
 GLsizeiptr clearValueSize;
  
-   bufObj = get_buffer(ctx, "glClearBufferData", target, GL_INVALID_VALUE);

-   if (!bufObj) {
-  return;
-   }
-
-   if (_mesa_check_disallowed_mapping(bufObj)) {
-  _mesa_error(ctx, GL_INVALID_OPERATION,
-  "glClearBufferData(buffer currently mapped)");
+   /* This checks for disallowed mappings. */
+   if (!buffer_object_subdata_range_good(ctx, bufObj, offset, size,
+ subdata, func)) {
return;
 }
  
 mesaFormat = validate_clear_buffer_format(ctx, internalformat,

- format, type,
- "glClearBufferData");
+ format, type, func);
+
 if (mesaFormat == MESA_FORMAT_NONE) {
return;
 }
  
 clearValueSize = _mesa_get_format_bytes(mesaFormat);

-   if (bufObj->Size % clearValueSize != 0) {
+   if (offset % clearValueSize != 0 || size % clearValueSize != 0) {
_mesa_error(ctx, GL_INVALID_VALUE,
-  "glClearBufferData(size is not a multiple of "
-  "internalformat size)");
+  "%s(offset or size is not a multiple of "
+  "internalformat size)", func);
return;
 }
  
 if (data == NULL) {

/* clear to zeros, per the spec */
-  ctx->Driver.ClearBufferSubData(ctx, 0, bufObj->Size,
- NULL, clearValueSize, bufObj);
+  if (size > 0) {
+ ctx->Driver.ClearBufferSubData(ctx, offset, size,
+NULL, clearValueSize, bufObj);
+  }
return;
 }
  
 if (!convert_clear_buffer_data(ctx, mesaFormat, clearValue,

-  format, type, data, "glClearBufferData")) {
+  format, type, data, func)) {
return;
 }
  
-   ctx->Driver.ClearBufferSubData(ctx, 0, bufObj->Size,

-  clearValue, clearValueSize, bufObj);
+   if (size > 0) {
+  ctx->Driver.ClearBufferSubData(

Re: [Mesa-dev] [PATCH 13/23] main: Minor whitespace fixes in ClearNamedBuffer[Sub]Data.

2015-02-20 Thread Martin Peres

Please squash this in the previous commit (with git rebase -i).

On 12/02/2015 04:05, Laura Ekstrand wrote:

---
  src/mesa/main/bufferobj.c | 4 ++--
  src/mesa/main/bufferobj.h | 4 ++--
  2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index bd21c8a..88230d6 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -1765,10 +1765,10 @@ void GLAPIENTRY
  _mesa_ClearBufferSubData(GLenum target, GLenum internalformat,
   GLintptr offset, GLsizeiptr size,
   GLenum format, GLenum type,
- const GLvoid* data)
+ const GLvoid *data)
  {
 GET_CURRENT_CONTEXT(ctx);
-   struct gl_buffer_object* bufObj;
+   struct gl_buffer_object *bufObj;
  
 bufObj = get_buffer(ctx, "glClearBufferSubData", target, GL_INVALID_VALUE);

 if (!bufObj)
diff --git a/src/mesa/main/bufferobj.h b/src/mesa/main/bufferobj.h
index 5254727..2a66444 100644
--- a/src/mesa/main/bufferobj.h
+++ b/src/mesa/main/bufferobj.h
@@ -220,7 +220,7 @@ _mesa_GetBufferSubData(GLenum target, GLintptrARB offset,
  void GLAPIENTRY
  _mesa_ClearBufferData(GLenum target, GLenum internalformat,
GLenum format, GLenum type,
-  const GLvoid * data);
+  const GLvoid *data);
  
  void GLAPIENTRY

  _mesa_ClearNamedBufferData(GLuint buffer, GLenum internalformat,
@@ -231,7 +231,7 @@ void GLAPIENTRY
  _mesa_ClearBufferSubData(GLenum target, GLenum internalformat,
   GLintptr offset, GLsizeiptr size,
   GLenum format, GLenum type,
- const GLvoid * data);
+ const GLvoid *data);
  
  void GLAPIENTRY

  _mesa_ClearNamedBufferSubData(GLuint buffer, GLenum internalformat,


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: add lowering for double divide to rcp/mul

2015-02-20 Thread Roland Scheidegger
Am 19.02.2015 um 23:47 schrieb Dave Airlie:
> From: Dave Airlie 
> 
> It looks like no hw does div anyways, so we should just
> lower at the GLSL level.
> 
> Signed-off-by: Dave Airlie 
> ---
>  src/glsl/lower_instructions.cpp | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/src/glsl/lower_instructions.cpp b/src/glsl/lower_instructions.cpp
> index e8a69e7..ac6715b 100644
> --- a/src/glsl/lower_instructions.cpp
> +++ b/src/glsl/lower_instructions.cpp
> @@ -199,7 +199,7 @@ lower_instructions_visitor::sub_to_add_neg(ir_expression 
> *ir)
>  void
>  lower_instructions_visitor::div_to_mul_rcp(ir_expression *ir)
>  {
> -   assert(ir->operands[1]->type->is_float());
> +   assert(ir->operands[1]->type->is_float() || 
> ir->operands[1]->type->is_double());
>  
> /* New expression for the 1.0 / op1 */
> ir_rvalue *expr;
> @@ -327,7 +327,7 @@ lower_instructions_visitor::mod_to_floor(ir_expression 
> *ir)
> /* Don't generate new IR that would need to be lowered in an additional
>  * pass.
>  */
> -   if (lowering(DIV_TO_MUL_RCP) && ir->type->is_float())
> +   if (lowering(DIV_TO_MUL_RCP) && (ir->type->is_float() || 
> ir->type->is_double()))
>div_to_mul_rcp(div_expr);
>  
> ir_expression *const floor_expr =
> @@ -1014,7 +1014,7 @@ lower_instructions_visitor::visit_leave(ir_expression 
> *ir)
> case ir_binop_div:
>if (ir->operands[1]->type->is_integer() && 
> lowering(INT_DIV_TO_MUL_RCP))
>int_div_to_mul_rcp(ir);
> -  else if (ir->operands[1]->type->is_float() && lowering(DIV_TO_MUL_RCP))
> +  else if ((ir->operands[1]->type->is_float() || 
> ir->operands[1]->type->is_double())&& lowering(DIV_TO_MUL_RCP))
>div_to_mul_rcp(ir);
>break;
>  
> 

FWIW I suspect lowering ddiv doesn't really cut it always. Can be done
later though. (d3d11 in fact has ddiv as an "extended double feature"
not just the ordinary double feature, though along with dfma and drcp,
which makes me wonder how you'd do a division with the unextended
version if you don't even have rcp... In any case ddiv is required to be
precise to 0.5 ULP if supported, drcp only 1.0 ULP.)

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/23] main: Add entry points for ClearNamedBuffer[Sub]Data.

2015-02-20 Thread Martin Peres
It could have been split in two, but as no changes affect any 
already-running code,

I do not see any good reason to spend time on splitting this.

Reviewed-by: Martin Peres 

On 12/02/2015 04:05, Laura Ekstrand wrote:

---
  src/mapi/glapi/gen/ARB_direct_state_access.xml | 18 +
  src/mesa/main/bufferobj.c  | 35 ++
  src/mesa/main/bufferobj.h  | 11 
  src/mesa/main/tests/dispatch_sanity.cpp|  2 ++
  4 files changed, 66 insertions(+)

diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml 
b/src/mapi/glapi/gen/ARB_direct_state_access.xml
index 042b2a8..ce4017b 100644
--- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
+++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
@@ -43,6 +43,24 @@

 
  
+   

+  
+  
+  
+  
+  
+   
+
+   
+  
+  
+  
+  
+  
+  
+  
+   
+
 
  
 

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index b8fa917..bd21c8a 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -1744,6 +1744,22 @@ _mesa_ClearBufferData(GLenum target, GLenum 
internalformat, GLenum format,
 "glClearBufferData", false);
  }
  
+void GLAPIENTRY

+_mesa_ClearNamedBufferData(GLuint buffer, GLenum internalformat,
+   GLenum format, GLenum type, const GLvoid *data)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   struct gl_buffer_object *bufObj;
+
+   bufObj = _mesa_lookup_bufferobj_err(ctx, buffer, "glClearNamedBufferData");
+   if (!bufObj)
+  return;
+
+   _mesa_clear_buffer_sub_data(ctx, bufObj, internalformat, 0, bufObj->Size,
+   format, type, data,
+   "glClearNamedBufferData", false);
+}
+
  
  void GLAPIENTRY

  _mesa_ClearBufferSubData(GLenum target, GLenum internalformat,
@@ -1763,6 +1779,25 @@ _mesa_ClearBufferSubData(GLenum target, GLenum 
internalformat,
 "glClearBufferSubData", true);
  }
  
+void GLAPIENTRY

+_mesa_ClearNamedBufferSubData(GLuint buffer, GLenum internalformat,
+  GLintptr offset, GLsizeiptr size,
+  GLenum format, GLenum type,
+  const GLvoid *data)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   struct gl_buffer_object *bufObj;
+
+   bufObj = _mesa_lookup_bufferobj_err(ctx, buffer,
+   "glClearNamedBufferSubData");
+   if (!bufObj)
+  return;
+
+   _mesa_clear_buffer_sub_data(ctx, bufObj, internalformat, offset, size,
+   format, type, data,
+   "glClearNamedBufferSubData", true);
+}
+
  
  void * GLAPIENTRY

  _mesa_MapBuffer(GLenum target, GLenum access)
diff --git a/src/mesa/main/bufferobj.h b/src/mesa/main/bufferobj.h
index 5911270..5254727 100644
--- a/src/mesa/main/bufferobj.h
+++ b/src/mesa/main/bufferobj.h
@@ -223,11 +223,22 @@ _mesa_ClearBufferData(GLenum target, GLenum 
internalformat,
const GLvoid * data);
  
  void GLAPIENTRY

+_mesa_ClearNamedBufferData(GLuint buffer, GLenum internalformat,
+   GLenum format, GLenum type,
+   const GLvoid *data);
+
+void GLAPIENTRY
  _mesa_ClearBufferSubData(GLenum target, GLenum internalformat,
   GLintptr offset, GLsizeiptr size,
   GLenum format, GLenum type,
   const GLvoid * data);
  
+void GLAPIENTRY

+_mesa_ClearNamedBufferSubData(GLuint buffer, GLenum internalformat,
+  GLintptr offset, GLsizeiptr size,
+  GLenum format, GLenum type,
+  const GLvoid *data);
+
  void * GLAPIENTRY
  _mesa_MapBuffer(GLenum target, GLenum access);
  
diff --git a/src/mesa/main/tests/dispatch_sanity.cpp b/src/mesa/main/tests/dispatch_sanity.cpp

index aa8e352..4bd3e10 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -960,6 +960,8 @@ const struct function gl_core_functions_possible[] = {
 { "glNamedBufferData", 45, -1 },
 { "glNamedBufferSubData", 45, -1 },
 { "glCopyNamedBufferSubData", 45, -1 },
+   { "glClearNamedBufferData", 45, -1 },
+   { "glClearNamedBufferSubData", 45, -1 },
 { "glCreateTextures", 45, -1 },
 { "glTextureStorage1D", 45, -1 },
 { "glTextureStorage2D", 45, -1 },


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] gallium: add some more double opcodes to avoid unnecessary lowering

2015-02-20 Thread Roland Scheidegger
Just some minor nits.

Am 20.02.2015 um 00:52 schrieb Ilia Mirkin:
> Signed-off-by: Ilia Mirkin 
> ---
>  src/gallium/auxiliary/tgsi/tgsi_info.c |  5 
>  src/gallium/docs/source/tgsi.rst   | 39 
> ++
>  src/gallium/include/pipe/p_shader_tokens.h |  7 +-
>  3 files changed, 50 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c 
> b/src/gallium/auxiliary/tgsi/tgsi_info.c
> index d04f9da..4d838fd 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_info.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_info.c
> @@ -257,6 +257,11 @@ static const struct tgsi_opcode_info 
> opcode_info[TGSI_OPCODE_LAST] =
> { 1, 1, 0, 0, 0, 0, COMP, "D2U", TGSI_OPCODE_D2U },
> { 1, 1, 0, 0, 0, 0, COMP, "U2D", TGSI_OPCODE_U2D },
> { 1, 1, 0, 0 ,0, 0, COMP, "DRSQ", TGSI_OPCODE_DRSQ },
> +   { 1, 1, 0, 0, 0, 0, COMP, "DTRUNC", TGSI_OPCODE_DTRUNC },
> +   { 1, 1, 0, 0, 0, 0, COMP, "DCEIL", TGSI_OPCODE_DCEIL },
> +   { 1, 1, 0, 0, 0, 0, COMP, "DFLR", TGSI_OPCODE_DFLR },
> +   { 1, 1, 0, 0, 0, 0, COMP, "DROUND", TGSI_OPCODE_DROUND },
> +   { 1, 1, 0, 0, 0, 0, COMP, "DSSG", TGSI_OPCODE_DSSG },
>  };
>  
>  const struct tgsi_opcode_info *
> diff --git a/src/gallium/docs/source/tgsi.rst 
> b/src/gallium/docs/source/tgsi.rst
> index e20af79..15f1e9f 100644
> --- a/src/gallium/docs/source/tgsi.rst
> +++ b/src/gallium/docs/source/tgsi.rst
> @@ -1861,6 +1861,45 @@ two-component vectors with doubled precision in each 
> component.
>  
>dst.zw = src.zw - \lfloor src.zw\rfloor
>  
> +.. opcode:: DTRUNC - Truncate
> +
> +.. math::
> +
> +  dst.xy = trunc(src.xy)
> +
> +  dst.zw = trunc(src.zw)
> +
> +.. opcode:: DCEIL - Ceiling
> +
> +.. math::
> +
> +  dst.xy = \lceil src.xy\rceil
> +
> +  dst.zw = \lceil src.zw\rceil
> +
> +.. opcode:: DFLR - Floor
> +
> +.. math::
> +
> +  dst.xy = \lfloor src.xy\rfloor
> +
> +  dst.zw = \lfloor src.zw\rfloor
> +
> +.. opcode:: DROUND - Fraction
> +
> +.. math::
> +
> +  dst.xy = round(src.xy)
> +
> +  dst.zw = round(src.zw)
> +
> +.. opcode:: DSSG - Set Sign
> +
> +.. math::
> +
> +  dst.xy = (src.xy > 0) ? 1 : (src.xy < 0) ? -1 : 0
> +
> +  dst.zw = (src.zw > 0) ? 1 : (src.zw < 0) ? -1 : 0
I think this would be more obvious written as 1.0/-1.0/0.0 (same goes
for the non-double version, of course).
(As a side note, I'm wondering if this actually has defined NaN
behavior, the glsl language sort of implies a number has to be exactly
one of these 3 cases, and doesn't give an answer what should be returned
for a NaN - maybe a NaN?)

>  
>  .. opcode:: DFRACEXP - Convert Number to Fractional and Integral Components
>  
> diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
> b/src/gallium/include/pipe/p_shader_tokens.h
> index fc41cc9..95ac590 100644
> --- a/src/gallium/include/pipe/p_shader_tokens.h
> +++ b/src/gallium/include/pipe/p_shader_tokens.h
> @@ -519,7 +519,12 @@ struct tgsi_property_data {
>  #define TGSI_OPCODE_D2U 215
>  #define TGSI_OPCODE_U2D 216
>  #define TGSI_OPCODE_DRSQ217 /* eg, cayman also has DRSQ */
> -#define TGSI_OPCODE_LAST218
> +#define TGSI_OPCODE_DTRUNC  218 /* nvc0 */
> +#define TGSI_OPCODE_DCEIL   219 /* nvc0 */
> +#define TGSI_OPCODE_DFLR220 /* nvc0 */
> +#define TGSI_OPCODE_DROUND  221 /* nvc0 */
> +#define TGSI_OPCODE_DSSG222
> +#define TGSI_OPCODE_LAST223

I don't think these hw-specific references (nvc0) really fit in there -
the same can be said about some existing ones (cayman...)

>  #define TGSI_SAT_NONE0  /* do not saturate */
>  #define TGSI_SAT_ZERO_ONE1  /* clamp to [0,1] */
> 

Otherwise looks good to me.

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): tgsi: add support for flt64 constants

2015-02-20 Thread Brian Paul

On 02/19/2015 03:53 PM, Dave Airlie wrote:

Module: Mesa
Branch: master
Commit: fa43e0443e206740e219d45abefee65bdb2c3ecb
URL:
https://urldefense.proofpoint.com/v2/url?u=http-3A__cgit.freedesktop.org_mesa_mesa_commit_-3Fid-3Dfa43e0443e206740e219d45abefee65bdb2c3ecb&d=AwIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8&m=wliriAj2Oy285HDmqyh5ciHQpkhmeNqaYMAXn6TDVgo&s=QfqfqTGO4IS5Ib7ukIrfSIR3EGdcXECY38l_K3GlYlo&e=

Author: Dave Airlie 
Date:   Wed Aug 27 09:56:14 2014 +1000

tgsi: add support for flt64 constants

These act like flt32 except they take up two slots, and you
can only add 2 x flt64 constants in one slot.

The main reason they are different is we don't want to match half a flt64
constants against a flt32 constant in the matching code, we need to make
sure we treat both parts of the flt64 as an single structure.

Cleaned up printing/parsing by Ilia Mirkin 

Reviewed-by: Ilia Mirkin 
Signed-off-by: Dave Airlie 

---

  src/gallium/auxiliary/tgsi/tgsi_dump.c |8 +++
  src/gallium/auxiliary/tgsi/tgsi_parse.c|1 +
  src/gallium/auxiliary/tgsi/tgsi_strings.c  |5 +-
  src/gallium/auxiliary/tgsi/tgsi_strings.h  |2 +-
  src/gallium/auxiliary/tgsi/tgsi_text.c |   22 
  src/gallium/auxiliary/tgsi/tgsi_ureg.c |   75 ++--
  src/gallium/auxiliary/tgsi/tgsi_ureg.h |5 ++
  src/gallium/include/pipe/p_shader_tokens.h |1 +
  8 files changed, 113 insertions(+), 6 deletions(-)



[...]


diff --git a/src/gallium/auxiliary/tgsi/tgsi_text.c 
b/src/gallium/auxiliary/tgsi/tgsi_text.c
index f965b01..5069d13 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_text.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_text.c
@@ -232,6 +232,24 @@ static boolean parse_float( const char **pcur, float *val )
 return TRUE;
  }

+static boolean parse_double( const char **pcur, uint32_t *val0, uint32_t *val1)
+{
+   const char *cur = *pcur;
+   union {
+  double dval;
+  uint32_t uval[2];
+   } v;
+
+   v.dval = strtod(cur, pcur);
+   if (*pcur == cur)
+  return FALSE;
+
+   *val0 = v.uval[0];
+   *val1 = v.uval[1];
+
+   return TRUE;
+}


Hi Dave, I'm seeing a new compiler warning in that code:

tgsi/tgsi_text.c: In function 'parse_double':
tgsi/tgsi_text.c:243:4: warning: passing argument 2 of 'strtod' from 
incompatible pointer type [enabled by default]

v.dval = strtod(cur, pcur);
^
In file included from ../../../src/gallium/include/pipe/p_compiler.h:36:0,
 from ./os/os_misc.h:38,
 from ./util/u_debug.h:42,
 from tgsi/tgsi_text.c:28:
/usr/include/stdlib.h:164:15: note: expected 'char ** restrict' but 
argument is of type 'const char **'

 extern double strtod (const char *__restrict __nptr,
   ^

Can you clean that up?

gcc 4.8.2

-Brian



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v5] mesa: use fi_type in vertex attribute code

2015-02-20 Thread Brian Paul

On 02/19/2015 11:42 AM, marius.pre...@intel.com wrote:

From: Marius Predut 

For 32-bit builds, floating point operations use x86 FPU registers,
not SSE registers.  If we're actually storing an integer in a float
variable, the value might get modified when written to memory.  This
patch changes the VBO code to use the fi_type (float/int union) to
store/copy vertex attributes.

Also, this can improve performance on x86 because moving floats with
integer registers instead of FP registers is faster.

Neil Roberts review:
- include changes on all places that are storing attribute values.
- check with and without -O3 compiler flag.
Brian Paul review:
- use fi_type type instead gl_constant_value type
- fix a bunch of nit-picks.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82668
Signed-off-by: Marius Predut 


Looks good.  Thanks.

Unfortunately, there's a few new compiler warnings after applying the 
patch.  I'll try to fix those up before I push the patch.  I'll also do 
some piglit testing...


-Brian


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] glCompressedTex(Sub)Image(2,3) on mapped PBOs

2015-02-20 Thread Eduardo Lima Mitev
Hello,

While working on the following dEQP failing tests I ran into an issue
that I think deserves clarification from more experienced people:

dEQP-GLES3.functional.negative_api.texture.compressedteximage2d_invalid_buffer_target
dEQP-GLES3.functional.negative_api.texture.compressedtexsubimage2d_invalid_buffer_target
dEQP-GLES3.functional.negative_api.texture.compressedteximage3d_invalid_buffer_target
dEQP-GLES3.functional.negative_api.texture.compressedtexsubimage3d_invalid_buffer_target

All above tests fail because a CompressedTex(Sub)Image(2,3)D operation
is performed while there is a pixel buffer object bound to
GL_PIXEL_UNPACK_BUFFER and that pixel buffer object is currently mapped.
The tests expect this situation to throw an GL_INVALID_OPERATION but no
error is returned on i965.

In the spec GLES 3.1 spec, page 57, there is this: (and similar wording
in GL 4.5 spec, page 76)

"6.3.2 Effects of Mapping Buffers on Other GL Commands

"Any GL command which attempts to read from, write to, or change the
state of a buffer object may generate an INVALID_OPERATION error if all
or part of the buffer object is mapped. However, only commands which
explicitly describe this error are required to do so. If an error is not
generated, using such commands to perform invalid reads, writes, or
state changes will have undefined results and may result in GL
interruption or termination."

There is no further restriction in gl(Compressed)Tex(Sub)Image(2,3)D
that makes throwing an error mandatory, hence I suspect these tests
might be wrong.


However, I found that glTex(Sub)Image(2,3)D do emits an
GL_INVALID_OPERATION error when executed while a mapped PBO is bound.
But the check is performed in driver code (as opposed to API entry
points). (Concretely, in i965/intel_tex_image.c::intelTexImage() and
i965/intel_tex_subimage.c::intelTexSubImage() which both call
common/meta_tex_subimage.c::_mesa_meta_pbo_TexSubImage()).

So, the question is:

Shouldn't these operations on compressed texture images have the same
restriction as their non-compressed counter-parts?

Or is there a reason why this restriction is enforced only on
non-compressed images?

Also, shouldn't this restriction be moved to common API entry points, or
are there drivers that do write to PBOs while they are mapped?

cheers,
Eduardo
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/7] mesa: Make bitset.h not rely on Mesa-specific types and functions.

2015-02-20 Thread Jose Fonseca

On 12/02/15 00:48, Eric Anholt wrote:

Note that we can't use u_math.h's align() because it's a function instead
of a macro, while BITSET_DECLARE needs a constant expression for nouveau's
usage in global declarations.
---
  src/mesa/main/bitset.h | 7 +++
  1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/src/mesa/main/bitset.h b/src/mesa/main/bitset.h
index 2558da4..6f9bed9 100644
--- a/src/mesa/main/bitset.h
+++ b/src/mesa/main/bitset.h
@@ -31,19 +31,18 @@
  #ifndef BITSET_H
  #define BITSET_H

-#include "imports.h"
-#include "macros.h"
+#include "util/u_math.h"

  /
   * generic bitset implementation
   */

-#define BITSET_WORD GLuint
+#define BITSET_WORD unsigned int
  #define BITSET_WORDBITS (sizeof (BITSET_WORD) * 8)

  /* bitset declarations
   */
-#define BITSET_WORDS(bits) (ALIGN(bits, BITSET_WORDBITS) / BITSET_WORDBITS)
+#define BITSET_WORDS(bits) ((bits + BITSET_WORDBITS - 1) / BITSET_WORDBITS)


bits -> (bits)

just in case...


  #define BITSET_DECLARE(name, bits) BITSET_WORD name[BITSET_WORDS(bits)]

  /* bitset operations



With it

Reviewed-by: Jose Fonseca 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] mesa: Add gallium include dirs to more parts of the tree.

2015-02-20 Thread Jose Fonseca

On 13/02/15 00:49, Eric Anholt wrote:

Jose Fonseca  writes:


Thanks for doing this. I appreciate it.

I have no objection with the series.  I'm happy to see more reuse.  We
can always move things around later, and it will be much easier when
then are less entangled/duplicated.

We'll need to update SCons include paths too.  If you have a git repos
with your series that I can pull from I'll give it a go tomorrow and
provide a patch for it.


gallium-includes of my tree.



Sorry it took this long to come back to on this.  I was sick for a 
couple days then had to rush to catch up.


I've tried to build your branch, and it looks like with your "v2: Try to 
patch up the scons bits." update everything just works.



Apart of one minor comment (in a seperate email), the series looks good:

Reviewed-by: Jose Fonseca 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] glCompressedTex(Sub)Image(2,3) on mapped PBOs

2015-02-20 Thread Jason Ekstrand
On Fri, Feb 20, 2015 at 8:13 AM, Eduardo Lima Mitev 
wrote:

> Hello,
>
> While working on the following dEQP failing tests I ran into an issue
> that I think deserves clarification from more experienced people:
>
>
> dEQP-GLES3.functional.negative_api.texture.compressedteximage2d_invalid_buffer_target
>
> dEQP-GLES3.functional.negative_api.texture.compressedtexsubimage2d_invalid_buffer_target
>
> dEQP-GLES3.functional.negative_api.texture.compressedteximage3d_invalid_buffer_target
>
> dEQP-GLES3.functional.negative_api.texture.compressedtexsubimage3d_invalid_buffer_target
>
> All above tests fail because a CompressedTex(Sub)Image(2,3)D operation
> is performed while there is a pixel buffer object bound to
> GL_PIXEL_UNPACK_BUFFER and that pixel buffer object is currently mapped.
> The tests expect this situation to throw an GL_INVALID_OPERATION but no
> error is returned on i965.
>
> In the spec GLES 3.1 spec, page 57, there is this: (and similar wording
> in GL 4.5 spec, page 76)
>
> "6.3.2 Effects of Mapping Buffers on Other GL Commands
>
> "Any GL command which attempts to read from, write to, or change the
> state of a buffer object may generate an INVALID_OPERATION error if all
> or part of the buffer object is mapped. However, only commands which
> explicitly describe this error are required to do so. If an error is not
> generated, using such commands to perform invalid reads, writes, or
> state changes will have undefined results and may result in GL
> interruption or termination."
>
> There is no further restriction in gl(Compressed)Tex(Sub)Image(2,3)D
> that makes throwing an error mandatory, hence I suspect these tests
> might be wrong.
>
>
> However, I found that glTex(Sub)Image(2,3)D do emits an
> GL_INVALID_OPERATION error when executed while a mapped PBO is bound.
> But the check is performed in driver code (as opposed to API entry
> points). (Concretely, in i965/intel_tex_image.c::intelTexImage() and
> i965/intel_tex_subimage.c::intelTexSubImage() which both call
> common/meta_tex_subimage.c::_mesa_meta_pbo_TexSubImage()).
>
> So, the question is:
>
> Shouldn't these operations on compressed texture images have the same
> restriction as their non-compressed counter-parts?
>
> Or is there a reason why this restriction is enforced only on
> non-compressed images?
>
> Also, shouldn't this restriction be moved to common API entry points, or
> are there drivers that do write to PBOs while they are mapped?
>

IMHO, that's exactly what we need to do.  I've been toying with writing a
patch to do just that.  Right now, we're leaving checking if the buffer is
mapped and checking if it has enough size up to the device-specific driver
backend.  This is, in general, not how the rest of mesa does error-checking
so it seems very odd.  Also, it means that if a driver misses the check it
causes a GL API failure not just a texture upload error.  To make it worse,
they all check it by calling the same function inside mesa/main so there's
really no excuse.  I haven't seen any good reason why we aren't just
calling that in the error-checking section of the entry-point.

--Jason


> cheers,
> Eduardo
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Jason Ekstrand
On Fri, Feb 20, 2015 at 4:11 AM, Francisco Jerez 
wrote:

> Jason Ekstrand  writes:
>
> > I'm still a little pensive.  But
> >
> > Reviewed-by: Jason Ekstrand 
> >
> Thanks.
>
> > Now for a little aside.  I have come to the conclusion that I made a
> grave
> > mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I should have
> > just subclassed fs_inst for load_payload.  The problem is that we need to
> > snag a bunch of information for the sources when we create the
> > load_payload.  In particular, we need to know the width of the source so
> > that we know how much space it consumes in the payload and we need to
> know
> > the information required to properly re-create the mov such as
> > force_sechalf and force_writemask_all.  Really, in order to do things
> > properly, we need to gather this information *before* we do any
> > optimizations.  The nasty pile of code that you're editing together with
> > the "effective_width" parameter is a lame attempt to capture/reconstruct
> > this information.  Really, we should just subclass, capture the
> information
> > up-front, and do it properly.
> >
> Yes, absolutely, this lowering pass is a real mess.  There are four more
> unreviewed patches in the mailing list fixing bugs of the metadata
> guessing of lower_load_payload() [1][2][3][4], you may want to take a
> look at them.  There are more bugs I'm aware of but it didn't seem
> practical to fix them.
>

Yeah, Matt pointed me at those.  I'll give them a look later today.


> That said, I don't think it would be worth subclassing fs_inst.
>
> My suggestion would have been to keep it simple and lower LOAD_PAYLOAD
> into a series of MOVs with force_writemask_all enabled in all cases,
> simply rely on the optimizer to eliminate those where possible.  Then
> get rid of the metadata and effective_width guessing.  Instead agree on
> immediates and uniforms being exec_size-wide by convention
> (i.e. LOAD_PAYLOAD's exec_size rather than the original instruction's),
> then prevent constant propagation from propagating an immediate load
> into a LOAD_PAYLOAD if it would lead to a change in the semantics of the
> program, and maybe just run constant propagation after lowering so we
> can be sure those cases are cleaned up properly where register coalesce
> isn't enough.
>

First off, simply setting force_writemask_all isn't an option.  Consider
the following scenario:

a = foo;
if (bar) {
   b = 7;
} else {
   use(a);
}
use(b);

If "use(a)" is the last use of the variable a, then the live ranges of a
and b don't actually over-lap and we can assign a and b to the same
register.  However, if force_writemask_all is set on b, it will destroy the
value in a before its last use.  Right now, we don't actually do this
because our liveness analysis pass flattens everything so it will think
that b and a over-lap even though they don't.  However, if we ever decide
to make the liveness information more accurate, having a pile of
force_writemask_all instructions will be a major problem.  So, while it is
*technically* safe for now, it's a really bad idea in the long-term.

Regarding the other suggestion of just requiring width == exec_size for
immediates and uniforms, that seems pretty reasonable to me.  I'd like to
know what it will do to optimizations, but it seems ok initially.  I'm
still a bigger fan of just subclassing and stashing everything we need to
know up-front.  If we do it right, the only things that will actually need
to know about the subclass are the function for creating a LOAD_PAYLOAD and
lower_load_payloads.

--Jason


>
> [1]
> http://lists.freedesktop.org/archives/mesa-dev/2015-January/074614.html
> [2]
> http://lists.freedesktop.org/archives/mesa-dev/2015-February/076094.html
> [3]
> http://lists.freedesktop.org/archives/mesa-dev/2015-February/076097.html
> [4]
> http://lists.freedesktop.org/archives/mesa-dev/2015-February/076098.html
>
>
> > --Jason
> >
> > On Thu, Feb 19, 2015 at 1:53 PM, Jason Ekstrand 
> > wrote:
> >
> >>
> >>
> >> On Thu, Feb 19, 2015 at 1:25 PM, Francisco Jerez  >
> >> wrote:
> >>
> >>> Jason Ekstrand  writes:
> >>>
> >>> > On Thu, Feb 19, 2015 at 12:13 PM, Francisco Jerez <
> >>> curroje...@riseup.net>
> >>> > wrote:
> >>> >
> >>> >> Jason Ekstrand  writes:
> >>> >>
> >>> >> > On Fri, Feb 6, 2015 at 4:01 PM, Francisco Jerez <
> >>> curroje...@riseup.net>
> >>> >> > wrote:
> >>> >> >
> >>> >> >> Hey Matt,
> >>> >> >>
> >>> >> >> Matt Turner  writes:
> >>> >> >>
> >>> >> >> > On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez <
> >>> >> curroje...@riseup.net>
> >>> >> >> wrote:
> >>> >> >> >> MRFs cannot be read from anyway so they cannot possibly be a
> >>> valid
> >>> >> >> >> source of LOAD_PAYLOAD.
> >>> >> >> >> ---
> >>> >> >> >
> >>> >> >> > The function only seems to test inst->dst.file == MRF. I don't
> >>> see any
> >>> >> >> > code for handling MRF sources. What am I missing?
> >>> >> >>
> >>> >> >> That test is for "handling" MRF sources -- More precisely, it's
> >>> >> >> c

Re: [Mesa-dev] [PATCH] nir: Copy-propagate vecN operations that are actually moves

2015-02-20 Thread Jason Ekstrand
On Thu, Feb 19, 2015 at 11:03 PM, Matt Turner  wrote:

> On Thu, Feb 19, 2015 at 11:01 PM, Connor Abbott 
> wrote:
> > I agree with Ken that the regressions are small enough, and it seems
> > they're mostly stuff we can prevent by being smarter when doing the
> > sel peephole, so it seems like the cleanup that will probably help
> > other passes is worth it.
>
> So, usually we do that as a preparatory patch. Why aren't we doing that
> here?
>

Do what in a preparatory patch?  Fix up the sel peephole to be able to
handle "if (foo) bar = baz;"?  Sure, I can put that patch together.


> NIR instruction counts is not the metric we care about.
>

No, but cleaning things up means that we can do other optimizations
better.  Also, in each of those cases, the non-ssa NIR code was better it
was just less cleanable by the backend.  We need to work on that, but I
don't think it's an indicator of a problem.
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Copy-propagate vecN operations that are actually moves

2015-02-20 Thread Matt Turner
On Fri, Feb 20, 2015 at 9:23 AM, Jason Ekstrand  wrote:
>
>
> On Thu, Feb 19, 2015 at 11:03 PM, Matt Turner  wrote:
>>
>> On Thu, Feb 19, 2015 at 11:01 PM, Connor Abbott 
>> wrote:
>> > I agree with Ken that the regressions are small enough, and it seems
>> > they're mostly stuff we can prevent by being smarter when doing the
>> > sel peephole, so it seems like the cleanup that will probably help
>> > other passes is worth it.
>>
>> So, usually we do that as a preparatory patch. Why aren't we doing that
>> here?
>
>
> Do what in a preparatory patch?  Fix up the sel peephole to be able to
> handle "if (foo) bar = baz;"?  Sure, I can put that patch together.

I thought that this would get turned into a conditional select as a
side-effect of going out of SSA anyway (assuming baz was
unconditionally set before the if)? Are these shaders not setting baz
unconditionally?

>>
>> NIR instruction counts is not the metric we care about.
>
>
> No, but cleaning things up means that we can do other optimizations better.
> Also, in each of those cases, the non-ssa NIR code was better it was just
> less cleanable by the backend.  We need to work on that, but I don't think
> it's an indicator of a problem.
> --Jason

Okay, that seems fine. I wouldn't bother updating the SEL peephole. It
sounds like a number of the regressions we have are related to it not
handling something so we might need to do a larger evaluation.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 23/23] main: Cosmetic changes to GetBufferSubData.

2015-02-20 Thread Laura Ekstrand
Ian has asked that this not be squashed.  In fact, they were the same
commit last time around on the ML.

On Fri, Feb 20, 2015 at 12:29 AM, Martin Peres  wrote:

> Please squash this commit with the one introducing GetBufferSubData.
>
> It should be pretty easy to do with git rebase -i :)
>
>
> On 12/02/2015 04:06, Laura Ekstrand wrote:
>
>> ---
>>   src/mesa/main/bufferobj.c | 2 +-
>>   src/mesa/main/bufferobj.h | 4 ++--
>>   2 files changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
>> index 0272704..38d8b5a 100644
>> --- a/src/mesa/main/bufferobj.c
>> +++ b/src/mesa/main/bufferobj.c
>> @@ -1668,7 +1668,7 @@ _mesa_GetBufferSubData(GLenum target, GLintptr
>> offset,
>>  }
>>ASSERT(ctx->Driver.GetBufferSubData);
>> -   ctx->Driver.GetBufferSubData( ctx, offset, size, data, bufObj );
>> +   ctx->Driver.GetBufferSubData(ctx, offset, size, data, bufObj);
>>   }
>> void GLAPIENTRY
>> diff --git a/src/mesa/main/bufferobj.h b/src/mesa/main/bufferobj.h
>> index feeaa8b..b5d73ae 100644
>> --- a/src/mesa/main/bufferobj.h
>> +++ b/src/mesa/main/bufferobj.h
>> @@ -230,8 +230,8 @@ _mesa_NamedBufferSubData(GLuint buffer, GLintptr
>> offset,
>>GLsizeiptr size, const GLvoid *data);
>> void GLAPIENTRY
>> -_mesa_GetBufferSubData(GLenum target, GLintptrARB offset,
>> -   GLsizeiptrARB size, void * data);
>> +_mesa_GetBufferSubData(GLenum target, GLintptr offset,
>> +   GLsizeiptr size, GLvoid *data);
>> void GLAPIENTRY
>>   _mesa_GetNamedBufferSubData(GLuint buffer, GLintptr offset,
>>
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Copy-propagate vecN operations that are actually moves

2015-02-20 Thread Jason Ekstrand
On Feb 20, 2015 9:27 AM, "Matt Turner"  wrote:
>
> On Fri, Feb 20, 2015 at 9:23 AM, Jason Ekstrand 
wrote:
> >
> >
> > On Thu, Feb 19, 2015 at 11:03 PM, Matt Turner 
wrote:
> >>
> >> On Thu, Feb 19, 2015 at 11:01 PM, Connor Abbott 
> >> wrote:
> >> > I agree with Ken that the regressions are small enough, and it seems
> >> > they're mostly stuff we can prevent by being smarter when doing the
> >> > sel peephole, so it seems like the cleanup that will probably help
> >> > other passes is worth it.
> >>
> >> So, usually we do that as a preparatory patch. Why aren't we doing that
> >> here?
> >
> >
> > Do what in a preparatory patch?  Fix up the sel peephole to be able to
> > handle "if (foo) bar = baz;"?  Sure, I can put that patch together.
>
> I thought that this would get turned into a conditional select as a
> side-effect of going out of SSA anyway (assuming baz was
> unconditionally set before the if)? Are these shaders not setting baz
> unconditionally?

The NIR peephole only triggers if the only instructions in the if or else
are moves.  In this particular case, it was a uniform load.  Before the
change, it was lowered to moves in both sides of the if and, since it ended
up being a push constant, got peepholed.  After the change, it was lowered
to moves on only one side of the if (which is theoretically better) and the
select didn't trigger.

> >>
> >> NIR instruction counts is not the metric we care about.
> >
> >
> > No, but cleaning things up means that we can do other optimizations
better.
> > Also, in each of those cases, the non-ssa NIR code was better it was
just
> > less cleanable by the backend.  We need to work on that, but I don't
think
> > it's an indicator of a problem.
> > --Jason
>
> Okay, that seems fine. I wouldn't bother updating the SEL peephole. It
> sounds like a number of the regressions we have are related to it not
> handling something so we might need to do a larger evaluation.

Yeah, it just needs to handle one-sided it's better.
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/23] main: Refactor ClearBuffer[Sub]Data.

2015-02-20 Thread Laura Ekstrand
This naming convention tries to match the corresponding DD table entry.

There's a thread discussing the naming convention for external software
fallback functions.  Feel free to add your input to the discussion there :)

On Fri, Feb 20, 2015 at 6:18 AM, Martin Peres  wrote:

> On 12/02/2015 04:05, Laura Ekstrand wrote:
>
>> v2: review by Jason Ekstrand
>> - Split refactor of clear buffer sub data from addition of DSA entry
>>   points.
>> ---
>>   src/mesa/main/bufferobj.c| 125
>> ---
>>   src/mesa/main/bufferobj.h|  19 ++--
>>   src/mesa/state_tracker/st_cb_bufferobjects.c |   4 +-
>>   3 files changed, 69 insertions(+), 79 deletions(-)
>>
>> diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
>> index 7225b64..b8fa917 100644
>> --- a/src/mesa/main/bufferobj.c
>> +++ b/src/mesa/main/bufferobj.c
>> @@ -660,11 +660,11 @@ _mesa_buffer_get_subdata( struct gl_context *ctx,
>> GLintptrARB offset,
>>* dd_function_table::ClearBufferSubData.
>>*/
>>   void
>> -_mesa_buffer_clear_subdata(struct gl_context *ctx,
>> -   GLintptr offset, GLsizeiptr size,
>> -   const GLvoid *clearValue,
>> -   GLsizeiptr clearValueSize,
>> -   struct gl_buffer_object *bufObj)
>> +_mesa_ClearBufferSubData_sw(struct gl_context *ctx,
>>
>
> Interesting choice of naming the function as a mix of camel case and
> underscores.
>
> Since it is an internal function, shouldn't it only have underscores?
>
> Other than that:
>
> Reviewed-by: Martin Peres 
>
>  +GLintptr offset, GLsizeiptr size,
>> +const GLvoid *clearValue,
>> +GLsizeiptr clearValueSize,
>> +struct gl_buffer_object *bufObj)
>>   {
>>  GLsizeiptr i;
>>  GLubyte *dest;
>> @@ -1113,7 +1113,7 @@ _mesa_init_buffer_object_functions(struct
>> dd_function_table *driver)
>>  driver->UnmapBuffer = _mesa_buffer_unmap;
>>/* GL_ARB_clear_buffer_object */
>> -   driver->ClearBufferSubData = _mesa_buffer_clear_subdata;
>> +   driver->ClearBufferSubData = _mesa_ClearBufferSubData_sw;
>>/* GL_ARB_map_buffer_range */
>>  driver->MapBufferRange = _mesa_buffer_map_range;
>> @@ -1671,57 +1671,77 @@ _mesa_GetBufferSubData(GLenum target, GLintptr
>> offset,
>>  ctx->Driver.GetBufferSubData( ctx, offset, size, data, bufObj );
>>   }
>>   -
>> -void GLAPIENTRY
>> -_mesa_ClearBufferData(GLenum target, GLenum internalformat, GLenum
>> format,
>> -  GLenum type, const GLvoid* data)
>> +/**
>> + * \param subdata   true if caller is *SubData, false if *Data
>> + */
>> +void
>> +_mesa_clear_buffer_sub_data(struct gl_context *ctx,
>> +struct gl_buffer_object *bufObj,
>> +GLenum internalformat,
>> +GLintptr offset, GLsizeiptr size,
>> +GLenum format, GLenum type,
>> +const GLvoid *data,
>> +const char *func, bool subdata)
>>   {
>> -   GET_CURRENT_CONTEXT(ctx);
>> -   struct gl_buffer_object* bufObj;
>>  mesa_format mesaFormat;
>>  GLubyte clearValue[MAX_PIXEL_BYTES];
>>  GLsizeiptr clearValueSize;
>>   -   bufObj = get_buffer(ctx, "glClearBufferData", target,
>> GL_INVALID_VALUE);
>> -   if (!bufObj) {
>> -  return;
>> -   }
>> -
>> -   if (_mesa_check_disallowed_mapping(bufObj)) {
>> -  _mesa_error(ctx, GL_INVALID_OPERATION,
>> -  "glClearBufferData(buffer currently mapped)");
>> +   /* This checks for disallowed mappings. */
>> +   if (!buffer_object_subdata_range_good(ctx, bufObj, offset, size,
>> + subdata, func)) {
>> return;
>>  }
>>mesaFormat = validate_clear_buffer_format(ctx, internalformat,
>> - format, type,
>> - "glClearBufferData");
>> + format, type, func);
>> +
>>  if (mesaFormat == MESA_FORMAT_NONE) {
>> return;
>>  }
>>clearValueSize = _mesa_get_format_bytes(mesaFormat);
>> -   if (bufObj->Size % clearValueSize != 0) {
>> +   if (offset % clearValueSize != 0 || size % clearValueSize != 0) {
>> _mesa_error(ctx, GL_INVALID_VALUE,
>> -  "glClearBufferData(size is not a multiple of "
>> -  "internalformat size)");
>> +  "%s(offset or size is not a multiple of "
>> +  "internalformat size)", func);
>> return;
>>  }
>>if (data == NULL) {
>> /* clear to zeros, per the spec */
>> -  ctx->Driver.ClearBufferSubData(ctx, 0, bufObj->Size,
>> - NULL, clearValueSize, bufObj);
>> +  if (size > 0) {
>> +

Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Francisco Jerez
Jason Ekstrand  writes:

> On Fri, Feb 20, 2015 at 4:11 AM, Francisco Jerez 
> wrote:
>
>> Jason Ekstrand  writes:
>>
>> > I'm still a little pensive.  But
>> >
>> > Reviewed-by: Jason Ekstrand 
>> >
>> Thanks.
>>
>> > Now for a little aside.  I have come to the conclusion that I made a
>> grave
>> > mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I should have
>> > just subclassed fs_inst for load_payload.  The problem is that we need to
>> > snag a bunch of information for the sources when we create the
>> > load_payload.  In particular, we need to know the width of the source so
>> > that we know how much space it consumes in the payload and we need to
>> know
>> > the information required to properly re-create the mov such as
>> > force_sechalf and force_writemask_all.  Really, in order to do things
>> > properly, we need to gather this information *before* we do any
>> > optimizations.  The nasty pile of code that you're editing together with
>> > the "effective_width" parameter is a lame attempt to capture/reconstruct
>> > this information.  Really, we should just subclass, capture the
>> information
>> > up-front, and do it properly.
>> >
>> Yes, absolutely, this lowering pass is a real mess.  There are four more
>> unreviewed patches in the mailing list fixing bugs of the metadata
>> guessing of lower_load_payload() [1][2][3][4], you may want to take a
>> look at them.  There are more bugs I'm aware of but it didn't seem
>> practical to fix them.
>>
>
> Yeah, Matt pointed me at those.  I'll give them a look later today.
>
>
>> That said, I don't think it would be worth subclassing fs_inst.
>>
>> My suggestion would have been to keep it simple and lower LOAD_PAYLOAD
>> into a series of MOVs with force_writemask_all enabled in all cases,
>> simply rely on the optimizer to eliminate those where possible.  Then
>> get rid of the metadata and effective_width guessing.  Instead agree on
>> immediates and uniforms being exec_size-wide by convention
>> (i.e. LOAD_PAYLOAD's exec_size rather than the original instruction's),
>> then prevent constant propagation from propagating an immediate load
>> into a LOAD_PAYLOAD if it would lead to a change in the semantics of the
>> program, and maybe just run constant propagation after lowering so we
>> can be sure those cases are cleaned up properly where register coalesce
>> isn't enough.
>>
>
> First off, simply setting force_writemask_all isn't an option.  Consider
> the following scenario:
>
> a = foo;
> if (bar) {
>b = 7;
> } else {
>use(a);
> }
> use(b);
>
> If "use(a)" is the last use of the variable a, then the live ranges of a
> and b don't actually over-lap and we can assign a and b to the same
> register.  However, if force_writemask_all is set on b, it will destroy the
> value in a before its last use.  Right now, we don't actually do this
> because our liveness analysis pass flattens everything so it will think
> that b and a over-lap even though they don't.  However, if we ever decide
> to make the liveness information more accurate, having a pile of
> force_writemask_all instructions will be a major problem.  So, while it is
> *technically* safe for now, it's a really bad idea in the long-term.
>
The thing is we *will* have to deal with that scenario.  Building a
message payload inherently involves crossing channel boundaries (because
of headers, unsupported SIMD modes by some shared functions, etc.).  I'd
say it's a bug in the liveness analysis pass if it wouldn't consider the
liveness interval of a and b to overlap in your example if the
assignment of b had force_writemask_all set.

A reasonable compromise might be to leave it up to the caller whether to
set the force_writemask_all and force_sechalf flags or not.  I.e. just
copy the same flags set on the LOAD_PAYLOAD instruction to the MOV
instructions.  That would still allow reducing the liveness intervals in
cases where we know that the payload respects channel boundaries.

Tracking the flag information per-register in cases where the same
payload has well- and ill-behaved values seems rather premature and
rather useless to me because the optimizer is likely to be able to get
rid of redundant copies with force_writemask_all -- register coalesce
is doing this already AFAIK, maybe by accident.

> Regarding the other suggestion of just requiring width == exec_size for
> immediates and uniforms, that seems pretty reasonable to me.  I'd like to
> know what it will do to optimizations, but it seems ok initially.  I'm
> still a bigger fan of just subclassing and stashing everything we need to
> know up-front.  If we do it right, the only things that will actually need
> to know about the subclass are the function for creating a LOAD_PAYLOAD and
> lower_load_payloads.
>
> --Jason
>
>
>>
>> [1]
>> http://lists.freedesktop.org/archives/mesa-dev/2015-January/074614.html
>> [2]
>> http://lists.freedesktop.org/archives/mesa-dev/2015-February/076094.html
>> [3]
>> http://lists.freedesktop.or

Re: [Mesa-dev] [PATCH] svga: add missing padding to SVGA3dSize

2015-02-20 Thread Sinclair Yeh
On Fri, Feb 20, 2015 at 09:22:20AM +, Van Der Wath, DanielX J wrote:
> From: Daniel van der Wath 
> 
> The kernel side equivalent of struct SVGA3dSize (struct drm_vmw_size) has an
> extra padding word that SVGA3dSize lacks. This was causing data to be written
> past the end of "size" in vmw_drm_surface_from_handle(), corrupting other data

The drm_vmw_* types are used to exchange data with the VMW DRM so none of the
DRM IOCTL functions, e.g. drmCommandWriteRead(), should be using the SVGA3d*
types.

In vmw_drm_surface_from_handle(), "size" is of type "struct drm_vmw_size",
and is being used here: rep->size_addr = (unsigned long)&size, to call
drmCommandWriteRead().  So there shouldn't be a user/kernel mismatch here.

At which point during this function do you see handle being over written?
Also, which version of MESA are you seeing this on?

I'll see if I can see this on my end.

> and in this case leading to Weston being unable to render anything on screen.
> ---
>  src/gallium/drivers/svga/include/svga3d_types.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/gallium/drivers/svga/include/svga3d_types.h 
> b/src/gallium/drivers/svga/include/svga3d_types.h
> index fc4a6b9..3ce6814 100644
> --- a/src/gallium/drivers/svga/include/svga3d_types.h
> +++ b/src/gallium/drivers/svga/include/svga3d_types.h
> @@ -1280,6 +1280,7 @@ struct {
> uint32   width;
> uint32   height;
> uint32   depth;
> +   uint32   pad64;
>  }
>  #include "vmware_pack_end.h"
>  SVGA3dSize;
> -- 
> 1.7.11.7
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/skl: Use 1 register for uniform pull constant payload

2015-02-20 Thread Ben Widawsky
On Fri, Feb 20, 2015 at 03:34:21AM -0800, Kenneth Graunke wrote:
> On Thursday, February 19, 2015 10:48:08 PM Ben Widawsky wrote:
> > When under dispatch_width=16 the previous code would allocate 2 registers 
> > for
> > the payload when only one is needed. This manifested itself through bugs on 
> > SKL
> > which needs to mess with this instruction.
> > 
> > Ken though this might impact shader-db, but apparently it doesn't
> > 
> > Cc: Kenneth Graunke 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89118
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88999
> > Signed-off-by: Ben Widawsky 
> > ---
> >  src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> > b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > index c46e1d7..24125cc 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > @@ -3062,7 +3062,7 @@ fs_visitor::lower_uniform_pull_constant_loads()
> >   assert(const_offset_reg.file == IMM &&
> >  const_offset_reg.type == BRW_REGISTER_TYPE_UD);
> >   const_offset_reg.fixed_hw_reg.dw1.ud /= 4;
> > - fs_reg payload = vgrf(glsl_type::uint_type);
> > + fs_reg payload = fs_reg(GRF, alloc.allocate(1));
> >  
> >   /* We have to use a message header on Skylake to get SIMD4x2 mode.
> >* Reserve space for the register.
> > 
> 
> Reviewed-by: Kenneth Graunke 

Given that this seems to be required to get SKL stable, do I want to Cc 1.5 too?

-- 
Ben Widawsky, Intel Open Source Technology Center
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/16] main: Added entry point for glCreateTransformFeedbacks

2015-02-20 Thread Laura Ekstrand
Looks good to me

Reviewed-by: Laura Ekstrand 

On Mon, Feb 16, 2015 at 6:13 AM, Martin Peres 
wrote:

> v2: Review from Laura Ekstrand
> - generate the name of the gl method once
> - shorten some lines to stay in the 78 chars limit
>
> v3: Review from Fredrik Höglund 
> - rename gl_mthd_name to func
> - set EverBound in _mesa_create_transform_feedbacks in the dsa case
>
> v4:
> - rename _mesa_create_transform_feedbacks to create_transform_feedbacks and
>   make it static
>
> Signed-off-by: Martin Peres 
> ---
>  src/mapi/glapi/gen/ARB_direct_state_access.xml |  7 +++
>  src/mesa/main/tests/dispatch_sanity.cpp|  1 +
>  src/mesa/main/transformfeedback.c  | 67
> --
>  src/mesa/main/transformfeedback.h  |  3 ++
>  4 files changed, 63 insertions(+), 15 deletions(-)
>
> diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml
> b/src/mapi/glapi/gen/ARB_direct_state_access.xml
> index 2fe1638..15b00c2 100644
> --- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
> +++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
> @@ -7,6 +7,13 @@
> 
> 
>
> +   
> +
> +  
> +  
> +  
> +   
> +
> 
>
> 
> diff --git a/src/mesa/main/tests/dispatch_sanity.cpp
> b/src/mesa/main/tests/dispatch_sanity.cpp
> index 1f1a3a8..bf320bf 100644
> --- a/src/mesa/main/tests/dispatch_sanity.cpp
> +++ b/src/mesa/main/tests/dispatch_sanity.cpp
> @@ -955,6 +955,7 @@ const struct function gl_core_functions_possible[] = {
> { "glClipControl", 45, -1 },
>
> /* GL_ARB_direct_state_access */
> +   { "glCreateTransformFeedbacks", 45, -1 },
> { "glCreateTextures", 45, -1 },
> { "glTextureStorage1D", 45, -1 },
> { "glTextureStorage2D", 45, -1 },
> diff --git a/src/mesa/main/transformfeedback.c
> b/src/mesa/main/transformfeedback.c
> index a737463..10bb6a1 100644
> --- a/src/mesa/main/transformfeedback.c
> +++ b/src/mesa/main/transformfeedback.c
> @@ -825,25 +825,24 @@ _mesa_lookup_transform_feedback_object(struct
> gl_context *ctx, GLuint name)
>   _mesa_HashLookup(ctx->TransformFeedback.Objects, name);
>  }
>
> -
> -/**
> - * Create new transform feedback objects.   Transform feedback objects
> - * encapsulate the state related to transform feedback to allow quickly
> - * switching state (and drawing the results, below).
> - * Part of GL_ARB_transform_feedback2.
> - */
> -void GLAPIENTRY
> -_mesa_GenTransformFeedbacks(GLsizei n, GLuint *names)
> +static void
> +create_transform_feedbacks(struct gl_context *ctx, GLsizei n, GLuint *ids,
> +   bool dsa)
>  {
> GLuint first;
> -   GET_CURRENT_CONTEXT(ctx);
> +   const char* func;
> +
> +   if (dsa)
> +  func = "glCreateTransformFeedbacks";
> +   else
> +  func = "glGenTransformFeedbacks";
>
> if (n < 0) {
> -  _mesa_error(ctx, GL_INVALID_VALUE, "glGenTransformFeedbacks(n <
> 0)");
> +  _mesa_error(ctx, GL_INVALID_VALUE, "%s(n < 0)", func);
>return;
> }
>
> -   if (!names)
> +   if (!ids)
>return;
>
> /* we don't need contiguous IDs, but this might be faster */
> @@ -854,18 +853,56 @@ _mesa_GenTransformFeedbacks(GLsizei n, GLuint *names)
>   struct gl_transform_feedback_object *obj
>  = ctx->Driver.NewTransformFeedback(ctx, first + i);
>   if (!obj) {
> -_mesa_error(ctx, GL_OUT_OF_MEMORY, "glGenTransformFeedbacks");
> +_mesa_error(ctx, GL_OUT_OF_MEMORY, "%s", func);
>  return;
>   }
> - names[i] = first + i;
> + ids[i] = first + i;
>   _mesa_HashInsert(ctx->TransformFeedback.Objects, first + i, obj);
> + if (dsa) {
> +/* this is normally done at bind time in the non-dsa case */
> +obj->EverBound = GL_TRUE;
> + }
>}
> }
> else {
> -  _mesa_error(ctx, GL_OUT_OF_MEMORY, "glGenTransformFeedbacks");
> +  _mesa_error(ctx, GL_OUT_OF_MEMORY, "%s", func);
> }
>  }
>
> +/**
> + * Create new transform feedback objects.   Transform feedback objects
> + * encapsulate the state related to transform feedback to allow quickly
> + * switching state (and drawing the results, below).
> + * Part of GL_ARB_transform_feedback2.
> + */
> +void GLAPIENTRY
> +_mesa_GenTransformFeedbacks(GLsizei n, GLuint *names)
> +{
> +   GET_CURRENT_CONTEXT(ctx);
> +
> +   /* GenTransformFeedbacks should just reserve the object names that a
> +* subsequent call to BindTransformFeedback should actively create. For
> +* the sake of simplicity, we reserve the names and create the objects
> +* straight away.
> +*/
> +
> +   create_transform_feedbacks(ctx, n, names, false);
> +}
> +
> +/**
> + * Create new transform feedback objects.   Transform feedback objects
> + * encapsulate the state related to transform feedback to allow quickly
> + * switching state (and drawing the results, below).
> + * Part of GL_ARB_direct_state_access.
> + */
> +void GLAPIENTRY
> +_mesa_CreateTransformFeedbacks(GLs

Re: [Mesa-dev] [PATCH 03/16] main: fix the validation of the number of samples

2015-02-20 Thread Laura Ekstrand
Please provide a page number and a section title in your spec comment.

Thanks.

On Mon, Feb 16, 2015 at 6:13 AM, Martin Peres 
wrote:

> Maybe this should be the job of the dispatch layer.
>
> Signed-off-by: Martin Peres 
> ---
>  src/mesa/main/multisample.c | 9 +
>  1 file changed, 9 insertions(+)
>
> diff --git a/src/mesa/main/multisample.c b/src/mesa/main/multisample.c
> index 1f3fa0c..a0a659b 100644
> --- a/src/mesa/main/multisample.c
> +++ b/src/mesa/main/multisample.c
> @@ -150,6 +150,15 @@ GLenum
>  _mesa_check_sample_count(struct gl_context *ctx, GLenum target,
>   GLenum internalFormat, GLsizei samples)
>  {
> +   /* From the OpenGL core 3.0 spec, section 2.5:
> +*
> +* "If a negative number is provided where an argument of type sizei or
> +* sizeiptr is specified, the error INVALID VALUE is generated."
> +*/
> +   if (samples < 0) {
> +  return GL_INVALID_VALUE;
> +   }
> +
> /* If ARB_internalformat_query is supported, then treat its highest
>  * returned sample count as the absolute maximum for this format; it is
>  * allowed to exceed MAX_SAMPLES.
> --
> 2.3.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] glsl: add double support for packing varyings

2015-02-20 Thread Ilia Mirkin
Doubles are always packed, but a single double will never cross a slot
boundary -- single slots can still be wasted in some situations.

Signed-off-by: Ilia Mirkin 
---

Packing *all* doubles is a bit of a hack... we shouldn't need to do it
for dvec2/dvec4. And yet, here we are. This means that setting
explicit locations on doubles is also unlikely to go well. But this
should be enough to get it mostly going.

v1 -> v2:
 - split out variables into a separate list so that it can work for GS

 src/glsl/lower_packed_varyings.cpp | 117 -
 1 file changed, 90 insertions(+), 27 deletions(-)

diff --git a/src/glsl/lower_packed_varyings.cpp 
b/src/glsl/lower_packed_varyings.cpp
index 5e844c7..2c9a1c4 100644
--- a/src/glsl/lower_packed_varyings.cpp
+++ b/src/glsl/lower_packed_varyings.cpp
@@ -146,7 +146,11 @@
 
 #include "glsl_symbol_table.h"
 #include "ir.h"
+#include "ir_builder.h"
 #include "ir_optimization.h"
+#include "program/prog_instruction.h"
+
+using namespace ir_builder;
 
 namespace {
 
@@ -163,13 +167,14 @@ public:
lower_packed_varyings_visitor(void *mem_ctx, unsigned locations_used,
  ir_variable_mode mode,
  unsigned gs_input_vertices,
- exec_list *out_instructions);
+ exec_list *out_instructions,
+ exec_list *out_variables);
 
void run(exec_list *instructions);
 
 private:
-   ir_assignment *bitwise_assign_pack(ir_rvalue *lhs, ir_rvalue *rhs);
-   ir_assignment *bitwise_assign_unpack(ir_rvalue *lhs, ir_rvalue *rhs);
+   void bitwise_assign_pack(ir_rvalue *lhs, ir_rvalue *rhs);
+   void bitwise_assign_unpack(ir_rvalue *lhs, ir_rvalue *rhs);
unsigned lower_rvalue(ir_rvalue *rvalue, unsigned fine_location,
  ir_variable *unpacked_var, const char *name,
  bool gs_input_toplevel, unsigned vertex_index);
@@ -221,13 +226,19 @@ private:
 * appropriate place in the shader once the visitor has finished running.
 */
exec_list *out_instructions;
+
+   /**
+* Exec list into which the visitor should insert any new variables.
+*/
+   exec_list *out_variables;
 };
 
 } /* anonymous namespace */
 
 lower_packed_varyings_visitor::lower_packed_varyings_visitor(
   void *mem_ctx, unsigned locations_used, ir_variable_mode mode,
-  unsigned gs_input_vertices, exec_list *out_instructions)
+  unsigned gs_input_vertices, exec_list *out_instructions,
+  exec_list *out_variables)
: mem_ctx(mem_ctx),
  locations_used(locations_used),
  packed_varyings((ir_variable **)
@@ -235,7 +246,8 @@ 
lower_packed_varyings_visitor::lower_packed_varyings_visitor(
 locations_used)),
  mode(mode),
  gs_input_vertices(gs_input_vertices),
- out_instructions(out_instructions)
+ out_instructions(out_instructions),
+ out_variables(out_variables)
 {
 }
 
@@ -274,6 +286,7 @@ lower_packed_varyings_visitor::run(exec_list *instructions)
}
 }
 
+#define SWIZZLE_ZWZW MAKE_SWIZZLE4(SWIZZLE_Z, SWIZZLE_W, SWIZZLE_Z, SWIZZLE_W)
 
 /**
  * Make an ir_assignment from \c rhs to \c lhs, performing appropriate
@@ -281,7 +294,7 @@ lower_packed_varyings_visitor::run(exec_list *instructions)
  *
  * This function is called when packing varyings.
  */
-ir_assignment *
+void
 lower_packed_varyings_visitor::bitwise_assign_pack(ir_rvalue *lhs,
ir_rvalue *rhs)
 {
@@ -300,12 +313,28 @@ 
lower_packed_varyings_visitor::bitwise_assign_pack(ir_rvalue *lhs,
  rhs = new(this->mem_ctx)
 ir_expression(ir_unop_bitcast_f2i, lhs->type, rhs);
  break;
+  case GLSL_TYPE_DOUBLE:
+ assert(rhs->type->vector_elements <= 2);
+ if (rhs->type->vector_elements == 2) {
+ir_variable *t = new(mem_ctx) ir_variable(lhs->type, "pack", 
ir_var_temporary);
+
+assert(lhs->type->vector_elements == 4);
+this->out_variables->push_tail(t);
+this->out_instructions->push_tail(
+  assign(t, u2i(expr(ir_unop_unpack_double_2x32, 
swizzle_x(rhs->clone(mem_ctx, NULL, 0x3));
+this->out_instructions->push_tail(
+  assign(t,  u2i(expr(ir_unop_unpack_double_2x32, 
swizzle_y(rhs))), 0xc));
+rhs = deref(t).val;
+ } else {
+rhs = u2i(expr(ir_unop_unpack_double_2x32, rhs));
+ }
+ break;
   default:
  assert(!"Unexpected type conversion while lowering varyings");
  break;
   }
}
-   return new(this->mem_ctx) ir_assignment(lhs, rhs);
+   this->out_instructions->push_tail(new (this->mem_ctx) ir_assignment(lhs, 
rhs));
 }
 
 
@@ -315,7 +344,7 @@ 
lower_packed_varyings_visitor::bitwise_assign_pack(ir_rvalue *lhs,
  *
  * This function is called when unpacking varyings.
  */
-ir_assignment *
+void
 

Re: [Mesa-dev] [PATCH] i965/skl: Use 1 register for uniform pull constant payload

2015-02-20 Thread Kenneth Graunke
On Friday, February 20, 2015 10:33:18 AM Ben Widawsky wrote:
> On Fri, Feb 20, 2015 at 03:34:21AM -0800, Kenneth Graunke wrote:
> > On Thursday, February 19, 2015 10:48:08 PM Ben Widawsky wrote:
> > > When under dispatch_width=16 the previous code would allocate 2 registers 
> > > for
> > > the payload when only one is needed. This manifested itself through bugs 
> > > on SKL
> > > which needs to mess with this instruction.
> > > 
> > > Ken though this might impact shader-db, but apparently it doesn't
> > > 
> > > Cc: Kenneth Graunke 
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89118
> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88999
> > > Signed-off-by: Ben Widawsky 
> > > ---
> > >  src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> > > b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > > index c46e1d7..24125cc 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > > @@ -3062,7 +3062,7 @@ fs_visitor::lower_uniform_pull_constant_loads()
> > >   assert(const_offset_reg.file == IMM &&
> > >  const_offset_reg.type == BRW_REGISTER_TYPE_UD);
> > >   const_offset_reg.fixed_hw_reg.dw1.ud /= 4;
> > > - fs_reg payload = vgrf(glsl_type::uint_type);
> > > + fs_reg payload = fs_reg(GRF, alloc.allocate(1));
> > >  
> > >   /* We have to use a message header on Skylake to get SIMD4x2 
> > > mode.
> > >* Reserve space for the register.
> > > 
> > 
> > Reviewed-by: Kenneth Graunke 
> 
> Given that this seems to be required to get SKL stable, do I want to Cc 1.5 
> too?

That seems reasonable to me.  It likely won't apply (or build), since I
don't think 10.5 has alloc.allocate().  To be nice to Emil, I'm guessing
the right thing to do is probably to cherry-pick it to 10.5, change
alloc.allocate(1) to virtual_grf_alloc(1), and send that patch to
mesa-sta...@lists.freedesktop.org marking it as a backport.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/16] main: fix the validation of the number of samples

2015-02-20 Thread Ilia Mirkin
On Mon, Feb 16, 2015 at 9:13 AM, Martin Peres
 wrote:
> Maybe this should be the job of the dispatch layer.
>
> Signed-off-by: Martin Peres 
> ---
>  src/mesa/main/multisample.c | 9 +
>  1 file changed, 9 insertions(+)
>
> diff --git a/src/mesa/main/multisample.c b/src/mesa/main/multisample.c
> index 1f3fa0c..a0a659b 100644
> --- a/src/mesa/main/multisample.c
> +++ b/src/mesa/main/multisample.c
> @@ -150,6 +150,15 @@ GLenum
>  _mesa_check_sample_count(struct gl_context *ctx, GLenum target,
>   GLenum internalFormat, GLsizei samples)
>  {
> +   /* From the OpenGL core 3.0 spec, section 2.5:

I think this is a bit of a contradiction... there is no OpenGL 3.0
core spec. Core didn't become a thing until 3.1 afaik.

> +*
> +* "If a negative number is provided where an argument of type sizei or
> +* sizeiptr is specified, the error INVALID VALUE is generated."
> +*/
> +   if (samples < 0) {
> +  return GL_INVALID_VALUE;
> +   }
> +
> /* If ARB_internalformat_query is supported, then treat its highest
>  * returned sample count as the absolute maximum for this format; it is
>  * allowed to exceed MAX_SAMPLES.
> --
> 2.3.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Jason Ekstrand
On Fri, Feb 20, 2015 at 1:09 PM, Francisco Jerez 
wrote:

> Jason Ekstrand  writes:
>
> > On Fri, Feb 20, 2015 at 4:11 AM, Francisco Jerez 
> > wrote:
> >
> >> Jason Ekstrand  writes:
> >>
> >> > I'm still a little pensive.  But
> >> >
> >> > Reviewed-by: Jason Ekstrand 
> >> >
> >> Thanks.
> >>
> >> > Now for a little aside.  I have come to the conclusion that I made a
> >> grave
> >> > mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I should
> have
> >> > just subclassed fs_inst for load_payload.  The problem is that we
> need to
> >> > snag a bunch of information for the sources when we create the
> >> > load_payload.  In particular, we need to know the width of the source
> so
> >> > that we know how much space it consumes in the payload and we need to
> >> know
> >> > the information required to properly re-create the mov such as
> >> > force_sechalf and force_writemask_all.  Really, in order to do things
> >> > properly, we need to gather this information *before* we do any
> >> > optimizations.  The nasty pile of code that you're editing together
> with
> >> > the "effective_width" parameter is a lame attempt to
> capture/reconstruct
> >> > this information.  Really, we should just subclass, capture the
> >> information
> >> > up-front, and do it properly.
> >> >
> >> Yes, absolutely, this lowering pass is a real mess.  There are four more
> >> unreviewed patches in the mailing list fixing bugs of the metadata
> >> guessing of lower_load_payload() [1][2][3][4], you may want to take a
> >> look at them.  There are more bugs I'm aware of but it didn't seem
> >> practical to fix them.
> >>
> >
> > Yeah, Matt pointed me at those.  I'll give them a look later today.
> >
> >
> >> That said, I don't think it would be worth subclassing fs_inst.
> >>
> >> My suggestion would have been to keep it simple and lower LOAD_PAYLOAD
> >> into a series of MOVs with force_writemask_all enabled in all cases,
> >> simply rely on the optimizer to eliminate those where possible.  Then
> >> get rid of the metadata and effective_width guessing.  Instead agree on
> >> immediates and uniforms being exec_size-wide by convention
> >> (i.e. LOAD_PAYLOAD's exec_size rather than the original instruction's),
> >> then prevent constant propagation from propagating an immediate load
> >> into a LOAD_PAYLOAD if it would lead to a change in the semantics of the
> >> program, and maybe just run constant propagation after lowering so we
> >> can be sure those cases are cleaned up properly where register coalesce
> >> isn't enough.
> >>
> >
> > First off, simply setting force_writemask_all isn't an option.  Consider
> > the following scenario:
> >
> > a = foo;
> > if (bar) {
> >b = 7;
> > } else {
> >use(a);
> > }
> > use(b);
> >
> > If "use(a)" is the last use of the variable a, then the live ranges of a
> > and b don't actually over-lap and we can assign a and b to the same
> > register.  However, if force_writemask_all is set on b, it will destroy
> the
> > value in a before its last use.  Right now, we don't actually do this
> > because our liveness analysis pass flattens everything so it will think
> > that b and a over-lap even though they don't.  However, if we ever decide
> > to make the liveness information more accurate, having a pile of
> > force_writemask_all instructions will be a major problem.  So, while it
> is
> > *technically* safe for now, it's a really bad idea in the long-term.
> >
> The thing is we *will* have to deal with that scenario.  Building a
> message payload inherently involves crossing channel boundaries (because
> of headers, unsupported SIMD modes by some shared functions, etc.).  I'd
> say it's a bug in the liveness analysis pass if it wouldn't consider the
> liveness interval of a and b to overlap in your example if the
> assignment of b had force_writemask_all set.
>

Yes, I'm aware of that.  However, the part of that register that has to
squash everything is usually only a couple of registers while the entire
payload may be up to 13 (if I remmeber correctly).  We'd rather not have to
squash everything if we can.  All that being said, our liveness/register
allocation can't handle this and making register allocation handle parts of
registers interfering but not other bits is going to be a real pain.  Maybe
not even worth it.  If you'd rather force_writemask_all everything, I'll
sign off on that for now.  I just wanted to point out that it may not be a
good permanent solution.


> A reasonable compromise might be to leave it up to the caller whether to
> set the force_writemask_all and force_sechalf flags or not.  I.e. just
> copy the same flags set on the LOAD_PAYLOAD instruction to the MOV
> instructions.  That would still allow reducing the liveness intervals in
> cases where we know that the payload respects channel boundaries.
>
> Tracking the flag information per-register in cases where the same
> payload has well- and ill-behaved values seems rather premature and
> rather useles

Re: [Mesa-dev] glCompressedTex(Sub)Image(2,3) on mapped PBOs

2015-02-20 Thread Ian Romanick
On 02/20/2015 08:13 AM, Eduardo Lima Mitev wrote:
> Hello,
> 
> While working on the following dEQP failing tests I ran into an issue
> that I think deserves clarification from more experienced people:
> 
> dEQP-GLES3.functional.negative_api.texture.compressedteximage2d_invalid_buffer_target
> dEQP-GLES3.functional.negative_api.texture.compressedtexsubimage2d_invalid_buffer_target
> dEQP-GLES3.functional.negative_api.texture.compressedteximage3d_invalid_buffer_target
> dEQP-GLES3.functional.negative_api.texture.compressedtexsubimage3d_invalid_buffer_target
> 
> All above tests fail because a CompressedTex(Sub)Image(2,3)D operation
> is performed while there is a pixel buffer object bound to
> GL_PIXEL_UNPACK_BUFFER and that pixel buffer object is currently mapped.
> The tests expect this situation to throw an GL_INVALID_OPERATION but no
> error is returned on i965.
> 
> In the spec GLES 3.1 spec, page 57, there is this: (and similar wording
> in GL 4.5 spec, page 76)
> 
> "6.3.2 Effects of Mapping Buffers on Other GL Commands
> 
> "Any GL command which attempts to read from, write to, or change the
> state of a buffer object may generate an INVALID_OPERATION error if all
> or part of the buffer object is mapped. However, only commands which
> explicitly describe this error are required to do so. If an error is not
> generated, using such commands to perform invalid reads, writes, or
> state changes will have undefined results and may result in GL
> interruption or termination."

This language is intentionally loose.  It is often perfectly safe to
have a portion of a buffer mapped while a GL command operates on a
different part of the buffer.  Determining whether a particular
operation is safe can be expensive.  Imagine trying to determine whether
a glDrawElements command will source data from a mapped region of a
vertex buffer object.  Many vendors objected to having to put expensive
(or cheap) checks in performance critical paths.

I don't think an of the TexSubImage commands are performance critical
enough to warrant excluding the test.  I believe some benchmarks use
some of the TexSubImage commands, so it's probably worth measuring them
on, say, Baytrail with and without the checks.  My expectation is that
there's enough CPU wasted in those paths that the checks will be lost in
the noise.  We won't know until we measure.

> There is no further restriction in gl(Compressed)Tex(Sub)Image(2,3)D
> that makes throwing an error mandatory, hence I suspect these tests
> might be wrong.

I'm also inclined to think they are wrong... or at least overly strict.

Do you know of any commands that require this error be generated?

> However, I found that glTex(Sub)Image(2,3)D do emits an
> GL_INVALID_OPERATION error when executed while a mapped PBO is bound.
> But the check is performed in driver code (as opposed to API entry
> points). (Concretely, in i965/intel_tex_image.c::intelTexImage() and
> i965/intel_tex_subimage.c::intelTexSubImage() which both call
> common/meta_tex_subimage.c::_mesa_meta_pbo_TexSubImage()).
> 
> So, the question is:
> 
> Shouldn't these operations on compressed texture images have the same
> restriction as their non-compressed counter-parts?
> 
> Or is there a reason why this restriction is enforced only on
> non-compressed images?
> 
> Also, shouldn't this restriction be moved to common API entry points, or
> are there drivers that do write to PBOs while they are mapped?

Looking at that function, I find it odd that either of the errors it
generates are generated there.  We should either do these checks in a
single, common place, or we should not do them at all.

> cheers,
> Eduardo
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/7] i965/fs: Less broken handling of force_writemask_all in lower_load_payload().

2015-02-20 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Sat, Jan 17, 2015 at 6:04 PM, Francisco Jerez 
wrote:

> It's perfectly fine to read the second half of a register written with
> force_writemask_all from a first half MOV instruction or vice versa, and
> lower_load_payload shouldn't mark the whole MOV as belonging to the second
> half in that case.  Replicate the same metadata to both halves of the
> destination when writemasking is disabled.
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 20 +---
>  1 file changed, 13 insertions(+), 7 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 4a61943..d585a67 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -3059,9 +3059,11 @@ fs_visitor::lower_load_payload()
>}
>
>if (inst->dst.file == MRF || inst->dst.file == GRF) {
> - bool force_sechalf = inst->force_sechalf;
> + bool force_sechalf = inst->force_sechalf &&
> +  !inst->force_writemask_all;
>   bool toggle_sechalf = inst->dst.width == 16 &&
> -   type_sz(inst->dst.type) == 4;
> +   type_sz(inst->dst.type) == 4 &&
> +   !inst->force_writemask_all;
>   for (int i = 0; i < inst->regs_written; ++i) {
>  metadata[dst_reg + i].written = true;
>  metadata[dst_reg + i].force_sechalf = force_sechalf;
> @@ -3104,11 +3106,15 @@ fs_visitor::lower_load_payload()
>mov->force_writemask_all =
> metadata[src_reg].force_writemask_all;
>metadata[dst_reg] = metadata[src_reg];
>if (dst.width * type_sz(dst.type) > 32) {
> - assert((!metadata[src_reg].written ||
> - !metadata[src_reg].force_sechalf) &&
> -(!metadata[src_reg + 1].written ||
> - metadata[src_reg + 1].force_sechalf));
> - metadata[dst_reg + 1] = metadata[src_reg + 1];
> + if (metadata[src_reg].force_writemask_all) {
> +metadata[dst_reg + 1] = metadata[src_reg];
> + } else {
> +assert((!metadata[src_reg].written ||
> +!metadata[src_reg].force_sechalf) &&
> +   (!metadata[src_reg + 1].written ||
> +metadata[src_reg + 1].force_sechalf));
> +metadata[dst_reg + 1] = metadata[src_reg + 1];
> + }
>}
> } else {
>metadata[dst_reg].force_writemask_all = false;
> --
> 2.1.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/32] i965/fs: Fix lower_load_payload() not to use an incorrect half for immediates and uniforms.

2015-02-20 Thread Jason Ekstrand
Yeah... More proof that the lower_load_payload code is plenty bogus...

Reviewed-by: Jason Ekstrand 

On Fri, Feb 6, 2015 at 9:42 AM, Francisco Jerez 
wrote:

> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 8da1f47..e2ebf7e 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -3116,6 +3116,14 @@ fs_visitor::lower_load_payload()
>  inst->src[i].reg_offset;
>mov->force_sechalf = metadata[src_reg].force_sechalf;
>mov->force_writemask_all =
> metadata[src_reg].force_writemask_all;
> +   } else {
> +  /* We don't have any useful metadata for immediates or
> +   * uniforms.  Assume that any of the channels of the
> +   * destination may be used.
> +   */
> +  assert(inst->src[i].file == IMM ||
> + inst->src[i].file == UNIFORM);
> +  mov->force_writemask_all = true;
> }
>
> if (dst.file == GRF) {
> --
> 2.1.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/32] i965/fs: Fix lower_load_payload() to take into account stride in the metadata guess.

2015-02-20 Thread Jason Ekstrand
Where are you getting a destination stride in a load_payload?  The whole
point of load_payload was to remove partial writes so this seems to go
against everything it's intended for.
--Jason

On Fri, Feb 6, 2015 at 9:42 AM, Francisco Jerez 
wrote:

> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 163aa41..8da1f47 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -3074,7 +3074,7 @@ fs_visitor::lower_load_payload()
>   bool force_sechalf = inst->force_sechalf &&
>!inst->force_writemask_all;
>   bool toggle_sechalf = inst->dst.width == 16 &&
> -   type_sz(inst->dst.type) == 4 &&
> +   type_sz(inst->dst.type) * inst->dst.stride
> == 4 &&
> !inst->force_writemask_all;
>   for (int i = 0; i < inst->regs_written; ++i) {
>  metadata[dst_reg + i].written = true;
> --
> 2.1.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] gallium: add some more double opcodes to avoid unnecessary lowering

2015-02-20 Thread Ilia Mirkin
On Fri, Feb 20, 2015 at 9:37 AM, Roland Scheidegger  wrote:
> Just some minor nits.
>
> Am 20.02.2015 um 00:52 schrieb Ilia Mirkin:
>> Signed-off-by: Ilia Mirkin 
>> ---
>>  src/gallium/auxiliary/tgsi/tgsi_info.c |  5 
>>  src/gallium/docs/source/tgsi.rst   | 39 
>> ++
>>  src/gallium/include/pipe/p_shader_tokens.h |  7 +-
>>  3 files changed, 50 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c 
>> b/src/gallium/auxiliary/tgsi/tgsi_info.c
>> index d04f9da..4d838fd 100644
>> --- a/src/gallium/auxiliary/tgsi/tgsi_info.c
>> +++ b/src/gallium/auxiliary/tgsi/tgsi_info.c
>> @@ -257,6 +257,11 @@ static const struct tgsi_opcode_info 
>> opcode_info[TGSI_OPCODE_LAST] =
>> { 1, 1, 0, 0, 0, 0, COMP, "D2U", TGSI_OPCODE_D2U },
>> { 1, 1, 0, 0, 0, 0, COMP, "U2D", TGSI_OPCODE_U2D },
>> { 1, 1, 0, 0 ,0, 0, COMP, "DRSQ", TGSI_OPCODE_DRSQ },
>> +   { 1, 1, 0, 0, 0, 0, COMP, "DTRUNC", TGSI_OPCODE_DTRUNC },
>> +   { 1, 1, 0, 0, 0, 0, COMP, "DCEIL", TGSI_OPCODE_DCEIL },
>> +   { 1, 1, 0, 0, 0, 0, COMP, "DFLR", TGSI_OPCODE_DFLR },
>> +   { 1, 1, 0, 0, 0, 0, COMP, "DROUND", TGSI_OPCODE_DROUND },
>> +   { 1, 1, 0, 0, 0, 0, COMP, "DSSG", TGSI_OPCODE_DSSG },
>>  };
>>
>>  const struct tgsi_opcode_info *
>> diff --git a/src/gallium/docs/source/tgsi.rst 
>> b/src/gallium/docs/source/tgsi.rst
>> index e20af79..15f1e9f 100644
>> --- a/src/gallium/docs/source/tgsi.rst
>> +++ b/src/gallium/docs/source/tgsi.rst
>> @@ -1861,6 +1861,45 @@ two-component vectors with doubled precision in each 
>> component.
>>
>>dst.zw = src.zw - \lfloor src.zw\rfloor
>>
>> +.. opcode:: DTRUNC - Truncate
>> +
>> +.. math::
>> +
>> +  dst.xy = trunc(src.xy)
>> +
>> +  dst.zw = trunc(src.zw)
>> +
>> +.. opcode:: DCEIL - Ceiling
>> +
>> +.. math::
>> +
>> +  dst.xy = \lceil src.xy\rceil
>> +
>> +  dst.zw = \lceil src.zw\rceil
>> +
>> +.. opcode:: DFLR - Floor
>> +
>> +.. math::
>> +
>> +  dst.xy = \lfloor src.xy\rfloor
>> +
>> +  dst.zw = \lfloor src.zw\rfloor
>> +
>> +.. opcode:: DROUND - Fraction
>> +
>> +.. math::
>> +
>> +  dst.xy = round(src.xy)
>> +
>> +  dst.zw = round(src.zw)
>> +
>> +.. opcode:: DSSG - Set Sign
>> +
>> +.. math::
>> +
>> +  dst.xy = (src.xy > 0) ? 1 : (src.xy < 0) ? -1 : 0
>> +
>> +  dst.zw = (src.zw > 0) ? 1 : (src.zw < 0) ? -1 : 0
> I think this would be more obvious written as 1.0/-1.0/0.0 (same goes
> for the non-double version, of course).

Yep, that was Dave's comment too. I've fixed it here, left SSG alone.

> (As a side note, I'm wondering if this actually has defined NaN
> behavior, the glsl language sort of implies a number has to be exactly
> one of these 3 cases, and doesn't give an answer what should be returned
> for a NaN - maybe a NaN?)

I'll leave that for another day :)

>
>>
>>  .. opcode:: DFRACEXP - Convert Number to Fractional and Integral Components
>>
>> diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
>> b/src/gallium/include/pipe/p_shader_tokens.h
>> index fc41cc9..95ac590 100644
>> --- a/src/gallium/include/pipe/p_shader_tokens.h
>> +++ b/src/gallium/include/pipe/p_shader_tokens.h
>> @@ -519,7 +519,12 @@ struct tgsi_property_data {
>>  #define TGSI_OPCODE_D2U 215
>>  #define TGSI_OPCODE_U2D 216
>>  #define TGSI_OPCODE_DRSQ217 /* eg, cayman also has DRSQ */
>> -#define TGSI_OPCODE_LAST218
>> +#define TGSI_OPCODE_DTRUNC  218 /* nvc0 */
>> +#define TGSI_OPCODE_DCEIL   219 /* nvc0 */
>> +#define TGSI_OPCODE_DFLR220 /* nvc0 */
>> +#define TGSI_OPCODE_DROUND  221 /* nvc0 */
>> +#define TGSI_OPCODE_DSSG222
>> +#define TGSI_OPCODE_LAST223
>
> I don't think these hw-specific references (nvc0) really fit in there -
> the same can be said about some existing ones (cayman...)

I went with the flow... started by Dave, I believe. Happy to remove
them all too. Dave -- thoughts?

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/32] i965/fs: Fix lower_load_payload() to take into account non-zero reg_offset.

2015-02-20 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Fri, Feb 6, 2015 at 9:42 AM, Francisco Jerez 
wrote:

> Fixes metadata guess when instructions in the program specify a
> destination register with non-zero reg_offset and when the payload of
> a LOAD_PAYLOAD spans several registers.
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index eccae06..163aa41 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -3070,7 +3070,7 @@ fs_visitor::lower_load_payload()
>
> foreach_block_and_inst_safe (block, fs_inst, inst, cfg) {
>if (inst->dst.file == GRF) {
> - const int dst_reg = vgrf_to_reg[inst->dst.reg];
> + const int dst_reg = vgrf_to_reg[inst->dst.reg] +
> inst->dst.reg_offset;
>   bool force_sechalf = inst->force_sechalf &&
>!inst->force_writemask_all;
>   bool toggle_sechalf = inst->dst.width == 16 &&
> @@ -3119,7 +3119,7 @@ fs_visitor::lower_load_payload()
> }
>
> if (dst.file == GRF) {
> -  const int dst_reg = vgrf_to_reg[dst.reg];
> +  const int dst_reg = vgrf_to_reg[dst.reg] +
> dst.reg_offset;
>const bool force_writemask = mov->force_writemask_all;
>metadata[dst_reg].force_writemask_all = force_writemask;
>metadata[dst_reg].force_sechalf = mov->force_sechalf;
> --
> 2.1.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/6] gallium/sw/kms: don't redefine DEBUG

2015-02-20 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c 
b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
index ed34dfa..781818a 100644
--- a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
+++ b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
@@ -53,9 +53,9 @@
 #include "state_tracker/drm_driver.h"
 
 #if 0
-#define DEBUG(msg, ...) fprintf(stderr, msg, __VA_ARGS__)
+#define DEBUG_PRINT(msg, ...) fprintf(stderr, msg, __VA_ARGS__)
 #else
-#define DEBUG(msg, ...)
+#define DEBUG_PRINT(msg, ...)
 #endif
 
 struct sw_winsys;
@@ -145,7 +145,7 @@ kms_sw_displaytarget_create(struct sw_winsys *ws,
 
list_add(&kms_sw_dt->link, &kms_sw->bo_list);
 
-   DEBUG("KMS-DEBUG: created buffer %u (size %u)\n", kms_sw_dt->handle, 
kms_sw_dt->size);
+   DEBUG_PRINT("KMS-DEBUG: created buffer %u (size %u)\n", kms_sw_dt->handle, 
kms_sw_dt->size);
 
*stride = kms_sw_dt->stride;
return (struct sw_displaytarget *)kms_sw_dt;
@@ -177,7 +177,7 @@ kms_sw_displaytarget_destroy(struct sw_winsys *ws,
 
list_del(&kms_sw_dt->link);
 
-   DEBUG("KMS-DEBUG: destroyed buffer %u\n", kms_sw_dt->handle);
+   DEBUG_PRINT("KMS-DEBUG: destroyed buffer %u\n", kms_sw_dt->handle);
 
FREE(kms_sw_dt);
 }
@@ -205,7 +205,7 @@ kms_sw_displaytarget_map(struct sw_winsys *ws,
if (kms_sw_dt->mapped == MAP_FAILED)
   return NULL;
 
-   DEBUG("KMS-DEBUG: mapped buffer %u (size %u) at %p\n",
+   DEBUG_PRINT("KMS-DEBUG: mapped buffer %u (size %u) at %p\n",
  kms_sw_dt->handle, kms_sw_dt->size, kms_sw_dt->mapped);
 
return kms_sw_dt->mapped;
@@ -249,7 +249,7 @@ kms_sw_displaytarget_unmap(struct sw_winsys *ws,
 {
struct kms_sw_displaytarget *kms_sw_dt = kms_sw_displaytarget(dt);
 
-   DEBUG("KMS-DEBUG: unmapped buffer %u (was %p)\n", kms_sw_dt->handle, 
kms_sw_dt->mapped);
+   DEBUG_PRINT("KMS-DEBUG: unmapped buffer %u (was %p)\n", kms_sw_dt->handle, 
kms_sw_dt->mapped);
 
munmap(kms_sw_dt->mapped, kms_sw_dt->size);
kms_sw_dt->mapped = NULL;
@@ -283,7 +283,7 @@ kms_sw_displaytarget_from_handle(struct sw_winsys *ws,
  if (kms_sw_dt->handle == whandle->handle) {
 kms_sw_dt->ref_count++;
 
-DEBUG("KMS-DEBUG: imported buffer %u (size %u)\n", 
kms_sw_dt->handle, kms_sw_dt->size);
+DEBUG_PRINT("KMS-DEBUG: imported buffer %u (size %u)\n", 
kms_sw_dt->handle, kms_sw_dt->size);
 
 *stride = kms_sw_dt->stride;
 return (struct sw_displaytarget *)kms_sw_dt;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/6] gallivm: fix uninitialized-variable warnings

2015-02-20 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/auxiliary/gallivm/lp_bld_init.c   | 2 +-
 src/gallium/auxiliary/gallivm/lp_bld_sample.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c 
b/src/gallium/auxiliary/gallivm/lp_bld_init.c
index b9593de..6133883 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_init.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c
@@ -509,7 +509,7 @@ void
 gallivm_compile_module(struct gallivm_state *gallivm)
 {
LLVMValueRef func;
-   int64_t time_begin;
+   int64_t time_begin = 0;
 
assert(!gallivm->compiled);
 
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.c 
b/src/gallium/auxiliary/gallivm/lp_bld_sample.c
index 093c8fc..7d18ee5 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_sample.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.c
@@ -221,7 +221,7 @@ lp_build_rho(struct lp_build_sample_context *bld,
struct lp_build_context *coord_bld = &bld->coord_bld;
struct lp_build_context *rho_bld = &bld->lodf_bld;
const unsigned dims = bld->dims;
-   LLVMValueRef ddx_ddy[2];
+   LLVMValueRef ddx_ddy[2] = {NULL};
LLVMBuilderRef builder = bld->gallivm->builder;
LLVMTypeRef i32t = LLVMInt32TypeInContext(bld->gallivm->context);
LLVMValueRef index0 = LLVMConstInt(i32t, 0, 0);
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/6] gallium/sw/kms: fix a type-mismatch warning

2015-02-20 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c 
b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
index 781818a..ce3de78 100644
--- a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
+++ b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
@@ -313,7 +313,7 @@ kms_sw_displaytarget_get_handle(struct sw_winsys *winsys,
   return TRUE;
case DRM_API_HANDLE_TYPE_FD:
   if (!drmPrimeHandleToFD(kms_sw->fd, kms_sw_dt->handle,
- DRM_CLOEXEC, &whandle->handle)) {
+ DRM_CLOEXEC, (int*)&whandle->handle)) {
  whandle->stride = kms_sw_dt->stride;
  return TRUE;
   }
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/6] tgsi: fix type-mismatch warning

2015-02-20 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/auxiliary/tgsi/tgsi_text.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_text.c 
b/src/gallium/auxiliary/tgsi/tgsi_text.c
index 5069d13..b6b3585 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_text.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_text.c
@@ -240,7 +240,7 @@ static boolean parse_double( const char **pcur, uint32_t 
*val0, uint32_t *val1)
   uint32_t uval[2];
} v;
 
-   v.dval = strtod(cur, pcur);
+   v.dval = strtod(cur, (char**)pcur);
if (*pcur == cur)
   return FALSE;
 
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/6] targets/d3dadapter9: remove an unused variable

2015-02-20 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/targets/d3dadapter9/drm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/targets/d3dadapter9/drm.c 
b/src/gallium/targets/d3dadapter9/drm.c
index cb0c48f..ec594fd 100644
--- a/src/gallium/targets/d3dadapter9/drm.c
+++ b/src/gallium/targets/d3dadapter9/drm.c
@@ -216,7 +216,7 @@ drm_create_adapter( int fd,
 {
 struct d3dadapter9drm_context *ctx = CALLOC_STRUCT(d3dadapter9drm_context);
 HRESULT hr;
-int i, different_device;
+int different_device;
 const struct drm_conf_ret *throttle_ret = NULL;
 const struct drm_conf_ret *dmabuf_ret = NULL;
 driOptionCache defaultInitOptions;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/6] vbo: fix an unitialized-variable warning

2015-02-20 Thread Marek Olšák
From: Marek Olšák 

It looks like a bug to me.

Cc: 10.5 10.4 10.3 
---
 src/mesa/vbo/vbo_attrib_tmp.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/vbo/vbo_attrib_tmp.h b/src/mesa/vbo/vbo_attrib_tmp.h
index ec66934..0c44540 100644
--- a/src/mesa/vbo/vbo_attrib_tmp.h
+++ b/src/mesa/vbo/vbo_attrib_tmp.h
@@ -210,6 +210,7 @@ static inline float conv_i2_to_norm_float(const struct 
gl_context *ctx, int i2)
   }\
} else if ((type) == GL_UNSIGNED_INT_10F_11F_11F_REV) { \
   float res[4];\
+  res[3] = 1;   \
   r11g11b10f_to_float3((arg), res);\
   ATTR##val##FV((attr), res);  \
} else  \
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] gallivm: fix uninitialized-variable warnings

2015-02-20 Thread Brian Paul

On 02/20/2015 12:22 PM, Marek Olšák wrote:

From: Marek Olšák 

---
  src/gallium/auxiliary/gallivm/lp_bld_init.c   | 2 +-
  src/gallium/auxiliary/gallivm/lp_bld_sample.c | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c 
b/src/gallium/auxiliary/gallivm/lp_bld_init.c
index b9593de..6133883 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_init.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c
@@ -509,7 +509,7 @@ void
  gallivm_compile_module(struct gallivm_state *gallivm)
  {
 LLVMValueRef func;
-   int64_t time_begin;
+   int64_t time_begin = 0;

 assert(!gallivm->compiled);

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.c 
b/src/gallium/auxiliary/gallivm/lp_bld_sample.c
index 093c8fc..7d18ee5 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_sample.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.c
@@ -221,7 +221,7 @@ lp_build_rho(struct lp_build_sample_context *bld,
 struct lp_build_context *coord_bld = &bld->coord_bld;
 struct lp_build_context *rho_bld = &bld->lodf_bld;
 const unsigned dims = bld->dims;
-   LLVMValueRef ddx_ddy[2];
+   LLVMValueRef ddx_ddy[2] = {NULL};
 LLVMBuilderRef builder = bld->gallivm->builder;
 LLVMTypeRef i32t = LLVMInt32TypeInContext(bld->gallivm->context);
 LLVMValueRef index0 = LLVMConstInt(i32t, 0, 0);



For the series, Reviewed-by: Brian Paul 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Jason Ekstrand
One more comment here...  This particularly regards your plan of separating
it into "things that match the destination" and "other things" and not copy
prop uniforms or immediates into the "other things".  There is another case
we need to handle.  On older gens (SNB maybe?) the SIMD16 FB write messages
are in the order (r0, g0, b0, a0, r1, g1, b1, a1) instead of the usual (r0,
r1, g0, g1, b0, b1, a0, a1).  This isn't going to work if we expect SIMD16
load_payload instructions to be lowerable to a bunch of SIMD16 MOVs.  On
those platforms, there is a special type of MOV-to-MRF that can move from
(gN, gN+1) to (mK, mK+4) and we handle that in lower_load_payload right
now.  However, it doesn't follow the standard rules.
--Jason

On Fri, Feb 20, 2015 at 2:10 PM, Jason Ekstrand 
wrote:

>
>
> On Fri, Feb 20, 2015 at 1:09 PM, Francisco Jerez 
> wrote:
>
>> Jason Ekstrand  writes:
>>
>> > On Fri, Feb 20, 2015 at 4:11 AM, Francisco Jerez > >
>> > wrote:
>> >
>> >> Jason Ekstrand  writes:
>> >>
>> >> > I'm still a little pensive.  But
>> >> >
>> >> > Reviewed-by: Jason Ekstrand 
>> >> >
>> >> Thanks.
>> >>
>> >> > Now for a little aside.  I have come to the conclusion that I made a
>> >> grave
>> >> > mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I should
>> have
>> >> > just subclassed fs_inst for load_payload.  The problem is that we
>> need to
>> >> > snag a bunch of information for the sources when we create the
>> >> > load_payload.  In particular, we need to know the width of the
>> source so
>> >> > that we know how much space it consumes in the payload and we need to
>> >> know
>> >> > the information required to properly re-create the mov such as
>> >> > force_sechalf and force_writemask_all.  Really, in order to do things
>> >> > properly, we need to gather this information *before* we do any
>> >> > optimizations.  The nasty pile of code that you're editing together
>> with
>> >> > the "effective_width" parameter is a lame attempt to
>> capture/reconstruct
>> >> > this information.  Really, we should just subclass, capture the
>> >> information
>> >> > up-front, and do it properly.
>> >> >
>> >> Yes, absolutely, this lowering pass is a real mess.  There are four
>> more
>> >> unreviewed patches in the mailing list fixing bugs of the metadata
>> >> guessing of lower_load_payload() [1][2][3][4], you may want to take a
>> >> look at them.  There are more bugs I'm aware of but it didn't seem
>> >> practical to fix them.
>> >>
>> >
>> > Yeah, Matt pointed me at those.  I'll give them a look later today.
>> >
>> >
>> >> That said, I don't think it would be worth subclassing fs_inst.
>> >>
>> >> My suggestion would have been to keep it simple and lower LOAD_PAYLOAD
>> >> into a series of MOVs with force_writemask_all enabled in all cases,
>> >> simply rely on the optimizer to eliminate those where possible.  Then
>> >> get rid of the metadata and effective_width guessing.  Instead agree on
>> >> immediates and uniforms being exec_size-wide by convention
>> >> (i.e. LOAD_PAYLOAD's exec_size rather than the original instruction's),
>> >> then prevent constant propagation from propagating an immediate load
>> >> into a LOAD_PAYLOAD if it would lead to a change in the semantics of
>> the
>> >> program, and maybe just run constant propagation after lowering so we
>> >> can be sure those cases are cleaned up properly where register coalesce
>> >> isn't enough.
>> >>
>> >
>> > First off, simply setting force_writemask_all isn't an option.  Consider
>> > the following scenario:
>> >
>> > a = foo;
>> > if (bar) {
>> >b = 7;
>> > } else {
>> >use(a);
>> > }
>> > use(b);
>> >
>> > If "use(a)" is the last use of the variable a, then the live ranges of a
>> > and b don't actually over-lap and we can assign a and b to the same
>> > register.  However, if force_writemask_all is set on b, it will destroy
>> the
>> > value in a before its last use.  Right now, we don't actually do this
>> > because our liveness analysis pass flattens everything so it will think
>> > that b and a over-lap even though they don't.  However, if we ever
>> decide
>> > to make the liveness information more accurate, having a pile of
>> > force_writemask_all instructions will be a major problem.  So, while it
>> is
>> > *technically* safe for now, it's a really bad idea in the long-term.
>> >
>> The thing is we *will* have to deal with that scenario.  Building a
>> message payload inherently involves crossing channel boundaries (because
>> of headers, unsupported SIMD modes by some shared functions, etc.).  I'd
>> say it's a bug in the liveness analysis pass if it wouldn't consider the
>> liveness interval of a and b to overlap in your example if the
>> assignment of b had force_writemask_all set.
>>
>
> Yes, I'm aware of that.  However, the part of that register that has to
> squash everything is usually only a couple of registers while the entire
> payload may be up to 13 (if I remmeber correctly).  We'd rather not have to
> squ

[Mesa-dev] [Bug 67676] Transparent windows no longer work

2015-02-20 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=67676

Chad Versace  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|mesa-dev@lists.freedesktop. |jonny.l...@collabora.co.uk
   |org |

--- Comment #14 from Chad Versace  ---
n reply to Jonny Lamb from comment #13)
> Created attachment 113563 [details]
> version three

Comments:

- Good call on specifying pre-multiplied alpha.

- I think you should remove the paragraph

To determine if the EGL implementation supports this extension, clients
should query the EGL_EXTENSIONS string of the current active display.

  because that's the normal way of querying extensions and needs no
  explanation. I found that paragraph confusing because it made me ask "Why is
  the extension telling me this? Is there some magic in this extension that
I've
  failed to see that warrants this explanation?". The platform extension
  specification that you used as a template explicitly describes how to perform
  the extension query because its query mechanism differs from 95% of existing
  EGL extensions.

- You should remove the "New Behavior" section. Normally, the fine details of
  an extension specification are defined in the "Additions to the EGL x.y
  specification" section as a diff against the EGL x.y spec. This extension is
no
  different. A few, special extensions, though, define behavior that should
*not*
  be added to the EGL spec, such as the platform extensions. That's what the
"New
  Behavior" section should be used for, and this alpha extension defines no
such
  "outside-of-the-spec" behavior.

- The paragraph

To obtain an EGL window surface with a meaningful alpha channel, use an
EGLConfig with EGL_TRANSPARENT_TYPE set to EGL_TRANSPARENT_ALPHA_MESA.

  is still valuable, despite belonging in the "New Behavior" section. The
  proper location for it is the "Overview" section.

- Regarding issue #1:

RESOLUTION: Yes. Non-window surfaces created with
EGL_TRANSPARENT_TYPE set to EGL_TRANSPARENT_ALPHA_MESA will
generate an error.

  What error does it generate? If we refer to this paragraph of Section 3.5.1
as precedent,

If config does not support rendering to windows (the EGL_SURFACE_TYPE
attribute does not contain EGL_WINDOW_BIT ), an EGL_BAD_MATCH error is
generated.

  then the resolution text can be clarified as:

RESOLUTION: Attempting to create a non-window surface with a config in
which EGL_TRANSPARENT_TYPE is EGL_TRANSPARENT_ALPHA_MESA, then an
EGL_BAD_MATCH error is generated.


- Other than the above (mostly sylistic) issues, the extension looks good to
  me.  Thanks for writing it! When you submit to mesa-dev, please CC me.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/23] main: Minor whitespace fixes in ClearNamedBuffer[Sub]Data.

2015-02-20 Thread Laura Ekstrand
Again, Ian requested that this be a separate commit.

On Fri, Feb 20, 2015 at 6:22 AM, Martin Peres  wrote:

> Please squash this in the previous commit (with git rebase -i).
>
>
> On 12/02/2015 04:05, Laura Ekstrand wrote:
>
>> ---
>>   src/mesa/main/bufferobj.c | 4 ++--
>>   src/mesa/main/bufferobj.h | 4 ++--
>>   2 files changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
>> index bd21c8a..88230d6 100644
>> --- a/src/mesa/main/bufferobj.c
>> +++ b/src/mesa/main/bufferobj.c
>> @@ -1765,10 +1765,10 @@ void GLAPIENTRY
>>   _mesa_ClearBufferSubData(GLenum target, GLenum internalformat,
>>GLintptr offset, GLsizeiptr size,
>>GLenum format, GLenum type,
>> - const GLvoid* data)
>> + const GLvoid *data)
>>   {
>>  GET_CURRENT_CONTEXT(ctx);
>> -   struct gl_buffer_object* bufObj;
>> +   struct gl_buffer_object *bufObj;
>>bufObj = get_buffer(ctx, "glClearBufferSubData", target,
>> GL_INVALID_VALUE);
>>  if (!bufObj)
>> diff --git a/src/mesa/main/bufferobj.h b/src/mesa/main/bufferobj.h
>> index 5254727..2a66444 100644
>> --- a/src/mesa/main/bufferobj.h
>> +++ b/src/mesa/main/bufferobj.h
>> @@ -220,7 +220,7 @@ _mesa_GetBufferSubData(GLenum target, GLintptrARB
>> offset,
>>   void GLAPIENTRY
>>   _mesa_ClearBufferData(GLenum target, GLenum internalformat,
>> GLenum format, GLenum type,
>> -  const GLvoid * data);
>> +  const GLvoid *data);
>> void GLAPIENTRY
>>   _mesa_ClearNamedBufferData(GLuint buffer, GLenum internalformat,
>> @@ -231,7 +231,7 @@ void GLAPIENTRY
>>   _mesa_ClearBufferSubData(GLenum target, GLenum internalformat,
>>GLintptr offset, GLsizeiptr size,
>>GLenum format, GLenum type,
>> - const GLvoid * data);
>> + const GLvoid *data);
>> void GLAPIENTRY
>>   _mesa_ClearNamedBufferSubData(GLuint buffer, GLenum internalformat,
>>
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/32] i965/fs: Fix lower_load_payload() to take into account stride in the metadata guess.

2015-02-20 Thread Francisco Jerez
Jason Ekstrand  writes:

> Where are you getting a destination stride in a load_payload?  The whole
> point of load_payload was to remove partial writes so this seems to go
> against everything it's intended for.
> --Jason

Yeah, maybe...  This case was triggered by some of my image_load_store
code that ends up using byte types with stride=4 for some image formats.

>
> On Fri, Feb 6, 2015 at 9:42 AM, Francisco Jerez 
> wrote:
>
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> index 163aa41..8da1f47 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> @@ -3074,7 +3074,7 @@ fs_visitor::lower_load_payload()
>>   bool force_sechalf = inst->force_sechalf &&
>>!inst->force_writemask_all;
>>   bool toggle_sechalf = inst->dst.width == 16 &&
>> -   type_sz(inst->dst.type) == 4 &&
>> +   type_sz(inst->dst.type) * inst->dst.stride
>> == 4 &&
>> !inst->force_writemask_all;
>>   for (int i = 0; i < inst->regs_written; ++i) {
>>  metadata[dst_reg + i].written = true;
>> --
>> 2.1.3
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>


pgpsLmQeSeN85.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Francisco Jerez
Jason Ekstrand  writes:

> On Fri, Feb 20, 2015 at 1:09 PM, Francisco Jerez 
> wrote:
>
>> Jason Ekstrand  writes:
>>
>> > On Fri, Feb 20, 2015 at 4:11 AM, Francisco Jerez 
>> > wrote:
>> >
>> >> Jason Ekstrand  writes:
>> >>
>> >> > I'm still a little pensive.  But
>> >> >
>> >> > Reviewed-by: Jason Ekstrand 
>> >> >
>> >> Thanks.
>> >>
>> >> > Now for a little aside.  I have come to the conclusion that I made a
>> >> grave
>> >> > mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I should
>> have
>> >> > just subclassed fs_inst for load_payload.  The problem is that we
>> need to
>> >> > snag a bunch of information for the sources when we create the
>> >> > load_payload.  In particular, we need to know the width of the source
>> so
>> >> > that we know how much space it consumes in the payload and we need to
>> >> know
>> >> > the information required to properly re-create the mov such as
>> >> > force_sechalf and force_writemask_all.  Really, in order to do things
>> >> > properly, we need to gather this information *before* we do any
>> >> > optimizations.  The nasty pile of code that you're editing together
>> with
>> >> > the "effective_width" parameter is a lame attempt to
>> capture/reconstruct
>> >> > this information.  Really, we should just subclass, capture the
>> >> information
>> >> > up-front, and do it properly.
>> >> >
>> >> Yes, absolutely, this lowering pass is a real mess.  There are four more
>> >> unreviewed patches in the mailing list fixing bugs of the metadata
>> >> guessing of lower_load_payload() [1][2][3][4], you may want to take a
>> >> look at them.  There are more bugs I'm aware of but it didn't seem
>> >> practical to fix them.
>> >>
>> >
>> > Yeah, Matt pointed me at those.  I'll give them a look later today.
>> >
>> >
>> >> That said, I don't think it would be worth subclassing fs_inst.
>> >>
>> >> My suggestion would have been to keep it simple and lower LOAD_PAYLOAD
>> >> into a series of MOVs with force_writemask_all enabled in all cases,
>> >> simply rely on the optimizer to eliminate those where possible.  Then
>> >> get rid of the metadata and effective_width guessing.  Instead agree on
>> >> immediates and uniforms being exec_size-wide by convention
>> >> (i.e. LOAD_PAYLOAD's exec_size rather than the original instruction's),
>> >> then prevent constant propagation from propagating an immediate load
>> >> into a LOAD_PAYLOAD if it would lead to a change in the semantics of the
>> >> program, and maybe just run constant propagation after lowering so we
>> >> can be sure those cases are cleaned up properly where register coalesce
>> >> isn't enough.
>> >>
>> >
>> > First off, simply setting force_writemask_all isn't an option.  Consider
>> > the following scenario:
>> >
>> > a = foo;
>> > if (bar) {
>> >b = 7;
>> > } else {
>> >use(a);
>> > }
>> > use(b);
>> >
>> > If "use(a)" is the last use of the variable a, then the live ranges of a
>> > and b don't actually over-lap and we can assign a and b to the same
>> > register.  However, if force_writemask_all is set on b, it will destroy
>> the
>> > value in a before its last use.  Right now, we don't actually do this
>> > because our liveness analysis pass flattens everything so it will think
>> > that b and a over-lap even though they don't.  However, if we ever decide
>> > to make the liveness information more accurate, having a pile of
>> > force_writemask_all instructions will be a major problem.  So, while it
>> is
>> > *technically* safe for now, it's a really bad idea in the long-term.
>> >
>> The thing is we *will* have to deal with that scenario.  Building a
>> message payload inherently involves crossing channel boundaries (because
>> of headers, unsupported SIMD modes by some shared functions, etc.).  I'd
>> say it's a bug in the liveness analysis pass if it wouldn't consider the
>> liveness interval of a and b to overlap in your example if the
>> assignment of b had force_writemask_all set.
>>
>
> Yes, I'm aware of that.  However, the part of that register that has to
> squash everything is usually only a couple of registers while the entire
> payload may be up to 13 (if I remmeber correctly).  We'd rather not have to
> squash everything if we can.  All that being said, our liveness/register
> allocation can't handle this and making register allocation handle parts of
> registers interfering but not other bits is going to be a real pain.  Maybe
> not even worth it.  If you'd rather force_writemask_all everything, I'll
> sign off on that for now.  I just wanted to point out that it may not be a
> good permanent solution.
>
>
>> A reasonable compromise might be to leave it up to the caller whether to
>> set the force_writemask_all and force_sechalf flags or not.  I.e. just
>> copy the same flags set on the LOAD_PAYLOAD instruction to the MOV
>> instructions.  That would still allow reducing the liveness intervals in
>> cases where we know that the payload respects channel boundaries.
>>
>> Trac

Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Francisco Jerez
Jason Ekstrand  writes:

> One more comment here...  This particularly regards your plan of separating
> it into "things that match the destination" and "other things" and not copy
> prop uniforms or immediates into the "other things".  There is another case
> we need to handle.  On older gens (SNB maybe?) the SIMD16 FB write messages
> are in the order (r0, g0, b0, a0, r1, g1, b1, a1) instead of the usual (r0,
> r1, g0, g1, b0, b1, a0, a1).  This isn't going to work if we expect SIMD16
> load_payload instructions to be lowerable to a bunch of SIMD16 MOVs.  On
> those platforms, there is a special type of MOV-to-MRF that can move from
> (gN, gN+1) to (mK, mK+4) and we handle that in lower_load_payload right
> now.  However, it doesn't follow the standard rules.
> --Jason
>
I don't see why it couldn't be handled in roughly the same way you do it
now?  Recognize when src[i + 4] is the same 8-wide register as src[i]
shifted by 8 and emit a COMPR4 copy in that case?

> On Fri, Feb 20, 2015 at 2:10 PM, Jason Ekstrand 
> wrote:
>
>>
>>
>> On Fri, Feb 20, 2015 at 1:09 PM, Francisco Jerez 
>> wrote:
>>
>>> Jason Ekstrand  writes:
>>>
>>> > On Fri, Feb 20, 2015 at 4:11 AM, Francisco Jerez >> >
>>> > wrote:
>>> >
>>> >> Jason Ekstrand  writes:
>>> >>
>>> >> > I'm still a little pensive.  But
>>> >> >
>>> >> > Reviewed-by: Jason Ekstrand 
>>> >> >
>>> >> Thanks.
>>> >>
>>> >> > Now for a little aside.  I have come to the conclusion that I made a
>>> >> grave
>>> >> > mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I should
>>> have
>>> >> > just subclassed fs_inst for load_payload.  The problem is that we
>>> need to
>>> >> > snag a bunch of information for the sources when we create the
>>> >> > load_payload.  In particular, we need to know the width of the
>>> source so
>>> >> > that we know how much space it consumes in the payload and we need to
>>> >> know
>>> >> > the information required to properly re-create the mov such as
>>> >> > force_sechalf and force_writemask_all.  Really, in order to do things
>>> >> > properly, we need to gather this information *before* we do any
>>> >> > optimizations.  The nasty pile of code that you're editing together
>>> with
>>> >> > the "effective_width" parameter is a lame attempt to
>>> capture/reconstruct
>>> >> > this information.  Really, we should just subclass, capture the
>>> >> information
>>> >> > up-front, and do it properly.
>>> >> >
>>> >> Yes, absolutely, this lowering pass is a real mess.  There are four
>>> more
>>> >> unreviewed patches in the mailing list fixing bugs of the metadata
>>> >> guessing of lower_load_payload() [1][2][3][4], you may want to take a
>>> >> look at them.  There are more bugs I'm aware of but it didn't seem
>>> >> practical to fix them.
>>> >>
>>> >
>>> > Yeah, Matt pointed me at those.  I'll give them a look later today.
>>> >
>>> >
>>> >> That said, I don't think it would be worth subclassing fs_inst.
>>> >>
>>> >> My suggestion would have been to keep it simple and lower LOAD_PAYLOAD
>>> >> into a series of MOVs with force_writemask_all enabled in all cases,
>>> >> simply rely on the optimizer to eliminate those where possible.  Then
>>> >> get rid of the metadata and effective_width guessing.  Instead agree on
>>> >> immediates and uniforms being exec_size-wide by convention
>>> >> (i.e. LOAD_PAYLOAD's exec_size rather than the original instruction's),
>>> >> then prevent constant propagation from propagating an immediate load
>>> >> into a LOAD_PAYLOAD if it would lead to a change in the semantics of
>>> the
>>> >> program, and maybe just run constant propagation after lowering so we
>>> >> can be sure those cases are cleaned up properly where register coalesce
>>> >> isn't enough.
>>> >>
>>> >
>>> > First off, simply setting force_writemask_all isn't an option.  Consider
>>> > the following scenario:
>>> >
>>> > a = foo;
>>> > if (bar) {
>>> >b = 7;
>>> > } else {
>>> >use(a);
>>> > }
>>> > use(b);
>>> >
>>> > If "use(a)" is the last use of the variable a, then the live ranges of a
>>> > and b don't actually over-lap and we can assign a and b to the same
>>> > register.  However, if force_writemask_all is set on b, it will destroy
>>> the
>>> > value in a before its last use.  Right now, we don't actually do this
>>> > because our liveness analysis pass flattens everything so it will think
>>> > that b and a over-lap even though they don't.  However, if we ever
>>> decide
>>> > to make the liveness information more accurate, having a pile of
>>> > force_writemask_all instructions will be a major problem.  So, while it
>>> is
>>> > *technically* safe for now, it's a really bad idea in the long-term.
>>> >
>>> The thing is we *will* have to deal with that scenario.  Building a
>>> message payload inherently involves crossing channel boundaries (because
>>> of headers, unsupported SIMD modes by some shared functions, etc.).  I'd
>>> say it's a bug in the liveness analysis pass if it wouldn't consider the
>>> live

[Mesa-dev] [PATCH 7/7] i965: Fix variable indexing of sampler arrays under non-uniform control flow.

2015-02-20 Thread Francisco Jerez
ARB_gpu_shader5 requires sampler array indexing expressions to be
dynamically uniform, this however doesn't have any implications on the
control flow that leads to the evaluation of that expression being
uniform.  Use emit_uniformize() to obtain an arbitrary live value from
the binding table index calculation instead of assuming that the first
channel is always live.

Fixes the following Piglit test cases:
  
arb_gpu_shader5/execution/sampler_array_indexing/fs-nonuniform-control-flow.shader_test
  
arb_gpu_shader5/execution/sampler_array_indexing/vs-nonuniform-control-flow.shader_test

part of the series:
  http://lists.freedesktop.org/archives/piglit/2015-February/014615.html
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 4 ++--
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   | 5 +++--
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 5 +++--
 3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index ad77752..030ce24 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -1563,8 +1563,8 @@ fs_visitor::nir_emit_texture(nir_tex_instr *instr)
 
  /* Emit code to evaluate the actual indexing expression */
  sampler_reg = vgrf(glsl_type::uint_type);
- emit(ADD(sampler_reg, src, fs_reg(sampler)))
- ->force_writemask_all = true;
+ emit(ADD(sampler_reg, src, fs_reg(sampler)));
+ emit_uniformize(sampler_reg, sampler_reg);
  break;
   }
 
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index dee6a6d..d7cad9f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -2166,8 +2166,9 @@ fs_visitor::visit(ir_texture *ir)
   /* Emit code to evaluate the actual indexing expression */
   nonconst_sampler_index->accept(this);
   fs_reg temp = vgrf(glsl_type::uint_type);
-  emit(ADD(temp, this->result, fs_reg(sampler)))
-->force_writemask_all = true;
+  emit(ADD(temp, this->result, fs_reg(sampler)));
+  emit_uniformize(temp, temp);
+
   sampler_reg = temp;
} else {
   /* Single sampler, or constant array index; the indexing expression
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index d5848d2..c260381 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -2497,8 +2497,9 @@ vec4_visitor::visit(ir_texture *ir)
   /* Emit code to evaluate the actual indexing expression */
   nonconst_sampler_index->accept(this);
   dst_reg temp(this, glsl_type::uint_type);
-  emit(ADD(temp, this->result, src_reg(sampler)))
- ->force_writemask_all = true;
+  emit(ADD(temp, this->result, src_reg(sampler)));
+  emit_uniformize(temp, src_reg(temp));
+
   sampler_reg = src_reg(temp);
} else {
   /* Single sampler, or constant array index; the indexing expression
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/7] i965: Define helper function to copy an arbitrary live component from some register.

2015-02-20 Thread Francisco Jerez
---
 src/mesa/drivers/dri/i965/brw_fs.h |  2 ++
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   | 12 
 src/mesa/drivers/dri/i965/brw_vec4.h   |  3 +++
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 11 +++
 4 files changed, 28 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 6e5fa1d..df6b825 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -306,6 +306,8 @@ public:
  const fs_reg &a);
void emit_minmax(enum brw_conditional_mod conditionalmod, const fs_reg &dst,
 const fs_reg &src0, const fs_reg &src1);
+   /** Copy any live channel from \p src to the first channel of \p dst. */
+   void emit_uniformize(const fs_reg &dst, const fs_reg &src);
bool try_emit_saturate(ir_expression *ir);
bool try_emit_line(ir_expression *ir);
bool try_emit_mad(ir_expression *ir);
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 4b48f2d..6f1ff20 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -337,6 +337,18 @@ fs_visitor::emit_minmax(enum brw_conditional_mod 
conditionalmod, const fs_reg &d
}
 }
 
+void
+fs_visitor::emit_uniformize(const fs_reg &dst, const fs_reg &src)
+{
+   const fs_reg chan_index = vgrf(glsl_type::uint_type);
+
+   emit(SHADER_OPCODE_FIND_LIVE_CHANNEL, component(chan_index, 0))
+  ->force_writemask_all = true;
+   emit(SHADER_OPCODE_BROADCAST, component(dst, 0),
+src, component(chan_index, 0))
+  ->force_writemask_all = true;
+}
+
 bool
 fs_visitor::try_emit_saturate(ir_expression *ir)
 {
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index 7a92d59..adb5fe4 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -298,6 +298,9 @@ public:
void emit_lrp(const dst_reg &dst,
  const src_reg &x, const src_reg &y, const src_reg &a);
 
+   /** Copy any live channel from \p src to the first channel of \p dst. */
+   void emit_uniformize(const dst_reg &dst, const src_reg &src);
+
void emit_block_move(dst_reg *dst, src_reg *src,
 const struct glsl_type *type, brw_predicate predicate);
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index f6f589d..4515fc7 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -1304,6 +1304,17 @@ vec4_visitor::emit_lrp(const dst_reg &dst,
 }
 
 void
+vec4_visitor::emit_uniformize(const dst_reg &dst, const src_reg &src)
+{
+   const src_reg chan_index(this, glsl_type::uint_type);
+
+   emit(SHADER_OPCODE_FIND_LIVE_CHANNEL, dst_reg(chan_index))
+  ->force_writemask_all = true;
+   emit(SHADER_OPCODE_BROADCAST, dst, src, chan_index)
+  ->force_writemask_all = true;
+}
+
+void
 vec4_visitor::visit(ir_expression *ir)
 {
unsigned int operand;
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/7] i965: Introduce the BROADCAST pseudo-opcode.

2015-02-20 Thread Francisco Jerez
The BROADCAST instruction picks the channel from its first source
given by an index passed in as second source.  This will be used in
situations where all channels from the same SIMD thread have to agree
on the value of something, e.g. a surface binding table index.
---
 src/mesa/drivers/dri/i965/brw_defines.h  |  6 ++
 src/mesa/drivers/dri/i965/brw_eu.h   |  6 ++
 src/mesa/drivers/dri/i965/brw_eu_emit.c  | 77 
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp   |  4 ++
 src/mesa/drivers/dri/i965/brw_shader.cpp |  3 +
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp |  4 ++
 6 files changed, 100 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index 17c27dd..d4930e3 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -911,6 +911,12 @@ enum opcode {
 
SHADER_OPCODE_URB_WRITE_SIMD8,
 
+   /**
+* Pick the channel from its first source register given by the index
+* specified as second source.  Useful for variable indexing of surfaces.
+*/
+   SHADER_OPCODE_BROADCAST,
+
VEC4_OPCODE_MOV_BYTES,
VEC4_OPCODE_PACK_BYTES,
VEC4_OPCODE_UNPACK_UNIFORM,
diff --git a/src/mesa/drivers/dri/i965/brw_eu.h 
b/src/mesa/drivers/dri/i965/brw_eu.h
index a94ea42..2505480 100644
--- a/src/mesa/drivers/dri/i965/brw_eu.h
+++ b/src/mesa/drivers/dri/i965/brw_eu.h
@@ -413,6 +413,12 @@ brw_pixel_interpolator_query(struct brw_compile *p,
  unsigned msg_length,
  unsigned response_length);
 
+void
+brw_broadcast(struct brw_compile *p,
+  struct brw_reg dst,
+  struct brw_reg src,
+  struct brw_reg idx);
+
 /***
  * brw_eu_util.c:
  */
diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
b/src/mesa/drivers/dri/i965/brw_eu_emit.c
index 1d6fd67..d7e3995 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
@@ -2854,6 +2854,83 @@ brw_pixel_interpolator_query(struct brw_compile *p,
brw_inst_set_pi_message_data(brw, insn, data);
 }
 
+void
+brw_broadcast(struct brw_compile *p,
+  struct brw_reg dst,
+  struct brw_reg src,
+  struct brw_reg idx)
+{
+   const struct brw_context *brw = p->brw;
+   const bool align1 = (brw_inst_access_mode(brw, p->current) == BRW_ALIGN_1);
+   brw_inst *inst;
+
+   assert(src.file == BRW_GENERAL_REGISTER_FILE &&
+  src.address_mode == BRW_ADDRESS_DIRECT);
+
+   if ((src.vstride == 0 && (src.hstride == 0 || !align1)) ||
+   idx.file == BRW_IMMEDIATE_VALUE) {
+  /* Trivial, the source is already uniform or the index is a constant.
+   * We will typically not get here if the optimizer is doing its job, but
+   * asserting would be mean.
+   */
+  const unsigned i = (idx.file == BRW_IMMEDIATE_VALUE ? idx.dw1.ud : 0);
+  brw_MOV(p, dst,
+  (align1 ? stride(suboffset(src, i), 0, 1, 0) :
+   stride(suboffset(src, 4 * i), 0, 4, 1)));
+
+   } else {
+  if (align1) {
+ const struct brw_reg addr =
+retype(brw_address_reg(0), BRW_REGISTER_TYPE_UD);
+ const unsigned offset = src.nr * REG_SIZE + src.subnr;
+ /* Limit in bytes of the signed indirect addressing immediate. */
+ const unsigned limit = 512;
+
+ brw_push_insn_state(p);
+ brw_set_default_mask_control(p, BRW_MASK_DISABLE);
+ brw_set_default_predicate_control(p, BRW_PREDICATE_NONE);
+
+ /* Take into account the component size and horizontal stride. */
+ assert(src.vstride == src.hstride + src.width);
+ brw_SHL(p, addr, vec1(idx),
+ brw_imm_ud(_mesa_logbase2(type_sz(src.type)) +
+src.hstride - 1));
+
+ /* We can only address up to limit bytes using the indirect
+  * addressing immediate, account for the difference if the source
+  * register is above this limit.
+  */
+ if (offset >= limit)
+brw_ADD(p, addr, addr, brw_imm_ud(offset - offset % limit));
+
+ brw_pop_insn_state(p);
+
+ /* Use indirect addressing to fetch the specified component. */
+ brw_MOV(p, dst,
+ retype(brw_vec1_indirect(addr.subnr, offset % limit),
+src.type));
+
+  } else {
+ /* In SIMD4x2 mode the index can be either zero or one, replicate it
+  * to all bits of a flag register,
+  */
+ inst = brw_MOV(p,
+brw_null_reg(),
+stride(brw_swizzle1(idx, 0), 0, 4, 1));
+ brw_inst_set_pred_control(brw, inst, BRW_PREDICATE_NONE);
+ brw_inst_set_cond_modifier(brw, inst, BRW_CONDITIONAL_NZ);
+ brw_inst_set_flag_reg_nr(brw, inst, 1);
+
+ /* and use predic

Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Jason Ekstrand
On Fri, Feb 20, 2015 at 2:43 PM, Francisco Jerez 
wrote:

> Jason Ekstrand  writes:
>
> > One more comment here...  This particularly regards your plan of
> separating
> > it into "things that match the destination" and "other things" and not
> copy
> > prop uniforms or immediates into the "other things".  There is another
> case
> > we need to handle.  On older gens (SNB maybe?) the SIMD16 FB write
> messages
> > are in the order (r0, g0, b0, a0, r1, g1, b1, a1) instead of the usual
> (r0,
> > r1, g0, g1, b0, b1, a0, a1).  This isn't going to work if we expect
> SIMD16
> > load_payload instructions to be lowerable to a bunch of SIMD16 MOVs.  On
> > those platforms, there is a special type of MOV-to-MRF that can move from
> > (gN, gN+1) to (mK, mK+4) and we handle that in lower_load_payload right
> > now.  However, it doesn't follow the standard rules.
> > --Jason
> >
> I don't see why it couldn't be handled in roughly the same way you do it
> now?  Recognize when src[i + 4] is the same 8-wide register as src[i]
> shifted by 8 and emit a COMPR4 copy in that case?
>

Sure.  I haven't thought about it hard enough to prove it can't be done.
Just wanted to make sure you were aware of that particular monkey-wrench.
--Jason


> > On Fri, Feb 20, 2015 at 2:10 PM, Jason Ekstrand 
> > wrote:
> >
> >>
> >>
> >> On Fri, Feb 20, 2015 at 1:09 PM, Francisco Jerez  >
> >> wrote:
> >>
> >>> Jason Ekstrand  writes:
> >>>
> >>> > On Fri, Feb 20, 2015 at 4:11 AM, Francisco Jerez <
> curroje...@riseup.net
> >>> >
> >>> > wrote:
> >>> >
> >>> >> Jason Ekstrand  writes:
> >>> >>
> >>> >> > I'm still a little pensive.  But
> >>> >> >
> >>> >> > Reviewed-by: Jason Ekstrand 
> >>> >> >
> >>> >> Thanks.
> >>> >>
> >>> >> > Now for a little aside.  I have come to the conclusion that I
> made a
> >>> >> grave
> >>> >> > mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I
> should
> >>> have
> >>> >> > just subclassed fs_inst for load_payload.  The problem is that we
> >>> need to
> >>> >> > snag a bunch of information for the sources when we create the
> >>> >> > load_payload.  In particular, we need to know the width of the
> >>> source so
> >>> >> > that we know how much space it consumes in the payload and we
> need to
> >>> >> know
> >>> >> > the information required to properly re-create the mov such as
> >>> >> > force_sechalf and force_writemask_all.  Really, in order to do
> things
> >>> >> > properly, we need to gather this information *before* we do any
> >>> >> > optimizations.  The nasty pile of code that you're editing
> together
> >>> with
> >>> >> > the "effective_width" parameter is a lame attempt to
> >>> capture/reconstruct
> >>> >> > this information.  Really, we should just subclass, capture the
> >>> >> information
> >>> >> > up-front, and do it properly.
> >>> >> >
> >>> >> Yes, absolutely, this lowering pass is a real mess.  There are four
> >>> more
> >>> >> unreviewed patches in the mailing list fixing bugs of the metadata
> >>> >> guessing of lower_load_payload() [1][2][3][4], you may want to take
> a
> >>> >> look at them.  There are more bugs I'm aware of but it didn't seem
> >>> >> practical to fix them.
> >>> >>
> >>> >
> >>> > Yeah, Matt pointed me at those.  I'll give them a look later today.
> >>> >
> >>> >
> >>> >> That said, I don't think it would be worth subclassing fs_inst.
> >>> >>
> >>> >> My suggestion would have been to keep it simple and lower
> LOAD_PAYLOAD
> >>> >> into a series of MOVs with force_writemask_all enabled in all cases,
> >>> >> simply rely on the optimizer to eliminate those where possible.
> Then
> >>> >> get rid of the metadata and effective_width guessing.  Instead
> agree on
> >>> >> immediates and uniforms being exec_size-wide by convention
> >>> >> (i.e. LOAD_PAYLOAD's exec_size rather than the original
> instruction's),
> >>> >> then prevent constant propagation from propagating an immediate load
> >>> >> into a LOAD_PAYLOAD if it would lead to a change in the semantics of
> >>> the
> >>> >> program, and maybe just run constant propagation after lowering so
> we
> >>> >> can be sure those cases are cleaned up properly where register
> coalesce
> >>> >> isn't enough.
> >>> >>
> >>> >
> >>> > First off, simply setting force_writemask_all isn't an option.
> Consider
> >>> > the following scenario:
> >>> >
> >>> > a = foo;
> >>> > if (bar) {
> >>> >b = 7;
> >>> > } else {
> >>> >use(a);
> >>> > }
> >>> > use(b);
> >>> >
> >>> > If "use(a)" is the last use of the variable a, then the live ranges
> of a
> >>> > and b don't actually over-lap and we can assign a and b to the same
> >>> > register.  However, if force_writemask_all is set on b, it will
> destroy
> >>> the
> >>> > value in a before its last use.  Right now, we don't actually do this
> >>> > because our liveness analysis pass flattens everything so it will
> think
> >>> > that b and a over-lap even though they don't.  However, if we ever
> >>> decide
> >>> > to make the liveness information more a

[Mesa-dev] [PATCH 4/7] i965: Perform basic optimizations on the FIND_LIVE_CHANNEL opcode.

2015-02-20 Thread Francisco Jerez
---
 src/mesa/drivers/dri/i965/brw_fs.cpp   | 49 ++
 src/mesa/drivers/dri/i965/brw_fs.h |  1 +
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp   |  1 +
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 41 +
 src/mesa/drivers/dri/i965/brw_vec4.h   |  1 +
 src/mesa/drivers/dri/i965/brw_vec4_cse.cpp |  1 +
 6 files changed, 94 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index d567c2b..4537900 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2734,6 +2734,54 @@ fs_visitor::compute_to_mrf()
 }
 
 /**
+ * Eliminate FIND_LIVE_CHANNEL instructions occurring outside any control
+ * flow.  We could probably do better here with some form of divergence
+ * analysis.
+ */
+bool
+fs_visitor::eliminate_find_live_channel()
+{
+   bool progress = false;
+   unsigned depth = 0;
+
+   foreach_block_and_inst_safe(block, fs_inst, inst, cfg) {
+  switch (inst->opcode) {
+  case BRW_OPCODE_IF:
+  case BRW_OPCODE_DO:
+ depth++;
+ break;
+
+  case BRW_OPCODE_ENDIF:
+  case BRW_OPCODE_WHILE:
+ depth--;
+ break;
+
+  case FS_OPCODE_DISCARD_JUMP:
+ /* This can potentially make control flow non-uniform until the end
+  * of the program.
+  */
+ depth++;
+ break;
+
+  case SHADER_OPCODE_FIND_LIVE_CHANNEL:
+ if (depth == 0) {
+inst->opcode = BRW_OPCODE_MOV;
+inst->src[0] = fs_reg(0);
+inst->sources = 1;
+inst->force_writemask_all = true;
+progress = true;
+ }
+ break;
+
+  default:
+ break;
+  }
+   }
+
+   return progress;
+}
+
+/**
  * Once we've generated code, try to convert normal FS_OPCODE_FB_WRITE
  * instructions to FS_OPCODE_REP_FB_WRITE.
  */
@@ -3678,6 +3726,7 @@ fs_visitor::optimize()
   OPT(opt_saturate_propagation);
   OPT(register_coalesce);
   OPT(compute_to_mrf);
+  OPT(eliminate_find_live_channel);
 
   OPT(compact_virtual_grfs);
} while (progress);
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 9375412..6e5fa1d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -228,6 +228,7 @@ public:
bool opt_register_renaming();
bool register_coalesce();
bool compute_to_mrf();
+   bool eliminate_find_live_channel();
bool dead_code_eliminate();
bool remove_duplicate_mrf_writes();
bool virtual_grf_interferes(int a, int b);
diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
index bbe0747..04a073e 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
@@ -89,6 +89,7 @@ is_expression(const fs_inst *const inst)
case FS_OPCODE_VARYING_PULL_CONSTANT_LOAD:
case FS_OPCODE_CINTERP:
case FS_OPCODE_LINTERP:
+   case SHADER_OPCODE_FIND_LIVE_CHANNEL:
case SHADER_OPCODE_BROADCAST:
   return true;
case SHADER_OPCODE_RCP:
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 601b8c1..219c05e 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1250,6 +1250,46 @@ vec4_visitor::opt_register_coalesce()
 }
 
 /**
+ * Eliminate FIND_LIVE_CHANNEL instructions occurring outside any control
+ * flow.  We could probably do better here with some form of divergence
+ * analysis.
+ */
+bool
+vec4_visitor::eliminate_find_live_channel()
+{
+   bool progress = false;
+   unsigned depth = 0;
+
+   foreach_block_and_inst_safe(block, vec4_instruction, inst, cfg) {
+  switch (inst->opcode) {
+  case BRW_OPCODE_IF:
+  case BRW_OPCODE_DO:
+ depth++;
+ break;
+
+  case BRW_OPCODE_ENDIF:
+  case BRW_OPCODE_WHILE:
+ depth--;
+ break;
+
+  case SHADER_OPCODE_FIND_LIVE_CHANNEL:
+ if (depth == 0) {
+inst->opcode = BRW_OPCODE_MOV;
+inst->src[0] = src_reg(0);
+inst->force_writemask_all = true;
+progress = true;
+ }
+ break;
+
+  default:
+ break;
+  }
+   }
+
+   return progress;
+}
+
+/**
  * Splits virtual GRFs requesting more than one contiguous physical register.
  *
  * We initially create large virtual GRFs for temporary structures, arrays,
@@ -1880,6 +1920,7 @@ vec4_visitor::run()
   OPT(opt_cse);
   OPT(opt_algebraic);
   OPT(opt_register_coalesce);
+  OPT(eliminate_find_live_channel);
} while (progress);
 
pass_num = 0;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index a24f843..7a92d59 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -207,6 +207,7 @@ public:
bool opt_cse();
bool opt_algebraic();
bool opt

[Mesa-dev] [PATCH 6/7] i965: Fix variable indexing of UBO arrays under non-uniform control flow.

2015-02-20 Thread Francisco Jerez
ARB_gpu_shader5 requires UBO array indexing expressions to be
dynamically uniform, this however doesn't have any implications on the
control flow that leads to the evaluation of that expression being
uniform.  Use emit_uniformize() to obtain an arbitrary live value from
the binding table index calculation instead of assuming that the first
channel is always live.

Fixes the following Piglit tests:
  
arb_gpu_shader5/execution/ubo_array_indexing/fs-nonuniform-control-flow.shader_test
  
arb_gpu_shader5/execution/ubo_array_indexing/vs-nonuniform-control-flow.shader_test

part of the series:
  http://lists.freedesktop.org/archives/piglit/2015-February/014616.html
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 8 
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   | 8 
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 5 +++--
 3 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 90eecae..ad77752 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -1263,13 +1263,13 @@ fs_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
*instr)
  const_index->u[0]);
   } else {
  /* The block index is not a constant. Evaluate the index expression
-  * per-channel and add the base UBO index; the generator will select
-  * a value from any live channel.
+  * per-channel and add the base UBO index; we have to select a value
+  * from any live channel.
   */
  surf_index = vgrf(glsl_type::uint_type);
  emit(ADD(surf_index, get_nir_src(instr->src[0]),
-  fs_reg(stage_prog_data->binding_table.ubo_start)))
-->force_writemask_all = true;
+  fs_reg(stage_prog_data->binding_table.ubo_start)));
+ emit_uniformize(surf_index, surf_index);
 
  /* Assume this may touch any UBO. It would be nice to provide
   * a tighter bound, but the array information is already lowered away.
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 6f1ff20..dee6a6d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1127,13 +1127,13 @@ fs_visitor::visit(ir_expression *ir)
  const_uniform_block->value.u[0]);
   } else {
  /* The block index is not a constant. Evaluate the index expression
-  * per-channel and add the base UBO index; the generator will select
-  * a value from any live channel.
+  * per-channel and add the base UBO index; we have to select a value
+  * from any live channel.
   */
  surf_index = vgrf(glsl_type::uint_type);
  emit(ADD(surf_index, op[0],
-  fs_reg(stage_prog_data->binding_table.ubo_start)))
-->force_writemask_all = true;
+  fs_reg(stage_prog_data->binding_table.ubo_start)));
+ emit_uniformize(surf_index, surf_index);
 
  /* Assume this may touch any UBO. It would be nice to provide
   * a tighter bound, but the array information is already lowered away.
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 4515fc7..d5848d2 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -1750,12 +1750,13 @@ vec4_visitor::visit(ir_expression *ir)
   const_uniform_block->value.u[0]);
   } else {
  /* The block index is not a constant. Evaluate the index expression
-  * per-channel and add the base UBO index; the generator will select
-  * a value from any live channel.
+  * per-channel and add the base UBO index; we have to select a value
+  * from any live channel.
   */
  surf_index = src_reg(this, glsl_type::uint_type);
  emit(ADD(dst_reg(surf_index), op[0],
   src_reg(prog_data->base.binding_table.ubo_start)));
+ emit_uniformize(dst_reg(surf_index), surf_index);
 
  /* Assume this may touch any UBO. It would be nice to provide
   * a tighter bound, but the array information is already lowered away.
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/7] i965: Perform basic optimizations on the BROADCAST opcode.

2015-02-20 Thread Francisco Jerez
---
 src/mesa/drivers/dri/i965/brw_fs.cpp| 15 +++
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp   |  1 +
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp|  1 +
 src/mesa/drivers/dri/i965/brw_ir_fs.h   |  7 +++
 src/mesa/drivers/dri/i965/brw_ir_vec4.h |  7 +++
 src/mesa/drivers/dri/i965/brw_vec4.cpp  | 10 ++
 src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp |  1 +
 src/mesa/drivers/dri/i965/brw_vec4_cse.cpp  |  1 +
 8 files changed, 43 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index a2a5234..d567c2b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2484,6 +2484,21 @@ fs_visitor::opt_algebraic()
  }
  break;
   }
+  case SHADER_OPCODE_BROADCAST:
+ if (is_uniform(inst->src[0])) {
+inst->opcode = BRW_OPCODE_MOV;
+inst->sources = 1;
+progress = true;
+
+ } else if (inst->src[1].file == IMM) {
+inst->opcode = BRW_OPCODE_MOV;
+inst->src[0] = component(inst->src[0],
+ inst->src[1].fixed_hw_reg.dw1.ud);
+inst->sources = 1;
+progress = true;
+ }
+ break;
+
   default:
 break;
   }
diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index e265ce0..cc4d9d0 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -572,6 +572,7 @@ fs_visitor::try_constant_propagate(fs_inst *inst, acp_entry 
*entry)
  break;
 
   case FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD:
+  case SHADER_OPCODE_BROADCAST:
  inst->src[i] = val;
  progress = true;
  break;
diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
index ae069bb..bbe0747 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
@@ -89,6 +89,7 @@ is_expression(const fs_inst *const inst)
case FS_OPCODE_VARYING_PULL_CONSTANT_LOAD:
case FS_OPCODE_CINTERP:
case FS_OPCODE_LINTERP:
+   case SHADER_OPCODE_BROADCAST:
   return true;
case SHADER_OPCODE_RCP:
case SHADER_OPCODE_RSQ:
diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h 
b/src/mesa/drivers/dri/i965/brw_ir_fs.h
index 9ef1261..dc1e041 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
@@ -166,6 +166,13 @@ component(fs_reg reg, unsigned idx)
return reg;
 }
 
+static inline bool
+is_uniform(const fs_reg ®)
+{
+   return (reg.width == 1 || reg.stride == 0 || reg.is_null()) &&
+  (!reg.reladdr || is_uniform(*reg.reladdr));
+}
+
 /**
  * Get either of the 8-component halves of a 16-component register.
  *
diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h 
b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
index 941086f..066240e 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
@@ -102,6 +102,13 @@ negate(src_reg reg)
return reg;
 }
 
+static inline bool
+is_uniform(const src_reg ®)
+{
+   return (reg.file == IMM || reg.file == UNIFORM || reg.is_null()) &&
+  (!reg.reladdr || is_uniform(*reg.reladdr));
+}
+
 class dst_reg : public backend_reg
 {
 public:
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 0a68413..601b8c1 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -756,6 +756,16 @@ vec4_visitor::opt_algebraic()
  }
  break;
   }
+  case SHADER_OPCODE_BROADCAST:
+ if (is_uniform(inst->src[0]) ||
+ (inst->src[1].file == IMM &&
+  inst->src[1].fixed_hw_reg.dw1.ud == 0)) {
+inst->opcode = BRW_OPCODE_MOV;
+inst->src[1] = src_reg();
+progress = true;
+ }
+ break;
+
   default:
 break;
   }
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
index 679867c..c65f23d 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
@@ -152,6 +152,7 @@ try_constant_propagate(struct brw_context *brw, 
vec4_instruction *inst,
 
switch (inst->opcode) {
case BRW_OPCODE_MOV:
+   case SHADER_OPCODE_BROADCAST:
   inst->src[arg] = value;
   return true;
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
index 5fb8f31..dbfea92 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
@@ -71,6 +71,7 @@ is_expression(const vec4_instruction *const inst)
case BR

[Mesa-dev] [PATCH 3/7] i965: Introduce the FIND_LIVE_CHANNEL pseudo-opcode.

2015-02-20 Thread Francisco Jerez
This instruction calculates the index of an arbitrary channel enabled
in the current execution mask.  It's expected to be used as input for
the BROADCAST opcode, but it's implemented as a separate instruction
rather than being baked into BROADCAST because FIND_LIVE_CHANNEL has
no dependencies so it can always be CSE'ed with other instances of the
same instruction within a basic block.
---
 src/mesa/drivers/dri/i965/brw_defines.h  |  8 +++
 src/mesa/drivers/dri/i965/brw_eu.h   |  4 ++
 src/mesa/drivers/dri/i965/brw_eu_emit.c  | 70 
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp   |  4 ++
 src/mesa/drivers/dri/i965/brw_shader.cpp |  2 +
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp |  4 ++
 6 files changed, 92 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index d4930e3..2b52fb2 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -912,6 +912,14 @@ enum opcode {
SHADER_OPCODE_URB_WRITE_SIMD8,
 
/**
+* Return the index of an arbitrary live channel (i.e. one of the channels
+* enabled in the current execution mask) and assign it to the first
+* component of the destination.  Expected to be used as input for the
+* BROADCAST pseudo-opcode.
+*/
+   SHADER_OPCODE_FIND_LIVE_CHANNEL,
+
+   /**
 * Pick the channel from its first source register given by the index
 * specified as second source.  Useful for variable indexing of surfaces.
 */
diff --git a/src/mesa/drivers/dri/i965/brw_eu.h 
b/src/mesa/drivers/dri/i965/brw_eu.h
index 2505480..1a8b38c 100644
--- a/src/mesa/drivers/dri/i965/brw_eu.h
+++ b/src/mesa/drivers/dri/i965/brw_eu.h
@@ -414,6 +414,10 @@ brw_pixel_interpolator_query(struct brw_compile *p,
  unsigned response_length);
 
 void
+brw_find_live_channel(struct brw_compile *p,
+  struct brw_reg dst);
+
+void
 brw_broadcast(struct brw_compile *p,
   struct brw_reg dst,
   struct brw_reg src,
diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
b/src/mesa/drivers/dri/i965/brw_eu_emit.c
index d7e3995..7899f83 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
@@ -2855,6 +2855,76 @@ brw_pixel_interpolator_query(struct brw_compile *p,
 }
 
 void
+brw_find_live_channel(struct brw_compile *p, struct brw_reg dst)
+{
+   const struct brw_context *brw = p->brw;
+   brw_inst *inst;
+
+   assert(brw->gen >= 7);
+
+   brw_push_insn_state(p);
+   brw_set_default_mask_control(p, BRW_MASK_DISABLE);
+
+   if (brw_inst_access_mode(brw, p->current) == BRW_ALIGN_1) {
+  if (brw->gen >= 8) {
+ /* Getting the first active channel index is easy on Gen8: Just find
+  * the first bit set in the mask register.  The same register exists
+  * on HSW already but it reads back as all ones when the current
+  * instruction has execution masking disabled, so it's kind of
+  * useless.
+  */
+ inst = brw_FBL(p, vec1(dst),
+retype(brw_mask_reg(0), BRW_REGISTER_TYPE_UD));
+
+ /* Quarter control has the effect of magically shifting the value of
+  * this register.  Make sure it's set to zero.
+  */
+ brw_inst_set_qtr_control(brw, inst, GEN6_COMPRESSION_1Q);
+
+  } else {
+ const struct brw_reg flag = retype(brw_flag_reg(1, 0),
+BRW_REGISTER_TYPE_UD);
+
+ brw_MOV(p, flag, brw_imm_ud(0));
+
+ /* Run a 16-wide instruction returning zero with execution masking
+  * and a conditional modifier enabled in order to get the current
+  * execution mask in f1.0.
+  */
+ inst = brw_MOV(p, vec16(brw_null_reg()), brw_imm_ud(0));
+ brw_inst_set_mask_control(brw, inst, BRW_MASK_ENABLE);
+ brw_inst_set_cond_modifier(brw, inst, BRW_CONDITIONAL_Z);
+ brw_inst_set_flag_reg_nr(brw, inst, 1);
+
+ brw_FBL(p, vec1(dst), flag);
+  }
+
+   } else {
+  if (brw->gen >= 8) {
+ /* In SIMD4x2 mode the first active channel index is just the
+  * negation of the first bit of the mask register.
+  */
+ inst = brw_AND(p, brw_writemask(dst, WRITEMASK_X),
+negate(retype(brw_mask_reg(0), BRW_REGISTER_TYPE_UD)),
+brw_imm_ud(1));
+
+  } else {
+ /* Overwrite the destination without and with execution masking to
+  * find out which of the channels is active.
+  */
+ brw_MOV(p, brw_writemask(vec4(dst), WRITEMASK_X),
+ brw_imm_ud(1));
+
+ inst = brw_MOV(p, brw_writemask(vec4(dst), WRITEMASK_X),
+brw_imm_ud(0));
+ brw_inst_set_mask_control(brw, inst, BRW_MASK_ENABLE);
+  }
+   }
+
+   brw_pop_insn_state(p);
+}
+
+void
 brw_b

Re: [Mesa-dev] [PATCH 2/3] i965/fs/nir: Optimize (gl_FrontFacing ? x : y) where x and y are ±1.0.

2015-02-20 Thread Eric Anholt
I wanted patch #1 to land, so I took a look at this one :)

Matt Turner  writes:
> +   if (brw->gen >= 6) {
> +  /* Bit 15 of g0.0 is 0 if the polygon is front facing. */
> +  fs_reg g0 = fs_reg(retype(brw_vec1_grf(0, 0), BRW_REGISTER_TYPE_W));
> +
> +  /* For (gl_FrontFacing ? 1.0 : -1.0), emit:
> +   *
> +   *or(8)  tmp.1<2>W  g0.0<0,1,0>W  0x3f80W
> +   *and(8) dst<1>Dtmp<8,8,1>D   0xbf80D
> +   *
> +   * and negate g0.0<0,1,0>W for (gl_FrontFacing ? -1.0 : 1.0).
> +   */
> +
> +  if (value1->f[0] == -1.0f) {
> + g0.negate = true;
> +  }

Does this do what you want?  If g0.0 happened to be *all* zeroes, you're
still going to get 0 after negation, right?

(I suppose your primitive type is probably going to be triangles if
you're doing facing, so the low bits will be non-zero)


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] nir: Add a couple of simplifications of csel operations.

2015-02-20 Thread Eric Anholt
vc4 was already cleaning these up, but it does shave 4 NIR instructions in
shader-db.
---
 src/glsl/nir/nir_opt_algebraic.py | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/glsl/nir/nir_opt_algebraic.py 
b/src/glsl/nir/nir_opt_algebraic.py
index b0c206a..dd4557f 100644
--- a/src/glsl/nir/nir_opt_algebraic.py
+++ b/src/glsl/nir/nir_opt_algebraic.py
@@ -146,6 +146,9 @@ optimizations = [
# next round of opt_algebraic, get picked up by one of the above two.
(('bcsel', '#a', b, c), ('bcsel', ('ine', 'a', 0), b, c)),
 
+   (('bcsel', a, b, b), b),
+   (('fcsel', a, b, b), b),
+
# Subtracts
(('fsub', 0.0, ('fsub', 0.0, a)), a),
(('isub', 0, ('isub', 0, a)), a),
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] nir: Allow nir_opt_algebraic to see booleanness through &&, ||, !.

2015-02-20 Thread Eric Anholt
We have some useful optimizations to drop things like 'ine a, 0' on a
boolean argument, but if 'a' came from logical operations on bools, it
couldn't tell.  These kinds of constructs appear as a result of TGSI->NIR
quite frequently (at least with if flattening), so being a little more
aggressive in detecting booleans can pay off.

vc4 results:
total instructions in shared programs: 40207 -> 39881 (-0.81%)
instructions in affected programs: 6677 -> 6351 (-4.88%)
---
 src/glsl/nir/nir_search.c | 29 -
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/src/glsl/nir/nir_search.c b/src/glsl/nir/nir_search.c
index 4671931..d12b6ab 100644
--- a/src/glsl/nir/nir_search.c
+++ b/src/glsl/nir/nir_search.c
@@ -39,6 +39,32 @@ match_expression(const nir_search_expression *expr, 
nir_alu_instr *instr,
 
 static const uint8_t identity_swizzle[] = { 0, 1, 2, 3 };
 
+static bool alu_instr_is_bool(nir_alu_instr *instr);
+
+static bool
+src_is_bool(nir_src src)
+{
+   if (!src.is_ssa)
+  return false;
+   if (src.ssa->parent_instr->type != nir_instr_type_alu)
+  return false;
+   return alu_instr_is_bool((nir_alu_instr *)src.ssa->parent_instr);
+}
+
+static bool
+alu_instr_is_bool(nir_alu_instr *instr)
+{
+   switch (instr->op) {
+   case nir_op_iand:
+   case nir_op_ior:
+  return src_is_bool(instr->src[0].src) && src_is_bool(instr->src[1].src);
+   case nir_op_inot:
+  return src_is_bool(instr->src[0].src);
+   default:
+  return nir_op_infos[instr->op].output_type == nir_type_bool;
+   }
+}
+
 static bool
 match_value(const nir_search_value *value, nir_alu_instr *instr, unsigned src,
 unsigned num_components, const uint8_t *swizzle,
@@ -89,7 +115,8 @@ match_value(const nir_search_value *value, nir_alu_instr 
*instr, unsigned src,
 nir_alu_instr *src_alu =
nir_instr_as_alu(instr->src[src].src.ssa->parent_instr);
 
-if (nir_op_infos[src_alu->op].output_type != var->type)
+if (nir_op_infos[src_alu->op].output_type != var->type &&
+!(var->type == nir_type_bool && alu_instr_is_bool(src_alu)))
return false;
  }
 
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] nir: Generalize the optimization of subs of subs from 0.

2015-02-20 Thread Eric Anholt
I initially wrote this based on the "(('fneg', ('fneg', a)), a)" above,
but we can generalize it and make it more potentially useful.  In the
specific original case of a 0 for our new 'a' argument, it'll get further
algebraic optimization once the 0 is an argument to the new add.

No shader-db effects.
---
 src/glsl/nir/nir_opt_algebraic.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/glsl/nir/nir_opt_algebraic.py 
b/src/glsl/nir/nir_opt_algebraic.py
index 8b08e9e..6d09120 100644
--- a/src/glsl/nir/nir_opt_algebraic.py
+++ b/src/glsl/nir/nir_opt_algebraic.py
@@ -152,8 +152,8 @@ optimizations = [
(('fcsel', a, b, b), b),
 
# Subtracts
-   (('fsub', 0.0, ('fsub', 0.0, a)), a),
-   (('isub', 0, ('isub', 0, a)), a),
+   (('fsub', a, ('fsub', 0.0, b)), ('fadd', a, b)),
+   (('isub', a, ('isub', 0, b)), ('iadd', a, b)),
(('fneg', a), ('fsub', 0.0, a), 'options->lower_negate'),
(('ineg', a), ('isub', 0, a), 'options->lower_negate'),
(('fadd', a, ('fsub', 0.0, b)), ('fsub', a, b)),
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/5] nir: When faced with a csel on !condition, just flip the arguments.

2015-02-20 Thread Eric Anholt
total NIR instructions in shared programs: 39426 -> 39411 (-0.04%)
NIR instructions in affected programs: 3748 -> 3733 (-0.40%)
---
 src/glsl/nir/nir_opt_algebraic.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/glsl/nir/nir_opt_algebraic.py 
b/src/glsl/nir/nir_opt_algebraic.py
index dd4557f..d4869c7 100644
--- a/src/glsl/nir/nir_opt_algebraic.py
+++ b/src/glsl/nir/nir_opt_algebraic.py
@@ -86,6 +86,7 @@ optimizations = [
(('fge', ('fneg', ('fabs', a)), 0.0), ('feq', a, 0.0)),
(('bcsel', ('flt', a, b), a, b), ('fmin', a, b)),
(('bcsel', ('flt', a, b), b, a), ('fmax', a, b)),
+   (('bcsel', ('inot', 'a@bool'), b, c), ('bcsel', a, c, b)),
(('fmin', ('fmax', a, 0.0), 1.0), ('fsat', a), '!options->lower_fsat'),
(('fsat', a), ('fmin', ('fmax', a, 0.0), 1.0), 'options->lower_fsat'),
(('fsat', ('fsat', a)), ('fsat', a)),
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] nir: Collapse repeated bcsels on the same argument.

2015-02-20 Thread Eric Anholt
vc4 results:
total instructions in shared programs: 39881 -> 39794 (-0.22%)
instructions in affected programs: 6302 -> 6215 (-1.38%)
---
 src/glsl/nir/nir_opt_algebraic.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/glsl/nir/nir_opt_algebraic.py 
b/src/glsl/nir/nir_opt_algebraic.py
index d4869c7..8b08e9e 100644
--- a/src/glsl/nir/nir_opt_algebraic.py
+++ b/src/glsl/nir/nir_opt_algebraic.py
@@ -87,6 +87,7 @@ optimizations = [
(('bcsel', ('flt', a, b), a, b), ('fmin', a, b)),
(('bcsel', ('flt', a, b), b, a), ('fmax', a, b)),
(('bcsel', ('inot', 'a@bool'), b, c), ('bcsel', a, c, b)),
+   (('bcsel', a, ('bcsel', a, b, c), d), ('bcsel', a, b, d)),
(('fmin', ('fmax', a, 0.0), 1.0), ('fsat', a), '!options->lower_fsat'),
(('fsat', a), ('fmin', ('fmax', a, 0.0), 1.0), 'options->lower_fsat'),
(('fsat', ('fsat', a)), ('fsat', a)),
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] r600g: add doubles support for CAYMAN

2015-02-20 Thread Ilia Mirkin
On Thu, Feb 19, 2015 at 9:59 PM, Glenn Kennard  wrote:
>> +   if ((write_mask & 0x3) != 0x3 &&
>> +   (write_mask & 0xc) != 0xc) {
>> +   fprintf(stderr, "illegal writemask for 64-bit: 0x%x\n",
>> write_mask);
>> +   return -1;
>> +   }

I think I noted this on IRC, but did you mean this to be || instead of
&& ? e.g. right now this will not trigger for a .xyz writemask, since
write_mask & 3 == 3.

BTW, this isn't a complete review, I was just scanning Glenn's
response and this happened to be below one of the lines he commented
on.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] r600g: add doubles support for CAYMAN

2015-02-20 Thread Ilia Mirkin
On Fri, Feb 20, 2015 at 3:07 PM, Ilia Mirkin  wrote:
> On Thu, Feb 19, 2015 at 9:59 PM, Glenn Kennard  
> wrote:
>>> +   if ((write_mask & 0x3) != 0x3 &&
>>> +   (write_mask & 0xc) != 0xc) {
>>> +   fprintf(stderr, "illegal writemask for 64-bit: 0x%x\n",
>>> write_mask);
>>> +   return -1;
>>> +   }
>
> I think I noted this on IRC, but did you mean this to be || instead of
> && ? e.g. right now this will not trigger for a .xyz writemask, since
> write_mask & 3 == 3.

Oh but it can't be || either, since write_mask == 3 should pass this.
I guess you want like

if (write_mask & 3 && write_mask & 3 != 3)
  error
if (writemask & c && writemaks & c != c)
  error

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965/fs/nir: Optimize (gl_FrontFacing ? x : y) where x and y are ±1.0.

2015-02-20 Thread Matt Turner
On Fri, Feb 20, 2015 at 11:54 AM, Eric Anholt  wrote:
> I wanted patch #1 to land, so I took a look at this one :)

Thanks! :)

> Matt Turner  writes:
>> +   if (brw->gen >= 6) {
>> +  /* Bit 15 of g0.0 is 0 if the polygon is front facing. */
>> +  fs_reg g0 = fs_reg(retype(brw_vec1_grf(0, 0), BRW_REGISTER_TYPE_W));
>> +
>> +  /* For (gl_FrontFacing ? 1.0 : -1.0), emit:
>> +   *
>> +   *or(8)  tmp.1<2>W  g0.0<0,1,0>W  0x3f80W
>> +   *and(8) dst<1>Dtmp<8,8,1>D   0xbf80D
>> +   *
>> +   * and negate g0.0<0,1,0>W for (gl_FrontFacing ? -1.0 : 1.0).
>> +   */
>> +
>> +  if (value1->f[0] == -1.0f) {
>> + g0.negate = true;
>> +  }
>
> Does this do what you want?  If g0.0 happened to be *all* zeroes, you're
> still going to get 0 after negation, right?

That's a good question. I'm not sure. The bits below it are

13: Source Depth Present to Render Target.
12: oMask to Render Target
11: Source0 Alpha Present to RenderTarget.
8:6: Starting Sample Pair Index

BDW adds some additional fields as well.

The others are "ignored". It's not clear to me that at least one of
the defined bits is guaranteed to be zero. It's no guarantee or
anything, but FWIW without realizing it we were depending on some bit
being non-zero for the frontfacing optimizations that used ASR as well
(commits d1c43ed4, 19c6617a) and haven't seen any problems from it. So
if there's a problem... we're not making it worse in this commit...

The simulator seems to set some bits in the ignored fields, but I
don't have any explanation for that, nor is that evidence that we can
rely on the hardware to do the same.

Or maybe I'm just wrong and some bit is guaranteed to be set?

> (I suppose your primitive type is probably going to be triangles if
> you're doing facing, so the low bits will be non-zero)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/23] main: Add entry point for NamedBufferSubData.

2015-02-20 Thread Laura Ekstrand
That would make the diff easier to read, but it doesn't make sense to lay
out the functions in that order in the file.  GetBufferSubData is a
download function and shouldn't be part of the upload function "group" in
the file.

On Fri, Feb 20, 2015 at 2:28 AM, Martin Peres  wrote:

> On 12/02/2015 04:05, Laura Ekstrand wrote:
>
>> v2: review by Ian Romanick
>> - Remove "_mesa" from name of static software fallback
>> buffer_sub_data.
>> - Remove mappedRange from _mesa_buffer_sub_data.
>> - Removed some cosmetic changes to a separate commit.
>> ---
>>   src/mapi/glapi/gen/ARB_direct_state_access.xml |   7 ++
>>   src/mesa/main/bufferobj.c  | 129
>> +++--
>>   src/mesa/main/bufferobj.h  |   9 ++
>>   src/mesa/main/tests/dispatch_sanity.cpp|   1 +
>>   4 files changed, 97 insertions(+), 49 deletions(-)
>>
>> diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml
>> b/src/mapi/glapi/gen/ARB_direct_state_access.xml
>> index 7779262..6d70b8e 100644
>> --- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
>> +++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
>> @@ -28,6 +28,13 @@
>> 
>>  
>>   +   
>> +  
>> +  
>> +  
>> +  
>> +   
>> +
>>  
>>
>> diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
>> index ac8eed1..4f89748 100644
>> --- a/src/mesa/main/bufferobj.c
>> +++ b/src/mesa/main/bufferobj.c
>> @@ -227,67 +227,58 @@ bufferobj_range_mapped(const struct
>> gl_buffer_object *obj,
>>* \c glClearBufferSubData.
>>*
>>* \param ctx GL context.
>> - * \param target  Buffer object target on which to operate.
>> + * \param bufObj  The buffer object.
>>* \param offset  Offset of the first byte of the subdata range.
>>* \param sizeSize, in bytes, of the subdata range.
>>* \param mappedRange  If true, checks if an overlapping range is
>> mapped.
>>* If false, checks if buffer is mapped.
>> - * \param errorNoBuffer  Error code if no buffer is bound to target.
>>* \param caller  Name of calling function for recording errors.
>> - * \return   A pointer to the buffer object bound to \c target in the
>> - *   specified context or \c NULL if any of the parameter or
>> state
>> - *   conditions are invalid.
>> + * \return   false if error, true otherwise
>>*
>>* \sa glBufferSubDataARB, glGetBufferSubDataARB, glClearBufferSubData
>>*/
>> -static struct gl_buffer_object *
>> -buffer_object_subdata_range_good(struct gl_context * ctx, GLenum target,
>> - GLintptrARB offset, GLsizeiptrARB size,
>> - bool mappedRange, GLenum errorNoBuffer,
>> - const char *caller)
>> +static bool
>> +buffer_object_subdata_range_good(struct gl_context *ctx,
>> + struct gl_buffer_object *bufObj,
>> + GLintptr offset, GLsizeiptr size,
>> + bool mappedRange, const char *caller)
>>   {
>> -   struct gl_buffer_object *bufObj;
>> -
>>  if (size < 0) {
>> _mesa_error(ctx, GL_INVALID_VALUE, "%s(size < 0)", caller);
>> -  return NULL;
>> +  return false;
>>  }
>>if (offset < 0) {
>> _mesa_error(ctx, GL_INVALID_VALUE, "%s(offset < 0)", caller);
>> -  return NULL;
>> +  return false;
>>  }
>>   -   bufObj = get_buffer(ctx, caller, target, errorNoBuffer);
>> -   if (!bufObj)
>> -  return NULL;
>> -
>>  if (offset + size > bufObj->Size) {
>> _mesa_error(ctx, GL_INVALID_VALUE,
>> "%s(offset %lu + size %lu > buffer size %lu)", caller,
>> (unsigned long) offset,
>> (unsigned long) size,
>> (unsigned long) bufObj->Size);
>> -  return NULL;
>> +  return false;
>>  }
>>if (bufObj->Mappings[MAP_USER].AccessFlags &
>> GL_MAP_PERSISTENT_BIT)
>> -  return bufObj;
>> +  return true;
>>if (mappedRange) {
>> if (bufferobj_range_mapped(bufObj, offset, size)) {
>>_mesa_error(ctx, GL_INVALID_OPERATION, "%s", caller);
>> - return NULL;
>> + return false;
>> }
>>  }
>>  else {
>> if (_mesa_bufferobj_mapped(bufObj, MAP_USER)) {
>>_mesa_error(ctx, GL_INVALID_OPERATION, "%s", caller);
>> - return NULL;
>> + return false;
>> }
>>  }
>>   -   return bufObj;
>> +   return true;
>>   }
>> @@ -602,9 +593,9 @@ buffer_data_fallback(struct gl_context *ctx,
>> GLenum target, GLsizeiptr size,
>>* \sa glBufferSubDataARB, dd_function_table::BufferSubData.
>>*/
>>   static void
>> -_mesa_buffer_subdata( struct gl_context *ctx, GLintptrARB offset,
>> - GLsizeiptrARB size, const GLvoid * data,
>> - struct gl_buffer_object * bufObj )
>> +buffer_sub_data_f

[Mesa-dev] [PATCH] drivers/x11: add gallium include dirs to Makefile.am

2015-02-20 Thread Brian Paul
Fixes xlib driver build after e8c5cbfd921680c.
---
 src/mesa/drivers/x11/Makefile.am | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/drivers/x11/Makefile.am b/src/mesa/drivers/x11/Makefile.am
index 76a7cd4..c0596f8 100644
--- a/src/mesa/drivers/x11/Makefile.am
+++ b/src/mesa/drivers/x11/Makefile.am
@@ -30,6 +30,8 @@ AM_CPPFLAGS = \
-I$(top_srcdir)/src/mapi \
-I$(top_srcdir)/src/mesa \
-I$(top_srcdir)/src \
+   -I$(top_srcdir)/src/gallium/include \
+   -I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_srcdir)/src/mesa/main \
$(X11_INCLUDES) \
$(DEFINES)
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] nir: Add a couple of simplifications of csel operations.

2015-02-20 Thread Matt Turner
The series is

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nir: Fix the Mesa build without -DDEBUG.

2015-02-20 Thread Kenneth Graunke
With -DDEBUG -UNDEBUG, this assert uses reg_state::stack_size, which
doesn't exist, breaking the build:

assert(state->states[index].index < state->states[index].stack_size);

Switch it to ifndef NDEBUG, so the field will exist if the assertion
actually generates code.

Signed-off-by: Kenneth Graunke 
---
 src/glsl/nir/nir_to_ssa.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/glsl/nir/nir_to_ssa.c b/src/glsl/nir/nir_to_ssa.c
index dbe1699..47cf453 100644
--- a/src/glsl/nir/nir_to_ssa.c
+++ b/src/glsl/nir/nir_to_ssa.c
@@ -134,7 +134,7 @@ typedef struct {
nir_ssa_def **stack;
int index;
unsigned num_defs; /** < used to add indices to debug names */
-#ifdef DEBUG
+#ifndef NDEBUG
unsigned stack_size;
 #endif
 } reg_state;
@@ -489,7 +489,7 @@ init_rewrite_state(nir_function_impl *impl, rewrite_state 
*state)
  state->states[reg->index].stack = ralloc_array(state->states,
 nir_ssa_def *,
 stack_size);
-#ifdef DEBUG
+#ifndef NDEBUG
  state->states[reg->index].stack_size = stack_size;
 #endif
  state->states[reg->index].index = -1;
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Have configure define NDEBUG, not mtypes.h.

2015-02-20 Thread Matt Turner
On Fri, Feb 20, 2015 at 12:44 PM, Matt Turner  wrote:
> mtypes.h had been defining NDEBUG (used by assert) if DEBUG was not
> defined. Confusing and bizarre that you don't get NDEBUG if you don't
> include mtypes.h.
>
> ... which is just what happened in commit bef38f62e.
>
> Let's let configure define this for us if not using --enable-debug.
> ---

Scons will probably need similar treatment.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: Have configure define NDEBUG, not mtypes.h.

2015-02-20 Thread Matt Turner
mtypes.h had been defining NDEBUG (used by assert) if DEBUG was not
defined. Confusing and bizarre that you don't get NDEBUG if you don't
include mtypes.h.

... which is just what happened in commit bef38f62e.

Let's let configure define this for us if not using --enable-debug.
---
 configure.ac   | 2 ++
 src/mesa/main/mtypes.h | 3 ---
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/configure.ac b/configure.ac
index beb7a7d..5fbb7bc 100644
--- a/configure.ac
+++ b/configure.ac
@@ -370,6 +370,8 @@ if test "x$enable_debug" = xyes; then
 CXXFLAGS="$CXXFLAGS -O0"
 fi
 fi
+else
+   DEFINES="$DEFINES -DNDEBUG"
 fi
 
 dnl
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 05b5a81..6e99773 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -4512,9 +4512,6 @@ extern int MESA_DEBUG_FLAGS;
 # define MESA_VERBOSE 0
 # define MESA_DEBUG_FLAGS 0
 # define MESA_FUNCTION "a function"
-# ifndef NDEBUG
-#  define NDEBUG
-# endif
 #endif
 
 
-- 
2.0.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Fix the Mesa build without -DDEBUG.

2015-02-20 Thread Matt Turner
Regardless of where DEBUG/NDEBUG is defined, it seems like we should
do this anyway.

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/23] main: Add entry point for NamedBufferSubData.

2015-02-20 Thread Martin Peres

On 20/02/2015 22:17, Laura Ekstrand wrote:
That would make the diff easier to read, but it doesn't make sense to 
lay out the functions in that order in the file.  GetBufferSubData is 
a download function and shouldn't be part of the upload function 
"group" in the file.


Fair enough! That was a minor nitpick anyway and I did not expect you to 
fix it, hence why I gave my R-b ;)




On Fri, Feb 20, 2015 at 2:28 AM, Martin Peres > wrote:


On 12/02/2015 04:05, Laura Ekstrand wrote:

v2: review by Ian Romanick
- Remove "_mesa" from name of static software fallback
buffer_sub_data.
- Remove mappedRange from _mesa_buffer_sub_data.
- Removed some cosmetic changes to a separate commit.
---
  src/mapi/glapi/gen/ARB_direct_state_access.xml |   7 ++
  src/mesa/main/bufferobj.c  | 129
+++--
  src/mesa/main/bufferobj.h  |   9 ++
  src/mesa/main/tests/dispatch_sanity.cpp|   1 +
  4 files changed, 97 insertions(+), 49 deletions(-)

diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml
b/src/mapi/glapi/gen/ARB_direct_state_access.xml
index 7779262..6d70b8e 100644
--- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
+++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
@@ -28,6 +28,13 @@

 
  +   
+  
+  
+  
+  
+   
+
 
   
diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index ac8eed1..4f89748 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -227,67 +227,58 @@ bufferobj_range_mapped(const struct
gl_buffer_object *obj,
   * \c glClearBufferSubData.
   *
   * \param ctx GL context.
- * \param target  Buffer object target on which to operate.
+ * \param bufObj  The buffer object.
   * \param offset  Offset of the first byte of the subdata range.
   * \param sizeSize, in bytes, of the subdata range.
   * \param mappedRange  If true, checks if an overlapping
range is mapped.
   * If false, checks if buffer is mapped.
- * \param errorNoBuffer  Error code if no buffer is bound to
target.
   * \param caller  Name of calling function for recording errors.
- * \return   A pointer to the buffer object bound to \c
target in the
- *   specified context or \c NULL if any of the
parameter or state
- *   conditions are invalid.
+ * \return   false if error, true otherwise
   *
   * \sa glBufferSubDataARB, glGetBufferSubDataARB,
glClearBufferSubData
   */
-static struct gl_buffer_object *
-buffer_object_subdata_range_good(struct gl_context * ctx,
GLenum target,
- GLintptrARB offset,
GLsizeiptrARB size,
- bool mappedRange, GLenum
errorNoBuffer,
- const char *caller)
+static bool
+buffer_object_subdata_range_good(struct gl_context *ctx,
+ struct gl_buffer_object *bufObj,
+ GLintptr offset, GLsizeiptr
size,
+ bool mappedRange, const char
*caller)
  {
-   struct gl_buffer_object *bufObj;
-
 if (size < 0) {
_mesa_error(ctx, GL_INVALID_VALUE, "%s(size < 0)",
caller);
-  return NULL;
+  return false;
 }
   if (offset < 0) {
_mesa_error(ctx, GL_INVALID_VALUE, "%s(offset < 0)",
caller);
-  return NULL;
+  return false;
 }
  -   bufObj = get_buffer(ctx, caller, target, errorNoBuffer);
-   if (!bufObj)
-  return NULL;
-
 if (offset + size > bufObj->Size) {
_mesa_error(ctx, GL_INVALID_VALUE,
"%s(offset %lu + size %lu > buffer size
%lu)", caller,
(unsigned long) offset,
(unsigned long) size,
(unsigned long) bufObj->Size);
-  return NULL;
+  return false;
 }
   if (bufObj->Mappings[MAP_USER].AccessFlags &
GL_MAP_PERSISTENT_BIT)
-  return bufObj;
+  return true;
   if (mappedRange) {
if (bufferobj_range_mapped(bufObj, offset, size)) {
   _mesa_error(ctx, GL_INVALID_OPERATION, "%s", c

Re: [Mesa-dev] [PATCH 11/23] main: Refactor ClearBuffer[Sub]Data.

2015-02-20 Thread Martin Peres

On 20/02/2015 20:03, Laura Ekstrand wrote:

This naming convention tries to match the corresponding DD table entry.

There's a thread discussing the naming convention for external 
software fallback functions.  Feel free to add your input to the 
discussion there :)


Ack, no strong opinion here :)



On Fri, Feb 20, 2015 at 6:18 AM, Martin Peres > wrote:


On 12/02/2015 04:05, Laura Ekstrand wrote:

v2: review by Jason Ekstrand
- Split refactor of clear buffer sub data from addition of
DSA entry
  points.
---
  src/mesa/main/bufferobj.c| 125
---
  src/mesa/main/bufferobj.h|  19 ++--
  src/mesa/state_tracker/st_cb_bufferobjects.c |   4 +-
  3 files changed, 69 insertions(+), 79 deletions(-)

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index 7225b64..b8fa917 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -660,11 +660,11 @@ _mesa_buffer_get_subdata( struct
gl_context *ctx, GLintptrARB offset,
   * dd_function_table::ClearBufferSubData.
   */
  void
-_mesa_buffer_clear_subdata(struct gl_context *ctx,
-   GLintptr offset, GLsizeiptr size,
-   const GLvoid *clearValue,
-   GLsizeiptr clearValueSize,
-   struct gl_buffer_object *bufObj)
+_mesa_ClearBufferSubData_sw(struct gl_context *ctx,


Interesting choice of naming the function as a mix of camel case
and underscores.

Since it is an internal function, shouldn't it only have underscores?

Other than that:

Reviewed-by: Martin Peres mailto:martin.pe...@linux.intel.com>>

+GLintptr offset, GLsizeiptr size,
+const GLvoid *clearValue,
+GLsizeiptr clearValueSize,
+struct gl_buffer_object *bufObj)
  {
 GLsizeiptr i;
 GLubyte *dest;
@@ -1113,7 +1113,7 @@
_mesa_init_buffer_object_functions(struct dd_function_table
*driver)
 driver->UnmapBuffer = _mesa_buffer_unmap;
   /* GL_ARB_clear_buffer_object */
-   driver->ClearBufferSubData = _mesa_buffer_clear_subdata;
+   driver->ClearBufferSubData = _mesa_ClearBufferSubData_sw;
   /* GL_ARB_map_buffer_range */
 driver->MapBufferRange = _mesa_buffer_map_range;
@@ -1671,57 +1671,77 @@ _mesa_GetBufferSubData(GLenum target,
GLintptr offset,
 ctx->Driver.GetBufferSubData( ctx, offset, size, data,
bufObj );
  }
  -
-void GLAPIENTRY
-_mesa_ClearBufferData(GLenum target, GLenum internalformat,
GLenum format,
-  GLenum type, const GLvoid* data)
+/**
+ * \param subdata   true if caller is *SubData, false if *Data
+ */
+void
+_mesa_clear_buffer_sub_data(struct gl_context *ctx,
+struct gl_buffer_object *bufObj,
+GLenum internalformat,
+GLintptr offset, GLsizeiptr size,
+GLenum format, GLenum type,
+const GLvoid *data,
+const char *func, bool subdata)
  {
-   GET_CURRENT_CONTEXT(ctx);
-   struct gl_buffer_object* bufObj;
 mesa_format mesaFormat;
 GLubyte clearValue[MAX_PIXEL_BYTES];
 GLsizeiptr clearValueSize;
  -   bufObj = get_buffer(ctx, "glClearBufferData", target,
GL_INVALID_VALUE);
-   if (!bufObj) {
-  return;
-   }
-
-   if (_mesa_check_disallowed_mapping(bufObj)) {
-  _mesa_error(ctx, GL_INVALID_OPERATION,
-  "glClearBufferData(buffer currently mapped)");
+   /* This checks for disallowed mappings. */
+   if (!buffer_object_subdata_range_good(ctx, bufObj, offset,
size,
+ subdata, func)) {
return;
 }
   mesaFormat = validate_clear_buffer_format(ctx,
internalformat,
- format, type,
-  "glClearBufferData");
+ format, type, func);
+
 if (mesaFormat == MESA_FORMAT_NONE) {
return;
 }
   clearValueSize = _mesa_get_format_bytes(mesaFormat);
-   if (bufObj->Size % clearValueSize != 0) {
+   if (offset % cle

Re: [Mesa-dev] [PATCH 23/23] main: Cosmetic changes to GetBufferSubData.

2015-02-20 Thread Martin Peres

On 20/02/2015 19:58, Laura Ekstrand wrote:
Ian has asked that this not be squashed.  In fact, they were the same 
commit last time around on the ML.


Ok. Sorry for the noise then :s



On Fri, Feb 20, 2015 at 12:29 AM, Martin Peres > wrote:


Please squash this commit with the one introducing GetBufferSubData.

It should be pretty easy to do with git rebase -i :)


On 12/02/2015 04:06, Laura Ekstrand wrote:

---
  src/mesa/main/bufferobj.c | 2 +-
  src/mesa/main/bufferobj.h | 4 ++--
  2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index 0272704..38d8b5a 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -1668,7 +1668,7 @@ _mesa_GetBufferSubData(GLenum target,
GLintptr offset,
 }
   ASSERT(ctx->Driver.GetBufferSubData);
-   ctx->Driver.GetBufferSubData( ctx, offset, size, data,
bufObj );
+   ctx->Driver.GetBufferSubData(ctx, offset, size, data, bufObj);
  }
void GLAPIENTRY
diff --git a/src/mesa/main/bufferobj.h b/src/mesa/main/bufferobj.h
index feeaa8b..b5d73ae 100644
--- a/src/mesa/main/bufferobj.h
+++ b/src/mesa/main/bufferobj.h
@@ -230,8 +230,8 @@ _mesa_NamedBufferSubData(GLuint buffer,
GLintptr offset,
   GLsizeiptr size, const GLvoid *data);
void GLAPIENTRY
-_mesa_GetBufferSubData(GLenum target, GLintptrARB offset,
-   GLsizeiptrARB size, void * data);
+_mesa_GetBufferSubData(GLenum target, GLintptr offset,
+   GLsizeiptr size, GLvoid *data);
void GLAPIENTRY
  _mesa_GetNamedBufferSubData(GLuint buffer, GLintptr offset,





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/fs: Use fs_reg for CS/VS atomics pixel mask immediate data

2015-02-20 Thread Jordan Justen
The brw_imm_ud will yield a HW_REG which then will introduce a barrier
for certain optimization opportunities.

No piglit regressions seen with gen8 (simd8vs).

Suggested-by: Matt Turner 
Signed-off-by: Jordan Justen 
Cc: Matt Turner 
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index fa7d32c..b1b75821 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -3016,7 +3016,7 @@ fs_visitor::emit_untyped_atomic(unsigned atomic_op, 
unsigned surf_index,
*/
   assert(stage == MESA_SHADER_VERTEX || stage == MESA_SHADER_COMPUTE);
   emit(MOV(component(sources[0], 7),
-   brw_imm_ud(0x)))->force_writemask_all = true;
+   fs_reg(0x)))->force_writemask_all = true;
}
length++;
 
@@ -3079,7 +3079,7 @@ fs_visitor::emit_untyped_surface_read(unsigned 
surf_index, fs_reg dst,
*/
   assert(stage == MESA_SHADER_VERTEX || stage == MESA_SHADER_COMPUTE);
   emit(MOV(component(sources[0], 7),
-   brw_imm_ud(0x)))->force_writemask_all = true;
+   fs_reg(0x)))->force_writemask_all = true;
}
 
/* Set the surface read offset. */
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Use fs_reg for CS/VS atomics pixel mask immediate data

2015-02-20 Thread Matt Turner
Reviewed-by: Matt Turner 

It might be a good idea to make the fs_reg(struct brw_reg) explicit to
prevent this kind of mistake from happening.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] nir: Compilation error in nir/nir_to_ssa.c

2015-02-20 Thread Dieter Nützel

make[3]: Entering directory '/opt/mesa/src/glsl'
  CC   nir/nir_to_ssa.lo
In file included from nir/../glsl_types.h:30:0,
 from nir/nir_types.h:32,
 from nir/nir.h:36,
 from nir/nir_to_ssa.c:28:
nir/nir_to_ssa.c: In function 'rewrite_def_forwards':
nir/nir_to_ssa.c:226:60: error: 'reg_state' has no member named 
'stack_size'
assert(state->states[index].index < 
state->states[index].stack_size);

^
Makefile:1602: recipe for target 'nir/nir_to_ssa.lo' failed
make[3]: *** [nir/nir_to_ssa.lo] Error 1

It's only defined for DEBUG.

-Dieter
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] nir: Compilation error in nir/nir_to_ssa.c

2015-02-20 Thread Matt Turner
On Fri, Feb 20, 2015 at 1:23 PM, Dieter Nützel  wrote:
> make[3]: Entering directory '/opt/mesa/src/glsl'
>   CC   nir/nir_to_ssa.lo
> In file included from nir/../glsl_types.h:30:0,
>  from nir/nir_types.h:32,
>  from nir/nir.h:36,
>  from nir/nir_to_ssa.c:28:
> nir/nir_to_ssa.c: In function 'rewrite_def_forwards':
> nir/nir_to_ssa.c:226:60: error: 'reg_state' has no member named 'stack_size'
> assert(state->states[index].index < state->states[index].stack_size);
> ^
> Makefile:1602: recipe for target 'nir/nir_to_ssa.lo' failed
> make[3]: *** [nir/nir_to_ssa.lo] Error 1
>
> It's only defined for DEBUG.
>
> -Dieter

Yep. There are two patches on the list to fix this.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] i965: Force miptrees for BOs to have all slices in each lod.

2015-02-20 Thread Laura Ekstrand
Textures made expressly for internal buffer objects shouldn't have extra
padding around them, but should be densely packed.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 0e3888f..b46532d 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -724,7 +724,7 @@ intel_miptree_create_for_bo(struct brw_context *brw,
mt = intel_miptree_create_layout(brw, target, format,
 0, 0,
 width, height, depth,
-true, 0, false);
+true, 0, true);
if (!mt) {
   free(mt);
   return mt;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] common: Fix PBOs for 1D_ARRAY.

2015-02-20 Thread Laura Ekstrand
Corrects the way that _mesa_meta_pbo_TexSubImage and
_mesa_meta_pbo_GetTexSubImage handle 1D_ARRAY textures.  Fixes a failure in
the Piglit arb_direct_state_access/gettextureimage-targets test.
---
 src/mesa/drivers/common/meta_tex_subimage.c | 62 +
 1 file changed, 36 insertions(+), 26 deletions(-)

diff --git a/src/mesa/drivers/common/meta_tex_subimage.c 
b/src/mesa/drivers/common/meta_tex_subimage.c
index ee3295b..24a72ba 100644
--- a/src/mesa/drivers/common/meta_tex_subimage.c
+++ b/src/mesa/drivers/common/meta_tex_subimage.c
@@ -141,7 +141,7 @@ _mesa_meta_pbo_TexSubImage(struct gl_context *ctx, GLuint 
dims,
struct gl_texture_image *pbo_tex_image;
GLenum status;
bool success = false;
-   int z;
+   int z, iters;
 
/* XXX: This should probably be passed in from somewhere */
const char *where = "_mesa_meta_pbo_TexSubImage";
@@ -189,12 +189,6 @@ _mesa_meta_pbo_TexSubImage(struct gl_context *ctx, GLuint 
dims,
_mesa_BindFramebuffer(GL_READ_FRAMEBUFFER, fbos[0]);
_mesa_BindFramebuffer(GL_DRAW_FRAMEBUFFER, fbos[1]);
 
-   if (tex_image->TexObject->Target == GL_TEXTURE_1D_ARRAY) {
-  assert(depth == 1);
-  depth = height;
-  height = 1;
-   }
-
_mesa_meta_bind_fbo_image(GL_READ_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
  pbo_tex_image, 0);
/* If this passes on the first layer it should pass on the others */
@@ -218,7 +212,10 @@ _mesa_meta_pbo_TexSubImage(struct gl_context *ctx, GLuint 
dims,
   GL_COLOR_BUFFER_BIT, GL_NEAREST))
   goto fail;
 
-   for (z = 1; z < depth; z++) {
+   iters = tex_image->TexObject->Target == GL_TEXTURE_1D_ARRAY ?
+   height : depth;
+
+   for (z = 1; z < iters; z++) {
   _mesa_meta_bind_fbo_image(GL_READ_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
 pbo_tex_image, z);
   _mesa_meta_bind_fbo_image(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
@@ -226,11 +223,18 @@ _mesa_meta_pbo_TexSubImage(struct gl_context *ctx, GLuint 
dims,
 
   _mesa_update_state(ctx);
 
-  _mesa_meta_BlitFramebuffer(ctx, ctx->ReadBuffer, ctx->DrawBuffer,
- 0, z * height, width, (z + 1) * height,
- xoffset, yoffset,
- xoffset + width, yoffset + height,
- GL_COLOR_BUFFER_BIT, GL_NEAREST);
+  if (tex_image->TexObject->Target == GL_TEXTURE_1D_ARRAY)
+ _mesa_meta_BlitFramebuffer(ctx, ctx->ReadBuffer, ctx->DrawBuffer,
+0, z, width, z + 1,
+xoffset, yoffset,
+xoffset + width, yoffset + 1,
+GL_COLOR_BUFFER_BIT, GL_NEAREST);
+  else
+ _mesa_meta_BlitFramebuffer(ctx, ctx->ReadBuffer, ctx->DrawBuffer,
+0, z * height, width, (z + 1) * height,
+xoffset, yoffset,
+xoffset + width, yoffset + height,
+GL_COLOR_BUFFER_BIT, GL_NEAREST);
}
 
success = true;
@@ -257,7 +261,7 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, 
GLuint dims,
struct gl_texture_image *pbo_tex_image;
GLenum status;
bool success = false;
-   int z;
+   int z, iters;
 
/* XXX: This should probably be passed in from somewhere */
const char *where = "_mesa_meta_pbo_GetTexSubImage";
@@ -299,12 +303,6 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, 
GLuint dims,
 
_mesa_GenFramebuffers(2, fbos);
 
-   if (tex_image && tex_image->TexObject->Target == GL_TEXTURE_1D_ARRAY) {
-  assert(depth == 1);
-  depth = height;
-  height = 1;
-   }
-
/* If we were given a texture, bind it to the read framebuffer.  If not,
 * we're doing a ReadPixels and we should just use whatever framebuffer
 * the client has bound.
@@ -338,7 +336,12 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, 
GLuint dims,
   GL_COLOR_BUFFER_BIT, GL_NEAREST))
   goto fail;
 
-   for (z = 1; z < depth; z++) {
+   if (tex_image && tex_image->TexObject->Target == GL_TEXTURE_1D_ARRAY)
+  iters = height;
+   else
+  iters = depth;
+
+   for (z = 1; z < iters; z++) {
   _mesa_meta_bind_fbo_image(GL_READ_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
 tex_image, zoffset + z);
   _mesa_meta_bind_fbo_image(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
@@ -346,11 +349,18 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, 
GLuint dims,
 
   _mesa_update_state(ctx);
 
-  _mesa_meta_BlitFramebuffer(ctx, ctx->ReadBuffer, ctx->DrawBuffer,
- xoffset, yoffset,
- xoffset + width, yoffset + height,
- 0, z * height, width, (z + 1) * height,
-  

[Mesa-dev] [PATCH 2/4] common: Correct PBO 2D_ARRAY handling.

2015-02-20 Thread Laura Ekstrand
Changes PBO uploads and downloads to use a tall (height * depth) 2D texture
for blitting.  This fixes the bug where 2D_ARRAY, 3D, and CUBE_MAP_ARRAY
textures are not properly uploaded and downloaded.
---
 src/mesa/drivers/common/meta_tex_subimage.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/common/meta_tex_subimage.c 
b/src/mesa/drivers/common/meta_tex_subimage.c
index f4f7716..ee3295b 100644
--- a/src/mesa/drivers/common/meta_tex_subimage.c
+++ b/src/mesa/drivers/common/meta_tex_subimage.c
@@ -110,7 +110,7 @@ create_texture_for_pbo(struct gl_context *ctx, bool 
create_pbo,
internal_format = _mesa_get_format_base_format(pbo_format);
 
tex_image = _mesa_get_tex_image(ctx, tex_obj, tex_obj->Target, 0);
-   _mesa_init_teximage_fields(ctx, tex_image, width, height, depth,
+   _mesa_init_teximage_fields(ctx, tex_image, width, height * depth, 1,
   0, internal_format, pbo_format);
 
read_only = pbo_target == GL_PIXEL_UNPACK_BUFFER;
@@ -227,7 +227,7 @@ _mesa_meta_pbo_TexSubImage(struct gl_context *ctx, GLuint 
dims,
   _mesa_update_state(ctx);
 
   _mesa_meta_BlitFramebuffer(ctx, ctx->ReadBuffer, ctx->DrawBuffer,
- 0, 0, width, height,
+ 0, z * height, width, (z + 1) * height,
  xoffset, yoffset,
  xoffset + width, yoffset + height,
  GL_COLOR_BUFFER_BIT, GL_NEAREST);
@@ -349,7 +349,7 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, 
GLuint dims,
   _mesa_meta_BlitFramebuffer(ctx, ctx->ReadBuffer, ctx->DrawBuffer,
  xoffset, yoffset,
  xoffset + width, yoffset + height,
- 0, 0, width, height,
+ 0, z * height, width, (z + 1) * height,
  GL_COLOR_BUFFER_BIT, GL_NEAREST);
}
 
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/4] v2: Fix PBO uploads/downloads.

2015-02-20 Thread Laura Ekstrand
This is a more robust set of patches to fix Meta PBO texture uploads and
downloads.  This fixes bugs related to array PBOs that were found using the
new set of getteximage-targets tests.

Laura Ekstrand (4):
  common: Correct texture initialization in create_texture_for_pbo.
  common: Correct PBO 2D_ARRAY handling.
  common: Fix PBOs for 1D_ARRAY.
  i965: Force miptrees for BOs to have all slices in each lod.

 src/mesa/drivers/common/meta_tex_subimage.c   | 77 ---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c |  2 +-
 src/mesa/drivers/dri/i965/intel_tex.c |  3 +-
 src/mesa/main/dd.h|  1 +
 4 files changed, 50 insertions(+), 33 deletions(-)

-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] common: Correct texture initialization in create_texture_for_pbo.

2015-02-20 Thread Laura Ekstrand
Solves bugs related to the driver not setting up the texture miptree
correctly, leading to faulty PBO uploads and downloads.
---
 src/mesa/drivers/common/meta_tex_subimage.c | 13 +
 src/mesa/drivers/dri/i965/intel_tex.c   |  3 ++-
 src/mesa/main/dd.h  |  1 +
 3 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/common/meta_tex_subimage.c 
b/src/mesa/drivers/common/meta_tex_subimage.c
index 68c8273..f4f7716 100644
--- a/src/mesa/drivers/common/meta_tex_subimage.c
+++ b/src/mesa/drivers/common/meta_tex_subimage.c
@@ -51,7 +51,7 @@ create_texture_for_pbo(struct gl_context *ctx, bool 
create_pbo,
 {
uint32_t pbo_format;
GLenum internal_format;
-   unsigned row_stride;
+   unsigned row_stride, image_stride;
struct gl_buffer_object *buffer_obj;
struct gl_texture_object *tex_obj;
struct gl_texture_image *tex_image;
@@ -74,6 +74,8 @@ create_texture_for_pbo(struct gl_context *ctx, bool 
create_pbo,
pixels = _mesa_image_address3d(packing, pixels,
   width, height, format, type, 0, 0, 0);
row_stride = _mesa_image_row_stride(packing, width, format, type);
+   image_stride = _mesa_image_image_stride(packing, width, height, format,
+   type);
 
if (_mesa_is_bufferobj(packing->BufferObj)) {
   *tmp_pbo = 0;
@@ -89,7 +91,7 @@ create_texture_for_pbo(struct gl_context *ctx, bool 
create_pbo,
*/
   _mesa_BindBuffer(pbo_target, *tmp_pbo);
 
-  _mesa_BufferData(pbo_target, row_stride * height, pixels, 
GL_STREAM_DRAW);
+  _mesa_BufferData(pbo_target, image_stride * depth, pixels, 
GL_STREAM_DRAW);
 
   buffer_obj = ctx->Unpack.BufferObj;
   pixels = NULL;
@@ -99,9 +101,11 @@ create_texture_for_pbo(struct gl_context *ctx, bool 
create_pbo,
 
_mesa_GenTextures(1, tmp_tex);
tex_obj = _mesa_lookup_texture(ctx, *tmp_tex);
-   tex_obj->Target = depth > 1 ? GL_TEXTURE_2D_ARRAY : GL_TEXTURE_2D;
-   tex_obj->Immutable = GL_TRUE;
_mesa_initialize_texture_object(ctx, tex_obj, *tmp_tex, GL_TEXTURE_2D);
+   /* This must be set after _mesa_initialize_texture_object, not before. */
+   tex_obj->Immutable = GL_TRUE;
+   /* This is required for interactions with ARB_texture_view. */
+   tex_obj->NumLayers = 1;
 
internal_format = _mesa_get_format_base_format(pbo_format);
 
@@ -114,6 +118,7 @@ create_texture_for_pbo(struct gl_context *ctx, bool 
create_pbo,
  buffer_obj,
  (intptr_t)pixels,
  row_stride,
+ image_stride,
  read_only)) {
   _mesa_DeleteTextures(1, tmp_tex);
   _mesa_DeleteBuffers(1, tmp_pbo);
diff --git a/src/mesa/drivers/dri/i965/intel_tex.c 
b/src/mesa/drivers/dri/i965/intel_tex.c
index 2d3009a..3a0c09a 100644
--- a/src/mesa/drivers/dri/i965/intel_tex.c
+++ b/src/mesa/drivers/dri/i965/intel_tex.c
@@ -305,6 +305,7 @@ intel_set_texture_storage_for_buffer_object(struct 
gl_context *ctx,
 struct gl_buffer_object 
*buffer_obj,
 uint32_t buffer_offset,
 uint32_t row_stride,
+uint32_t image_stride,
 bool read_only)
 {
struct brw_context *brw = brw_context(ctx);
@@ -334,7 +335,7 @@ intel_set_texture_storage_for_buffer_object(struct 
gl_context *ctx,
 
drm_intel_bo *bo = intel_bufferobj_buffer(brw, intel_buffer_obj,
  buffer_offset,
- row_stride * image->Height);
+ image_stride * image->Depth);
intel_texobj->mt =
   intel_miptree_create_for_bo(brw, bo,
   image->TexFormat,
diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h
index ec8662b..9de08e2 100644
--- a/src/mesa/main/dd.h
+++ b/src/mesa/main/dd.h
@@ -429,6 +429,7 @@ struct dd_function_table {
 struct gl_buffer_object *bufferObj,
 uint32_t buffer_offset,
 uint32_t row_stride,
+uint32_t image_stride,
 bool read_only);
 
/**
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] drivers/x11: add gallium include dirs to Makefile.am

2015-02-20 Thread Matt Turner
On Fri, Feb 20, 2015 at 12:20 PM, Brian Paul  wrote:
> Fixes xlib driver build after e8c5cbfd921680c.
> ---
>  src/mesa/drivers/x11/Makefile.am | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/src/mesa/drivers/x11/Makefile.am 
> b/src/mesa/drivers/x11/Makefile.am
> index 76a7cd4..c0596f8 100644
> --- a/src/mesa/drivers/x11/Makefile.am
> +++ b/src/mesa/drivers/x11/Makefile.am
> @@ -30,6 +30,8 @@ AM_CPPFLAGS = \
> -I$(top_srcdir)/src/mapi \
> -I$(top_srcdir)/src/mesa \
> -I$(top_srcdir)/src \
> +   -I$(top_srcdir)/src/gallium/include \
> +   -I$(top_srcdir)/src/gallium/auxiliary \

I really dislike including these directories in a bunch of non-Gallium
places. I should have said so more strongly when the patch went by.

I'm not sure whether I want to spend my time to fix this the right way or what.

I guess just go ahead and commit this until something better comes along.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 89245] [bisected] "Drop dependency on mtypes.h for core NIR" patch broke Gallium builds

2015-02-20 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=89245

Bug ID: 89245
   Summary: [bisected] "Drop dependency on mtypes.h for core NIR"
patch broke Gallium builds
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: b.bel...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org
CC: e...@anholt.net

The patch "nir: Drop dependency on mtypes.h for core NIR." (commit
bef38f62e026) broke my Gallium (r600g) build.


I build with:
./autogen.sh --with-gallium-drivers=r600 --with-dri-drivers=
--enable-texture-float --disable-dri3 --disable-r600-llvm-compiler
--disable-gallium-llvm CFLAGS="-O2 -march=native" CXXFLAGS="-O2 -march=native"
--libdir=/usr/local/lib64 --prefix=/usr/local


Here is the error:
[...]
  CC   nir/nir_split_var_copies.lo
  CC   nir/nir_to_ssa.lo
  CC   nir/nir_validate.lo
  CC   nir/nir_worklist.lo
In file included from nir/../glsl_types.h:30:0,
 from nir/nir_types.h:32,
 from nir/nir.h:36,
 from nir/nir_to_ssa.c:28:
nir/nir_to_ssa.c: In function 'rewrite_def_forwards':
nir/nir_to_ssa.c:226:60: error: 'reg_state' has no member named 'stack_size'
assert(state->states[index].index < state->states[index].stack_size);
^
Makefile:1602: recipe for target 'nir/nir_to_ssa.lo' failed
make[3]: *** [nir/nir_to_ssa.lo] Error 1
make[3]: *** Attente des tâches non terminées
make[3]: Leaving directory '/home/benjamin/MESA/mesa/src/glsl'
Makefile:1209: recipe for target 'all' failed
make[2]: *** [all] Error 2
make[2]: Leaving directory '/home/benjamin/MESA/mesa/src/glsl'
Makefile:658: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/home/benjamin/MESA/mesa/src'
Makefile:601: recipe for target 'all-recursive' failed
make: *** [all-recursive] Error 1

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965/fs/nir: Optimize (gl_FrontFacing ? x : y) where x and y are ±1.0.

2015-02-20 Thread Eric Anholt
Matt Turner  writes:

> On Fri, Feb 20, 2015 at 11:54 AM, Eric Anholt  wrote:
>> I wanted patch #1 to land, so I took a look at this one :)
>
> Thanks! :)
>
>> Matt Turner  writes:
>>> +   if (brw->gen >= 6) {
>>> +  /* Bit 15 of g0.0 is 0 if the polygon is front facing. */
>>> +  fs_reg g0 = fs_reg(retype(brw_vec1_grf(0, 0), BRW_REGISTER_TYPE_W));
>>> +
>>> +  /* For (gl_FrontFacing ? 1.0 : -1.0), emit:
>>> +   *
>>> +   *or(8)  tmp.1<2>W  g0.0<0,1,0>W  0x3f80W
>>> +   *and(8) dst<1>Dtmp<8,8,1>D   0xbf80D
>>> +   *
>>> +   * and negate g0.0<0,1,0>W for (gl_FrontFacing ? -1.0 : 1.0).
>>> +   */
>>> +
>>> +  if (value1->f[0] == -1.0f) {
>>> + g0.negate = true;
>>> +  }
>>
>> Does this do what you want?  If g0.0 happened to be *all* zeroes, you're
>> still going to get 0 after negation, right?
>
> That's a good question. I'm not sure. The bits below it are
>
> 13: Source Depth Present to Render Target.
> 12: oMask to Render Target
> 11: Source0 Alpha Present to RenderTarget.
> 8:6: Starting Sample Pair Index
>
> BDW adds some additional fields as well.
>
> The others are "ignored". It's not clear to me that at least one of
> the defined bits is guaranteed to be zero. It's no guarantee or
> anything, but FWIW without realizing it we were depending on some bit
> being non-zero for the frontfacing optimizations that used ASR as well
> (commits d1c43ed4, 19c6617a) and haven't seen any problems from it. So
> if there's a problem... we're not making it worse in this commit...
>
> The simulator seems to set some bits in the ignored fields, but I
> don't have any explanation for that, nor is that evidence that we can
> rely on the hardware to do the same.
>
> Or maybe I'm just wrong and some bit is guaranteed to be set?

A "This negation looks like it's safe in practice, because bits 0:4 will
surely be TRIANGLES" comment seems fine with me.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] drivers/x11: add gallium include dirs to Makefile.am

2015-02-20 Thread Eric Anholt
Matt Turner  writes:

> On Fri, Feb 20, 2015 at 12:20 PM, Brian Paul  wrote:
>> Fixes xlib driver build after e8c5cbfd921680c.
>> ---
>>  src/mesa/drivers/x11/Makefile.am | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/src/mesa/drivers/x11/Makefile.am 
>> b/src/mesa/drivers/x11/Makefile.am
>> index 76a7cd4..c0596f8 100644
>> --- a/src/mesa/drivers/x11/Makefile.am
>> +++ b/src/mesa/drivers/x11/Makefile.am
>> @@ -30,6 +30,8 @@ AM_CPPFLAGS = \
>> -I$(top_srcdir)/src/mapi \
>> -I$(top_srcdir)/src/mesa \
>> -I$(top_srcdir)/src \
>> +   -I$(top_srcdir)/src/gallium/include \
>> +   -I$(top_srcdir)/src/gallium/auxiliary \
>
> I really dislike including these directories in a bunch of non-Gallium
> places. I should have said so more strongly when the patch went by.
>
> I'm not sure whether I want to spend my time to fix this the right way or 
> what.
>
> I guess just go ahead and commit this until something better comes along.

I'm sympathetic, it's just the feeling of "if I can't get incremental
changes towards sharing, we'll never get anywhere and we'll be stuck
with this code duplication we've had for the last 5 years".

p_compiler.h looks like something we should be able to get out into
util/, at which point we could drop these include dirs.  It's on my list
still, but I do really want to land my NIR work.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Have configure define NDEBUG, not mtypes.h.

2015-02-20 Thread Kenneth Graunke
On Friday, February 20, 2015 12:44:53 PM Matt Turner wrote:
> mtypes.h had been defining NDEBUG (used by assert) if DEBUG was not
> defined. Confusing and bizarre that you don't get NDEBUG if you don't
> include mtypes.h.
> 
> ... which is just what happened in commit bef38f62e.
> 
> Let's let configure define this for us if not using --enable-debug.
> ---
>  configure.ac   | 2 ++
>  src/mesa/main/mtypes.h | 3 ---
>  2 files changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/configure.ac b/configure.ac
> index beb7a7d..5fbb7bc 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -370,6 +370,8 @@ if test "x$enable_debug" = xyes; then
>  CXXFLAGS="$CXXFLAGS -O0"
>  fi
>  fi
> +else
> +   DEFINES="$DEFINES -DNDEBUG"
>  fi
>  
>  dnl
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index 05b5a81..6e99773 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -4512,9 +4512,6 @@ extern int MESA_DEBUG_FLAGS;
>  # define MESA_VERBOSE 0
>  # define MESA_DEBUG_FLAGS 0
>  # define MESA_FUNCTION "a function"
> -# ifndef NDEBUG
> -#  define NDEBUG
> -# endif
>  #endif
>  
>  
> 

It looks like scons/gallium.py already defines DEBUG and NDEBUG:

cppdefines = []
if env['build'] in ('debug', 'checked'):
cppdefines += ['DEBUG']
else:
cppdefines += ['NDEBUG']

So this may not even break the SCons build.  If it does, there's a
fairly clear solution there.

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965/fs/nir: Optimize (gl_FrontFacing ? x : y) where x and y are ±1.0.

2015-02-20 Thread Matt Turner
On Fri, Feb 20, 2015 at 1:41 PM, Eric Anholt  wrote:
>> Or maybe I'm just wrong and some bit is guaranteed to be set?
>
> A "This negation looks like it's safe in practice, because bits 0:4 will
> surely be TRIANGLES" comment seems fine with me.

Thanks, will do. R-b?

I realized I was looking at Vol4/Part1 which described the render
target write message header, which much the same description but not
the actual incoming thread dispatch payload. Vol2/Part1 has the info I
want.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Use fs_reg for CS/VS atomics pixel mask immediate data

2015-02-20 Thread Matt Turner
On Fri, Feb 20, 2015 at 1:14 PM, Jordan Justen
 wrote:
> The brw_imm_ud will yield a HW_REG which then will introduce a barrier
> for certain optimization opportunities.
>
> No piglit regressions seen with gen8 (simd8vs).
>
> Suggested-by: Matt Turner 
> Signed-off-by: Jordan Justen 
> Cc: Matt Turner 

By the way, there are some more in gen6_gs_visitor.cpp if you'd like
to clean those up as well.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/6] i965/skl: Texture layout fixes

2015-02-20 Thread Neil Roberts
Here is a v2 of my patch series to fix 1D textures on Skylake. It's
now turned into just some general fixes and also helps with 3D
textures. It fixes 110 piglit tests but sadly seems to cause 3
regressions. It might not be worth landing until I can work out what
the regressions are so I guess I'm just posting it now in case
anyone's interested to look at it anyway.

I've uploaded the piglit html summary of the changes here:

http://busydoingnothing.co.uk/skl-qpitch-patches/changes.html

- Neil

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >