Re: [Mesa-dev] [PATCH v5 25/70] glsl: Add std430 related member functions to glsl_type class

2015-09-16 Thread Ilia Mirkin
On Wed, Sep 16, 2015 at 1:14 AM, Samuel Iglesias Gonsálvez
 wrote:
>
>
> On 15/09/15 21:03, Jordan Justen wrote:
>> On 2015-09-10 22:48:55, Samuel Iglesias Gonsálvez wrote:
>>> On 10/09/15 20:13, Jordan Justen wrote:
 On 2015-09-10 06:35:41, Iago Toral Quiroga wrote:
> From: Samuel Iglesias Gonsalvez 
>
> They are used to calculate size, base alignment and array stride values
> for a glsl_type following std430 rules.
>
> Signed-off-by: Samuel Iglesias Gonsalvez 
> ---
>  src/glsl/glsl_types.cpp | 209 
> 
>  src/glsl/glsl_types.h   |  19 +
>  2 files changed, 228 insertions(+)
>
> diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp
> index 755618a..d97991a 100644
> --- a/src/glsl/glsl_types.cpp
> +++ b/src/glsl/glsl_types.cpp
> @@ -1357,6 +1357,215 @@ glsl_type::std140_size(bool row_major) const
> return -1;
>  }
>
> +unsigned
> +glsl_type::std430_base_alignment(bool row_major) const
> +{
> +
> +   unsigned N = is_double() ? 8 : 4;
> +
> +   /* (1) If the member is a scalar consuming  basic machine units, 
> the
> +* base alignment is .
> +*
> +* (2) If the member is a two- or four-component vector with 
> components
> +* consuming  basic machine units, the base alignment is 2 
> or
> +* 4, respectively.
> +*
> +* (3) If the member is a three-component vector with components 
> consuming
> +*  basic machine units, the base alignment is 4.
> +*/
> +   if (this->is_scalar() || this->is_vector()) {
> +  switch (this->vector_elements) {
> +  case 1:
> + return N;
> +  case 2:
> + return 2 * N;
> +  case 3:
> +  case 4:
> + return 4 * N;
> +  }
> +   }
> +
> +   /* OpenGL 4.30 spec, section 7.6.2.2 "Standard Uniform Block Layout":
> +*
> +* "When using the "std430" storage layout, shader storage
> +* blocks will be laid out in buffer storage identically to uniform 
> and
> +* shader storage blocks using the "std140" layout, except that the 
> base
> +* alignment of arrays of scalars and vectors in rule (4) and of 
> structures

 Looking at the 4.3 spec (and 4.5), it actually adds "and stride"
 following "base alignment". The extension spec *does not* have the
 "and stride" text.

>>>
>>> OK. If you agree, I will keep OpenGL 4.3 (and later) spec wording in all
>>> the places where this snippet is pasted.
>>>
 This seems to be an inconsistency between the extension spec and the
 actual spec, but the OpenGL spec form would produce more tightly
 packed arrays.

 Maybe we want to confirm what another implementation does?
>>>
>>> Both NVIDIA and ATI proprietary drivers don't round up the stride of
>>> arrays of vectors to a multiple of a vec4 size, i.e., they are following
>>> the OpenGL spec. For example: for an array of vec2, they are returning
>>> an stride value of 8, not 16 as in std140.
>>
>> Well, my concern was that the 'and stride' part might mean that vec3
>> array stride should be 12 rather than 16. But, I tested NVidia, and
>> they seem to use a stride of 16 for a vec3 array. So, I think your
>> interpretation is correct.
>>
>> I still say we could still use an update to idr's ubo-lolz branch to
>> handle ssbo and std430, but this would also involve extending shader
>> runner to better support ssbo.
>>
>
> I have already done that work. I have a ubo-lolz modified branch [0]
> with an initial support of SSBOs and std430.
>
> About ssbo support for shader_runner, I have sent a couple of patches to
> piglit [1] and I plan to send a new version of them today with a generic
> approach (so it is not only for SSBOs but for other interface types
> defined in ARB_program_interface_query extension).
>
> FWIW, I executed [0] with no errors during 15 minutes.

As way of validation, have you tried running your modified script
against any other drivers? They may well have bugs in them as well,
but it should be possible to determine if the bug is in the script or
the other impl, should they not match up.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v5 25/70] glsl: Add std430 related member functions to glsl_type class

2015-09-16 Thread Samuel Iglesias Gonsálvez


On 16/09/15 09:11, Ilia Mirkin wrote:
> On Wed, Sep 16, 2015 at 1:14 AM, Samuel Iglesias Gonsálvez
>  wrote:
>>
>>
>> On 15/09/15 21:03, Jordan Justen wrote:
>>> On 2015-09-10 22:48:55, Samuel Iglesias Gonsálvez wrote:
 On 10/09/15 20:13, Jordan Justen wrote:
> On 2015-09-10 06:35:41, Iago Toral Quiroga wrote:
>> From: Samuel Iglesias Gonsalvez 
>>
>> They are used to calculate size, base alignment and array stride values
>> for a glsl_type following std430 rules.
>>
>> Signed-off-by: Samuel Iglesias Gonsalvez 
>> ---
>>  src/glsl/glsl_types.cpp | 209 
>> 
>>  src/glsl/glsl_types.h   |  19 +
>>  2 files changed, 228 insertions(+)
>>
>> diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp
>> index 755618a..d97991a 100644
>> --- a/src/glsl/glsl_types.cpp
>> +++ b/src/glsl/glsl_types.cpp
>> @@ -1357,6 +1357,215 @@ glsl_type::std140_size(bool row_major) const
>> return -1;
>>  }
>>
>> +unsigned
>> +glsl_type::std430_base_alignment(bool row_major) const
>> +{
>> +
>> +   unsigned N = is_double() ? 8 : 4;
>> +
>> +   /* (1) If the member is a scalar consuming  basic machine units, 
>> the
>> +* base alignment is .
>> +*
>> +* (2) If the member is a two- or four-component vector with 
>> components
>> +* consuming  basic machine units, the base alignment is 2 
>> or
>> +* 4, respectively.
>> +*
>> +* (3) If the member is a three-component vector with components 
>> consuming
>> +*  basic machine units, the base alignment is 4.
>> +*/
>> +   if (this->is_scalar() || this->is_vector()) {
>> +  switch (this->vector_elements) {
>> +  case 1:
>> + return N;
>> +  case 2:
>> + return 2 * N;
>> +  case 3:
>> +  case 4:
>> + return 4 * N;
>> +  }
>> +   }
>> +
>> +   /* OpenGL 4.30 spec, section 7.6.2.2 "Standard Uniform Block Layout":
>> +*
>> +* "When using the "std430" storage layout, shader storage
>> +* blocks will be laid out in buffer storage identically to uniform 
>> and
>> +* shader storage blocks using the "std140" layout, except that the 
>> base
>> +* alignment of arrays of scalars and vectors in rule (4) and of 
>> structures
>
> Looking at the 4.3 spec (and 4.5), it actually adds "and stride"
> following "base alignment". The extension spec *does not* have the
> "and stride" text.
>

 OK. If you agree, I will keep OpenGL 4.3 (and later) spec wording in all
 the places where this snippet is pasted.

> This seems to be an inconsistency between the extension spec and the
> actual spec, but the OpenGL spec form would produce more tightly
> packed arrays.
>
> Maybe we want to confirm what another implementation does?

 Both NVIDIA and ATI proprietary drivers don't round up the stride of
 arrays of vectors to a multiple of a vec4 size, i.e., they are following
 the OpenGL spec. For example: for an array of vec2, they are returning
 an stride value of 8, not 16 as in std140.
>>>
>>> Well, my concern was that the 'and stride' part might mean that vec3
>>> array stride should be 12 rather than 16. But, I tested NVidia, and
>>> they seem to use a stride of 16 for a vec3 array. So, I think your
>>> interpretation is correct.
>>>
>>> I still say we could still use an update to idr's ubo-lolz branch to
>>> handle ssbo and std430, but this would also involve extending shader
>>> runner to better support ssbo.
>>>
>>
>> I have already done that work. I have a ubo-lolz modified branch [0]
>> with an initial support of SSBOs and std430.
>>
>> About ssbo support for shader_runner, I have sent a couple of patches to
>> piglit [1] and I plan to send a new version of them today with a generic
>> approach (so it is not only for SSBOs but for other interface types
>> defined in ARB_program_interface_query extension).
>>
>> FWIW, I executed [0] with no errors during 15 minutes.
> 
> As way of validation, have you tried running your modified script
> against any other drivers? They may well have bugs in them as well,
> but it should be possible to determine if the bug is in the script or
> the other impl, should they not match up.
> 
>   -ilia
> 

I tested it on NVIDIA proprietary driver version 352.21. It has an issue
when we query shader storage block members when they are arrays of
structs and the index is different than zero -> it doesn't find them as
active. For example:

struct B {
vec4 a[2];
}

layout(std430) buffer Block {
B[2] s;
vec3 v;
};

NVIDIA marked v, s[0].a[0] and s[0].a[1] as active but s[1].a[0] and
s[0].a[1] as inactive even when they are referenced in main().

I have not checked y

Re: [Mesa-dev] [PATCH v5 28/70] glsl: add std430 interface packing support to ssbo related operations

2015-09-16 Thread Jordan Justen
On 2015-09-10 06:35:44, Iago Toral Quiroga wrote:
> From: Samuel Iglesias Gonsalvez 
> 
> v2:
> - Get interface packing information from interface's type, not the variable 
> type.
> - Simplify is_std430 condition in emit_access() for readability (Jordan)
> - Add a commment explaing why array of three-component vector case is 
> different

Lines a bit long.

>   in std430 than the rest of cases.
> - Add calls to std430_array_stride().
> 
> Signed-off-by: Samuel Iglesias Gonsalvez 
> ---
>  src/glsl/lower_ubo_reference.cpp | 102 
> ++-
>  1 file changed, 78 insertions(+), 24 deletions(-)
> 
> diff --git a/src/glsl/lower_ubo_reference.cpp 
> b/src/glsl/lower_ubo_reference.cpp
> index 8694383..7e45a26 100644
> --- a/src/glsl/lower_ubo_reference.cpp
> +++ b/src/glsl/lower_ubo_reference.cpp
> @@ -147,7 +147,8 @@ public:
>  ir_rvalue **offset,
>  unsigned *const_offset,
>  bool *row_major,
> -int *matrix_columns);
> +int *matrix_columns,
> +unsigned packing);
> ir_expression *ubo_load(const struct glsl_type *type,
>ir_rvalue *offset);
> ir_call *ssbo_load(const struct glsl_type *type,
> @@ -164,7 +165,7 @@ public:
> void emit_access(bool is_write, ir_dereference *deref,
>  ir_variable *base_offset, unsigned int deref_offset,
>  bool row_major, int matrix_columns,
> -unsigned write_mask);
> +bool is_std430, unsigned write_mask);
>  
> ir_visitor_status visit_enter(class ir_expression *);
> ir_expression *calculate_ssbo_unsized_array_length(ir_expression *expr);
> @@ -176,7 +177,8 @@ public:
>  ir_variable *);
> ir_expression *emit_ssbo_get_buffer_size();
>  
> -   unsigned calculate_unsized_array_stride(ir_dereference *deref);
> +   unsigned calculate_unsized_array_stride(ir_dereference *deref,
> +   unsigned packing);
>  
> void *mem_ctx;
> struct gl_shader *shader;
> @@ -257,7 +259,8 @@ 
> lower_ubo_reference_visitor::setup_for_load_or_store(ir_variable *var,
>   ir_rvalue **offset,
>   unsigned *const_offset,
>   bool *row_major,
> - int *matrix_columns)
> + int *matrix_columns,
> + unsigned packing)
>  {
> /* Determine the name of the interface block */
> ir_rvalue *nonconst_block_index;
> @@ -343,8 +346,15 @@ 
> lower_ubo_reference_visitor::setup_for_load_or_store(ir_variable *var,
>  const bool array_row_major =
> is_dereferenced_thing_row_major(deref_array);
>  
> -array_stride = deref_array->type->std140_size(array_row_major);
> -array_stride = glsl_align(array_stride, 16);
> +/* The array type will give the correct interface packing
> + * information
> + */
> +if (packing == GLSL_INTERFACE_PACKING_STD430) {
> +   array_stride = 
> deref_array->type->std430_array_stride(array_row_major);
> +} else {
> +   array_stride = 
> deref_array->type->std140_size(array_row_major);
> +   array_stride = glsl_align(array_stride, 16);
> +}
>   }
>  
>   ir_rvalue *array_index = deref_array->array_index;
> @@ -380,7 +390,12 @@ 
> lower_ubo_reference_visitor::setup_for_load_or_store(ir_variable *var,
>  
>  ralloc_free(field_deref);
>  
> -unsigned field_align = 
> type->std140_base_alignment(field_row_major);
> +unsigned field_align = 0;
> +
> +if (packing == GLSL_INTERFACE_PACKING_STD430)
> +   field_align = type->std430_base_alignment(field_row_major);
> +else
> +   field_align = type->std140_base_alignment(field_row_major);
>  
>  intra_struct_offset = glsl_align(intra_struct_offset, 
> field_align);
>  
> @@ -388,7 +403,10 @@ 
> lower_ubo_reference_visitor::setup_for_load_or_store(ir_variable *var,
> deref_record->field) == 0)
> break;
>  
> -intra_struct_offset += type->std140_size(field_row_major);
> +if (packing == GLSL_INTERFACE_PACKING_STD430)
> +   intra_struct_offset += type->std430_size(field_row_major);
> +else
> +   intra_struct_offset += type->std140_size(field_row_major);
>  
>  /* If the field just examined was itself a structure, apply rule
>   * #9

Re: [Mesa-dev] [PATCH v5 28/70] glsl: add std430 interface packing support to ssbo related operations

2015-09-16 Thread Samuel Iglesias Gonsálvez


On 16/09/15 09:46, Jordan Justen wrote:
> On 2015-09-10 06:35:44, Iago Toral Quiroga wrote:
>> From: Samuel Iglesias Gonsalvez 
>>
>> v2:
>> - Get interface packing information from interface's type, not the variable 
>> type.
>> - Simplify is_std430 condition in emit_access() for readability (Jordan)
>> - Add a commment explaing why array of three-component vector case is 
>> different
> 
> Lines a bit long.
> 

OK, I will fix it.

>>   in std430 than the rest of cases.
>> - Add calls to std430_array_stride().
>>
>> Signed-off-by: Samuel Iglesias Gonsalvez 
>> ---
>>  src/glsl/lower_ubo_reference.cpp | 102 
>> ++-
>>  1 file changed, 78 insertions(+), 24 deletions(-)
>>
>> diff --git a/src/glsl/lower_ubo_reference.cpp 
>> b/src/glsl/lower_ubo_reference.cpp
>> index 8694383..7e45a26 100644
>> --- a/src/glsl/lower_ubo_reference.cpp
>> +++ b/src/glsl/lower_ubo_reference.cpp
>> @@ -147,7 +147,8 @@ public:
>>  ir_rvalue **offset,
>>  unsigned *const_offset,
>>  bool *row_major,
>> -int *matrix_columns);
>> +int *matrix_columns,
>> +unsigned packing);
>> ir_expression *ubo_load(const struct glsl_type *type,
>>ir_rvalue *offset);
>> ir_call *ssbo_load(const struct glsl_type *type,
>> @@ -164,7 +165,7 @@ public:
>> void emit_access(bool is_write, ir_dereference *deref,
>>  ir_variable *base_offset, unsigned int deref_offset,
>>  bool row_major, int matrix_columns,
>> -unsigned write_mask);
>> +bool is_std430, unsigned write_mask);
>>  
>> ir_visitor_status visit_enter(class ir_expression *);
>> ir_expression *calculate_ssbo_unsized_array_length(ir_expression *expr);
>> @@ -176,7 +177,8 @@ public:
>>  ir_variable *);
>> ir_expression *emit_ssbo_get_buffer_size();
>>  
>> -   unsigned calculate_unsized_array_stride(ir_dereference *deref);
>> +   unsigned calculate_unsized_array_stride(ir_dereference *deref,
>> +   unsigned packing);
>>  
>> void *mem_ctx;
>> struct gl_shader *shader;
>> @@ -257,7 +259,8 @@ 
>> lower_ubo_reference_visitor::setup_for_load_or_store(ir_variable *var,
>>   ir_rvalue **offset,
>>   unsigned *const_offset,
>>   bool *row_major,
>> - int *matrix_columns)
>> + int *matrix_columns,
>> + unsigned packing)
>>  {
>> /* Determine the name of the interface block */
>> ir_rvalue *nonconst_block_index;
>> @@ -343,8 +346,15 @@ 
>> lower_ubo_reference_visitor::setup_for_load_or_store(ir_variable *var,
>>  const bool array_row_major =
>> is_dereferenced_thing_row_major(deref_array);
>>  
>> -array_stride = deref_array->type->std140_size(array_row_major);
>> -array_stride = glsl_align(array_stride, 16);
>> +/* The array type will give the correct interface packing
>> + * information
>> + */
>> +if (packing == GLSL_INTERFACE_PACKING_STD430) {
>> +   array_stride = 
>> deref_array->type->std430_array_stride(array_row_major);
>> +} else {
>> +   array_stride = 
>> deref_array->type->std140_size(array_row_major);
>> +   array_stride = glsl_align(array_stride, 16);
>> +}
>>   }
>>  
>>   ir_rvalue *array_index = deref_array->array_index;
>> @@ -380,7 +390,12 @@ 
>> lower_ubo_reference_visitor::setup_for_load_or_store(ir_variable *var,
>>  
>>  ralloc_free(field_deref);
>>  
>> -unsigned field_align = 
>> type->std140_base_alignment(field_row_major);
>> +unsigned field_align = 0;
>> +
>> +if (packing == GLSL_INTERFACE_PACKING_STD430)
>> +   field_align = type->std430_base_alignment(field_row_major);
>> +else
>> +   field_align = type->std140_base_alignment(field_row_major);
>>  
>>  intra_struct_offset = glsl_align(intra_struct_offset, 
>> field_align);
>>  
>> @@ -388,7 +403,10 @@ 
>> lower_ubo_reference_visitor::setup_for_load_or_store(ir_variable *var,
>> deref_record->field) == 0)
>> break;
>>  
>> -intra_struct_offset += type->std140_size(field_row_major);
>> +if (packing == GLSL_INTERFACE_PACKING_STD430)
>> +   intra_struct_offset += type->std430_size(field_row_major);
>> +else
>> +

Re: [Mesa-dev] [PATCH v5 25/70] glsl: Add std430 related member functions to glsl_type class

2015-09-16 Thread Ilia Mirkin
On Wed, Sep 16, 2015 at 3:45 AM, Samuel Iglesias Gonsálvez
 wrote:
>
>
> On 16/09/15 09:11, Ilia Mirkin wrote:
>> On Wed, Sep 16, 2015 at 1:14 AM, Samuel Iglesias Gonsálvez
>>  wrote:
>>>
>>>
>>> On 15/09/15 21:03, Jordan Justen wrote:
 On 2015-09-10 22:48:55, Samuel Iglesias Gonsálvez wrote:
> On 10/09/15 20:13, Jordan Justen wrote:
>> On 2015-09-10 06:35:41, Iago Toral Quiroga wrote:
>>> From: Samuel Iglesias Gonsalvez 
>>>
>>> They are used to calculate size, base alignment and array stride values
>>> for a glsl_type following std430 rules.
>>>
>>> Signed-off-by: Samuel Iglesias Gonsalvez 
>>> ---
>>>  src/glsl/glsl_types.cpp | 209 
>>> 
>>>  src/glsl/glsl_types.h   |  19 +
>>>  2 files changed, 228 insertions(+)
>>>
>>> diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp
>>> index 755618a..d97991a 100644
>>> --- a/src/glsl/glsl_types.cpp
>>> +++ b/src/glsl/glsl_types.cpp
>>> @@ -1357,6 +1357,215 @@ glsl_type::std140_size(bool row_major) const
>>> return -1;
>>>  }
>>>
>>> +unsigned
>>> +glsl_type::std430_base_alignment(bool row_major) const
>>> +{
>>> +
>>> +   unsigned N = is_double() ? 8 : 4;
>>> +
>>> +   /* (1) If the member is a scalar consuming  basic machine units, 
>>> the
>>> +* base alignment is .
>>> +*
>>> +* (2) If the member is a two- or four-component vector with 
>>> components
>>> +* consuming  basic machine units, the base alignment is 
>>> 2 or
>>> +* 4, respectively.
>>> +*
>>> +* (3) If the member is a three-component vector with components 
>>> consuming
>>> +*  basic machine units, the base alignment is 4.
>>> +*/
>>> +   if (this->is_scalar() || this->is_vector()) {
>>> +  switch (this->vector_elements) {
>>> +  case 1:
>>> + return N;
>>> +  case 2:
>>> + return 2 * N;
>>> +  case 3:
>>> +  case 4:
>>> + return 4 * N;
>>> +  }
>>> +   }
>>> +
>>> +   /* OpenGL 4.30 spec, section 7.6.2.2 "Standard Uniform Block 
>>> Layout":
>>> +*
>>> +* "When using the "std430" storage layout, shader storage
>>> +* blocks will be laid out in buffer storage identically to uniform 
>>> and
>>> +* shader storage blocks using the "std140" layout, except that the 
>>> base
>>> +* alignment of arrays of scalars and vectors in rule (4) and of 
>>> structures
>>
>> Looking at the 4.3 spec (and 4.5), it actually adds "and stride"
>> following "base alignment". The extension spec *does not* have the
>> "and stride" text.
>>
>
> OK. If you agree, I will keep OpenGL 4.3 (and later) spec wording in all
> the places where this snippet is pasted.
>
>> This seems to be an inconsistency between the extension spec and the
>> actual spec, but the OpenGL spec form would produce more tightly
>> packed arrays.
>>
>> Maybe we want to confirm what another implementation does?
>
> Both NVIDIA and ATI proprietary drivers don't round up the stride of
> arrays of vectors to a multiple of a vec4 size, i.e., they are following
> the OpenGL spec. For example: for an array of vec2, they are returning
> an stride value of 8, not 16 as in std140.

 Well, my concern was that the 'and stride' part might mean that vec3
 array stride should be 12 rather than 16. But, I tested NVidia, and
 they seem to use a stride of 16 for a vec3 array. So, I think your
 interpretation is correct.

 I still say we could still use an update to idr's ubo-lolz branch to
 handle ssbo and std430, but this would also involve extending shader
 runner to better support ssbo.

>>>
>>> I have already done that work. I have a ubo-lolz modified branch [0]
>>> with an initial support of SSBOs and std430.
>>>
>>> About ssbo support for shader_runner, I have sent a couple of patches to
>>> piglit [1] and I plan to send a new version of them today with a generic
>>> approach (so it is not only for SSBOs but for other interface types
>>> defined in ARB_program_interface_query extension).
>>>
>>> FWIW, I executed [0] with no errors during 15 minutes.
>>
>> As way of validation, have you tried running your modified script
>> against any other drivers? They may well have bugs in them as well,
>> but it should be possible to determine if the bug is in the script or
>> the other impl, should they not match up.
>>
>>   -ilia
>>
>
> I tested it on NVIDIA proprietary driver version 352.21. It has an issue
> when we query shader storage block members when they are arrays of
> structs and the index is different than zero -> it doesn't find them as
> active. For example:
>
> struct B {
> vec4 a[2];
> }
>
> layout(

Re: [Mesa-dev] [PATCH v5 25/70] glsl: Add std430 related member functions to glsl_type class

2015-09-16 Thread Samuel Iglesias Gonsálvez


On 16/09/15 10:39, Ilia Mirkin wrote:
> On Wed, Sep 16, 2015 at 3:45 AM, Samuel Iglesias Gonsálvez
>  wrote:
>>
>>
>> On 16/09/15 09:11, Ilia Mirkin wrote:
>>> On Wed, Sep 16, 2015 at 1:14 AM, Samuel Iglesias Gonsálvez
>>>  wrote:


 On 15/09/15 21:03, Jordan Justen wrote:
> On 2015-09-10 22:48:55, Samuel Iglesias Gonsálvez wrote:
>> On 10/09/15 20:13, Jordan Justen wrote:
>>> On 2015-09-10 06:35:41, Iago Toral Quiroga wrote:
 From: Samuel Iglesias Gonsalvez 

 They are used to calculate size, base alignment and array stride values
 for a glsl_type following std430 rules.

 Signed-off-by: Samuel Iglesias Gonsalvez 
 ---
  src/glsl/glsl_types.cpp | 209 
 
  src/glsl/glsl_types.h   |  19 +
  2 files changed, 228 insertions(+)

 diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp
 index 755618a..d97991a 100644
 --- a/src/glsl/glsl_types.cpp
 +++ b/src/glsl/glsl_types.cpp
 @@ -1357,6 +1357,215 @@ glsl_type::std140_size(bool row_major) const
 return -1;
  }

 +unsigned
 +glsl_type::std430_base_alignment(bool row_major) const
 +{
 +
 +   unsigned N = is_double() ? 8 : 4;
 +
 +   /* (1) If the member is a scalar consuming  basic machine 
 units, the
 +* base alignment is .
 +*
 +* (2) If the member is a two- or four-component vector with 
 components
 +* consuming  basic machine units, the base alignment is 
 2 or
 +* 4, respectively.
 +*
 +* (3) If the member is a three-component vector with components 
 consuming
 +*  basic machine units, the base alignment is 4.
 +*/
 +   if (this->is_scalar() || this->is_vector()) {
 +  switch (this->vector_elements) {
 +  case 1:
 + return N;
 +  case 2:
 + return 2 * N;
 +  case 3:
 +  case 4:
 + return 4 * N;
 +  }
 +   }
 +
 +   /* OpenGL 4.30 spec, section 7.6.2.2 "Standard Uniform Block 
 Layout":
 +*
 +* "When using the "std430" storage layout, shader storage
 +* blocks will be laid out in buffer storage identically to 
 uniform and
 +* shader storage blocks using the "std140" layout, except that 
 the base
 +* alignment of arrays of scalars and vectors in rule (4) and of 
 structures
>>>
>>> Looking at the 4.3 spec (and 4.5), it actually adds "and stride"
>>> following "base alignment". The extension spec *does not* have the
>>> "and stride" text.
>>>
>>
>> OK. If you agree, I will keep OpenGL 4.3 (and later) spec wording in all
>> the places where this snippet is pasted.
>>
>>> This seems to be an inconsistency between the extension spec and the
>>> actual spec, but the OpenGL spec form would produce more tightly
>>> packed arrays.
>>>
>>> Maybe we want to confirm what another implementation does?
>>
>> Both NVIDIA and ATI proprietary drivers don't round up the stride of
>> arrays of vectors to a multiple of a vec4 size, i.e., they are following
>> the OpenGL spec. For example: for an array of vec2, they are returning
>> an stride value of 8, not 16 as in std140.
>
> Well, my concern was that the 'and stride' part might mean that vec3
> array stride should be 12 rather than 16. But, I tested NVidia, and
> they seem to use a stride of 16 for a vec3 array. So, I think your
> interpretation is correct.
>
> I still say we could still use an update to idr's ubo-lolz branch to
> handle ssbo and std430, but this would also involve extending shader
> runner to better support ssbo.
>

 I have already done that work. I have a ubo-lolz modified branch [0]
 with an initial support of SSBOs and std430.

 About ssbo support for shader_runner, I have sent a couple of patches to
 piglit [1] and I plan to send a new version of them today with a generic
 approach (so it is not only for SSBOs but for other interface types
 defined in ARB_program_interface_query extension).

 FWIW, I executed [0] with no errors during 15 minutes.
>>>
>>> As way of validation, have you tried running your modified script
>>> against any other drivers? They may well have bugs in them as well,
>>> but it should be possible to determine if the bug is in the script or
>>> the other impl, should they not match up.
>>>
>>>   -ilia
>>>
>>
>> I tested it on NVIDIA proprietary driver version 352.21. It has an issue
>> when we query shader storage block members when 

[Mesa-dev] [RFC 0/3] i965: Enable up to 24 MRF registers in gen6

2015-09-16 Thread Iago Toral Quiroga
It seems that we have some bugs where we fail to compile shaders in gen6
because we do not having enough MRF registers available (see bugs 86469 and
90631 for example). That triggered some discussion about the fact that SNB
might actually have 24 MRF registers available, but since the docs where not
very clear about this, it was suggested that it would be nice to try and
experiment if that was the case.

These series of patches implement such test, basically they turn our fixed
BRW_MAX_MRF into a macro that accepts the hardware generation and then changes
the spilling code in brw_fs_reg_allocate.cpp to use MRF registers 21-23 for
gen6 (something similar can be done for the vec4 code, I just did not do it
yet).

The good news is that this seems to work fine, at least I can do a full piglit
run without issues in SNB. In fact, this seems to help a lot of tests when I
force spilling of everything in the FS backend (INTEL_DEBUG=spill_fs):

Using MRF registers 13-15 for spilling:
crash: 5, fail 267, pass: 15853, skip: 11679, warn: 3

Using MRF registers 21-23 for spilling:
crash: 5, fail 140, pass: 15980, skip: 11679, warn: 3

As you can see, we drop the fail ratio to almost 50%...

The bad news is that, currently, we assert that MRF registers are within the
supported range in brw_reg.h. This works fine now because the limit does not
depend on the hardware generation, but these patches change that of course.
The natural way to fix this would be to pass a generation argument to
all brw_reg functions that can create a brw_reg, but I imagine that we don't
want to do that only for this, right? In that case, if we want to keep the
asserts (I think we do) we need a way around that limitatation. The first
patch in this series tries to move the asserts to the generator, but that won't
manage things like blorp and other modules that can emit code directly, so we
would lose the assert checks for those. Of course we could add individual
asserts for these as needed, but it is not ideal. Alternatively, we could add
a function wrapper to brw_message_reg that has the assert and use that
version of the function from these places. In that case, this wrapper might not
need to take in the generation number as parameter and could just check
with 16 as the limit, since we really only use MRF registers
beyond 16 for spilling, and we only handle spilling in code paths that end
up going through the generator.

Or maybe we think this is just not worth it if it only helps gen6...

what do you think? 

Iago Toral Quiroga (3):
  i965: Move MRF register asserts to the generator
  i965: Turn BRW_MAX_MRF into a macro that accepts a hardware generation
  i965/fs: Use MRF registers 21-23 for spilling on gen6

 src/mesa/drivers/dri/i965/brw_eu_emit.c|  2 +-
 src/mesa/drivers/dri/i965/brw_fs.cpp   |  4 ++--
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 14 +++
 src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  | 27 --
 src/mesa/drivers/dri/i965/brw_ir_vec4.h|  2 +-
 src/mesa/drivers/dri/i965/brw_reg.h|  5 +---
 .../drivers/dri/i965/brw_schedule_instructions.cpp |  4 ++--
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp   |  9 +---
 8 files changed, 37 insertions(+), 30 deletions(-)

-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 3/3] i965/fs: Use MRF registers 21-23 for spilling on gen6

2015-09-16 Thread Iago Toral Quiroga
---
 src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
index 21fb3de..6900cee 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
@@ -30,6 +30,8 @@
 #include "glsl/glsl_types.h"
 #include "glsl/ir_optimization.h"
 
+#define FIRST_SPILL_MRF(gen) (gen == 6 ? 21 : 13)
+
 using namespace brw;
 
 static void
@@ -727,7 +729,7 @@ fs_visitor::emit_unspill(bblock_t *block, fs_inst *inst, 
fs_reg dst,
   unspill_inst->regs_written = reg_size;
 
   if (!gen7_read) {
- unspill_inst->base_mrf = 14;
+ unspill_inst->base_mrf = FIRST_SPILL_MRF(devinfo->gen) + 1;
  unspill_inst->mlen = 1; /* header contains offset */
   }
 
@@ -741,9 +743,9 @@ fs_visitor::emit_spill(bblock_t *block, fs_inst *inst, 
fs_reg src,
uint32_t spill_offset, int count)
 {
int reg_size = 1;
-   int spill_base_mrf = 14;
+   int spill_base_mrf = FIRST_SPILL_MRF(devinfo->gen) + 1;
if (dispatch_width == 16 && count % 2 == 0) {
-  spill_base_mrf = 13;
+  spill_base_mrf = FIRST_SPILL_MRF(devinfo->gen);
   reg_size = 2;
}
 
@@ -843,7 +845,8 @@ fs_visitor::spill_reg(int spill_reg)
int size = alloc.sizes[spill_reg];
unsigned int spill_offset = last_scratch;
assert(ALIGN(spill_offset, 16) == spill_offset); /* oword read/write req. */
-   int spill_base_mrf = dispatch_width > 8 ? 13 : 14;
+   int spill_base_mrf = dispatch_width > 8 ? FIRST_SPILL_MRF(devinfo->gen) :
+ FIRST_SPILL_MRF(devinfo->gen) + 1;
 
/* Spills may use MRFs 13-15 in the SIMD16 case.  Our texturing is done
 * using up to 11 MRFs starting from either m1 or m2, and fb writes can use
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 1/3] i965: Move MRF register asserts to the generator

2015-09-16 Thread Iago Toral Quiroga
In a later patch we will make BRW_MAX_MRF return a different value depending
on the hardware generation, but it is inconvenient to add a gen parameter
to the brw_reg functions only for the assertions, so move them to the generator
where checking for this is easier.

FIXME: we would still need to add asserts manually in some places that call
brw_message_reg or create message regs with other brw_reg functions.
---
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp   | 6 +-
 src/mesa/drivers/dri/i965/brw_reg.h  | 3 ---
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 3 +++
 3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index 90805e4..d770c42 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -53,8 +53,10 @@ brw_reg_from_fs_reg(fs_inst *inst, fs_reg *reg)
struct brw_reg brw_reg;
 
switch (reg->file) {
-   case GRF:
case MRF:
+  assert((reg->reg & ~(1 << 7)) < BRW_MAX_MRF);
+  /* Fallthrough */
+   case GRF:
   if (reg->stride == 0) {
  brw_reg = brw_vec1_reg(brw_file_from_reg(reg), reg->reg, 0);
   } else if (inst->exec_size < 8) {
@@ -1558,6 +1560,8 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
   brw_set_default_acc_write_control(p, inst->writes_accumulator);
   brw_set_default_exec_size(p, cvt(inst->exec_size) - 1);
 
+  assert(inst->base_mrf + inst->mlen < BRW_MAX_MRF);
+
   switch (inst->exec_size) {
   case 1:
   case 2:
diff --git a/src/mesa/drivers/dri/i965/brw_reg.h 
b/src/mesa/drivers/dri/i965/brw_reg.h
index 31806f7..97aaa5b 100644
--- a/src/mesa/drivers/dri/i965/brw_reg.h
+++ b/src/mesa/drivers/dri/i965/brw_reg.h
@@ -344,8 +344,6 @@ brw_reg(unsigned file,
struct brw_reg reg;
if (file == BRW_GENERAL_REGISTER_FILE)
   assert(nr < BRW_MAX_GRF);
-   else if (file == BRW_MESSAGE_REGISTER_FILE)
-  assert((nr & ~(1 << 7)) < BRW_MAX_MRF);
else if (file == BRW_ARCHITECTURE_REGISTER_FILE)
   assert(nr <= BRW_ARF_TIMESTAMP);
 
@@ -808,7 +806,6 @@ brw_mask_reg(unsigned subnr)
 static inline struct brw_reg
 brw_message_reg(unsigned nr)
 {
-   assert((nr & ~(1 << 7)) < BRW_MAX_MRF);
return brw_vec8_reg(BRW_MESSAGE_REGISTER_FILE, nr, 0);
 }
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
index 1950333..73e5b22 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
@@ -46,6 +46,7 @@ vec4_instruction::get_dst(void)
   break;
 
case MRF:
+  assert(((dst.reg + dst.reg_offset) & ~(1 << 7)) < BRW_MAX_MRF);
   brw_reg = brw_message_reg(dst.reg + dst.reg_offset);
   brw_reg = retype(brw_reg, dst.type);
   brw_reg.dw1.bits.writemask = dst.writemask;
@@ -1134,6 +1135,8 @@ vec4_generator::generate_code(const cfg_t *cfg)
   brw_set_default_mask_control(p, inst->force_writemask_all);
   brw_set_default_acc_write_control(p, inst->writes_accumulator);
 
+  assert(inst->base_mrf + inst->mlen < BRW_MAX_MRF);
+
   unsigned pre_emit_nr_insn = p->nr_insn;
 
   if (dst.width == BRW_WIDTH_4) {
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 2/3] i965: Turn BRW_MAX_MRF into a macro that accepts a hardware generation

2015-09-16 Thread Iago Toral Quiroga
---
 src/mesa/drivers/dri/i965/brw_eu_emit.c |  2 +-
 src/mesa/drivers/dri/i965/brw_fs.cpp|  4 ++--
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp  | 12 ++--
 src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp   | 16 
 src/mesa/drivers/dri/i965/brw_ir_vec4.h |  2 +-
 src/mesa/drivers/dri/i965/brw_reg.h |  2 +-
 src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp |  4 ++--
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp| 10 +-
 8 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
b/src/mesa/drivers/dri/i965/brw_eu_emit.c
index 0432efa..f819ed5 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
@@ -2482,7 +2482,7 @@ void brw_urb_WRITE(struct brw_codegen *p,
 
insn = next_insn(p, BRW_OPCODE_SEND);
 
-   assert(msg_length < BRW_MAX_MRF);
+   assert(msg_length < BRW_MAX_MRF(devinfo->gen));
 
brw_set_dest(p, insn, dest);
brw_set_src0(p, insn, src0);
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index b9f1051..dc81641 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2794,7 +2794,7 @@ 
fs_visitor::insert_gen4_pre_send_dependency_workarounds(bblock_t *block,
 {
int write_len = inst->regs_written;
int first_write_grf = inst->dst.reg;
-   bool needs_dep[BRW_MAX_MRF];
+   bool needs_dep[BRW_MAX_MRF(devinfo->gen)];
assert(write_len < (int)sizeof(needs_dep) - 1);
 
memset(needs_dep, false, sizeof(needs_dep));
@@ -2865,7 +2865,7 @@ 
fs_visitor::insert_gen4_post_send_dependency_workarounds(bblock_t *block, fs_ins
 {
int write_len = inst->regs_written;
int first_write_grf = inst->dst.reg;
-   bool needs_dep[BRW_MAX_MRF];
+   bool needs_dep[BRW_MAX_MRF(devinfo->gen)];
assert(write_len < (int)sizeof(needs_dep) - 1);
 
memset(needs_dep, false, sizeof(needs_dep));
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index d770c42..934e342 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -48,13 +48,13 @@ static uint32_t brw_file_from_reg(fs_reg *reg)
 }
 
 static struct brw_reg
-brw_reg_from_fs_reg(fs_inst *inst, fs_reg *reg)
+brw_reg_from_fs_reg(fs_inst *inst, fs_reg *reg, unsigned gen)
 {
struct brw_reg brw_reg;
 
switch (reg->file) {
case MRF:
-  assert((reg->reg & ~(1 << 7)) < BRW_MAX_MRF);
+  assert((reg->reg & ~(1 << 7)) < BRW_MAX_MRF(gen));
   /* Fallthrough */
case GRF:
   if (reg->stride == 0) {
@@ -420,7 +420,7 @@ fs_generator::generate_blorp_fb_write(fs_inst *inst)
brw_fb_WRITE(p,
 16 /* dispatch_width */,
 brw_message_reg(inst->base_mrf),
-brw_reg_from_fs_reg(inst, &inst->src[0]),
+brw_reg_from_fs_reg(inst, &inst->src[0], devinfo->gen),
 BRW_DATAPORT_RENDER_TARGET_WRITE_SIMD16_SINGLE_SOURCE,
 inst->target,
 inst->mlen,
@@ -1538,7 +1538,7 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
  annotate(p->devinfo, &annotation, cfg, inst, p->next_insn_offset);
 
   for (unsigned int i = 0; i < inst->sources; i++) {
-src[i] = brw_reg_from_fs_reg(inst, &inst->src[i]);
+src[i] = brw_reg_from_fs_reg(inst, &inst->src[i], devinfo->gen);
 
 /* The accumulator result appears to get used for the
  * conditional modifier generation.  When negating a UD
@@ -1550,7 +1550,7 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
inst->src[i].type != BRW_REGISTER_TYPE_UD ||
!inst->src[i].negate);
   }
-  dst = brw_reg_from_fs_reg(inst, &inst->dst);
+  dst = brw_reg_from_fs_reg(inst, &inst->dst, devinfo->gen);
 
   brw_set_default_predicate_control(p, inst->predicate);
   brw_set_default_predicate_inverse(p, inst->predicate_inverse);
@@ -1560,7 +1560,7 @@ fs_generator::generate_code(const cfg_t *cfg, int 
dispatch_width)
   brw_set_default_acc_write_control(p, inst->writes_accumulator);
   brw_set_default_exec_size(p, cvt(inst->exec_size) - 1);
 
-  assert(inst->base_mrf + inst->mlen < BRW_MAX_MRF);
+  assert(inst->base_mrf + inst->mlen < BRW_MAX_MRF(devinfo->gen));
 
   switch (inst->exec_size) {
   case 1:
diff --git a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
index 570b4fe..21fb3de 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
@@ -478,7 +478,7 @@ get_used_mrfs(fs_visitor *v, bool *mrf_used)
 {
int reg_width = v->dispatch_width / 8;
 
-   memset(mrf_used, 0, BRW_MAX_MRF * sizeof(bool));
+   memset(mrf_used, 0, BRW_MAX_MRF

[Mesa-dev] [Bug 92020] wglCreatePbufferARB handle attrib error

2015-09-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92020

Bug ID: 92020
   Summary: wglCreatePbufferARB handle attrib error
   Product: Mesa
   Version: 11.0
  Hardware: x86 (IA32)
OS: Windows (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: EGL
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: 332447...@qq.com
QA Contact: mesa-dev@lists.freedesktop.org

HPBUFFERARB WINAPI
wglCreatePbufferARB(HDC hCurrentDC,
int iPixelFormat,
int iWidth,
int iHeight,
const int *piAttribList)
{
//..

   for (piAttrib = piAttribList; *piAttrib; piAttrib++) {
  switch (*piAttrib) {
  case WGL_PBUFFER_LARGEST_ARB:
 piAttrib++;
 useLargest = *piAttrib;
 break;
  default:
 SetLastError(ERROR_INVALID_DATA);
 return 0;// <-- delete this
  }
   }

//.
}

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Fix texture compression on big-endian systems

2015-09-16 Thread Oded Gabbay
On Tue, Sep 15, 2015 at 4:23 PM, Ulrich Weigand  wrote:
>
> Various pieces of code to create compressed textures will first
> generate an uncompressed RGBA texture into a temporary buffer,
> and then read from that buffer while creating the final compressed
> texture in the requested format.
>
> The code reading from the temporary buffer assumes the buffer is
> formatted as an array of bytes in RGBA order.  However, the buffer
> is filled using a _mesa_texstore call with MESA_FORMAT_R8G8B8A8_UNORM
> format -- this is defined as an array of *integers* holding the
> RGBA values in packed format (least-significant to most-significant).
> This means incorrect bytes are accessed on big-endian systems.
>
> This patch fixes this by using the MESA_FORMAT_A8B8G8R8_UNORM format
> instead on big-endian systems when filling the buffer.  This fixes
> about 100 piglit test case failures on s390x for me.
>
> Signed-off-by: Ulrich Weigand 
> ---
>  src/mesa/main/texcompress_bptc.c |3 ++-
>  src/mesa/main/texcompress_fxt1.c |3 ++-
>  src/mesa/main/texcompress_rgtc.c |6 --
>  src/mesa/main/texcompress_s3tc.c |9 ++---
>  4 files changed, 14 insertions(+), 7 deletions(-)
>
> diff --git a/src/mesa/main/texcompress_bptc.c 
> b/src/mesa/main/texcompress_bptc.c
> index a600180..f0f6553 100644
> --- a/src/mesa/main/texcompress_bptc.c
> +++ b/src/mesa/main/texcompress_bptc.c
> @@ -1291,7 +1291,8 @@ _mesa_texstore_bptc_rgba_unorm(TEXSTORE_PARAMS)
>tempImageSlices[0] = (GLubyte *) tempImage;
>_mesa_texstore(ctx, dims,
>   baseInternalFormat,
> - MESA_FORMAT_R8G8B8A8_UNORM,
> + _mesa_little_endian() ? MESA_FORMAT_R8G8B8A8_UNORM
> +   : MESA_FORMAT_A8B8G8R8_UNORM,
>   rgbaRowStride, tempImageSlices,
>   srcWidth, srcHeight, srcDepth,
>   srcFormat, srcType, srcAddr,
> diff --git a/src/mesa/main/texcompress_fxt1.c 
> b/src/mesa/main/texcompress_fxt1.c
> index d605e25..ae339e1 100644
> --- a/src/mesa/main/texcompress_fxt1.c
> +++ b/src/mesa/main/texcompress_fxt1.c
> @@ -130,7 +130,8 @@ _mesa_texstore_rgba_fxt1(TEXSTORE_PARAMS)
>tempImageSlices[0] = (GLubyte *) tempImage;
>_mesa_texstore(ctx, dims,
>   baseInternalFormat,
> - MESA_FORMAT_R8G8B8A8_UNORM,
> + _mesa_little_endian() ? MESA_FORMAT_R8G8B8A8_UNORM
> +   : MESA_FORMAT_A8B8G8R8_UNORM,
>   rgbaRowStride, tempImageSlices,
>   srcWidth, srcHeight, srcDepth,
>   srcFormat, srcType, srcAddr,
> diff --git a/src/mesa/main/texcompress_rgtc.c 
> b/src/mesa/main/texcompress_rgtc.c
> index 66de1f1..8cab7a5 100644
> --- a/src/mesa/main/texcompress_rgtc.c
> +++ b/src/mesa/main/texcompress_rgtc.c
> @@ -196,9 +196,11 @@ _mesa_texstore_rg_rgtc2(TEXSTORE_PARAMS)
>dstFormat == MESA_FORMAT_LA_LATC2_UNORM);
>
> if (baseInternalFormat == GL_RG)
> -  tempFormat = MESA_FORMAT_R8G8_UNORM;
> +  tempFormat = _mesa_little_endian() ? MESA_FORMAT_R8G8_UNORM
> + : MESA_FORMAT_G8R8_UNORM;
> else
> -  tempFormat = MESA_FORMAT_L8A8_UNORM;
> +  tempFormat = _mesa_little_endian() ? MESA_FORMAT_L8A8_UNORM
> + : MESA_FORMAT_A8L8_UNORM;
>
> rgRowStride = 2 * srcWidth * sizeof(GLubyte);
> tempImage = malloc(srcWidth * srcHeight * 2 * sizeof(GLubyte));
> diff --git a/src/mesa/main/texcompress_s3tc.c 
> b/src/mesa/main/texcompress_s3tc.c
> index 6cfe06a..7ddb0ed 100644
> --- a/src/mesa/main/texcompress_s3tc.c
> +++ b/src/mesa/main/texcompress_s3tc.c
> @@ -198,7 +198,8 @@ _mesa_texstore_rgba_dxt1(TEXSTORE_PARAMS)
>tempImageSlices[0] = (GLubyte *) tempImage;
>_mesa_texstore(ctx, dims,
>   baseInternalFormat,
> - MESA_FORMAT_R8G8B8A8_UNORM,
> + _mesa_little_endian() ? MESA_FORMAT_R8G8B8A8_UNORM
> +   : MESA_FORMAT_A8B8G8R8_UNORM,
>   rgbaRowStride, tempImageSlices,
>   srcWidth, srcHeight, srcDepth,
>   srcFormat, srcType, srcAddr,
> @@ -255,7 +256,8 @@ _mesa_texstore_rgba_dxt3(TEXSTORE_PARAMS)
>tempImageSlices[0] = (GLubyte *) tempImage;
>_mesa_texstore(ctx, dims,
>   baseInternalFormat,
> - MESA_FORMAT_R8G8B8A8_UNORM,
> + _mesa_little_endian() ? MESA_FORMAT_R8G8B8A8_UNORM
> +   : MESA_FORMAT_A8B8G8R8_UNORM,
>   rgbaRowStride, tempImageSlices,
>   srcWidth, srcHeight, srcDepth,
>   srcFormat, srcType, srcAddr,
> @@ -311,7 +313,8 @@ _mesa_texstore_rgba_dxt5(TEXSTORE_PARAMS)
>tempI

[Mesa-dev] [Bug 92020] wglCreatePbufferARB handle attrib error

2015-09-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92020

Emil Velikov  changed:

   What|Removed |Added

  Component|EGL |Other

--- Comment #1 from Emil Velikov  ---
Hi zeif,

This looks like a user error, rather than a mesa bug.

Namely, I suspect that you're feeding attrib as defined by some extension,
without first checking if the latter is available. Is that the case ?

For example if you want to use WGL_DEPTH_TEXTURE_FORMAT_NV, you should first
check for the WGL_NV_render_depth_texture extension.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: add lowering for ffract

2015-09-16 Thread Iago Toral
Reviewed-by: Iago Toral Quiroga 

On Tue, 2015-09-15 at 17:40 -0400, Rob Clark wrote:
> From: Rob Clark 
> 
> Signed-off-by: Rob Clark 
> ---
>  src/glsl/nir/nir.h| 3 +++
>  src/glsl/nir/nir_opt_algebraic.py | 1 +
>  2 files changed, 4 insertions(+)
> 
> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
> index f0acd75..284fccd 100644
> --- a/src/glsl/nir/nir.h
> +++ b/src/glsl/nir/nir.h
> @@ -1440,6 +1440,9 @@ typedef struct nir_shader_compiler_options {
>  */
> bool fdot_replicates;
>  
> +   /** lowers ffract to fsub+ffloor: */
> +   bool lower_ffract;
> +
> /**
>  * Does the driver support real 32-bit integers?  (Otherwise, integers
>  * are simulated by floats.)
> diff --git a/src/glsl/nir/nir_opt_algebraic.py 
> b/src/glsl/nir/nir_opt_algebraic.py
> index acc3b04..43558a5 100644
> --- a/src/glsl/nir/nir_opt_algebraic.py
> +++ b/src/glsl/nir/nir_opt_algebraic.py
> @@ -76,6 +76,7 @@ optimizations = [
> (('flrp', a, a, b), a),
> (('flrp', 0.0, a, b), ('fmul', a, b)),
> (('flrp', a, b, c), ('fadd', ('fmul', c, ('fsub', b, a)), a), 
> 'options->lower_flrp'),
> +   (('ffract', a), ('fsub', a, ('ffloor', a)), 'options->lower_ffract'),
> (('fadd', ('fmul', a, ('fadd', 1.0, ('fneg', c))), ('fmul', b, c)), 
> ('flrp', a, b, c), '!options->lower_flrp'),
> (('fadd', a, ('fmul', c, ('fadd', b, ('fneg', a, ('flrp', a, b, c), 
> '!options->lower_flrp'),
> (('ffma', a, b, c), ('fadd', ('fmul', a, b), c), 'options->lower_ffma'),


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] nir/print: bit of state refactoring

2015-09-16 Thread Rob Clark
From: Rob Clark 

Rename print_var_state to print_state, and stuff FILE ptr into the state
object.  This avoids passing around an extra parameter everywhere.

v2: even more extensive conversion.. use state *everywhere* instead of
FILE ptr, and convert nir_print_instr() to use state as well

Signed-off-by: Rob Clark 
---
 src/glsl/nir/nir_print.c | 261 +++
 1 file changed, 152 insertions(+), 109 deletions(-)

diff --git a/src/glsl/nir/nir_print.c b/src/glsl/nir/nir_print.c
index 69cadba..bdecc3c 100644
--- a/src/glsl/nir/nir_print.c
+++ b/src/glsl/nir/nir_print.c
@@ -37,6 +37,7 @@ print_tabs(unsigned num_tabs, FILE *fp)
 }
 
 typedef struct {
+   FILE *fp;
/** map from nir_variable -> printable name */
struct hash_table *ht;
 
@@ -45,11 +46,12 @@ typedef struct {
 
/* an index used to make new non-conflicting names */
unsigned index;
-} print_var_state;
+} print_state;
 
 static void
-print_register(nir_register *reg, FILE *fp)
+print_register(nir_register *reg, print_state *state)
 {
+   FILE *fp = state->fp;
if (reg->name != NULL)
   fprintf(fp, "/* %s */ ", reg->name);
if (reg->is_global)
@@ -61,90 +63,97 @@ print_register(nir_register *reg, FILE *fp)
 static const char *sizes[] = { "error", "vec1", "vec2", "vec3", "vec4" };
 
 static void
-print_register_decl(nir_register *reg, FILE *fp)
+print_register_decl(nir_register *reg, print_state *state)
 {
+   FILE *fp = state->fp;
fprintf(fp, "decl_reg %s ", sizes[reg->num_components]);
if (reg->is_packed)
   fprintf(fp, "(packed) ");
-   print_register(reg, fp);
+   print_register(reg, state);
if (reg->num_array_elems != 0)
   fprintf(fp, "[%u]", reg->num_array_elems);
fprintf(fp, "\n");
 }
 
 static void
-print_ssa_def(nir_ssa_def *def, FILE *fp)
+print_ssa_def(nir_ssa_def *def, print_state *state)
 {
+   FILE *fp = state->fp;
if (def->name != NULL)
   fprintf(fp, "/* %s */ ", def->name);
fprintf(fp, "%s ssa_%u", sizes[def->num_components], def->index);
 }
 
 static void
-print_ssa_use(nir_ssa_def *def, FILE *fp)
+print_ssa_use(nir_ssa_def *def, print_state *state)
 {
+   FILE *fp = state->fp;
if (def->name != NULL)
   fprintf(fp, "/* %s */ ", def->name);
fprintf(fp, "ssa_%u", def->index);
 }
 
-static void print_src(nir_src *src, FILE *fp);
+static void print_src(nir_src *src, print_state *state);
 
 static void
-print_reg_src(nir_reg_src *src, FILE *fp)
+print_reg_src(nir_reg_src *src, print_state *state)
 {
-   print_register(src->reg, fp);
+   FILE *fp = state->fp;
+   print_register(src->reg, state);
if (src->reg->num_array_elems != 0) {
   fprintf(fp, "[%u", src->base_offset);
   if (src->indirect != NULL) {
  fprintf(fp, " + ");
- print_src(src->indirect, fp);
+ print_src(src->indirect, state);
   }
   fprintf(fp, "]");
}
 }
 
 static void
-print_reg_dest(nir_reg_dest *dest, FILE *fp)
+print_reg_dest(nir_reg_dest *dest, print_state *state)
 {
-   print_register(dest->reg, fp);
+   FILE *fp = state->fp;
+   print_register(dest->reg, state);
if (dest->reg->num_array_elems != 0) {
   fprintf(fp, "[%u", dest->base_offset);
   if (dest->indirect != NULL) {
  fprintf(fp, " + ");
- print_src(dest->indirect, fp);
+ print_src(dest->indirect, state);
   }
   fprintf(fp, "]");
}
 }
 
 static void
-print_src(nir_src *src, FILE *fp)
+print_src(nir_src *src, print_state *state)
 {
if (src->is_ssa)
-  print_ssa_use(src->ssa, fp);
+  print_ssa_use(src->ssa, state);
else
-  print_reg_src(&src->reg, fp);
+  print_reg_src(&src->reg, state);
 }
 
 static void
-print_dest(nir_dest *dest, FILE *fp)
+print_dest(nir_dest *dest, print_state *state)
 {
if (dest->is_ssa)
-  print_ssa_def(&dest->ssa, fp);
+  print_ssa_def(&dest->ssa, state);
else
-  print_reg_dest(&dest->reg, fp);
+  print_reg_dest(&dest->reg, state);
 }
 
 static void
-print_alu_src(nir_alu_instr *instr, unsigned src, FILE *fp)
+print_alu_src(nir_alu_instr *instr, unsigned src, print_state *state)
 {
+   FILE *fp = state->fp;
+
if (instr->src[src].negate)
   fprintf(fp, "-");
if (instr->src[src].abs)
   fprintf(fp, "abs(");
 
-   print_src(&instr->src[src].src, fp);
+   print_src(&instr->src[src].src, state);
 
bool print_swizzle = false;
for (unsigned i = 0; i < 4; i++) {
@@ -172,11 +181,12 @@ print_alu_src(nir_alu_instr *instr, unsigned src, FILE 
*fp)
 }
 
 static void
-print_alu_dest(nir_alu_dest *dest, FILE *fp)
+print_alu_dest(nir_alu_dest *dest, print_state *state)
 {
+   FILE *fp = state->fp;
/* we're going to print the saturate modifier later, after the opcode */
 
-   print_dest(&dest->dest, fp);
+   print_dest(&dest->dest, state);
 
if (!dest->dest.is_ssa &&
dest->write_mask != (1 << dest->dest.reg.reg->num_components) - 1) {
@@ -188,9 +198,11 @@ print_alu_dest(nir_alu_dest *dest, FILE *fp)
 }
 
 stat

[Mesa-dev] [Bug 92020] wglCreatePbufferARB handle attrib error

2015-09-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92020

--- Comment #2 from zeif <332447...@qq.com> ---
(In reply to Emil Velikov from comment #1)
> Hi zeif,
> 
> This looks like a user error, rather than a mesa bug.
> 
> Namely, I suspect that you're feeding attrib as defined by some extension,
> without first checking if the latter is available. Is that the case ?
> 
> For example if you want to use WGL_DEPTH_TEXTURE_FORMAT_NV, you should first
> check for the WGL_NV_render_depth_texture extension.





/

Thx :

I was a rookie at opengl..

But , Android source code do the same things as me.

So, I push mesa to Emulator's folder.

Android Emulator is not work..


/

// Android source code : 

//http://androidxref.com/4.3_r2.1/xref/sdk/emulator/opengl/host/libs/Translator/EGL/EglWindowsApi.cpp#498



int pbAttribs[] = {
   WGL_TEXTURE_TARGET_ARB   ,wglTexTarget,
   WGL_TEXTURE_FORMAT_ARB   ,wglTexFormat,
   0
  };

if(!s_wglExtProcs->wglCreatePbufferARB) return NULL;

EGLNativePbufferType pb =
s_wglExtProcs->wglCreatePbufferARB(dpy,cfg->nativeId(),width,height,pbAttribs);

if(!pb) {
DWORD err = GetLastError();
return NULL;
}
//

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92022] st/va: add initial support for Video Post Processing and Export/Import of VaSurface

2015-09-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92022

Bug ID: 92022
   Summary: st/va: add initial support for Video Post Processing
and Export/Import of VaSurface
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: julien.iso...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

Just let know I did a first attempt to add VPP and
VaAcquireBufferHandle(dmabuf) to st/va:

https://github.com/CapOM/mesa/commits/wip_export_import_and_vpp

I'll send patches to mesa-dev mailing list once it is ready. If you have any
remark let me know.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92022] st/va: add initial support for Video Post Processing and Export/Import of VaSurface

2015-09-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92022

Julien Isorce  changed:

   What|Removed |Added

 CC||deathsim...@vodafone.de,
   ||imir...@alum.mit.edu

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] nir/print: bit of state refactoring

2015-09-16 Thread Iago Toral
Looks good,

Reviewed-by: Iago Toral Quiroga 

On Wed, 2015-09-16 at 08:25 -0400, Rob Clark wrote:
> From: Rob Clark 
> 
> Rename print_var_state to print_state, and stuff FILE ptr into the state
> object.  This avoids passing around an extra parameter everywhere.
> 
> v2: even more extensive conversion.. use state *everywhere* instead of
> FILE ptr, and convert nir_print_instr() to use state as well
> 
> Signed-off-by: Rob Clark 
> ---
>  src/glsl/nir/nir_print.c | 261 
> +++
>  1 file changed, 152 insertions(+), 109 deletions(-)
> 
> diff --git a/src/glsl/nir/nir_print.c b/src/glsl/nir/nir_print.c
> index 69cadba..bdecc3c 100644
> --- a/src/glsl/nir/nir_print.c
> +++ b/src/glsl/nir/nir_print.c
> @@ -37,6 +37,7 @@ print_tabs(unsigned num_tabs, FILE *fp)
>  }
>  
>  typedef struct {
> +   FILE *fp;
> /** map from nir_variable -> printable name */
> struct hash_table *ht;
>  
> @@ -45,11 +46,12 @@ typedef struct {
>  
> /* an index used to make new non-conflicting names */
> unsigned index;
> -} print_var_state;
> +} print_state;
>  
>  static void
> -print_register(nir_register *reg, FILE *fp)
> +print_register(nir_register *reg, print_state *state)
>  {
> +   FILE *fp = state->fp;
> if (reg->name != NULL)
>fprintf(fp, "/* %s */ ", reg->name);
> if (reg->is_global)
> @@ -61,90 +63,97 @@ print_register(nir_register *reg, FILE *fp)
>  static const char *sizes[] = { "error", "vec1", "vec2", "vec3", "vec4" };
>  
>  static void
> -print_register_decl(nir_register *reg, FILE *fp)
> +print_register_decl(nir_register *reg, print_state *state)
>  {
> +   FILE *fp = state->fp;
> fprintf(fp, "decl_reg %s ", sizes[reg->num_components]);
> if (reg->is_packed)
>fprintf(fp, "(packed) ");
> -   print_register(reg, fp);
> +   print_register(reg, state);
> if (reg->num_array_elems != 0)
>fprintf(fp, "[%u]", reg->num_array_elems);
> fprintf(fp, "\n");
>  }
>  
>  static void
> -print_ssa_def(nir_ssa_def *def, FILE *fp)
> +print_ssa_def(nir_ssa_def *def, print_state *state)
>  {
> +   FILE *fp = state->fp;
> if (def->name != NULL)
>fprintf(fp, "/* %s */ ", def->name);
> fprintf(fp, "%s ssa_%u", sizes[def->num_components], def->index);
>  }
>  
>  static void
> -print_ssa_use(nir_ssa_def *def, FILE *fp)
> +print_ssa_use(nir_ssa_def *def, print_state *state)
>  {
> +   FILE *fp = state->fp;
> if (def->name != NULL)
>fprintf(fp, "/* %s */ ", def->name);
> fprintf(fp, "ssa_%u", def->index);
>  }
>  
> -static void print_src(nir_src *src, FILE *fp);
> +static void print_src(nir_src *src, print_state *state);
>  
>  static void
> -print_reg_src(nir_reg_src *src, FILE *fp)
> +print_reg_src(nir_reg_src *src, print_state *state)
>  {
> -   print_register(src->reg, fp);
> +   FILE *fp = state->fp;
> +   print_register(src->reg, state);
> if (src->reg->num_array_elems != 0) {
>fprintf(fp, "[%u", src->base_offset);
>if (src->indirect != NULL) {
>   fprintf(fp, " + ");
> - print_src(src->indirect, fp);
> + print_src(src->indirect, state);
>}
>fprintf(fp, "]");
> }
>  }
>  
>  static void
> -print_reg_dest(nir_reg_dest *dest, FILE *fp)
> +print_reg_dest(nir_reg_dest *dest, print_state *state)
>  {
> -   print_register(dest->reg, fp);
> +   FILE *fp = state->fp;
> +   print_register(dest->reg, state);
> if (dest->reg->num_array_elems != 0) {
>fprintf(fp, "[%u", dest->base_offset);
>if (dest->indirect != NULL) {
>   fprintf(fp, " + ");
> - print_src(dest->indirect, fp);
> + print_src(dest->indirect, state);
>}
>fprintf(fp, "]");
> }
>  }
>  
>  static void
> -print_src(nir_src *src, FILE *fp)
> +print_src(nir_src *src, print_state *state)
>  {
> if (src->is_ssa)
> -  print_ssa_use(src->ssa, fp);
> +  print_ssa_use(src->ssa, state);
> else
> -  print_reg_src(&src->reg, fp);
> +  print_reg_src(&src->reg, state);
>  }
>  
>  static void
> -print_dest(nir_dest *dest, FILE *fp)
> +print_dest(nir_dest *dest, print_state *state)
>  {
> if (dest->is_ssa)
> -  print_ssa_def(&dest->ssa, fp);
> +  print_ssa_def(&dest->ssa, state);
> else
> -  print_reg_dest(&dest->reg, fp);
> +  print_reg_dest(&dest->reg, state);
>  }
>  
>  static void
> -print_alu_src(nir_alu_instr *instr, unsigned src, FILE *fp)
> +print_alu_src(nir_alu_instr *instr, unsigned src, print_state *state)
>  {
> +   FILE *fp = state->fp;
> +
> if (instr->src[src].negate)
>fprintf(fp, "-");
> if (instr->src[src].abs)
>fprintf(fp, "abs(");
>  
> -   print_src(&instr->src[src].src, fp);
> +   print_src(&instr->src[src].src, state);
>  
> bool print_swizzle = false;
> for (unsigned i = 0; i < 4; i++) {
> @@ -172,11 +181,12 @@ print_alu_src(nir_alu_instr *instr, unsigned src, FILE 
> *fp)
>  }
>  
>  static void
> -print_alu_dest(nir_alu_dest *dest, 

[Mesa-dev] [PATCH] st/xa: Use PIPE_FORMAT_R8_UNORM when available

2015-09-16 Thread Thomas Hellstrom
XA has been using L8_UNORM for a8 and yuv component surfaces.
This commit instead makes XA prefer R8_UNORM since it's assumed to have a
higher availability.

Also neither of these formats are suitable as destination formats using
destination alpha blending, so reject those operations.

Signed-off-by: Thomas Hellstrom 
---
 src/gallium/state_trackers/xa/xa_composite.c | 40 ++--
 src/gallium/state_trackers/xa/xa_tracker.c   | 28 +--
 2 files changed, 34 insertions(+), 34 deletions(-)

diff --git a/src/gallium/state_trackers/xa/xa_composite.c 
b/src/gallium/state_trackers/xa/xa_composite.c
index 7cfd1e1..e81eeba 100644
--- a/src/gallium/state_trackers/xa/xa_composite.c
+++ b/src/gallium/state_trackers/xa/xa_composite.c
@@ -78,26 +78,6 @@ static const struct xa_composite_blend xa_blends[] = {
   0, 0, PIPE_BLENDFACTOR_ONE, PIPE_BLENDFACTOR_ONE},
 };
 
-
-/*
- * The alpha value stored in a luminance texture is read by the
- * hardware as color.
- */
-static unsigned
-xa_convert_blend_for_luminance(unsigned factor)
-{
-switch(factor) {
-case PIPE_BLENDFACTOR_DST_ALPHA:
-   return PIPE_BLENDFACTOR_DST_COLOR;
-case PIPE_BLENDFACTOR_INV_DST_ALPHA:
-   return PIPE_BLENDFACTOR_INV_DST_COLOR;
-default:
-   break;
-}
-return factor;
-}
-
-
 static boolean
 blend_for_op(struct xa_composite_blend *blend,
 enum xa_composite_op op,
@@ -131,10 +111,16 @@ blend_for_op(struct xa_composite_blend *blend,
 if (!dst_pic->srf)
return supported;
 
-if (dst_pic->srf->tex->format == PIPE_FORMAT_L8_UNORM) {
-   blend->rgb_src = xa_convert_blend_for_luminance(blend->rgb_src);
-   blend->rgb_dst = xa_convert_blend_for_luminance(blend->rgb_dst);
-}
+/*
+ * None of the hardware formats we might use for dst A8 are
+ * suitable for dst_alpha blending, since they present the
+ * alpha channel either in all color channels (L8_UNORM) or
+ * in the red channel only (R8_UNORM)
+ */
+if ((dst_pic->srf->tex->format == PIPE_FORMAT_L8_UNORM ||
+ dst_pic->srf->tex->format == PIPE_FORMAT_R8_UNORM) &&
+blend->alpha_dst)
+return FALSE;
 
 /*
  * If there's no dst alpha channel, adjust the blend op so that we'll treat
@@ -298,7 +284,8 @@ picture_format_fixups(struct xa_picture *src_pic,
ret |= mask ? FS_MASK_SET_ALPHA : FS_SRC_SET_ALPHA;
 
 if (src_hw_format == src_pic_format) {
-   if (src->tex->format == PIPE_FORMAT_L8_UNORM)
+   if (src->tex->format == PIPE_FORMAT_L8_UNORM ||
+src->tex->format == PIPE_FORMAT_R8_UNORM)
return ((mask) ? FS_MASK_LUMINANCE : FS_SRC_LUMINANCE);
 
return ret;
@@ -372,7 +359,8 @@ bind_shaders(struct xa_context *ctx, const struct 
xa_composite *comp)
fs_traits |= picture_format_fixups(mask_pic, 1);
 }
 
-if (ctx->srf->format == PIPE_FORMAT_L8_UNORM)
+if (ctx->srf->format == PIPE_FORMAT_L8_UNORM ||
+ctx->srf->format == PIPE_FORMAT_R8_UNORM)
fs_traits |= FS_DST_LUMINANCE;
 
 shader = xa_shaders_get(ctx->shaders, vs_traits, fs_traits);
diff --git a/src/gallium/state_trackers/xa/xa_tracker.c 
b/src/gallium/state_trackers/xa/xa_tracker.c
index 2944b16..cd1394a 100644
--- a/src/gallium/state_trackers/xa/xa_tracker.c
+++ b/src/gallium/state_trackers/xa/xa_tracker.c
@@ -82,7 +82,7 @@ static const unsigned int stype_bind[XA_LAST_SURFACE_TYPE] = 
{ 0,
 };
 
 static struct xa_format_descriptor
-xa_get_pipe_format(enum xa_formats xa_format)
+xa_get_pipe_format(struct xa_tracker *xa, enum xa_formats xa_format)
 {
 struct xa_format_descriptor fdesc;
 
@@ -102,7 +102,13 @@ xa_get_pipe_format(enum xa_formats xa_format)
fdesc.format = PIPE_FORMAT_B5G5R5A1_UNORM;
break;
 case xa_format_a8:
-   fdesc.format = PIPE_FORMAT_L8_UNORM;
+if (xa->screen->is_format_supported(xa->screen, PIPE_FORMAT_R8_UNORM,
+PIPE_TEXTURE_2D, 0,
+stype_bind[xa_type_a] |
+PIPE_BIND_RENDER_TARGET))
+fdesc.format = PIPE_FORMAT_R8_UNORM;
+else
+fdesc.format = PIPE_FORMAT_L8_UNORM;
break;
 case xa_format_z24:
fdesc.format = PIPE_FORMAT_Z24X8_UNORM;
@@ -126,7 +132,12 @@ xa_get_pipe_format(enum xa_formats xa_format)
fdesc.format = PIPE_FORMAT_S8_UINT_Z24_UNORM;
break;
 case xa_format_yuv8:
-   fdesc.format = PIPE_FORMAT_L8_UNORM;
+if (xa->screen->is_format_supported(xa->screen, PIPE_FORMAT_R8_UNORM,
+PIPE_TEXTURE_2D, 0,
+stype_bind[xa_type_yuv_component]))
+fdesc.format = PIPE_FORMAT_R8_UNORM;
+else
+fdesc.format = PIPE_FORMAT_L8_UNORM;
break;
 default:
fdesc.xa_format = xa_format_unknown;
@@ -184,7 +195,8 @@ xa_tracker_crea

Re: [Mesa-dev] [PATCH] st/mesa: avoid integer overflows with buffers >= 512MB

2015-09-16 Thread Roland Scheidegger
Since there are no formats where block.bits isn't a multiple of 8 (and I
wouldn't expect that to change), you could theoretically fix that by
dividing by block.bits / 8 instead of multiplying base (or size) by 8.
Ought to be faster at least on 32bit systems...
Unless you wanted to support >= 4GB buffers (I think though for that
we're missing way more things).
Either way though,
Reviewed-by: Roland Scheidegger 

Am 16.09.2015 um 01:32 schrieb Ilia Mirkin:
> This fixes failures with the newly-submitted max-size texture buffer
> piglit test for GPUs exposing >= 128M max texels.
> 
> Signed-off-by: Ilia Mirkin 
> Cc: "10.6 11.0" 
> ---
>  src/mesa/state_tracker/st_atom_texture.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/state_tracker/st_atom_texture.c 
> b/src/mesa/state_tracker/st_atom_texture.c
> index 31e0f6b..62312af 100644
> --- a/src/mesa/state_tracker/st_atom_texture.c
> +++ b/src/mesa/state_tracker/st_atom_texture.c
> @@ -264,7 +264,7 @@ st_create_texture_sampler_view_from_stobj(struct 
> pipe_context *pipe,
> format);
>  
> if (stObj->pt->target == PIPE_BUFFER) {
> -  unsigned base, size;
> +  uint64_t base, size;
>unsigned f, n;
>const struct util_format_description *desc
>   = util_format_description(templ.format);
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: move GL_APPLE_object_purgeable functions to new file

2015-09-16 Thread Matt Turner
On Tue, Sep 15, 2015 at 8:23 PM, Brian Paul  wrote:
> Move this code out of bufferobj.c since it's not strongly connected to
> buffer objects.
> ---

Seems fine to me, and bonus points for removing now unneeded #includes! :)

Acked-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] mesa: Reduce libGL.so binary size by about 15%

2015-09-16 Thread Matt Turner
On Tue, Sep 15, 2015 at 10:38 AM, Arlie Davis  wrote:
>
> Hello!  I noticed an inefficiency in libGL.so, so I thought I'd take a
> stab at fixing it.  This is my first patch submitted to mesa-dev, so
> if I'm doing anything dumb, let me know.  I can't use git send-email,
> but I've formatted the patch using git format-patch, which should
> hopefully produce similar output.  The patch text (below) describes
> the inefficiency and the improvement.
>
>
>
> From 0abde226eed1b9f6052193f36f6cdc060698f621 Mon Sep 17 00:00:00 2001
> From: Arlie Davis 
> Date: Tue, 15 Sep 2015 09:58:34 -0700
> Subject: [PATCH] This patch significantly reduces the size of the libGL.so
>  binary. It does not change the (externally visible) behavior of libGL.so at
>  all.
>
> gl_gentable.py generates a function, _glapi_create_table_from_handle.
> This function allocates a large dispatch table, consisting of 1300 or so
> function pointers, and fills this dispatch table by doing symbol lookups
> on a given shared library.  Previously, gl_gentable.py would generate a
> single, very large _glapi_create_table_from_handle function, with a short
> cluster of lines for each entry point (function).  The idiom it generates
> was a NULL check, a call to snprintf, a call to dlsym / GetProcAddress,
> and then a store into the dispatch table.  Since this function processes
> a large number of entry points, this code is duplicated many times over.
>
> We can encode the same information much more compactly, by using a lookup
> table.  The previous total size of _glapi_create_table_from_handle on x64
> was 125848 bytes.  By using a lookup table, the size of
> _glapi_create_table_from_handle (and the related lookup tables) is reduced
> to 10840 bytes.  In other words, this enormous function is reduced by 91%.
> The size of the entire libGL.so binary (measured when stripped) itself drops
> by 15%.
>
> So the purpose of this change is to reduce the binary size, which frees up
> disk space, memory, etc.
> ---

Seems like a nice change. size lib/libGL.so.1.2.0 on my system shows

   text   databssdechex filename
 604031  11360   2792 618183  96ec7 lib/libGL.so.1.2.0 before
 490751  21920   2792 515463  7dd87 lib/libGL.so.1.2.0 after

Feel free to include that in the commit message.

>  src/mapi/glapi/gen/gl_gentable.py | 56 
> ---
>  1 file changed, 40 insertions(+), 16 deletions(-)
>
> diff --git a/src/mapi/glapi/gen/gl_gentable.py 
> b/src/mapi/glapi/gen/gl_gentable.py
> index 1b3eb72..2563b6b 100644
> --- a/src/mapi/glapi/gen/gl_gentable.py
> +++ b/src/mapi/glapi/gen/gl_gentable.py
> @@ -113,6 +113,9 @@ __glapi_gentable_set_remaining_noop(struct _glapi_table 
> *disp) {
>  dispatch[i] = p.v;
>  }
>
> +"""
> +
> +footer = """
>  struct _glapi_table *
>  _glapi_create_table_from_handle(void *handle, const char *symbol_prefix) {
>  struct _glapi_table *disp = calloc(_glapi_get_dispatch_table_size(), 
> sizeof(_glapi_proc));
> @@ -123,27 +126,27 @@ _glapi_create_table_from_handle(void *handle, const 
> char *symbol_prefix) {
>
>  if(symbol_prefix == NULL)
>  symbol_prefix = "";
> -"""
>
> -footer = """
> -__glapi_gentable_set_remaining_noop(disp);
> -
> -return disp;
> -}
> -"""
> +/* Note: This code relies on _glapi_table_func_names being sorted by the
> +   entry point index of each function. */

Mesa style puts the */ on its own line for multiline comments.

> +for (int func_index = 0; func_index < GLAPI_TABLE_COUNT; ++func_index) {
> +const char* name = _glapi_table_func_names[func_index];

* goes with the var name, not the type. That is, "char* " should be "char *"

> +void ** procp = &((void **)disp)[func_index];
>
> -body_template = """
> -if(!disp->%(name)s) {

We're removing the null check. Is that okay to do?

> -void ** procp = (void **) &disp->%(name)s;
> -snprintf(symboln, sizeof(symboln), "%%s%(entry_point)s", 
> symbol_prefix);
> +snprintf(symboln, sizeof(symboln), \"%s%s\", symbol_prefix, name);
>  #ifdef _WIN32
>  *procp = GetProcAddress(handle, symboln);
>  #else
>  *procp = dlsym(handle, symboln);
>  #endif
>  }
> +__glapi_gentable_set_remaining_noop(disp);
> +
> +return disp;
> +}
>  """
>
> +
>  class PrintCode(gl_XML.gl_print_base):
>
>  def __init__(self):
> @@ -180,12 +183,33 @@ class PrintCode(gl_XML.gl_print_base):
>
>
>  def printBody(self, api):
> -for f in api.functionIterateByOffset():
> -for entry_point in f.entry_points:
> -vars = { 'entry_point' : entry_point,
> - 'name' : f.name }
>
> -print body_template % vars
> +# Determine how many functions have a defined offset.
> +func_count = 0
> +for f in api.functions_by_name.itervalues():
> +if f.offset != -1:
> +func_count += 1
> +
> +# Build the mapping from offset to fun

Re: [Mesa-dev] [PATCH 2/3] nir/print: bit of state refactoring

2015-09-16 Thread Connor Abbott
On Wed, Sep 16, 2015 at 8:25 AM, Rob Clark  wrote:
> From: Rob Clark 
>
> Rename print_var_state to print_state, and stuff FILE ptr into the state
> object.  This avoids passing around an extra parameter everywhere.
>
> v2: even more extensive conversion.. use state *everywhere* instead of
> FILE ptr, and convert nir_print_instr() to use state as well
>
> Signed-off-by: Rob Clark 

Reviewed-by: Connor Abbott 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/xa: Use PIPE_FORMAT_R8_UNORM when available

2015-09-16 Thread Brian Paul

On 09/16/2015 07:04 AM, Thomas Hellstrom wrote:

XA has been using L8_UNORM for a8 and yuv component surfaces.
This commit instead makes XA prefer R8_UNORM since it's assumed to have a
higher availability.

Also neither of these formats are suitable as destination formats using
destination alpha blending, so reject those operations.

Signed-off-by: Thomas Hellstrom 
---
  src/gallium/state_trackers/xa/xa_composite.c | 40 ++--
  src/gallium/state_trackers/xa/xa_tracker.c   | 28 +--
  2 files changed, 34 insertions(+), 34 deletions(-)

diff --git a/src/gallium/state_trackers/xa/xa_composite.c 
b/src/gallium/state_trackers/xa/xa_composite.c
index 7cfd1e1..e81eeba 100644
--- a/src/gallium/state_trackers/xa/xa_composite.c
+++ b/src/gallium/state_trackers/xa/xa_composite.c
@@ -78,26 +78,6 @@ static const struct xa_composite_blend xa_blends[] = {
0, 0, PIPE_BLENDFACTOR_ONE, PIPE_BLENDFACTOR_ONE},
  };

-
-/*
- * The alpha value stored in a luminance texture is read by the
- * hardware as color.
- */
-static unsigned
-xa_convert_blend_for_luminance(unsigned factor)
-{
-switch(factor) {
-case PIPE_BLENDFACTOR_DST_ALPHA:
-   return PIPE_BLENDFACTOR_DST_COLOR;
-case PIPE_BLENDFACTOR_INV_DST_ALPHA:
-   return PIPE_BLENDFACTOR_INV_DST_COLOR;
-default:
-   break;
-}
-return factor;
-}
-
-
  static boolean
  blend_for_op(struct xa_composite_blend *blend,
 enum xa_composite_op op,
@@ -131,10 +111,16 @@ blend_for_op(struct xa_composite_blend *blend,
  if (!dst_pic->srf)
return supported;

-if (dst_pic->srf->tex->format == PIPE_FORMAT_L8_UNORM) {
-   blend->rgb_src = xa_convert_blend_for_luminance(blend->rgb_src);
-   blend->rgb_dst = xa_convert_blend_for_luminance(blend->rgb_dst);
-}
+/*
+ * None of the hardware formats we might use for dst A8 are
+ * suitable for dst_alpha blending, since they present the
+ * alpha channel either in all color channels (L8_UNORM) or
+ * in the red channel only (R8_UNORM)
+ */
+if ((dst_pic->srf->tex->format == PIPE_FORMAT_L8_UNORM ||
+ dst_pic->srf->tex->format == PIPE_FORMAT_R8_UNORM) &&
+blend->alpha_dst)
+return FALSE;

  /*
   * If there's no dst alpha channel, adjust the blend op so that we'll 
treat
@@ -298,7 +284,8 @@ picture_format_fixups(struct xa_picture *src_pic,
ret |= mask ? FS_MASK_SET_ALPHA : FS_SRC_SET_ALPHA;

  if (src_hw_format == src_pic_format) {
-   if (src->tex->format == PIPE_FORMAT_L8_UNORM)
+   if (src->tex->format == PIPE_FORMAT_L8_UNORM ||
+src->tex->format == PIPE_FORMAT_R8_UNORM)
return ((mask) ? FS_MASK_LUMINANCE : FS_SRC_LUMINANCE);

return ret;
@@ -372,7 +359,8 @@ bind_shaders(struct xa_context *ctx, const struct 
xa_composite *comp)
fs_traits |= picture_format_fixups(mask_pic, 1);
  }

-if (ctx->srf->format == PIPE_FORMAT_L8_UNORM)
+if (ctx->srf->format == PIPE_FORMAT_L8_UNORM ||
+ctx->srf->format == PIPE_FORMAT_R8_UNORM)
fs_traits |= FS_DST_LUMINANCE;

  shader = xa_shaders_get(ctx->shaders, vs_traits, fs_traits);
diff --git a/src/gallium/state_trackers/xa/xa_tracker.c 
b/src/gallium/state_trackers/xa/xa_tracker.c
index 2944b16..cd1394a 100644
--- a/src/gallium/state_trackers/xa/xa_tracker.c
+++ b/src/gallium/state_trackers/xa/xa_tracker.c
@@ -82,7 +82,7 @@ static const unsigned int stype_bind[XA_LAST_SURFACE_TYPE] = 
{ 0,
  };

  static struct xa_format_descriptor
-xa_get_pipe_format(enum xa_formats xa_format)
+xa_get_pipe_format(struct xa_tracker *xa, enum xa_formats xa_format)
  {
  struct xa_format_descriptor fdesc;

@@ -102,7 +102,13 @@ xa_get_pipe_format(enum xa_formats xa_format)
fdesc.format = PIPE_FORMAT_B5G5R5A1_UNORM;
break;
  case xa_format_a8:
-   fdesc.format = PIPE_FORMAT_L8_UNORM;
+if (xa->screen->is_format_supported(xa->screen, PIPE_FORMAT_R8_UNORM,
+PIPE_TEXTURE_2D, 0,
+stype_bind[xa_type_a] |
+PIPE_BIND_RENDER_TARGET))
+fdesc.format = PIPE_FORMAT_R8_UNORM;
+else
+fdesc.format = PIPE_FORMAT_L8_UNORM;
break;
  case xa_format_z24:
fdesc.format = PIPE_FORMAT_Z24X8_UNORM;
@@ -126,7 +132,12 @@ xa_get_pipe_format(enum xa_formats xa_format)
fdesc.format = PIPE_FORMAT_S8_UINT_Z24_UNORM;
break;
  case xa_format_yuv8:
-   fdesc.format = PIPE_FORMAT_L8_UNORM;
+if (xa->screen->is_format_supported(xa->screen, PIPE_FORMAT_R8_UNORM,
+PIPE_TEXTURE_2D, 0,
+stype_bind[xa_type_yuv_component]))
+fdesc.format = PIPE_FORMAT_R8_UNORM;
+else
+fdesc.format = PIPE_FORMAT_L8_UNORM;
break;
  default:
fdesc.xa_

[Mesa-dev] [Bug 92020] wglCreatePbufferARB handle attrib error

2015-09-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92020

--- Comment #3 from Emil Velikov  ---
Congratulations - seems like you found a bug in the emulator :-)

I would suggest taking a closer look into the emulator code.

To do things properly it should:
1. Create OpenGL context
2. Call wglGetProcAddress("wglGetExtensionsStringARB")
3. If non NULL, call the function to get a list of WGL extensions
4. Search for "WGL_ARB_render_texture" (the provider of WGL_TEXTURE_TARGET_ARB 
 and WGL_TEXTURE_FORMAT_ARB)
5. Act depending on its presence.

Here are a couple of links which should be useful

[1] https://www.opengl.org/wiki/Load_OpenGL_Functions#Windows_2
[2] https://www.opengl.org/registry/specs/ARB/wgl_render_texture.txt


If you want to add support for the extension in mesa that'll be great. Check
the Developer info [3] page for more details.


Cheers,
Emil

[3] http://www.mesa3d.org/devinfo.html

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92020] wglCreatePbufferARB handle attrib error

2015-09-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92020

--- Comment #4 from Emil Velikov  ---
While one's in the emulator they could also fix the strstr in
wglGetExtentionsProcAddress.

Currently it will trigger whenever it finds FooBar, even if it's looking for
Foo.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] glsl: shader-enum to name debug fxns

2015-09-16 Thread Emil Velikov
Hi Rob,

On 16 September 2015 at 00:33, Rob Clark  wrote:

> diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
> index ed9848c..2a719a0 100644
> --- a/src/mesa/Makefile.sources
> +++ b/src/mesa/Makefile.sources
> @@ -523,7 +523,9 @@ PROGRAM_FILES = \
> program/sampler.h \
> program/string_to_uint_map.cpp \
> program/symbol_table.c \
> -   program/symbol_table.h
> +   program/symbol_table.h \
> +   ../glsl/shader_enums.c \
> +   ../glsl/shader_enums.h
>
I'm not too sure if this will work with the automake out-of-tree
builds and/or scons. Can you please give them a try ?

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] glsl: shader-enum to name debug fxns

2015-09-16 Thread Rob Clark
On Wed, Sep 16, 2015 at 11:22 AM, Emil Velikov  wrote:
> Hi Rob,
>
> On 16 September 2015 at 00:33, Rob Clark  wrote:
>
>> diff --git a/src/mesa/Makefile.sources b/src/mesa/Makefile.sources
>> index ed9848c..2a719a0 100644
>> --- a/src/mesa/Makefile.sources
>> +++ b/src/mesa/Makefile.sources
>> @@ -523,7 +523,9 @@ PROGRAM_FILES = \
>> program/sampler.h \
>> program/string_to_uint_map.cpp \
>> program/symbol_table.c \
>> -   program/symbol_table.h
>> +   program/symbol_table.h \
>> +   ../glsl/shader_enums.c \
>> +   ../glsl/shader_enums.h
>>
> I'm not too sure if this will work with the automake out-of-tree
> builds and/or scons. Can you please give them a try ?

not sure about scons, but I always do out-of-tree automake builds, so
at least that works

BR,
-R

> Thanks
> Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 3/4] main/get: make KHR_debug enums available everywhere

2015-09-16 Thread Emil Velikov
From: Matthew Waters 

Move all the enums but CONTEXT_FLAGS. The spec seems quite explicit
about the latter (wrt OpenGL ES)

"In OpenGL ES versions prior to and including ES 3.1 there is no
CONTEXT_FLAGS state and therefore the CONTEXT_FLAG_DEBUG_BIT cannot
be queried."

v2 [Emil Velikov] Rebase.
v3 [Emil Veliokv] Drop the CONTEXT_FLAGS hunk - not applicable for GLES

Signed-off-by: Matthew Waters 
Signed-off-by: Emil Velikov 
---
 src/mesa/main/get_hash_params.py | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index c06835a..02d4bea 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -124,6 +124,15 @@ descriptor=[
 
 # GL_EXT_texture_filter_anisotropic
   [ "MAX_TEXTURE_MAX_ANISOTROPY_EXT", 
"CONTEXT_FLOAT(Const.MaxTextureMaxAnisotropy), 
extra_EXT_texture_filter_anisotropic" ],
+
+# GL_KHR_debug (GL 4.3)/ GL_ARB_debug_output
+  [ "DEBUG_LOGGED_MESSAGES", "LOC_CUSTOM, TYPE_INT, 0, NO_EXTRA" ],
+  [ "DEBUG_NEXT_LOGGED_MESSAGE_LENGTH", "LOC_CUSTOM, TYPE_INT, 0, NO_EXTRA" ],
+  [ "MAX_DEBUG_LOGGED_MESSAGES", "CONST(MAX_DEBUG_LOGGED_MESSAGES), NO_EXTRA" 
],
+  [ "MAX_DEBUG_MESSAGE_LENGTH", "CONST(MAX_DEBUG_MESSAGE_LENGTH), NO_EXTRA" ],
+  [ "MAX_LABEL_LENGTH", "CONST(MAX_LABEL_LENGTH), NO_EXTRA" ],
+  [ "MAX_DEBUG_GROUP_STACK_DEPTH", "CONST(MAX_DEBUG_GROUP_STACK_DEPTH), 
NO_EXTRA" ],
+  [ "DEBUG_GROUP_STACK_DEPTH", "LOC_CUSTOM, TYPE_INT, 0, NO_EXTRA" ],
 ]},
 
 # Enums in OpenGL and GLES1
@@ -776,15 +785,6 @@ descriptor=[
 # GL_ARB_robustness
   [ "RESET_NOTIFICATION_STRATEGY_ARB", "CONTEXT_ENUM(Const.ResetStrategy), 
NO_EXTRA" ],
 
-# GL_KHR_debug (GL 4.3)/ GL_ARB_debug_output
-  [ "DEBUG_LOGGED_MESSAGES", "LOC_CUSTOM, TYPE_INT, 0, NO_EXTRA" ],
-  [ "DEBUG_NEXT_LOGGED_MESSAGE_LENGTH", "LOC_CUSTOM, TYPE_INT, 0, NO_EXTRA" ],
-  [ "MAX_DEBUG_LOGGED_MESSAGES", "CONST(MAX_DEBUG_LOGGED_MESSAGES), NO_EXTRA" 
],
-  [ "MAX_DEBUG_MESSAGE_LENGTH", "CONST(MAX_DEBUG_MESSAGE_LENGTH), NO_EXTRA" ],
-  [ "MAX_LABEL_LENGTH", "CONST(MAX_LABEL_LENGTH), NO_EXTRA" ],
-  [ "MAX_DEBUG_GROUP_STACK_DEPTH", "CONST(MAX_DEBUG_GROUP_STACK_DEPTH), 
NO_EXTRA" ],
-  [ "DEBUG_GROUP_STACK_DEPTH", "LOC_CUSTOM, TYPE_INT, 0, NO_EXTRA" ],
-
   [ "MAX_DUAL_SOURCE_DRAW_BUFFERS", 
"CONTEXT_INT(Const.MaxDualSourceDrawBuffers), extra_ARB_blend_func_extended" ],
 
 # GL_ARB_uniform_buffer_object
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: avoid integer overflows with buffers >= 512MB

2015-09-16 Thread Ilia Mirkin
On Wed, Sep 16, 2015 at 9:21 AM, Roland Scheidegger  wrote:
> Since there are no formats where block.bits isn't a multiple of 8 (and I
> wouldn't expect that to change), you could theoretically fix that by
> dividing by block.bits / 8 instead of multiplying base (or size) by 8.
> Ought to be faster at least on 32bit systems...

Why didn't I think of that... I even tried playing with a few other
expression orderings, and gave up.

> Unless you wanted to support >= 4GB buffers (I think though for that
> we're missing way more things).

No. In fact my piglit test explicitly skips situations that would
cause it to create >= 2G buffers.

> Either way though,
> Reviewed-by: Roland Scheidegger 

Thanks!

>
> Am 16.09.2015 um 01:32 schrieb Ilia Mirkin:
>> This fixes failures with the newly-submitted max-size texture buffer
>> piglit test for GPUs exposing >= 128M max texels.
>>
>> Signed-off-by: Ilia Mirkin 
>> Cc: "10.6 11.0" 
>> ---
>>  src/mesa/state_tracker/st_atom_texture.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/src/mesa/state_tracker/st_atom_texture.c 
>> b/src/mesa/state_tracker/st_atom_texture.c
>> index 31e0f6b..62312af 100644
>> --- a/src/mesa/state_tracker/st_atom_texture.c
>> +++ b/src/mesa/state_tracker/st_atom_texture.c
>> @@ -264,7 +264,7 @@ st_create_texture_sampler_view_from_stobj(struct 
>> pipe_context *pipe,
>> format);
>>
>> if (stObj->pt->target == PIPE_BUFFER) {
>> -  unsigned base, size;
>> +  uint64_t base, size;
>>unsigned f, n;
>>const struct util_format_description *desc
>>   = util_format_description(templ.format);
>>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] mesa: Reduce libGL.so binary size by about 15%

2015-09-16 Thread Arlie Davis
The null check is safe to remove, for two reasons.  First, we're allocating
with calloc, so we know for sure that the entire structure is zero-filled.
Second, we're assigning every byte of the table, so we don't even need to
rely on zero-filling it.  (If this were a function that was frequently
called, I'd change it to use malloc instead of calloc -- but it isn't.)

Will move the * and make the other */-related style changes.

Hmm, wasn't aware of xrange().  Will change to use that.

After making those edits, what's the next step to push / commit this?


On Wed, Sep 16, 2015 at 7:07 AM, Matt Turner  wrote:

> On Tue, Sep 15, 2015 at 10:38 AM, Arlie Davis  wrote:
> >
> > Hello!  I noticed an inefficiency in libGL.so, so I thought I'd take a
> > stab at fixing it.  This is my first patch submitted to mesa-dev, so
> > if I'm doing anything dumb, let me know.  I can't use git send-email,
> > but I've formatted the patch using git format-patch, which should
> > hopefully produce similar output.  The patch text (below) describes
> > the inefficiency and the improvement.
> >
> >
> >
> > From 0abde226eed1b9f6052193f36f6cdc060698f621 Mon Sep 17 00:00:00 2001
> > From: Arlie Davis 
> > Date: Tue, 15 Sep 2015 09:58:34 -0700
> > Subject: [PATCH] This patch significantly reduces the size of the
> libGL.so
> >  binary. It does not change the (externally visible) behavior of
> libGL.so at
> >  all.
> >
> > gl_gentable.py generates a function, _glapi_create_table_from_handle.
> > This function allocates a large dispatch table, consisting of 1300 or so
> > function pointers, and fills this dispatch table by doing symbol lookups
> > on a given shared library.  Previously, gl_gentable.py would generate a
> > single, very large _glapi_create_table_from_handle function, with a short
> > cluster of lines for each entry point (function).  The idiom it generates
> > was a NULL check, a call to snprintf, a call to dlsym / GetProcAddress,
> > and then a store into the dispatch table.  Since this function processes
> > a large number of entry points, this code is duplicated many times over.
> >
> > We can encode the same information much more compactly, by using a lookup
> > table.  The previous total size of _glapi_create_table_from_handle on x64
> > was 125848 bytes.  By using a lookup table, the size of
> > _glapi_create_table_from_handle (and the related lookup tables) is
> reduced
> > to 10840 bytes.  In other words, this enormous function is reduced by
> 91%.
> > The size of the entire libGL.so binary (measured when stripped) itself
> drops
> > by 15%.
> >
> > So the purpose of this change is to reduce the binary size, which frees
> up
> > disk space, memory, etc.
> > ---
>
> Seems like a nice change. size lib/libGL.so.1.2.0 on my system shows
>
>text   databssdechex filename
>  604031  11360   2792 618183  96ec7 lib/libGL.so.1.2.0 before
>  490751  21920   2792 515463  7dd87 lib/libGL.so.1.2.0 after
>
> Feel free to include that in the commit message.
>
> >  src/mapi/glapi/gen/gl_gentable.py | 56
> ---
> >  1 file changed, 40 insertions(+), 16 deletions(-)
> >
> > diff --git a/src/mapi/glapi/gen/gl_gentable.py
> b/src/mapi/glapi/gen/gl_gentable.py
> > index 1b3eb72..2563b6b 100644
> > --- a/src/mapi/glapi/gen/gl_gentable.py
> > +++ b/src/mapi/glapi/gen/gl_gentable.py
> > @@ -113,6 +113,9 @@ __glapi_gentable_set_remaining_noop(struct
> _glapi_table *disp) {
> >  dispatch[i] = p.v;
> >  }
> >
> > +"""
> > +
> > +footer = """
> >  struct _glapi_table *
> >  _glapi_create_table_from_handle(void *handle, const char
> *symbol_prefix) {
> >  struct _glapi_table *disp =
> calloc(_glapi_get_dispatch_table_size(), sizeof(_glapi_proc));
> > @@ -123,27 +126,27 @@ _glapi_create_table_from_handle(void *handle,
> const char *symbol_prefix) {
> >
> >  if(symbol_prefix == NULL)
> >  symbol_prefix = "";
> > -"""
> >
> > -footer = """
> > -__glapi_gentable_set_remaining_noop(disp);
> > -
> > -return disp;
> > -}
> > -"""
> > +/* Note: This code relies on _glapi_table_func_names being sorted
> by the
> > +   entry point index of each function. */
>
> Mesa style puts the */ on its own line for multiline comments.
>
> > +for (int func_index = 0; func_index < GLAPI_TABLE_COUNT;
> ++func_index) {
> > +const char* name = _glapi_table_func_names[func_index];
>
> * goes with the var name, not the type. That is, "char* " should be "char
> *"
>
> > +void ** procp = &((void **)disp)[func_index];
> >
> > -body_template = """
> > -if(!disp->%(name)s) {
>
> We're removing the null check. Is that okay to do?
>
> > -void ** procp = (void **) &disp->%(name)s;
> > -snprintf(symboln, sizeof(symboln), "%%s%(entry_point)s",
> symbol_prefix);
> > +snprintf(symboln, sizeof(symboln), \"%s%s\", symbol_prefix,
> name);
> >  #ifdef _WIN32
> >  *procp = GetProcAddress(handle, symboln);
> >  #else
> >  *procp = dlsy

Re: [Mesa-dev] [PATCH 6/6] mesa/teximage: reuse compressed format utility functions for base_format

2015-09-16 Thread Nanley Chery
On Tue, Sep 15, 2015 at 3:01 PM, Anuj Phogat  wrote:

>
>
> On Fri, Aug 28, 2015 at 7:50 AM, Nanley Chery 
> wrote:
>
>> From: Nanley Chery 
>>
>> Reuse utility functions instead of reimplementing the same logic.
>>
>> * _mesa_is_compressed_format() performs the required checking to
>>   determine format support in the current context.
>> * _mesa_gl_compressed_format_base_format() returns the base format.
>>
>> Signed-off-by: Nanley Chery 
>> ---
>>  src/mesa/main/teximage.c | 150
>> ++-
>>  1 file changed, 5 insertions(+), 145 deletions(-)
>>
>> diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
>> index 39d1281..8913a72 100644
>> --- a/src/mesa/main/teximage.c
>> +++ b/src/mesa/main/teximage.c
>> @@ -252,44 +252,11 @@ _mesa_base_tex_format( struct gl_context *ctx,
>> GLint internalFormat )
>>; /* fallthrough */
>> }
>>
>> -   if (ctx->Extensions.TDFX_texture_compression_FXT1) {
>> -  switch (internalFormat) {
>> -  case GL_COMPRESSED_RGB_FXT1_3DFX:
>> - return GL_RGB;
>> -  case GL_COMPRESSED_RGBA_FXT1_3DFX:
>> - return GL_RGBA;
>> -  default:
>> - ; /* fallthrough */
>> -  }
>> -   }
>> -
>> -   /* Assume that the ANGLE flag will always be set if the EXT flag is
>> set.
>> -*/
>> -   if (ctx->Extensions.ANGLE_texture_compression_dxt) {
>> -  switch (internalFormat) {
>> -  case GL_COMPRESSED_RGB_S3TC_DXT1_EXT:
>> - return GL_RGB;
>> -  case GL_COMPRESSED_RGBA_S3TC_DXT1_EXT:
>> -  case GL_COMPRESSED_RGBA_S3TC_DXT3_EXT:
>> -  case GL_COMPRESSED_RGBA_S3TC_DXT5_EXT:
>> - return GL_RGBA;
>> -  default:
>> - ; /* fallthrough */
>> -  }
>> -   }
>> -
>> -   if (_mesa_is_desktop_gl(ctx)
>> -   && ctx->Extensions.ANGLE_texture_compression_dxt) {
>> -  switch (internalFormat) {
>> -  case GL_RGB_S3TC:
>> -  case GL_RGB4_S3TC:
>> - return GL_RGB;
>> -  case GL_RGBA_S3TC:
>> -  case GL_RGBA4_S3TC:
>> - return GL_RGBA;
>> -  default:
>> - ; /* fallthrough */
>> -  }
>> +   if (_mesa_is_compressed_format(ctx, internalFormat)) {
>> +  GLenum base_compressed =
>> + _mesa_gl_compressed_format_base_format(internalFormat);
>> +  if (base_compressed)
>> +return base_compressed;
>> }
>>
>> if (ctx->Extensions.MESA_ycbcr_texture) {
>> @@ -367,16 +334,10 @@ _mesa_base_tex_format( struct gl_context *ctx,
>> GLint internalFormat )
>>case GL_SRGB8_EXT:
>>case GL_COMPRESSED_SRGB_EXT:
>>   return GL_RGB;
>> -  case GL_COMPRESSED_SRGB_S3TC_DXT1_EXT:
>> - return ctx->Extensions.EXT_texture_compression_s3tc ? GL_RGB :
>> -1;
>>case GL_SRGB_ALPHA_EXT:
>>case GL_SRGB8_ALPHA8_EXT:
>>case GL_COMPRESSED_SRGB_ALPHA_EXT:
>>   return GL_RGBA;
>> -  case GL_COMPRESSED_SRGB_ALPHA_S3TC_DXT1_EXT:
>> -  case GL_COMPRESSED_SRGB_ALPHA_S3TC_DXT3_EXT:
>> -  case GL_COMPRESSED_SRGB_ALPHA_S3TC_DXT5_EXT:
>> - return ctx->Extensions.EXT_texture_compression_s3tc ? GL_RGBA :
>> -1;
>>case GL_SLUMINANCE_ALPHA_EXT:
>>case GL_SLUMINANCE8_ALPHA8_EXT:
>>case GL_COMPRESSED_SLUMINANCE_ALPHA_EXT:
>> @@ -521,111 +482,10 @@ _mesa_base_tex_format( struct gl_context *ctx,
>> GLint internalFormat )
>>}
>> }
>>
>> -   if (ctx->Extensions.ARB_texture_compression_rgtc) {
>> -  switch (internalFormat) {
>> -  case GL_COMPRESSED_RED_RGTC1:
>> -  case GL_COMPRESSED_SIGNED_RED_RGTC1:
>> - return GL_RED;
>> -  case GL_COMPRESSED_RG_RGTC2:
>> -  case GL_COMPRESSED_SIGNED_RG_RGTC2:
>> - return GL_RG;
>> -  default:
>> - ; /* fallthrough */
>> -  }
>> -   }
>> -
>> -   if (_mesa_is_desktop_gl(ctx) &&
>> -  ctx->Extensions.EXT_texture_compression_latc) {
>> -  switch (internalFormat) {
>> -  case GL_COMPRESSED_LUMINANCE_LATC1_EXT:
>> -  case GL_COMPRESSED_SIGNED_LUMINANCE_LATC1_EXT:
>> - return GL_LUMINANCE;
>> -  case GL_COMPRESSED_LUMINANCE_ALPHA_LATC2_EXT:
>> -  case GL_COMPRESSED_SIGNED_LUMINANCE_ALPHA_LATC2_EXT:
>> - return GL_LUMINANCE_ALPHA;
>> -  default:
>> - ; /* fallthrough */
>> -  }
>> -   }
>> -
>> -   if (_mesa_is_desktop_gl(ctx) &&
>> -  ctx->Extensions.ATI_texture_compression_3dc) {
>> -  switch (internalFormat) {
>> -  case GL_COMPRESSED_LUMINANCE_ALPHA_3DC_ATI:
>> - return GL_LUMINANCE_ALPHA;
>> -  default:
>> - ; /* fallthrough */
>> -  }
>> -   }
>> -
>> -   if (_mesa_is_gles(ctx) &&
>> -  ctx->Extensions.OES_compressed_ETC1_RGB8_texture) {
>> -  switch (internalFormat) {
>> -  case GL_ETC1_RGB8_OES:
>> - return GL_RGB;
>> -  default:
>> - ; /* fallthrough */
>> -  }
>> -   }
>> -
>> if (ctx->Extensions.KHR_texture_compression_astc_ldr &&
>> _mesa_is_astc_format(internalFormat))
>>   r

Re: [Mesa-dev] [PATCH 6/6] mesa/teximage: reuse compressed format utility functions for base_format

2015-09-16 Thread Nanley Chery
On Wed, Sep 16, 2015 at 10:15 AM, Nanley Chery 
wrote:

>
>
> On Tue, Sep 15, 2015 at 3:01 PM, Anuj Phogat 
> wrote:
>
>>
>>
>> On Fri, Aug 28, 2015 at 7:50 AM, Nanley Chery 
>> wrote:
>>
>>> From: Nanley Chery 
>>>
>>> Reuse utility functions instead of reimplementing the same logic.
>>>
>>> * _mesa_is_compressed_format() performs the required checking to
>>>   determine format support in the current context.
>>> * _mesa_gl_compressed_format_base_format() returns the base format.
>>>
>>> Signed-off-by: Nanley Chery 
>>> ---
>>>  src/mesa/main/teximage.c | 150
>>> ++-
>>>  1 file changed, 5 insertions(+), 145 deletions(-)
>>>
>>> diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
>>> index 39d1281..8913a72 100644
>>> --- a/src/mesa/main/teximage.c
>>> +++ b/src/mesa/main/teximage.c
>>> @@ -252,44 +252,11 @@ _mesa_base_tex_format( struct gl_context *ctx,
>>> GLint internalFormat )
>>>; /* fallthrough */
>>> }
>>>
>>> -   if (ctx->Extensions.TDFX_texture_compression_FXT1) {
>>> -  switch (internalFormat) {
>>> -  case GL_COMPRESSED_RGB_FXT1_3DFX:
>>> - return GL_RGB;
>>> -  case GL_COMPRESSED_RGBA_FXT1_3DFX:
>>> - return GL_RGBA;
>>> -  default:
>>> - ; /* fallthrough */
>>> -  }
>>> -   }
>>> -
>>> -   /* Assume that the ANGLE flag will always be set if the EXT flag is
>>> set.
>>> -*/
>>> -   if (ctx->Extensions.ANGLE_texture_compression_dxt) {
>>> -  switch (internalFormat) {
>>> -  case GL_COMPRESSED_RGB_S3TC_DXT1_EXT:
>>> - return GL_RGB;
>>> -  case GL_COMPRESSED_RGBA_S3TC_DXT1_EXT:
>>> -  case GL_COMPRESSED_RGBA_S3TC_DXT3_EXT:
>>> -  case GL_COMPRESSED_RGBA_S3TC_DXT5_EXT:
>>> - return GL_RGBA;
>>> -  default:
>>> - ; /* fallthrough */
>>> -  }
>>> -   }
>>> -
>>> -   if (_mesa_is_desktop_gl(ctx)
>>> -   && ctx->Extensions.ANGLE_texture_compression_dxt) {
>>> -  switch (internalFormat) {
>>> -  case GL_RGB_S3TC:
>>> -  case GL_RGB4_S3TC:
>>> - return GL_RGB;
>>> -  case GL_RGBA_S3TC:
>>> -  case GL_RGBA4_S3TC:
>>> - return GL_RGBA;
>>> -  default:
>>> - ; /* fallthrough */
>>> -  }
>>> +   if (_mesa_is_compressed_format(ctx, internalFormat)) {
>>> +  GLenum base_compressed =
>>> + _mesa_gl_compressed_format_base_format(internalFormat);
>>> +  if (base_compressed)
>>> +return base_compressed;
>>> }
>>>
>>> if (ctx->Extensions.MESA_ycbcr_texture) {
>>> @@ -367,16 +334,10 @@ _mesa_base_tex_format( struct gl_context *ctx,
>>> GLint internalFormat )
>>>case GL_SRGB8_EXT:
>>>case GL_COMPRESSED_SRGB_EXT:
>>>   return GL_RGB;
>>> -  case GL_COMPRESSED_SRGB_S3TC_DXT1_EXT:
>>> - return ctx->Extensions.EXT_texture_compression_s3tc ? GL_RGB :
>>> -1;
>>>case GL_SRGB_ALPHA_EXT:
>>>case GL_SRGB8_ALPHA8_EXT:
>>>case GL_COMPRESSED_SRGB_ALPHA_EXT:
>>>   return GL_RGBA;
>>> -  case GL_COMPRESSED_SRGB_ALPHA_S3TC_DXT1_EXT:
>>> -  case GL_COMPRESSED_SRGB_ALPHA_S3TC_DXT3_EXT:
>>> -  case GL_COMPRESSED_SRGB_ALPHA_S3TC_DXT5_EXT:
>>> - return ctx->Extensions.EXT_texture_compression_s3tc ? GL_RGBA
>>> : -1;
>>>case GL_SLUMINANCE_ALPHA_EXT:
>>>case GL_SLUMINANCE8_ALPHA8_EXT:
>>>case GL_COMPRESSED_SLUMINANCE_ALPHA_EXT:
>>> @@ -521,111 +482,10 @@ _mesa_base_tex_format( struct gl_context *ctx,
>>> GLint internalFormat )
>>>}
>>> }
>>>
>>> -   if (ctx->Extensions.ARB_texture_compression_rgtc) {
>>> -  switch (internalFormat) {
>>> -  case GL_COMPRESSED_RED_RGTC1:
>>> -  case GL_COMPRESSED_SIGNED_RED_RGTC1:
>>> - return GL_RED;
>>> -  case GL_COMPRESSED_RG_RGTC2:
>>> -  case GL_COMPRESSED_SIGNED_RG_RGTC2:
>>> - return GL_RG;
>>> -  default:
>>> - ; /* fallthrough */
>>> -  }
>>> -   }
>>> -
>>> -   if (_mesa_is_desktop_gl(ctx) &&
>>> -  ctx->Extensions.EXT_texture_compression_latc) {
>>> -  switch (internalFormat) {
>>> -  case GL_COMPRESSED_LUMINANCE_LATC1_EXT:
>>> -  case GL_COMPRESSED_SIGNED_LUMINANCE_LATC1_EXT:
>>> - return GL_LUMINANCE;
>>> -  case GL_COMPRESSED_LUMINANCE_ALPHA_LATC2_EXT:
>>> -  case GL_COMPRESSED_SIGNED_LUMINANCE_ALPHA_LATC2_EXT:
>>> - return GL_LUMINANCE_ALPHA;
>>> -  default:
>>> - ; /* fallthrough */
>>> -  }
>>> -   }
>>> -
>>> -   if (_mesa_is_desktop_gl(ctx) &&
>>> -  ctx->Extensions.ATI_texture_compression_3dc) {
>>> -  switch (internalFormat) {
>>> -  case GL_COMPRESSED_LUMINANCE_ALPHA_3DC_ATI:
>>> - return GL_LUMINANCE_ALPHA;
>>> -  default:
>>> - ; /* fallthrough */
>>> -  }
>>> -   }
>>> -
>>> -   if (_mesa_is_gles(ctx) &&
>>> -  ctx->Extensions.OES_compressed_ETC1_RGB8_texture) {
>>> -  switch (internalFormat) {
>>> -  case GL_ETC1_RGB8_OES:
>>> - return GL_RG

[Mesa-dev] [Bug 92022] st/va: add initial support for Video Post Processing and Export/Import of VaSurface

2015-09-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92022

Julien Isorce  changed:

   What|Removed |Added

 CC||gb.de...@gmail.com,
   ||vjaq...@igalia.com

--- Comment #1 from Julien Isorce  ---
I splitted patches:
https://github.com/CapOM/mesa/commits/wip_export_import_and_vpp

>From newer to older:
st/va: implement dmabuf export
st/va: implement VaDeriveImage
st/va: implement dmabuf import for VaCreateSurfaces2
st/va: add initial Video Post Processing support
st/va: implement VaCreateSurfaces2 and VaQuerySurfaceAttributes
st/va: in VaPutImage only destroy previous buffer if pipe->create_video_buffer
succeed
st/va: properly defines VAImageFormat formats and improve VaCreateImage
nvc0: fix crash when nv50_miptree_from_handle

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] nv50, nvc0: detect underlying resource changes and update tic

2015-09-16 Thread Ilia Mirkin
When updating texture buffers, we might end up replacing the whole
buffer. Check that the tic address matches the resource address, and if
not, update the tic and reupload it.

This fixes:
  arb_direct_state_access-texture-buffer
  arb_texture_buffer_object-data-sync

Signed-off-by: Ilia Mirkin 
Cc: "11.0" 
---

This seems like a better version of the previous attempt to fix this,
since it no longer relies on the sampler view being bound. And it
keeps the tic update logic along with the other tic logic.

 src/gallium/drivers/nouveau/nv50/nv50_tex.c | 18 ++
 src/gallium/drivers/nouveau/nvc0/nvc0_tex.c | 19 +++
 2 files changed, 37 insertions(+)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_tex.c 
b/src/gallium/drivers/nouveau/nv50/nv50_tex.c
index fc6374d..70f8928 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_tex.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_tex.c
@@ -221,6 +221,23 @@ nv50_create_texture_view(struct pipe_context *pipe,
return &view->pipe;
 }
 
+static void
+nv50_update_tic(struct nv50_context *nv50, struct nv50_tic_entry *tic,
+struct nv04_resource *res)
+{
+   if (res->base.target != PIPE_BUFFER)
+  return;
+   if (tic->tic[1] == (uint32_t)res->address &&
+   (tic->tic[2] & 0xff) == res->address >> 32)
+  return;
+
+   nv50_screen_tic_unlock(nv50->screen, tic);
+   tic->id = -1;
+   tic->tic[1] = res->address;
+   tic->tic[2] &= 0xff00;
+   tic->tic[2] |= res->address >> 32;
+}
+
 static bool
 nv50_validate_tic(struct nv50_context *nv50, int s)
 {
@@ -240,6 +257,7 @@ nv50_validate_tic(struct nv50_context *nv50, int s)
  continue;
   }
   res = &nv50_miptree(tic->pipe.texture)->base;
+  nv50_update_tic(nv50, tic, res);
 
   if (tic->id < 0) {
  tic->id = nv50_screen_tic_alloc(nv50->screen, tic);
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
index d19082e..0174407 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
@@ -226,6 +226,23 @@ nvc0_create_texture_view(struct pipe_context *pipe,
return &view->pipe;
 }
 
+static void
+nvc0_update_tic(struct nvc0_context *nvc0, struct nv50_tic_entry *tic,
+struct nv04_resource *res)
+{
+   if (res->base.target != PIPE_BUFFER)
+  return;
+   if (tic->tic[1] == (uint32_t)res->address &&
+   (tic->tic[2] & 0xff) == res->address >> 32)
+  return;
+
+   nvc0_screen_tic_unlock(nvc0->screen, tic);
+   tic->id = -1;
+   tic->tic[1] = res->address;
+   tic->tic[2] &= 0xff00;
+   tic->tic[2] |= res->address >> 32;
+}
+
 static bool
 nvc0_validate_tic(struct nvc0_context *nvc0, int s)
 {
@@ -247,6 +264,7 @@ nvc0_validate_tic(struct nvc0_context *nvc0, int s)
  continue;
   }
   res = nv04_resource(tic->pipe.texture);
+  nvc0_update_tic(nvc0, tic, res);
 
   if (tic->id < 0) {
  tic->id = nvc0_screen_tic_alloc(nvc0->screen, tic);
@@ -313,6 +331,7 @@ nve4_validate_tic(struct nvc0_context *nvc0, unsigned s)
  continue;
   }
   res = nv04_resource(tic->pipe.texture);
+  nvc0_update_tic(nvc0, tic, res);
 
   if (tic->id < 0) {
  tic->id = nvc0_screen_tic_alloc(nvc0->screen, tic);
-- 
2.4.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92022] st/va: add initial support for Video Post Processing and Export/Import of VaSurface

2015-09-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92022

--- Comment #2 from Ilia Mirkin  ---
(In reply to Julien Isorce from comment #0)
> Just let know I did a first attempt to add VPP and
> VaAcquireBufferHandle(dmabuf) to st/va:
> 
> https://github.com/CapOM/mesa/commits/wip_export_import_and_vpp
> 
> I'll send patches to mesa-dev mailing list once it is ready. If you have any
> remark let me know.

The proper way to do all this is by sending emails, not filing bugs.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92022] st/va: add initial support for Video Post Processing and Export/Import of VaSurface

2015-09-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92022

Ilia Mirkin  changed:

   What|Removed |Added

 CC|imir...@alum.mit.edu|

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] freedreno/ir3: lower txp/clamp in NIR

2015-09-16 Thread Rob Clark
From: Rob Clark 

Signed-off-by: Rob Clark 
---
 .../drivers/freedreno/ir3/ir3_compiler_nir.c   | 53 --
 1 file changed, 28 insertions(+), 25 deletions(-)

diff --git a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c 
b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
index d72464f..3738721 100644
--- a/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
+++ b/src/gallium/drivers/freedreno/ir3/ir3_compiler_nir.c
@@ -127,8 +127,8 @@ struct ir3_compile {
 static struct ir3_instruction * create_immed(struct ir3_block *block, uint32_t 
val);
 static struct ir3_block * get_block(struct ir3_compile *ctx, nir_block 
*nblock);
 
-static struct nir_shader *to_nir(const struct tgsi_token *tokens,
-   struct ir3_shader_variant *so)
+static struct nir_shader *to_nir(struct ir3_compile *ctx,
+   const struct tgsi_token *tokens, struct ir3_shader_variant *so)
 {
struct nir_shader_compiler_options options = {
.lower_fpow = true,
@@ -138,8 +138,31 @@ static struct nir_shader *to_nir(const struct tgsi_token 
*tokens,
.lower_ffract = true,
.native_integers = true,
};
+   unsigned lower_txp, saturate_s, saturate_t, saturate_r;
bool progress;
 
+   switch (so->type) {
+   case SHADER_FRAGMENT:
+   case SHADER_COMPUTE:
+   saturate_s = so->key.fsaturate_s;
+   saturate_t = so->key.fsaturate_t;
+   saturate_r = so->key.fsaturate_r;
+   break;
+   case SHADER_VERTEX:
+   saturate_s = so->key.vsaturate_s;
+   saturate_t = so->key.vsaturate_t;
+   saturate_r = so->key.vsaturate_r;
+   break;
+   }
+
+   if (ctx->compiler->gpu_id >= 400) {
+   /* a4xx seems to have *no* sam.p */
+   lower_txp = ~0;  /* lower all txp */
+   } else {
+   /* a3xx just needs to avoid sam.p for 3d tex */
+   lower_txp = (1 << GLSL_SAMPLER_DIM_3D);
+   }
+
struct nir_shader *s = tgsi_to_nir(tokens, &options);
 
if (fd_mesa_debug & FD_DBG_OPTMSGS) {
@@ -155,6 +178,8 @@ static struct nir_shader *to_nir(const struct tgsi_token 
*tokens,
} else if (s->stage == MESA_SHADER_FRAGMENT) {
nir_lower_clip_fs(s, so->key.ucp_enables);
}
+   nir_lower_tex_projector(s, lower_txp, saturate_s,
+   saturate_t, saturate_r);
nir_lower_idiv(s);
nir_lower_load_const_to_scalar(s);
 
@@ -196,28 +221,6 @@ lower_tgsi(struct ir3_compile *ctx, const struct 
tgsi_token *tokens,
.color_two_side = so->key.color_two_side,
};
 
-   switch (so->type) {
-   case SHADER_FRAGMENT:
-   case SHADER_COMPUTE:
-   lconfig.saturate_s = so->key.fsaturate_s;
-   lconfig.saturate_t = so->key.fsaturate_t;
-   lconfig.saturate_r = so->key.fsaturate_r;
-   break;
-   case SHADER_VERTEX:
-   lconfig.saturate_s = so->key.vsaturate_s;
-   lconfig.saturate_t = so->key.vsaturate_t;
-   lconfig.saturate_r = so->key.vsaturate_r;
-   break;
-   }
-
-   if (ctx->compiler->gpu_id >= 400) {
-   /* a4xx seems to have *no* sam.p */
-   lconfig.lower_TXP = ~0;  /* lower all txp */
-   } else {
-   /* a3xx just needs to avoid sam.p for 3d tex */
-   lconfig.lower_TXP = (1 << TGSI_TEXTURE_3D);
-   }
-
return tgsi_transform_lowering(&lconfig, tokens, &info);
 }
 
@@ -257,7 +260,7 @@ compile_init(struct ir3_compiler *compiler,
lowered_tokens = lower_tgsi(ctx, tokens, so);
if (!lowered_tokens)
lowered_tokens = tokens;
-   ctx->s = to_nir(lowered_tokens, so);
+   ctx->s = to_nir(ctx, lowered_tokens, so);
 
if (lowered_tokens != tokens)
free((void *)lowered_tokens);
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] nir/lower_tex_proj: split out project_src() helper

2015-09-16 Thread Rob Clark
From: Rob Clark 

Split this out to reduce noise in later patches.

Signed-off-by: Rob Clark 
---
 src/glsl/nir/nir_lower_tex_projector.c | 146 +
 1 file changed, 77 insertions(+), 69 deletions(-)

diff --git a/src/glsl/nir/nir_lower_tex_projector.c 
b/src/glsl/nir/nir_lower_tex_projector.c
index 9afa42f..11fcd61 100644
--- a/src/glsl/nir/nir_lower_tex_projector.c
+++ b/src/glsl/nir/nir_lower_tex_projector.c
@@ -30,6 +30,82 @@
 #include "nir.h"
 #include "nir_builder.h"
 
+static void
+project_src(nir_builder *b, nir_tex_instr *tex)
+{
+   /* Find the projector in the srcs list, if present. */
+   unsigned proj_index;
+   for (proj_index = 0; proj_index < tex->num_srcs; proj_index++) {
+  if (tex->src[proj_index].src_type == nir_tex_src_projector)
+ break;
+   }
+   if (proj_index == tex->num_srcs)
+  return;
+
+   b->cursor = nir_before_instr(&tex->instr);
+
+   nir_ssa_def *inv_proj =
+  nir_frcp(b, nir_ssa_for_src(b, tex->src[proj_index].src, 1));
+
+   /* Walk through the sources projecting the arguments. */
+   for (unsigned i = 0; i < tex->num_srcs; i++) {
+  switch (tex->src[i].src_type) {
+  case nir_tex_src_coord:
+  case nir_tex_src_comparitor:
+ break;
+  default:
+ continue;
+  }
+  nir_ssa_def *unprojected =
+ nir_ssa_for_src(b, tex->src[i].src, nir_tex_instr_src_size(tex, i));
+  nir_ssa_def *projected = nir_fmul(b, unprojected, inv_proj);
+
+  /* Array indices don't get projected, so make an new vector with the
+   * coordinate's array index untouched.
+   */
+  if (tex->is_array && tex->src[i].src_type == nir_tex_src_coord) {
+ switch (tex->coord_components) {
+ case 4:
+projected = nir_vec4(b,
+ nir_channel(b, projected, 0),
+ nir_channel(b, projected, 1),
+ nir_channel(b, projected, 2),
+ nir_channel(b, unprojected, 3));
+break;
+ case 3:
+projected = nir_vec3(b,
+ nir_channel(b, projected, 0),
+ nir_channel(b, projected, 1),
+ nir_channel(b, unprojected, 2));
+break;
+ case 2:
+projected = nir_vec2(b,
+ nir_channel(b, projected, 0),
+ nir_channel(b, unprojected, 1));
+break;
+ default:
+unreachable("bad texture coord count for array");
+break;
+ }
+  }
+
+  nir_instr_rewrite_src(&tex->instr,
+&tex->src[i].src,
+nir_src_for_ssa(projected));
+   }
+
+   /* Now move the later tex sources down the array so that the projector
+* disappears.
+*/
+   nir_instr_rewrite_src(&tex->instr, &tex->src[proj_index].src,
+ NIR_SRC_INIT);
+   for (unsigned i = proj_index + 1; i < tex->num_srcs; i++) {
+  tex->src[i-1].src_type = tex->src[i].src_type;
+  nir_instr_move_src(&tex->instr, &tex->src[i-1].src, &tex->src[i].src);
+   }
+   tex->num_srcs--;
+}
+
 static bool
 nir_lower_tex_projector_block(nir_block *block, void *void_state)
 {
@@ -40,76 +116,8 @@ nir_lower_tex_projector_block(nir_block *block, void 
*void_state)
  continue;
 
   nir_tex_instr *tex = nir_instr_as_tex(instr);
-  b->cursor = nir_before_instr(&tex->instr);
 
-  /* Find the projector in the srcs list, if present. */
-  unsigned proj_index;
-  for (proj_index = 0; proj_index < tex->num_srcs; proj_index++) {
- if (tex->src[proj_index].src_type == nir_tex_src_projector)
-break;
-  }
-  if (proj_index == tex->num_srcs)
- continue;
-  nir_ssa_def *inv_proj =
- nir_frcp(b, nir_ssa_for_src(b, tex->src[proj_index].src, 1));
-
-  /* Walk through the sources projecting the arguments. */
-  for (unsigned i = 0; i < tex->num_srcs; i++) {
- switch (tex->src[i].src_type) {
- case nir_tex_src_coord:
- case nir_tex_src_comparitor:
-break;
- default:
-continue;
- }
- nir_ssa_def *unprojected =
-nir_ssa_for_src(b, tex->src[i].src, nir_tex_instr_src_size(tex, 
i));
- nir_ssa_def *projected = nir_fmul(b, unprojected, inv_proj);
-
- /* Array indices don't get projected, so make an new vector with the
-  * coordinate's array index untouched.
-  */
- if (tex->is_array && tex->src[i].src_type == nir_tex_src_coord) {
-switch (tex->coord_components) {
-case 4:
-   projected = nir_vec4(b,
-nir_channel(b, projected, 0),
-nir_channel(b, projected, 1),
-nir_channel(b, project

[Mesa-dev] [PATCH 3/4] nir/lower_tex_proj: add support to clamp texture coords

2015-09-16 Thread Rob Clark
From: Rob Clark 

Some hardware needs to clamp texture coordinates to [0.0, 1.0] in the
shader to emulate GL_CLAMP.  This is added to lower_tex_proj since, in
the case of projected coords, the clamping needs to happen *after*
projection.

Signed-off-by: Rob Clark 
---
 src/glsl/nir/nir.h |  4 +-
 src/glsl/nir/nir_lower_tex_projector.c | 98 --
 src/mesa/drivers/dri/i965/brw_nir.c|  2 +-
 3 files changed, 99 insertions(+), 5 deletions(-)

diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index 9d47001..fba28f2 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -1830,7 +1830,9 @@ void nir_lower_samplers(nir_shader *shader,
 const struct gl_shader_program *shader_program);
 
 void nir_lower_system_values(nir_shader *shader);
-void nir_lower_tex_projector(nir_shader *shader, unsigned lower_txp);
+void nir_lower_tex_projector(nir_shader *shader, unsigned lower_txp,
+ unsigned saturate_s, unsigned saturate_t,
+ unsigned saturate_r);
 void nir_lower_idiv(nir_shader *shader);
 
 void nir_lower_clip_vs(nir_shader *shader, unsigned ucp_enables);
diff --git a/src/glsl/nir/nir_lower_tex_projector.c 
b/src/glsl/nir/nir_lower_tex_projector.c
index ce20956..1a72fd0 100644
--- a/src/glsl/nir/nir_lower_tex_projector.c
+++ b/src/glsl/nir/nir_lower_tex_projector.c
@@ -33,6 +33,9 @@
 typedef struct {
nir_builder b;
unsigned lower_txp;
+   unsigned saturate_s;
+   unsigned saturate_t;
+   unsigned saturate_r;
 } lower_tex_state;
 
 static void
@@ -111,6 +114,62 @@ project_src(nir_builder *b, nir_tex_instr *tex)
tex->num_srcs--;
 }
 
+static void
+saturate_src(nir_builder *b, nir_tex_instr *tex, unsigned sat_mask)
+{
+   b->cursor = nir_before_instr(&tex->instr);
+
+   /* Walk through the sources saturating the requested arguments. */
+   for (unsigned i = 0; i < tex->num_srcs; i++) {
+  switch (tex->src[i].src_type) {
+  case nir_tex_src_coord:
+ break;
+  default:
+ continue;
+  }
+  nir_ssa_def *src =
+ nir_ssa_for_src(b, tex->src[i].src, tex->coord_components);
+
+  /* split src into components: */
+  nir_ssa_def *comp[4];
+
+  for (unsigned j = 0; j < tex->coord_components; j++)
+ comp[j] = nir_channel(b, src, j);
+
+  /* clamp requested components, array index does not get clamped: */
+  unsigned ncomp = tex->coord_components;
+  if (tex->is_array)
+ ncomp--;
+
+  for (unsigned j = 0; j < ncomp; j++)
+ if ((1 << j) & sat_mask)
+comp[j] = nir_fsat(b, comp[j]);
+
+  /* and move the result back into a single vecN: */
+  switch (tex->coord_components) {
+  case 4:
+ src = nir_vec4(b, comp[0], comp[1], comp[2], comp[3]);
+ break;
+  case 3:
+ src = nir_vec3(b, comp[0], comp[1], comp[2]);
+ break;
+  case 2:
+ src = nir_vec2(b, comp[0], comp[1]);
+ break;
+  case 1:
+ src = comp[0];
+ break;
+  default:
+ unreachable("bad texture coord count");
+ break;
+  }
+
+  nir_instr_rewrite_src(&tex->instr,
+&tex->src[i].src,
+nir_src_for_ssa(src));
+   }
+}
+
 static bool
 nir_lower_tex_projector_block(nir_block *block, void *void_state)
 {
@@ -123,10 +182,24 @@ nir_lower_tex_projector_block(nir_block *block, void 
*void_state)
 
   nir_tex_instr *tex = nir_instr_as_tex(instr);
   bool lower_txp = !!(state->lower_txp & (1 << tex->sampler_dim));
-
-  if (lower_txp)
+  /* mask of src coords to saturate (clamp): */
+  unsigned sat_mask = 0;
+
+  if ((1 << tex->sampler_index) & state->saturate_r)
+ sat_mask |= (1 << 2);/* .z */
+  if ((1 << tex->sampler_index) & state->saturate_t)
+ sat_mask |= (1 << 1);/* .y */
+  if ((1 << tex->sampler_index) & state->saturate_s)
+ sat_mask |= (1 << 0);/* .x */
+
+  /* If we are clamping any coords, we must lower projector first
+   * as clamping happens *after* projection:
+   */
+  if (lower_txp || sat_mask)
  project_src(b, tex);
 
+  if (sat_mask)
+ saturate_src(b, tex, sat_mask);
}
 
return true;
@@ -147,12 +220,31 @@ nir_lower_tex_projector_impl(nir_function_impl *impl, 
lower_tex_state *state)
  * lower_txp:
  *bitmask of (1 << GLSL_SAMPLER_DIM_x) to control for which
  *sampler types a texture projector is lowered.
+ *
+ * saturate_s/t/r:
+ *To emulate certain texture wrap modes, this can be used
+ *to saturate the specified tex coord to [0.0, 1.0].  The
+ *bits are according to sampler #, ie. if, for example:
+ *
+ *  (conf->saturate_s & (1 << n))
+ *
+ *is true, then the s coord for sampler n is saturated.
+ *
+ *Note that clamping must happen *after* projector lowering
+ *so any projected texture sample instruction wit

[Mesa-dev] [PATCH 2/4] nir/lower_tex_proj: add support projector lowering per sampler type

2015-09-16 Thread Rob Clark
From: Rob Clark 

Some hardware, such as adreno a3xx, supports txp on some but not all
sampler types.  In this case we want more fine grained control over
which texture projectors get lowered.

Signed-off-by: Rob Clark 
---
 src/glsl/nir/nir.h |  2 +-
 src/glsl/nir/nir_lower_tex_projector.c | 31 +++
 src/mesa/drivers/dri/i965/brw_nir.c|  2 +-
 3 files changed, 25 insertions(+), 10 deletions(-)

diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index 284fccd..9d47001 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -1830,7 +1830,7 @@ void nir_lower_samplers(nir_shader *shader,
 const struct gl_shader_program *shader_program);
 
 void nir_lower_system_values(nir_shader *shader);
-void nir_lower_tex_projector(nir_shader *shader);
+void nir_lower_tex_projector(nir_shader *shader, unsigned lower_txp);
 void nir_lower_idiv(nir_shader *shader);
 
 void nir_lower_clip_vs(nir_shader *shader, unsigned ucp_enables);
diff --git a/src/glsl/nir/nir_lower_tex_projector.c 
b/src/glsl/nir/nir_lower_tex_projector.c
index 11fcd61..ce20956 100644
--- a/src/glsl/nir/nir_lower_tex_projector.c
+++ b/src/glsl/nir/nir_lower_tex_projector.c
@@ -30,6 +30,11 @@
 #include "nir.h"
 #include "nir_builder.h"
 
+typedef struct {
+   nir_builder b;
+   unsigned lower_txp;
+} lower_tex_state;
+
 static void
 project_src(nir_builder *b, nir_tex_instr *tex)
 {
@@ -109,37 +114,47 @@ project_src(nir_builder *b, nir_tex_instr *tex)
 static bool
 nir_lower_tex_projector_block(nir_block *block, void *void_state)
 {
-   nir_builder *b = void_state;
+   lower_tex_state *state = void_state;
+   nir_builder *b = &state->b;
 
nir_foreach_instr_safe(block, instr) {
   if (instr->type != nir_instr_type_tex)
  continue;
 
   nir_tex_instr *tex = nir_instr_as_tex(instr);
+  bool lower_txp = !!(state->lower_txp & (1 << tex->sampler_dim));
+
+  if (lower_txp)
+ project_src(b, tex);
 
-  project_src(b, tex);
}
 
return true;
 }
 
 static void
-nir_lower_tex_projector_impl(nir_function_impl *impl)
+nir_lower_tex_projector_impl(nir_function_impl *impl, lower_tex_state *state)
 {
-   nir_builder b;
-   nir_builder_init(&b, impl);
+   nir_builder_init(&state->b, impl);
 
-   nir_foreach_block(impl, nir_lower_tex_projector_block, &b);
+   nir_foreach_block(impl, nir_lower_tex_projector_block, state);
 
nir_metadata_preserve(impl, nir_metadata_block_index |
nir_metadata_dominance);
 }
 
+/**
+ * lower_txp:
+ *bitmask of (1 << GLSL_SAMPLER_DIM_x) to control for which
+ *sampler types a texture projector is lowered.
+ */
 void
-nir_lower_tex_projector(nir_shader *shader)
+nir_lower_tex_projector(nir_shader *shader, unsigned lower_txp)
 {
+   lower_tex_state state;
+   state.lower_txp = lower_txp;
nir_foreach_overload(shader, overload) {
   if (overload->impl)
- nir_lower_tex_projector_impl(overload->impl);
+ nir_lower_tex_projector_impl(overload->impl, &state);
}
 }
diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
b/src/mesa/drivers/dri/i965/brw_nir.c
index f326b23..2a924bb 100644
--- a/src/mesa/drivers/dri/i965/brw_nir.c
+++ b/src/mesa/drivers/dri/i965/brw_nir.c
@@ -96,7 +96,7 @@ brw_create_nir(struct brw_context *brw,
nir_lower_global_vars_to_local(nir);
nir_validate_shader(nir);
 
-   nir_lower_tex_projector(nir);
+   nir_lower_tex_projector(nir, ~0);
nir_validate_shader(nir);
 
nir_normalize_cubemap_coords(nir);
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] nir/lower_tex_proj: add support projector lowering per sampler type

2015-09-16 Thread Ilia Mirkin
On Wed, Sep 16, 2015 at 2:07 PM, Rob Clark  wrote:
> From: Rob Clark 
>
> Some hardware, such as adreno a3xx, supports txp on some but not all
> sampler types.  In this case we want more fine grained control over
> which texture projectors get lowered.

I mentioned this on IRC, but should probably say it here too -- a3xx
doesn't actually need this. The tex-miplevel-selection test was being
picky, Iago changed it up in commit 181c264956 since Intel was having
similar troubles. As I recall, sam.3d.p worked fine on my a320 with
that change, but it was quite a while ago, and should be re-checked.

  -ilia

>
> Signed-off-by: Rob Clark 
> ---
>  src/glsl/nir/nir.h |  2 +-
>  src/glsl/nir/nir_lower_tex_projector.c | 31 +++
>  src/mesa/drivers/dri/i965/brw_nir.c|  2 +-
>  3 files changed, 25 insertions(+), 10 deletions(-)
>
> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
> index 284fccd..9d47001 100644
> --- a/src/glsl/nir/nir.h
> +++ b/src/glsl/nir/nir.h
> @@ -1830,7 +1830,7 @@ void nir_lower_samplers(nir_shader *shader,
>  const struct gl_shader_program *shader_program);
>
>  void nir_lower_system_values(nir_shader *shader);
> -void nir_lower_tex_projector(nir_shader *shader);
> +void nir_lower_tex_projector(nir_shader *shader, unsigned lower_txp);
>  void nir_lower_idiv(nir_shader *shader);
>
>  void nir_lower_clip_vs(nir_shader *shader, unsigned ucp_enables);
> diff --git a/src/glsl/nir/nir_lower_tex_projector.c 
> b/src/glsl/nir/nir_lower_tex_projector.c
> index 11fcd61..ce20956 100644
> --- a/src/glsl/nir/nir_lower_tex_projector.c
> +++ b/src/glsl/nir/nir_lower_tex_projector.c
> @@ -30,6 +30,11 @@
>  #include "nir.h"
>  #include "nir_builder.h"
>
> +typedef struct {
> +   nir_builder b;
> +   unsigned lower_txp;
> +} lower_tex_state;
> +
>  static void
>  project_src(nir_builder *b, nir_tex_instr *tex)
>  {
> @@ -109,37 +114,47 @@ project_src(nir_builder *b, nir_tex_instr *tex)
>  static bool
>  nir_lower_tex_projector_block(nir_block *block, void *void_state)
>  {
> -   nir_builder *b = void_state;
> +   lower_tex_state *state = void_state;
> +   nir_builder *b = &state->b;
>
> nir_foreach_instr_safe(block, instr) {
>if (instr->type != nir_instr_type_tex)
>   continue;
>
>nir_tex_instr *tex = nir_instr_as_tex(instr);
> +  bool lower_txp = !!(state->lower_txp & (1 << tex->sampler_dim));
> +
> +  if (lower_txp)
> + project_src(b, tex);
>
> -  project_src(b, tex);
> }
>
> return true;
>  }
>
>  static void
> -nir_lower_tex_projector_impl(nir_function_impl *impl)
> +nir_lower_tex_projector_impl(nir_function_impl *impl, lower_tex_state *state)
>  {
> -   nir_builder b;
> -   nir_builder_init(&b, impl);
> +   nir_builder_init(&state->b, impl);
>
> -   nir_foreach_block(impl, nir_lower_tex_projector_block, &b);
> +   nir_foreach_block(impl, nir_lower_tex_projector_block, state);
>
> nir_metadata_preserve(impl, nir_metadata_block_index |
> nir_metadata_dominance);
>  }
>
> +/**
> + * lower_txp:
> + *bitmask of (1 << GLSL_SAMPLER_DIM_x) to control for which
> + *sampler types a texture projector is lowered.
> + */
>  void
> -nir_lower_tex_projector(nir_shader *shader)
> +nir_lower_tex_projector(nir_shader *shader, unsigned lower_txp)
>  {
> +   lower_tex_state state;
> +   state.lower_txp = lower_txp;
> nir_foreach_overload(shader, overload) {
>if (overload->impl)
> - nir_lower_tex_projector_impl(overload->impl);
> + nir_lower_tex_projector_impl(overload->impl, &state);
> }
>  }
> diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
> b/src/mesa/drivers/dri/i965/brw_nir.c
> index f326b23..2a924bb 100644
> --- a/src/mesa/drivers/dri/i965/brw_nir.c
> +++ b/src/mesa/drivers/dri/i965/brw_nir.c
> @@ -96,7 +96,7 @@ brw_create_nir(struct brw_context *brw,
> nir_lower_global_vars_to_local(nir);
> nir_validate_shader(nir);
>
> -   nir_lower_tex_projector(nir);
> +   nir_lower_tex_projector(nir, ~0);
> nir_validate_shader(nir);
>
> nir_normalize_cubemap_coords(nir);
> --
> 2.4.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/4] nir/lower_tex_proj: add support to clamp texture coords

2015-09-16 Thread Rob Clark
On Wed, Sep 16, 2015 at 2:07 PM, Rob Clark  wrote:
> From: Rob Clark 
>
> Some hardware needs to clamp texture coordinates to [0.0, 1.0] in the
> shader to emulate GL_CLAMP.  This is added to lower_tex_proj since, in
> the case of projected coords, the clamping needs to happen *after*
> projection.
>
> Signed-off-by: Rob Clark 
> ---
>  src/glsl/nir/nir.h |  4 +-
>  src/glsl/nir/nir_lower_tex_projector.c | 98 
> --
>  src/mesa/drivers/dri/i965/brw_nir.c|  2 +-
>  3 files changed, 99 insertions(+), 5 deletions(-)
>
> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
> index 9d47001..fba28f2 100644
> --- a/src/glsl/nir/nir.h
> +++ b/src/glsl/nir/nir.h
> @@ -1830,7 +1830,9 @@ void nir_lower_samplers(nir_shader *shader,
>  const struct gl_shader_program *shader_program);
>
>  void nir_lower_system_values(nir_shader *shader);
> -void nir_lower_tex_projector(nir_shader *shader, unsigned lower_txp);
> +void nir_lower_tex_projector(nir_shader *shader, unsigned lower_txp,
> + unsigned saturate_s, unsigned saturate_t,
> + unsigned saturate_r);
>  void nir_lower_idiv(nir_shader *shader);
>
>  void nir_lower_clip_vs(nir_shader *shader, unsigned ucp_enables);
> diff --git a/src/glsl/nir/nir_lower_tex_projector.c 
> b/src/glsl/nir/nir_lower_tex_projector.c
> index ce20956..1a72fd0 100644
> --- a/src/glsl/nir/nir_lower_tex_projector.c
> +++ b/src/glsl/nir/nir_lower_tex_projector.c
> @@ -33,6 +33,9 @@
>  typedef struct {
> nir_builder b;
> unsigned lower_txp;
> +   unsigned saturate_s;
> +   unsigned saturate_t;
> +   unsigned saturate_r;
>  } lower_tex_state;
>
>  static void
> @@ -111,6 +114,62 @@ project_src(nir_builder *b, nir_tex_instr *tex)
> tex->num_srcs--;
>  }
>
> +static void
> +saturate_src(nir_builder *b, nir_tex_instr *tex, unsigned sat_mask)
> +{
> +   b->cursor = nir_before_instr(&tex->instr);
> +
> +   /* Walk through the sources saturating the requested arguments. */
> +   for (unsigned i = 0; i < tex->num_srcs; i++) {
> +  switch (tex->src[i].src_type) {
> +  case nir_tex_src_coord:
> + break;
> +  default:
> + continue;
> +  }
> +  nir_ssa_def *src =
> + nir_ssa_for_src(b, tex->src[i].src, tex->coord_components);
> +
> +  /* split src into components: */
> +  nir_ssa_def *comp[4];
> +
> +  for (unsigned j = 0; j < tex->coord_components; j++)
> + comp[j] = nir_channel(b, src, j);
> +
> +  /* clamp requested components, array index does not get clamped: */
> +  unsigned ncomp = tex->coord_components;
> +  if (tex->is_array)
> + ncomp--;
> +
> +  for (unsigned j = 0; j < ncomp; j++)
> + if ((1 << j) & sat_mask)
> +comp[j] = nir_fsat(b, comp[j]);
> +
> +  /* and move the result back into a single vecN: */
> +  switch (tex->coord_components) {
> +  case 4:
> + src = nir_vec4(b, comp[0], comp[1], comp[2], comp[3]);
> + break;
> +  case 3:
> + src = nir_vec3(b, comp[0], comp[1], comp[2]);
> + break;
> +  case 2:
> + src = nir_vec2(b, comp[0], comp[1]);
> + break;
> +  case 1:
> + src = comp[0];
> + break;
> +  default:
> + unreachable("bad texture coord count");
> + break;
> +  }
> +
> +  nir_instr_rewrite_src(&tex->instr,
> +&tex->src[i].src,
> +nir_src_for_ssa(src));
> +   }
> +}
> +
>  static bool
>  nir_lower_tex_projector_block(nir_block *block, void *void_state)
>  {
> @@ -123,10 +182,24 @@ nir_lower_tex_projector_block(nir_block *block, void 
> *void_state)
>
>nir_tex_instr *tex = nir_instr_as_tex(instr);
>bool lower_txp = !!(state->lower_txp & (1 << tex->sampler_dim));
> -
> -  if (lower_txp)
> +  /* mask of src coords to saturate (clamp): */
> +  unsigned sat_mask = 0;
> +
> +  if ((1 << tex->sampler_index) & state->saturate_r)
> + sat_mask |= (1 << 2);/* .z */
> +  if ((1 << tex->sampler_index) & state->saturate_t)
> + sat_mask |= (1 << 1);/* .y */
> +  if ((1 << tex->sampler_index) & state->saturate_s)
> + sat_mask |= (1 << 0);/* .x */
> +
> +  /* If we are clamping any coords, we must lower projector first
> +   * as clamping happens *after* projection:
> +   */
> +  if (lower_txp || sat_mask)
>   project_src(b, tex);
>
> +  if (sat_mask)
> + saturate_src(b, tex, sat_mask);
> }
>
> return true;
> @@ -147,12 +220,31 @@ nir_lower_tex_projector_impl(nir_function_impl *impl, 
> lower_tex_state *state)
>   * lower_txp:
>   *bitmask of (1 << GLSL_SAMPLER_DIM_x) to control for which
>   *sampler types a texture projector is lowered.
> + *
> + * saturate_s/t/r:
> + *To emulate certain texture wrap modes, this can be used
> + *to saturate the specifi

Re: [Mesa-dev] [PATCH v5 28/70] glsl: add std430 interface packing support to ssbo related operations

2015-09-16 Thread Jordan Justen
On 2015-09-16 01:01:30, Samuel Iglesias Gonsálvez wrote:
> 
> 
> On 16/09/15 09:46, Jordan Justen wrote:
> > On 2015-09-10 06:35:44, Iago Toral Quiroga wrote:
> >> From: Samuel Iglesias Gonsalvez 
> >>
> >> v2:
> >> - Get interface packing information from interface's type, not the 
> >> variable type.
> >> - Simplify is_std430 condition in emit_access() for readability (Jordan)
> >> - Add a commment explaing why array of three-component vector case is 
> >> different
> > 
> > Lines a bit long.
> > 
> 
> OK, I will fix it.
> 
> >>   in std430 than the rest of cases.
> >> - Add calls to std430_array_stride().
> >>
> >> Signed-off-by: Samuel Iglesias Gonsalvez 
> >> ---
> >>  src/glsl/lower_ubo_reference.cpp | 102 
> >> ++-
> >>  1 file changed, 78 insertions(+), 24 deletions(-)
> >>
> >> diff --git a/src/glsl/lower_ubo_reference.cpp 
> >> b/src/glsl/lower_ubo_reference.cpp
> >> index 8694383..7e45a26 100644
> >> --- a/src/glsl/lower_ubo_reference.cpp
> >> +++ b/src/glsl/lower_ubo_reference.cpp
> >> @@ -147,7 +147,8 @@ public:
> >>  ir_rvalue **offset,
> >>  unsigned *const_offset,
> >>  bool *row_major,
> >> -int *matrix_columns);
> >> +int *matrix_columns,
> >> +unsigned packing);
> >> ir_expression *ubo_load(const struct glsl_type *type,
> >>ir_rvalue *offset);
> >> ir_call *ssbo_load(const struct glsl_type *type,
> >> @@ -164,7 +165,7 @@ public:
> >> void emit_access(bool is_write, ir_dereference *deref,
> >>  ir_variable *base_offset, unsigned int deref_offset,
> >>  bool row_major, int matrix_columns,
> >> -unsigned write_mask);
> >> +bool is_std430, unsigned write_mask);
> >>  
> >> ir_visitor_status visit_enter(class ir_expression *);
> >> ir_expression *calculate_ssbo_unsized_array_length(ir_expression 
> >> *expr);
> >> @@ -176,7 +177,8 @@ public:
> >>  ir_variable *);
> >> ir_expression *emit_ssbo_get_buffer_size();
> >>  
> >> -   unsigned calculate_unsized_array_stride(ir_dereference *deref);
> >> +   unsigned calculate_unsized_array_stride(ir_dereference *deref,
> >> +   unsigned packing);
> >>  
> >> void *mem_ctx;
> >> struct gl_shader *shader;
> >> @@ -257,7 +259,8 @@ 
> >> lower_ubo_reference_visitor::setup_for_load_or_store(ir_variable *var,
> >>   ir_rvalue **offset,
> >>   unsigned 
> >> *const_offset,
> >>   bool *row_major,
> >> - int *matrix_columns)
> >> + int *matrix_columns,
> >> + unsigned packing)
> >>  {
> >> /* Determine the name of the interface block */
> >> ir_rvalue *nonconst_block_index;
> >> @@ -343,8 +346,15 @@ 
> >> lower_ubo_reference_visitor::setup_for_load_or_store(ir_variable *var,
> >>  const bool array_row_major =
> >> is_dereferenced_thing_row_major(deref_array);
> >>  
> >> -array_stride = 
> >> deref_array->type->std140_size(array_row_major);
> >> -array_stride = glsl_align(array_stride, 16);
> >> +/* The array type will give the correct interface packing
> >> + * information
> >> + */
> >> +if (packing == GLSL_INTERFACE_PACKING_STD430) {
> >> +   array_stride = 
> >> deref_array->type->std430_array_stride(array_row_major);
> >> +} else {
> >> +   array_stride = 
> >> deref_array->type->std140_size(array_row_major);
> >> +   array_stride = glsl_align(array_stride, 16);
> >> +}
> >>   }
> >>  
> >>   ir_rvalue *array_index = deref_array->array_index;
> >> @@ -380,7 +390,12 @@ 
> >> lower_ubo_reference_visitor::setup_for_load_or_store(ir_variable *var,
> >>  
> >>  ralloc_free(field_deref);
> >>  
> >> -unsigned field_align = 
> >> type->std140_base_alignment(field_row_major);
> >> +unsigned field_align = 0;
> >> +
> >> +if (packing == GLSL_INTERFACE_PACKING_STD430)
> >> +   field_align = type->std430_base_alignment(field_row_major);
> >> +else
> >> +   field_align = type->std140_base_alignment(field_row_major);
> >>  
> >>  intra_struct_offset = glsl_align(intra_struct_offset, 
> >> field_align);
> >>  
> >> @@ -388,7 +403,10 @@ 
> >> lower_ubo_reference_visitor::setup_for_load_or_store(ir_variable *var,
> >>  

Re: [Mesa-dev] [RFC 0/3] i965: Enable up to 24 MRF registers in gen6

2015-09-16 Thread Kenneth Graunke
On Wednesday, September 16, 2015 11:17:53 AM Iago Toral Quiroga wrote:
> It seems that we have some bugs where we fail to compile shaders in gen6
> because we do not having enough MRF registers available (see bugs 86469 and
> 90631 for example). That triggered some discussion about the fact that SNB
> might actually have 24 MRF registers available, but since the docs where not
> very clear about this, it was suggested that it would be nice to try and
> experiment if that was the case.
> 
> These series of patches implement such test, basically they turn our fixed
> BRW_MAX_MRF into a macro that accepts the hardware generation and then changes
> the spilling code in brw_fs_reg_allocate.cpp to use MRF registers 21-23 for
> gen6 (something similar can be done for the vec4 code, I just did not do it
> yet).
> 
> The good news is that this seems to work fine, at least I can do a full piglit
> run without issues in SNB.

Sweet!

> In fact, this seems to help a lot of tests when I
> force spilling of everything in the FS backend (INTEL_DEBUG=spill_fs):
> 
> Using MRF registers 13-15 for spilling:
> crash: 5, fail 267, pass: 15853, skip: 11679, warn: 3
> 
> Using MRF registers 21-23 for spilling:
> crash: 5, fail 140, pass: 15980, skip: 11679, warn: 3
> 
> As you can see, we drop the fail ratio to almost 50%...

That seems odd - I wouldn't think using m13-15 vs. m21-23 would actually
make a difference.  Perhaps it's papering over a bug where we're failing
to notice that MRFs are in use?  If so, we should probably fix that (in
addition to making this change).

> The bad news is that, currently, we assert that MRF registers are within the
> supported range in brw_reg.h. This works fine now because the limit does not
> depend on the hardware generation, but these patches change that of course.
> The natural way to fix this would be to pass a generation argument to
> all brw_reg functions that can create a brw_reg, but I imagine that we don't
> want to do that only for this, right?

Yeah...it does seem a bit funny to add a generation parameter to brw_reg
functions just for an assert that the register number is in range.

What about adding the asserts in brw_set_src0 and brw_set_dest?  This
would catch BLORP and the Gen4 clip/sf/gs code that emits assembly
directly - it would catch everything.  But, unfortunately, at the last
minute...when it might be harder to debug.  So, I do like adding the
assertions to the generators as well.

> In that case, if we want to keep the
> asserts (I think we do) we need a way around that limitatation. The first
> patch in this series tries to move the asserts to the generator, but that 
> won't
> manage things like blorp and other modules that can emit code directly, so we
> would lose the assert checks for those. Of course we could add individual
> asserts for these as needed, but it is not ideal. Alternatively, we could add
> a function wrapper to brw_message_reg that has the assert and use that
> version of the function from these places. In that case, this wrapper might 
> not
> need to take in the generation number as parameter and could just check
> with 16 as the limit, since we really only use MRF registers
> beyond 16 for spilling, and we only handle spilling in code paths that end
> up going through the generator.
> 
> Or maybe we think this is just not worth it if it only helps gen6...

I'd like to do it.

> 
> what do you think? 
> 
> Iago Toral Quiroga (3):
>   i965: Move MRF register asserts to the generator
>   i965: Turn BRW_MAX_MRF into a macro that accepts a hardware generation
>   i965/fs: Use MRF registers 21-23 for spilling on gen6
> 
>  src/mesa/drivers/dri/i965/brw_eu_emit.c|  2 +-
>  src/mesa/drivers/dri/i965/brw_fs.cpp   |  4 ++--
>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 14 +++
>  src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  | 27 
> --
>  src/mesa/drivers/dri/i965/brw_ir_vec4.h|  2 +-
>  src/mesa/drivers/dri/i965/brw_reg.h|  5 +---
>  .../drivers/dri/i965/brw_schedule_instructions.cpp |  4 ++--
>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp   |  9 +---
>  8 files changed, 37 insertions(+), 30 deletions(-)
> 
> 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/2] i965/vec4: Change SEL and MOV types as needed to propagate source modifiers

2015-09-16 Thread Alejandro Piñeiro
On the review of the patch "i965/nir/vec4: fill the type of the dst
and src when loading an uniform" Jason Ekstrand suggested to change
the optimization pass in order to allow the copy propagation with
MOVs even if there is a type mismatch, as was done on the fs path,
instead of fixing the type for MOV instructions.[1]

So using commit 472ef9 as reference I implemented the equivalent
for the vec4 case. But that only worked if it was the current
instruction the MOV with default types. It didn't fixed the shader-db
instruction count regression I was working on, that was when it was
the from instruction the MOV with default types. Or in other words,
it didn't cover this case:

   1: mov vgrf1.0:UD, u0.xyzw:UD
   2: add vgrf2.0:F, vgrf0.xyzw:F, -vgrf1.xyzw:F

So I extended the same idea by checking too against the from
instruction. In order to do that, I needed to also track
the vec4_instructions on the copy_entry struct.

Submitting two patches because I think that it will be easier
to review in this way. But if this solutions is approved, I
think that it could be better to push them squashed on just
one patch.

Shader-db results for vec4 programs on Haswell:
total instructions in shared programs: 1746280 -> 1732159 (-0.81%)
instructions in affected programs: 760595 -> 746474 (-1.86%)
helped:6132
HURT:  0
GAINED:0
LOST:  0


[1] http://lists.freedesktop.org/archives/mesa-dev/2015-September/094555.html

Alejandro Piñeiro (2):
  i965/vec4: Change types as needed to propagate source modifiers using
current instruction
  i965/vec4: Change types as needed to propagate source modifiers using
from instruction

 .../drivers/dri/i965/brw_vec4_copy_propagation.cpp | 45 --
 1 file changed, 41 insertions(+), 4 deletions(-)

-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] i965/vec4: Change types as needed to propagate source modifiers using current instruction

2015-09-16 Thread Alejandro Piñeiro
SEL and MOV instructions, as long as they don't have source modifiers, are
just copying bits around.  So those kind of instruction could be propagated
even if there are type mismatches. This is needed because NIR generates
integer SEL and MOV instructions whenever it doesn't know what else to
generate.

This commit adds support for copy propagation using current instruction
as reference.
---

Equivalent to commit 472ef9 but for the vec4 case.

 .../drivers/dri/i965/brw_vec4_copy_propagation.cpp | 28 --
 1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
index 5a15eb8..64e2528 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
@@ -249,6 +249,16 @@ try_constant_propagate(const struct brw_device_info 
*devinfo,
 }
 
 static bool
+can_change_source_types(vec4_instruction *inst)
+{
+   return !inst->src[0].abs && !inst->src[0].negate &&
+  (inst->opcode == BRW_OPCODE_MOV ||
+   (inst->opcode == BRW_OPCODE_SEL &&
+inst->predicate != BRW_PREDICATE_NONE &&
+!inst->src[1].abs && !inst->src[1].negate));
+}
+
+static bool
 try_copy_propagate(const struct brw_device_info *devinfo,
vec4_instruction *inst,
int arg, struct copy_entry *entry)
@@ -308,7 +318,9 @@ try_copy_propagate(const struct brw_device_info *devinfo,
 value.swizzle != BRW_SWIZZLE_XYZW) && 
!inst->can_do_source_mods(devinfo))
   return false;
 
-   if (has_source_modifiers && value.type != inst->src[arg].type)
+   if (has_source_modifiers &&
+   value.type != inst->src[arg].type &&
+   !can_change_source_types(inst))
   return false;
 
if (has_source_modifiers &&
@@ -362,7 +374,19 @@ try_copy_propagate(const struct brw_device_info *devinfo,
   }
}
 
-   value.type = inst->src[arg].type;
+   if (has_source_modifiers &&
+   value.type != inst->src[arg].type) {
+  /* We are propagating source modifiers from a MOV with a different
+   * type.  If we got here, then we can just change the source and
+   * destination types of the instruction and keep going.
+   */
+  assert(can_change_source_types(inst));
+  for (int i = 0; i < 3; i++) {
+ inst->src[i].type = value.type;
+  }
+  inst->dst.type = value.type;
+   } else
+  value.type = inst->src[arg].type;
inst->src[arg] = value;
return true;
 }
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965/vec4: Change types as needed to propagate source modifiers using from instruction

2015-09-16 Thread Alejandro Piñeiro
SEL and MOV instructions, as long as they don't have source modifiers, are
just copying bits around.  So those kind of instruction could be propagated
even if there are type mismatches. This is needed because NIR generates
integer SEL and MOV instructions whenever it doesn't know what else to
generate.

This commit adds support for copy propagation using previous instruction
as reference.
---

I was tempted to try to remove copy_entry->value, as with this commit
we are tracking the instructions too, but I think that the code would
be clearer this way.


 .../drivers/dri/i965/brw_vec4_copy_propagation.cpp | 35 +++---
 1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
index 64e2528..f8ecd0b 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
@@ -39,6 +39,7 @@ namespace brw {
 
 struct copy_entry {
src_reg *value[4];
+   vec4_instruction *inst[4];
int saturatemask;
 };
 
@@ -320,7 +321,8 @@ try_copy_propagate(const struct brw_device_info *devinfo,
 
if (has_source_modifiers &&
value.type != inst->src[arg].type &&
-   !can_change_source_types(inst))
+   !can_change_source_types(inst) &&
+   !can_change_source_types(entry->inst[arg]))
   return false;
 
if (has_source_modifiers &&
@@ -338,7 +340,8 @@ try_copy_propagate(const struct brw_device_info *devinfo,
 * instead. See also resolve_ud_negate().
 */
if (value.negate &&
-   value.type == BRW_REGISTER_TYPE_UD)
+   value.type == BRW_REGISTER_TYPE_UD &&
+   !can_change_source_types(entry->inst[arg]))
   return false;
 
/* Don't report progress if this is a noop. */
@@ -376,17 +379,25 @@ try_copy_propagate(const struct brw_device_info *devinfo,
 
if (has_source_modifiers &&
value.type != inst->src[arg].type) {
-  /* We are propagating source modifiers from a MOV with a different
-   * type.  If we got here, then we can just change the source and
-   * destination types of the instruction and keep going.
+  /* We are propagating source modifiers from a safe instruction with a
+   * different type. If we got here, then we can just change the source
+   * and destination types of the current instruction or the instruction
+   * from we are propagating.
*/
-  assert(can_change_source_types(inst));
-  for (int i = 0; i < 3; i++) {
- inst->src[i].type = value.type;
+  assert(can_change_source_types(inst) ||
+ can_change_source_types(entry->inst[arg]));
+
+  if (can_change_source_types(inst)) {
+ for (int i = 0; i < 3; i++) {
+inst->src[i].type = value.type;
+ }
+ inst->dst.type = value.type;
+  } else {
+ value.type = inst->src[arg].type;
   }
-  inst->dst.type = value.type;
-   } else
+   } else {
   value.type = inst->src[arg].type;
+   }
inst->src[arg] = value;
return true;
 }
@@ -439,7 +450,7 @@ vec4_visitor::opt_copy_propagation(bool do_constant_prop)
 for (c = 0; c < 4; c++) {
 int channel = BRW_GET_SWZ(inst->src[i].swizzle, c);
 entry.value[c] = entries[reg].value[channel];
-
+entry.inst[c] = entries[reg].inst[channel];
/* If there's no available copy for this channel, bail.
 * We could be more aggressive here -- some channels might
 * not get used based on the destination writemask.
@@ -484,6 +495,7 @@ vec4_visitor::opt_copy_propagation(bool do_constant_prop)
entries[reg].value[i] = direct_copy ? &inst->src[0] : NULL;
entries[reg].saturatemask |=
   inst->saturate && direct_copy ? 1 << i : 0;
+   entries[reg].inst[i] = direct_copy ? inst : NULL;
 }
 }
 
@@ -498,6 +510,7 @@ vec4_visitor::opt_copy_propagation(bool do_constant_prop)
  if (is_channel_updated(inst, entries[i].value, j)) {
 entries[i].value[j] = NULL;
 entries[i].saturatemask &= ~(1 << j);
+ entries[i].inst[j] = NULL;
   }
   }
}
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79783] Distorted output in obs-studio where other vendors "work"

2015-09-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79783

--- Comment #3 from gregory.hain...@gmail.com ---
I have the same issue on my application (PCSX2).

The code in link_varyings.cpp (varying_matches::record) is potentially wrong
but it isn't the main issue. I try to comment the code and the issue is still
here. Nevertheless the code feels wrong as you don't know the existence of the
consumer in SSO. So the flat optimization is likely bad.

Anyway I try also to render the texture coordinate in the screen. Normally they
must be interpolated between [0;1] however the interpolation is done between 
[-1;1] (potentially with a different sign). Indeed applying a rescaling of the
coordinate in the FS, (t + 1.0) / 2.0 seems to render correctly my draw call.

It seems [-1;1] is a default value of the raster unit. The behavior is the same
if the texture coordinate is not written in the vertex shader. Maybe the code
is optimized in the VS. Unfortunately I don't know if there is any possibility
to dump VS asm code with Nouveau.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 79783] Distorted output in obs-studio where other vendors "work"

2015-09-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=79783

--- Comment #4 from Ilia Mirkin  ---
(In reply to gregory.hainaut from comment #3)
> Unfortunately I don't know if there is any
> possibility to dump VS asm code with Nouveau.

NV50_PROG_DEBUG=1 (assuming you've built mesa with --enable-debug) should dump
the TGSI, nv50 ir (post optimizations and, separately, post-RA), and the actual
instruction stream.

NV50_PROG_OPTIMIZE=0 will disable all of the nv50 ir optimizations, =1 will
enable some of them, =2 will enable most of them (and =3 should be everything).

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] Revert "mesa/extensions: restrict GL_OES_EGL_image to GLES"

2015-09-16 Thread Dave Airlie
This reverts commit 48961fa3ba37999a6f8fd812458b735e39604a95.

glamor/Xwayland use this, the spec saying something when it
was written, and the fact that the comment says Mesa relies on it
hasn't changed.

I also don't have a copy of this patch in my mail archive, which
seems wierd, did it get posted to mesa-dev?

Signed-off-by: Dave Airlie 
---
 src/mesa/main/extensions.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
index 767c50e..b2c88c3 100644
--- a/src/mesa/main/extensions.c
+++ b/src/mesa/main/extensions.c
@@ -307,7 +307,8 @@ static const struct extension extension_table[] = {
{ "GL_OES_depth_texture_cube_map",  
o(OES_depth_texture_cube_map), ES2, 2012 },
{ "GL_OES_draw_texture",o(OES_draw_texture),
 ES1,   2004 },
{ "GL_OES_EGL_sync",o(dummy_true),  
 ES1 | ES2, 2010 },
-   { "GL_OES_EGL_image",   o(OES_EGL_image),   
 ES1 | ES2, 2006 },
+   /*  FIXME: Mesa expects GL_OES_EGL_image to be available in OpenGL 
contexts. */
+   { "GL_OES_EGL_image",   o(OES_EGL_image),   
GL | ES1 | ES2, 2006 },
{ "GL_OES_EGL_image_external",  o(OES_EGL_image_external),  
 ES1 | ES2, 2010 },
{ "GL_OES_element_index_uint",  o(dummy_true),  
 ES1 | ES2, 2005 },
{ "GL_OES_fbo_render_mipmap",   o(dummy_true),  
 ES1 | ES2, 2005 },
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92022] st/va: add initial support for Video Post Processing and Export/Import of VaSurface

2015-09-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92022

Julien Isorce  changed:

   What|Removed |Added

   Severity|normal  |enhancement

--- Comment #3 from Julien Isorce  ---
(In reply to Ilia Mirkin from comment #2)
> The proper way to do all this is by sending emails, not filing bugs.

All right. I need to clean up them a bit more first and I'll send them.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] nir/lower_tex_proj: add support projector lowering per sampler type

2015-09-16 Thread Rob Clark
On Wed, Sep 16, 2015 at 2:11 PM, Ilia Mirkin  wrote:
> On Wed, Sep 16, 2015 at 2:07 PM, Rob Clark  wrote:
>> From: Rob Clark 
>>
>> Some hardware, such as adreno a3xx, supports txp on some but not all
>> sampler types.  In this case we want more fine grained control over
>> which texture projectors get lowered.
>
> I mentioned this on IRC, but should probably say it here too -- a3xx
> doesn't actually need this. The tex-miplevel-selection test was being
> picky, Iago changed it up in commit 181c264956 since Intel was having
> similar troubles. As I recall, sam.3d.p worked fine on my a320 with
> that change, but it was quite a while ago, and should be re-checked.

fyi, there is an updated version of the patchset here:

https://github.com/freedreno/mesa/commits/wip-tex-lowering

This adds an extra patch to handle RECT lowering to 2D, and now it
actually handles clamping + rect textures properly (unlike
tgsi_lowering) :-)

It also splits options out into a nir_lower_tex_options struct (rather
than ever increasing # of params to lowering fxn).  I did end up
keeping the lower_txp param, since for a3xx I need a way to tell the
pass not to lower txp (unless needed for clamp).  I guess I could
change it to a simple boolean.  Not sure if it is worth changing.

The one thing I did not do yet (and why I'm not resending to list yet)
is rename the pass to nir_lower_tex.

BR,
-R

>   -ilia
>
>>
>> Signed-off-by: Rob Clark 
>> ---
>>  src/glsl/nir/nir.h |  2 +-
>>  src/glsl/nir/nir_lower_tex_projector.c | 31 +++
>>  src/mesa/drivers/dri/i965/brw_nir.c|  2 +-
>>  3 files changed, 25 insertions(+), 10 deletions(-)
>>
>> diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
>> index 284fccd..9d47001 100644
>> --- a/src/glsl/nir/nir.h
>> +++ b/src/glsl/nir/nir.h
>> @@ -1830,7 +1830,7 @@ void nir_lower_samplers(nir_shader *shader,
>>  const struct gl_shader_program *shader_program);
>>
>>  void nir_lower_system_values(nir_shader *shader);
>> -void nir_lower_tex_projector(nir_shader *shader);
>> +void nir_lower_tex_projector(nir_shader *shader, unsigned lower_txp);
>>  void nir_lower_idiv(nir_shader *shader);
>>
>>  void nir_lower_clip_vs(nir_shader *shader, unsigned ucp_enables);
>> diff --git a/src/glsl/nir/nir_lower_tex_projector.c 
>> b/src/glsl/nir/nir_lower_tex_projector.c
>> index 11fcd61..ce20956 100644
>> --- a/src/glsl/nir/nir_lower_tex_projector.c
>> +++ b/src/glsl/nir/nir_lower_tex_projector.c
>> @@ -30,6 +30,11 @@
>>  #include "nir.h"
>>  #include "nir_builder.h"
>>
>> +typedef struct {
>> +   nir_builder b;
>> +   unsigned lower_txp;
>> +} lower_tex_state;
>> +
>>  static void
>>  project_src(nir_builder *b, nir_tex_instr *tex)
>>  {
>> @@ -109,37 +114,47 @@ project_src(nir_builder *b, nir_tex_instr *tex)
>>  static bool
>>  nir_lower_tex_projector_block(nir_block *block, void *void_state)
>>  {
>> -   nir_builder *b = void_state;
>> +   lower_tex_state *state = void_state;
>> +   nir_builder *b = &state->b;
>>
>> nir_foreach_instr_safe(block, instr) {
>>if (instr->type != nir_instr_type_tex)
>>   continue;
>>
>>nir_tex_instr *tex = nir_instr_as_tex(instr);
>> +  bool lower_txp = !!(state->lower_txp & (1 << tex->sampler_dim));
>> +
>> +  if (lower_txp)
>> + project_src(b, tex);
>>
>> -  project_src(b, tex);
>> }
>>
>> return true;
>>  }
>>
>>  static void
>> -nir_lower_tex_projector_impl(nir_function_impl *impl)
>> +nir_lower_tex_projector_impl(nir_function_impl *impl, lower_tex_state 
>> *state)
>>  {
>> -   nir_builder b;
>> -   nir_builder_init(&b, impl);
>> +   nir_builder_init(&state->b, impl);
>>
>> -   nir_foreach_block(impl, nir_lower_tex_projector_block, &b);
>> +   nir_foreach_block(impl, nir_lower_tex_projector_block, state);
>>
>> nir_metadata_preserve(impl, nir_metadata_block_index |
>> nir_metadata_dominance);
>>  }
>>
>> +/**
>> + * lower_txp:
>> + *bitmask of (1 << GLSL_SAMPLER_DIM_x) to control for which
>> + *sampler types a texture projector is lowered.
>> + */
>>  void
>> -nir_lower_tex_projector(nir_shader *shader)
>> +nir_lower_tex_projector(nir_shader *shader, unsigned lower_txp)
>>  {
>> +   lower_tex_state state;
>> +   state.lower_txp = lower_txp;
>> nir_foreach_overload(shader, overload) {
>>if (overload->impl)
>> - nir_lower_tex_projector_impl(overload->impl);
>> + nir_lower_tex_projector_impl(overload->impl, &state);
>> }
>>  }
>> diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
>> b/src/mesa/drivers/dri/i965/brw_nir.c
>> index f326b23..2a924bb 100644
>> --- a/src/mesa/drivers/dri/i965/brw_nir.c
>> +++ b/src/mesa/drivers/dri/i965/brw_nir.c
>> @@ -96,7 +96,7 @@ brw_create_nir(struct brw_context *brw,
>> nir_lower_global_vars_to_local(nir);
>> nir_validate_shader(nir);
>>
>> -   nir_lower_tex_projector(nir);
>> +   nir_lower_tex_projector(nir, ~0);
>> nir_validate_shader

Re: [Mesa-dev] [PATCH 1/2] i965/vec4: Change types as needed to propagate source modifiers using current instruction

2015-09-16 Thread Jason Ekstrand
On Wed, Sep 16, 2015 at 12:47 PM, Alejandro Piñeiro
 wrote:
> SEL and MOV instructions, as long as they don't have source modifiers, are
> just copying bits around.  So those kind of instruction could be propagated
> even if there are type mismatches. This is needed because NIR generates
> integer SEL and MOV instructions whenever it doesn't know what else to
> generate.
>
> This commit adds support for copy propagation using current instruction
> as reference.
> ---
>
> Equivalent to commit 472ef9 but for the vec4 case.
>
>  .../drivers/dri/i965/brw_vec4_copy_propagation.cpp | 28 
> --
>  1 file changed, 26 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
> index 5a15eb8..64e2528 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
> @@ -249,6 +249,16 @@ try_constant_propagate(const struct brw_device_info 
> *devinfo,
>  }
>
>  static bool
> +can_change_source_types(vec4_instruction *inst)
> +{
> +   return !inst->src[0].abs && !inst->src[0].negate &&
> +  (inst->opcode == BRW_OPCODE_MOV ||
> +   (inst->opcode == BRW_OPCODE_SEL &&
> +inst->predicate != BRW_PREDICATE_NONE &&
> +!inst->src[1].abs && !inst->src[1].negate));

You should probably check saturate in here as well.  The only time we
can do saturate propagation is if both the instruction and the MOV are
float type anyway.  I think the FS version has the same theoretical
problem.

With that fixed,

Reviewed-by: Jason Ekstrand 

> +}
> +
> +static bool
>  try_copy_propagate(const struct brw_device_info *devinfo,
> vec4_instruction *inst,
> int arg, struct copy_entry *entry)
> @@ -308,7 +318,9 @@ try_copy_propagate(const struct brw_device_info *devinfo,
>  value.swizzle != BRW_SWIZZLE_XYZW) && 
> !inst->can_do_source_mods(devinfo))
>return false;
>
> -   if (has_source_modifiers && value.type != inst->src[arg].type)
> +   if (has_source_modifiers &&
> +   value.type != inst->src[arg].type &&
> +   !can_change_source_types(inst))
>return false;
>
> if (has_source_modifiers &&
> @@ -362,7 +374,19 @@ try_copy_propagate(const struct brw_device_info *devinfo,
>}
> }
>
> -   value.type = inst->src[arg].type;
> +   if (has_source_modifiers &&
> +   value.type != inst->src[arg].type) {
> +  /* We are propagating source modifiers from a MOV with a different
> +   * type.  If we got here, then we can just change the source and
> +   * destination types of the instruction and keep going.
> +   */
> +  assert(can_change_source_types(inst));
> +  for (int i = 0; i < 3; i++) {
> + inst->src[i].type = value.type;
> +  }
> +  inst->dst.type = value.type;
> +   } else
> +  value.type = inst->src[arg].type;
> inst->src[arg] = value;
> return true;
>  }
> --
> 2.1.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965/vec4: Change types as needed to propagate source modifiers using from instruction

2015-09-16 Thread Jason Ekstrand
On Wed, Sep 16, 2015 at 12:47 PM, Alejandro Piñeiro
 wrote:
> SEL and MOV instructions, as long as they don't have source modifiers, are
> just copying bits around.  So those kind of instruction could be propagated
> even if there are type mismatches. This is needed because NIR generates
> integer SEL and MOV instructions whenever it doesn't know what else to
> generate.
>
> This commit adds support for copy propagation using previous instruction
> as reference.
> ---
>
> I was tempted to try to remove copy_entry->value, as with this commit
> we are tracking the instructions too, but I think that the code would
> be clearer this way.
>
>
>  .../drivers/dri/i965/brw_vec4_copy_propagation.cpp | 35 
> +++---
>  1 file changed, 24 insertions(+), 11 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
> index 64e2528..f8ecd0b 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
> @@ -39,6 +39,7 @@ namespace brw {
>
>  struct copy_entry {
> src_reg *value[4];
> +   vec4_instruction *inst[4];
> int saturatemask;
>  };
>
> @@ -320,7 +321,8 @@ try_copy_propagate(const struct brw_device_info *devinfo,
>
> if (has_source_modifiers &&
> value.type != inst->src[arg].type &&
> -   !can_change_source_types(inst))
> +   !can_change_source_types(inst) &&
> +   !can_change_source_types(entry->inst[arg]))

This isn't right.  The entry->inst array is indexed on vec4 component
but arg is the argument of the instruction.  I think what you want to
do is add something to the loop above to loop over 0...3 and check
them all.  Also, how this is different from has_source_modifiers?
Obviously, it is (the shader-db numbers say so) but I'm not seeing it.
Could you please provide an example.

>return false;
>
> if (has_source_modifiers &&
> @@ -338,7 +340,8 @@ try_copy_propagate(const struct brw_device_info *devinfo,
>  * instead. See also resolve_ud_negate().
>  */
> if (value.negate &&
> -   value.type == BRW_REGISTER_TYPE_UD)
> +   value.type == BRW_REGISTER_TYPE_UD &&
> +   !can_change_source_types(entry->inst[arg]))
>return false;
>
> /* Don't report progress if this is a noop. */
> @@ -376,17 +379,25 @@ try_copy_propagate(const struct brw_device_info 
> *devinfo,
>
> if (has_source_modifiers &&
> value.type != inst->src[arg].type) {
> -  /* We are propagating source modifiers from a MOV with a different
> -   * type.  If we got here, then we can just change the source and
> -   * destination types of the instruction and keep going.
> +  /* We are propagating source modifiers from a safe instruction with a
> +   * different type. If we got here, then we can just change the source
> +   * and destination types of the current instruction or the instruction
> +   * from we are propagating.
> */
> -  assert(can_change_source_types(inst));
> -  for (int i = 0; i < 3; i++) {
> - inst->src[i].type = value.type;
> +  assert(can_change_source_types(inst) ||
> + can_change_source_types(entry->inst[arg]));
> +
> +  if (can_change_source_types(inst)) {
> + for (int i = 0; i < 3; i++) {
> +inst->src[i].type = value.type;
> + }
> + inst->dst.type = value.type;
> +  } else {
> + value.type = inst->src[arg].type;
>}
> -  inst->dst.type = value.type;
> -   } else
> +   } else {
>value.type = inst->src[arg].type;
> +   }
> inst->src[arg] = value;
> return true;
>  }
> @@ -439,7 +450,7 @@ vec4_visitor::opt_copy_propagation(bool do_constant_prop)
>  for (c = 0; c < 4; c++) {
>  int channel = BRW_GET_SWZ(inst->src[i].swizzle, c);
>  entry.value[c] = entries[reg].value[channel];
> -
> +entry.inst[c] = entries[reg].inst[channel];
> /* If there's no available copy for this channel, bail.
>  * We could be more aggressive here -- some channels might
>  * not get used based on the destination writemask.
> @@ -484,6 +495,7 @@ vec4_visitor::opt_copy_propagation(bool do_constant_prop)
> entries[reg].value[i] = direct_copy ? &inst->src[0] : NULL;
> entries[reg].saturatemask |=
>inst->saturate && direct_copy ? 1 << i : 0;
> +   entries[reg].inst[i] = direct_copy ? inst : NULL;
>  }
>  }
>
> @@ -498,6 +510,7 @@ vec4_visitor::opt_copy_propagation(bool do_constant_prop)
>   if (is_channel_updated(inst, entries[i].value, j)) {
>  entries[i].value[j] = NULL;
>  entries[i].saturatemask &= ~(1 << j);
> + entries[i].inst[j] = NULL;
>}
>}
> }
> --
> 2.1.4
>
> __

Re: [Mesa-dev] [PATCH 0/2] i965/vec4: Change SEL and MOV types as needed to propagate source modifiers

2015-09-16 Thread Jason Ekstrand
On Wed, Sep 16, 2015 at 12:47 PM, Alejandro Piñeiro
 wrote:
> On the review of the patch "i965/nir/vec4: fill the type of the dst
> and src when loading an uniform" Jason Ekstrand suggested to change
> the optimization pass in order to allow the copy propagation with
> MOVs even if there is a type mismatch, as was done on the fs path,
> instead of fixing the type for MOV instructions.[1]
>
> So using commit 472ef9 as reference I implemented the equivalent
> for the vec4 case. But that only worked if it was the current
> instruction the MOV with default types. It didn't fixed the shader-db
> instruction count regression I was working on, that was when it was
> the from instruction the MOV with default types. Or in other words,
> it didn't cover this case:
>
>1: mov vgrf1.0:UD, u0.xyzw:UD
>2: add vgrf2.0:F, vgrf0.xyzw:F, -vgrf1.xyzw:F
>
> So I extended the same idea by checking too against the from
> instruction. In order to do that, I needed to also track
> the vec4_instructions on the copy_entry struct.
>
> Submitting two patches because I think that it will be easier
> to review in this way. But if this solutions is approved, I
> think that it could be better to push them squashed on just
> one patch.

I don't know that I care that much about squashing them or not.  I
reviewed the first one but I'm confused about what's going on in the
second.
--Jason

> Shader-db results for vec4 programs on Haswell:
> total instructions in shared programs: 1746280 -> 1732159 (-0.81%)
> instructions in affected programs: 760595 -> 746474 (-1.86%)
> helped:6132
> HURT:  0
> GAINED:0
> LOST:  0
>
>
> [1] http://lists.freedesktop.org/archives/mesa-dev/2015-September/094555.html
>
> Alejandro Piñeiro (2):
>   i965/vec4: Change types as needed to propagate source modifiers using
> current instruction
>   i965/vec4: Change types as needed to propagate source modifiers using
> from instruction
>
>  .../drivers/dri/i965/brw_vec4_copy_propagation.cpp | 45 
> --
>  1 file changed, 41 insertions(+), 4 deletions(-)
>
> --
> 2.1.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 1/8] i965: Add a helper function intel_get_tile_dims()

2015-09-16 Thread Chad Versace
On Fri 11 Sep 2015, Ville Syrjälä wrote:

> As it turns out I was just looking at Yf and whatnot from display POV,
> and I came to the conclusion that I'll change the kernel to just have a
> function to return the tile width in bytes based on the cpp passed in,
> and then I can simply compute tile height as 'tile_size / tile_width',
> or tile size in pixels (if needed) as 'tile_width / cpp'
> 
> And what I understood about Yf (the docs are no good IME, at least the
> part I was looking at) is the following:
> 
> cpp w_bytes w_pixels h  aspect
> 1   64  64   64 1
> 2   128 64   32 2
> 4   128 32   32 1
> 8   256 32   16 2
> 16  256 16   16 1
> 
> So all you really need to know is cpp and w_bytes and the rest can all
> be computed as needed.

That table matches my understanding too.

I found the hw doc's discussion of the tiling algorithm (Memory Views
» Address Tiling Function Introduction » Tiling Algorithm) to be
terribly confusing. The bit swizzle table in 2D surface section, though,
is golden (Memory Views » Common Surface Formats » Surface Layout and
Tiling » 2D Surfaces).
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 2/8] i965: Use intel_get_tile_dims() to get tile masks

2015-09-16 Thread Chad Versace
On Wed 19 Aug 2015, Anuj Phogat wrote:
> This will require change in the parameters passed to
> intel_miptree_get_tile_masks().
> 
> V2: Rearrange the order of parameters. (Ben)
> Change the name to intel_get_tile_masks(). (Topi)
> 
> Cc: Ben Widawsky 
> Cc: Topi Pohjolainen 
> Signed-off-by: Anuj Phogat 
> ---
>  src/mesa/drivers/dri/i965/brw_blorp.cpp   |  4 +++-
>  src/mesa/drivers/dri/i965/brw_misc_state.c| 20 +++---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 30 
> +++
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  6 +++---
>  4 files changed, 27 insertions(+), 33 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp 
> b/src/mesa/drivers/dri/i965/brw_blorp.cpp
> index eac1f00..df2969d 100644
> --- a/src/mesa/drivers/dri/i965/brw_blorp.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp
> @@ -144,7 +144,9 @@ brw_blorp_surface_info::compute_tile_offsets(uint32_t 
> *tile_x,
>  {
> uint32_t mask_x, mask_y;
>  
> -   intel_miptree_get_tile_masks(mt, &mask_x, &mask_y, 
> map_stencil_as_y_tiled);
> +   intel_get_tile_masks(mt->tiling, mt->tr_mode, mt->cpp,
> +map_stencil_as_y_tiled,
> +&mask_x, &mask_y);
>  
> *tile_x = x_offset & mask_x;
> *tile_y = y_offset & mask_y;
> diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c 
> b/src/mesa/drivers/dri/i965/brw_misc_state.c
> index e9d9467..2a3195a 100644
> --- a/src/mesa/drivers/dri/i965/brw_misc_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
> @@ -174,13 +174,17 @@ brw_get_depthstencil_tile_masks(struct 
> intel_mipmap_tree *depth_mt,
> uint32_t tile_mask_x = 0, tile_mask_y = 0;
>  
> if (depth_mt) {
> -  intel_miptree_get_tile_masks(depth_mt, &tile_mask_x, &tile_mask_y, 
> false);
> +  intel_get_tile_masks(depth_mt->tiling, depth_mt->tr_mode,
> +   depth_mt->cpp, false,
> +   &tile_mask_x, &tile_mask_y);
>  
>if (intel_miptree_level_has_hiz(depth_mt, depth_level)) {
>   uint32_t hiz_tile_mask_x, hiz_tile_mask_y;
> - intel_miptree_get_tile_masks(depth_mt->hiz_buf->mt,
> -  &hiz_tile_mask_x, &hiz_tile_mask_y,
> -  false);
> + intel_get_tile_masks(depth_mt->hiz_buf->mt->tiling,
> +  depth_mt->hiz_buf->mt->tr_mode,
> +  depth_mt->hiz_buf->mt->cpp,
> +  false, &hiz_tile_mask_x,
> +  &hiz_tile_mask_y);
>  
>   /* Each HiZ row represents 2 rows of pixels */
>   hiz_tile_mask_y = hiz_tile_mask_y << 1 | 1;
> @@ -200,9 +204,11 @@ brw_get_depthstencil_tile_masks(struct intel_mipmap_tree 
> *depth_mt,
>   tile_mask_y |= 63;
>} else {
>   uint32_t stencil_tile_mask_x, stencil_tile_mask_y;
> - intel_miptree_get_tile_masks(stencil_mt,
> -  &stencil_tile_mask_x,
> -  &stencil_tile_mask_y, false);
> + intel_get_tile_masks(stencil_mt->tiling,
> +  stencil_mt->tr_mode,
> +  stencil_mt->cpp,
> +  false, &stencil_tile_mask_x,
> +  &stencil_tile_mask_y);
>  
>   tile_mask_x |= stencil_tile_mask_x;
>   tile_mask_y |= stencil_tile_mask_y;
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index c282e94..13a33c6 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -1124,31 +1124,17 @@ intel_get_tile_dims(uint32_t tiling, uint32_t 
> tr_mode, uint32_t cpp,
>   * untiled, the masks are set to 0.
>   */
>  void
> -intel_miptree_get_tile_masks(const struct intel_mipmap_tree *mt,
> - uint32_t *mask_x, uint32_t *mask_y,
> - bool map_stencil_as_y_tiled)
> +intel_get_tile_masks(uint32_t tiling, uint32_t tr_mode, uint32_t cpp,
> + bool map_stencil_as_y_tiled,
> + uint32_t *mask_x, uint32_t *mask_y)
>  {
> -   int cpp = mt->cpp;
> -   uint32_t tiling = mt->tiling;
> -
> if (map_stencil_as_y_tiled)
>tiling = I915_TILING_Y;
>  
> -   switch (tiling) {
> -   default:
> -  unreachable("not reached");
> -   case I915_TILING_NONE:
> -  *mask_x = *mask_y = 0;
> -  break;
> -   case I915_TILING_X:
> -  *mask_x = 512 / cpp - 1;
> -  *mask_y = 7;
> -  break;
> -   case I915_TILING_Y:
> -  *mask_x = 128 / cpp - 1;
> -  *mask_y = 31;
> -  break;
> -   }
> +   intel_get_tile_dims(tiling, tr_mode, cpp, mask_x, mask_y);
> +
> +   *mask_x = *mask_x / cpp - 1;
> +   *mask_y = *mask_y / cpp - 1;
>  }

mask_y should be exactly (tile_height - 1) for all tiling modes.

Re: [Mesa-dev] [PATCH V2 3/8] i965: Use helper function intel_get_tile_dims() in surface setup

2015-09-16 Thread Chad Versace
On Wed 19 Aug 2015, Anuj Phogat wrote:
> It takes care of using the correct tile width if we later use other
> tiling patterns for aux miptree.
> 
> V2: Remove the comment about using Yf for aux miptree.
> 
> Cc: Ben Widawsky 
> Signed-off-by: Anuj Phogat 
> ---
>  src/mesa/drivers/dri/i965/gen8_surface_state.c | 14 --
>  1 file changed, 12 insertions(+), 2 deletions(-)

Patch 3/8 is
Reviewed-by: Chad Versace 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 5/8] i965: Move conversion of {src, dst}_pitch to dwords outside if/else

2015-09-16 Thread Chad Versace
On Wed 19 Aug 2015, Anuj Phogat wrote:
> Signed-off-by: Anuj Phogat 
> ---
>  src/mesa/drivers/dri/i965/intel_blit.c | 25 +
>  1 file changed, 9 insertions(+), 16 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_blit.c 
> b/src/mesa/drivers/dri/i965/intel_blit.c
> index c177eec..d15a64d 100644
> --- a/src/mesa/drivers/dri/i965/intel_blit.c
> +++ b/src/mesa/drivers/dri/i965/intel_blit.c

[snip]

> @@ -645,17 +636,19 @@ intelEmitCopyBlit(struct brw_context *brw,
>CMD = xy_blit_cmd(src_tiling, src_tr_mode,
>  dst_tiling, dst_tr_mode,
>  cpp, use_fast_copy_blit);
> +   }
>  
> -  if (dst_tiling != I915_TILING_NONE)
> - dst_pitch /= 4;
> +   /* For tiled source and destination, pitch value should be specified
> +* as a number of Dwords.
> +*/
> +   if (dst_tiling != I915_TILING_NONE)
> +  dst_pitch /= 4;
>  
> -  if (src_tiling != I915_TILING_NONE)
> - src_pitch /= 4;
> -   }
> +   if (src_tiling != I915_TILING_NONE)
> +  src_pitch /= 4;
>  
> -   if (dst_y2 <= dst_y || dst_x2 <= dst_x) {
> +   if (dst_y2 <= dst_y || dst_x2 <= dst_x)
>return true;
> -   }

The diff's last 4 lines add noise to the diff, and I'd like to see that
as a seprate mini-patch.

Either way, with or without the separate mini-patch, patch 5/8 is
Reviewed-by: Chad Versace 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 6/8] i965: Fix {src, dst}_pitch alignment check for XY_SRC_COPY_BLT

2015-09-16 Thread Chad Versace
On Wed 19 Aug 2015, Anuj Phogat wrote:
> Current code checks the alignment restrictions only for Y tiling.
> From Broadwell PRM vol 10:
> 
>  "pitch is of 512Byte granularity for Tile-X: This means the tiled-x
>   surface pitch can be (512, 1024, 1536, 2048...)/4 (in Dwords)."
> 
> This patch adds the restriction for X tiling as well.
> 
> Signed-off-by: Anuj Phogat 
> Reviewed-by: Ben Widawsky 
> ---
>  src/mesa/drivers/dri/i965/intel_blit.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)

Patch 6/8 is
Reviewed-by: Chad Versace 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 7/8] i965/gen9: Fix {src, dst}_pitch alignment check for XY_FAST_COPY_BLT

2015-09-16 Thread Chad Versace
On Wed 19 Aug 2015, Anuj Phogat wrote:
> I misinterpreted the alignmnet restriction in XY_FAST_COPY_BLT earlier.
> Instead of checking pitch for 64KB alignmnet we need to check it for
> tile widh alignment.
> 
> Signed-off-by: Anuj Phogat 
> Cc: Ben Widawsky 
> ---
>  src/mesa/drivers/dri/i965/intel_blit.c | 18 +++---
>  1 file changed, 7 insertions(+), 11 deletions(-)

Patch 7/8 is
Reviewed-by: Chad Versace 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 8/8] i965: Rename intel_miptree_get_dimensions_for_image()

2015-09-16 Thread Chad Versace
On Wed 19 Aug 2015, Anuj Phogat wrote:
> This function isn't specific to miptrees. So, drop the "miptree"
> from function name.
> 
> Signed-off-by: Anuj Phogat 
> ---
>  src/mesa/drivers/dri/i965/intel_fbo.c  | 2 +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c  | 6 +++---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h  | 4 ++--
>  src/mesa/drivers/dri/i965/intel_tex_image.c| 3 +--
>  src/mesa/drivers/dri/i965/intel_tex_validate.c | 3 +--
>  5 files changed, 8 insertions(+), 10 deletions(-)


> @@ -928,8 +928,8 @@ intel_miptree_release(struct intel_mipmap_tree **mt)
>  }
>  
>  void
> -intel_miptree_get_dimensions_for_image(struct gl_texture_image *image,
> -   int *width, int *height, int *depth)
> +intel_get_image_dims(struct gl_texture_image *image,
> + int *width, int *height, int *depth)
>  {
> switch (image->TexObject->Target) {
> case GL_TEXTURE_1D_ARRAY:

True, the function isn't specific to miptrees. But it *is* specific to
Intel's RENDER_SURFACE_STATE, as it translates the image's (width,
height, depth), from the perspective of the OpenGL API, to the needs of
Intel hardware.

Now that 'miptree' is removed from the function name, the function name
looks like a mere getter. In that case, it's not clear why the caller
cannot just access image->width, image->height, and image->depth
directly.

So that we all don't forget why this function exists next year, please copy
into the function the relevant portions of this comment from
intel_miptree_create_layout():

  /* For a 1D Array texture the OpenGL API will treat the height0
   * parameter as the number of array slices. For Intel hardware, we treat
   * the 1D array as a 2D Array with a height of 1.
   *
   * So, when we first come through this path to create a 1D Array
   * texture, height0 stores the number of slices, and depth0 is 1. In
   * this case, we want to swap height0 and depth0.
   *
   * Since some miptrees will be created based on the base miptree, we may
   * come through this path and see height0 as 1 and depth0 being the
   * number of slices. In this case we don't need to do the swap.
   */

With such a comment, I think this patch will be ok.

By the way, the height<->depth adjustment that intel_miptree_create_layout()
performs directly beneath that comment, that adjustment duplicates the
height<-> adjustment done by intel_get_image_dims(). That means you may be able
to eliminate intel_get_image_dims() completely, and rely on
intel_miptree_create_layout() to do the adjustment for you. I say "may" because
I haven't investigated it closely enough to be confident.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/11] glsl: add SYSTEM_VALUE_VERTEX_CNT

2015-09-16 Thread Rob Clark
On Sun, Sep 13, 2015 at 11:51 AM, Rob Clark  wrote:
> From: Rob Clark 
>
> Used internally in freedreno/ir3 to calc stream-out position.  Seems
> like a generic enough way to implement stream-out (using str instrs),
> plus it avoids compiler warnings by sneaking in a non-enum value in
> switch statements.
>
> Signed-off-by: Rob Clark 

Anyone got any strong opinions about this?  At least a Meh'd-by?  I'd
like to push the freedreno/ir3 conversion to
varying_slot_/frag_result_/etc, but need a way to handle my internal
vtxcnt sysval, but don't want to step on any toes..

I could alternatively do something like SYSTEM_VALUE_DRIVERn which
gets re-#defined in freedreno/ir3?

BR,
-R

> ---
>  src/glsl/shader_enums.h | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/src/glsl/shader_enums.h b/src/glsl/shader_enums.h
> index d054b87..fb4bcd0 100644
> --- a/src/glsl/shader_enums.h
> +++ b/src/glsl/shader_enums.h
> @@ -402,6 +402,12 @@ typedef enum
> SYSTEM_VALUE_TESS_LEVEL_INNER, /**< TES input */
> /*@}*/
>
> +   /**
> +* Driver internal vertex-count, used (for example) for drivers to
> +* calculate stride for stream-out outputs.  Not externally visible.
> +*/
> +   SYSTEM_VALUE_VERTEX_CNT,
> +
> SYSTEM_VALUE_MAX /**< Number of values */
>  } gl_system_value;
>
> --
> 2.4.3
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/11] glsl: add SYSTEM_VALUE_VERTEX_CNT

2015-09-16 Thread Ilia Mirkin
On Wed, Sep 16, 2015 at 7:30 PM, Rob Clark  wrote:
> On Sun, Sep 13, 2015 at 11:51 AM, Rob Clark  wrote:
>> From: Rob Clark 
>>
>> Used internally in freedreno/ir3 to calc stream-out position.  Seems
>> like a generic enough way to implement stream-out (using str instrs),
>> plus it avoids compiler warnings by sneaking in a non-enum value in
>> switch statements.
>>
>> Signed-off-by: Rob Clark 
>
> Anyone got any strong opinions about this?  At least a Meh'd-by?  I'd
> like to push the freedreno/ir3 conversion to
> varying_slot_/frag_result_/etc, but need a way to handle my internal
> vtxcnt sysval, but don't want to step on any toes..
>
> I could alternatively do something like SYSTEM_VALUE_DRIVERn which
> gets re-#defined in freedreno/ir3?

It's definitely a bit weird to define these driver-specific things
that generic things know nothing about. I think it made a lot more
sense when it was completely driver-private with a TGSI semantic. Any
reason you can't just do that here? Start defining private ones
starting at SYSTEM_VALUE_MAX?

>
> BR,
> -R
>
>> ---
>>  src/glsl/shader_enums.h | 6 ++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/src/glsl/shader_enums.h b/src/glsl/shader_enums.h
>> index d054b87..fb4bcd0 100644
>> --- a/src/glsl/shader_enums.h
>> +++ b/src/glsl/shader_enums.h
>> @@ -402,6 +402,12 @@ typedef enum
>> SYSTEM_VALUE_TESS_LEVEL_INNER, /**< TES input */
>> /*@}*/
>>
>> +   /**
>> +* Driver internal vertex-count, used (for example) for drivers to
>> +* calculate stride for stream-out outputs.  Not externally visible.
>> +*/
>> +   SYSTEM_VALUE_VERTEX_CNT,
>> +
>> SYSTEM_VALUE_MAX /**< Number of values */
>>  } gl_system_value;
>>
>> --
>> 2.4.3
>>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/11] glsl: add SYSTEM_VALUE_VERTEX_CNT

2015-09-16 Thread Rob Clark
On Wed, Sep 16, 2015 at 7:34 PM, Ilia Mirkin  wrote:
> On Wed, Sep 16, 2015 at 7:30 PM, Rob Clark  wrote:
>> On Sun, Sep 13, 2015 at 11:51 AM, Rob Clark  wrote:
>>> From: Rob Clark 
>>>
>>> Used internally in freedreno/ir3 to calc stream-out position.  Seems
>>> like a generic enough way to implement stream-out (using str instrs),
>>> plus it avoids compiler warnings by sneaking in a non-enum value in
>>> switch statements.
>>>
>>> Signed-off-by: Rob Clark 
>>
>> Anyone got any strong opinions about this?  At least a Meh'd-by?  I'd
>> like to push the freedreno/ir3 conversion to
>> varying_slot_/frag_result_/etc, but need a way to handle my internal
>> vtxcnt sysval, but don't want to step on any toes..
>>
>> I could alternatively do something like SYSTEM_VALUE_DRIVERn which
>> gets re-#defined in freedreno/ir3?
>
> It's definitely a bit weird to define these driver-specific things
> that generic things know nothing about. I think it made a lot more
> sense when it was completely driver-private with a TGSI semantic. Any
> reason you can't just do that here? Start defining private ones
> starting at SYSTEM_VALUE_MAX?

mostly because gcc gives a warning about non enum values in an enum
switch statement..

it wasn't an issue w/ tgsi semantic's since everything was #define

BR,
-R


>>
>> BR,
>> -R
>>
>>> ---
>>>  src/glsl/shader_enums.h | 6 ++
>>>  1 file changed, 6 insertions(+)
>>>
>>> diff --git a/src/glsl/shader_enums.h b/src/glsl/shader_enums.h
>>> index d054b87..fb4bcd0 100644
>>> --- a/src/glsl/shader_enums.h
>>> +++ b/src/glsl/shader_enums.h
>>> @@ -402,6 +402,12 @@ typedef enum
>>> SYSTEM_VALUE_TESS_LEVEL_INNER, /**< TES input */
>>> /*@}*/
>>>
>>> +   /**
>>> +* Driver internal vertex-count, used (for example) for drivers to
>>> +* calculate stride for stream-out outputs.  Not externally visible.
>>> +*/
>>> +   SYSTEM_VALUE_VERTEX_CNT,
>>> +
>>> SYSTEM_VALUE_MAX /**< Number of values */
>>>  } gl_system_value;
>>>
>>> --
>>> 2.4.3
>>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92020] wglCreatePbufferARB handle attrib error

2015-09-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92020

--- Comment #5 from zeif <332447...@qq.com> ---
(In reply to Emil Velikov from comment #4)
> While one's in the emulator they could also fix the strstr in
> wglGetExtentionsProcAddress.
> 
> Currently it will trigger whenever it finds FooBar, even if it's looking for
> Foo.

Sincerely thank you for your help,

(In reply to Emil Velikov from comment #4)
> While one's in the emulator they could also fix the strstr in
> wglGetExtentionsProcAddress.
> 
> Currently it will trigger whenever it finds FooBar, even if it's looking for
> Foo.



Sincerely thank you for your help.

I know what I'm going to do.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Revert "mesa/extensions: restrict GL_OES_EGL_image to GLES"

2015-09-16 Thread Dieter Nützel

Am 16.09.2015 23:00, schrieb Dave Airlie:

This reverts commit 48961fa3ba37999a6f8fd812458b735e39604a95.

glamor/Xwayland use this, the spec saying something when it
was written, and the fact that the comment says Mesa relies on it
hasn't changed.


Thank you Dave!
r600g - NI/Turks works again with glamor, now.

Soo,
Tested-by: Dieter Nützel 


I also don't have a copy of this patch in my mail archive, which
seems wierd, did it get posted to mesa-dev?

Signed-off-by: Dave Airlie 
---
 src/mesa/main/extensions.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
index 767c50e..b2c88c3 100644
--- a/src/mesa/main/extensions.c
+++ b/src/mesa/main/extensions.c
@@ -307,7 +307,8 @@ static const struct extension extension_table[] = {
{ "GL_OES_depth_texture_cube_map",
o(OES_depth_texture_cube_map), ES2, 2012 },
{ "GL_OES_draw_texture",
o(OES_draw_texture), ES1,   2004 },
{ "GL_OES_EGL_sync",o(dummy_true),
  ES1 | ES2, 2010 },
-   { "GL_OES_EGL_image",   o(OES_EGL_image),
  ES1 | ES2, 2006 },
+   /*  FIXME: Mesa expects GL_OES_EGL_image to be available in OpenGL
contexts. */
+   { "GL_OES_EGL_image",   o(OES_EGL_image),
 GL | ES1 | ES2, 2006 },
{ "GL_OES_EGL_image_external",
o(OES_EGL_image_external),   ES1 | ES2, 2010 },
{ "GL_OES_element_index_uint",  o(dummy_true),
  ES1 | ES2, 2005 },
{ "GL_OES_fbo_render_mipmap",   o(dummy_true),
  ES1 | ES2, 2005 },


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nv50, nvc0: flush texture cache in presence of coherent bufs

2015-09-16 Thread Ilia Mirkin
This fixes the newly-added arb_texture_buffer_object-bufferstorage
piglit test.

Signed-off-by: Ilia Mirkin 
Cc: "11.0" 
---
 src/gallium/drivers/nouveau/nv50/nv50_vbo.c | 19 +++
 src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c | 20 
 2 files changed, 39 insertions(+)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_vbo.c 
b/src/gallium/drivers/nouveau/nv50/nv50_vbo.c
index e798473..f5f4708 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_vbo.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_vbo.c
@@ -768,6 +768,7 @@ nv50_draw_vbo(struct pipe_context *pipe, const struct 
pipe_draw_info *info)
 {
struct nv50_context *nv50 = nv50_context(pipe);
struct nouveau_pushbuf *push = nv50->base.pushbuf;
+   bool tex_dirty = false;
int i, s;
 
/* NOTE: caller must ensure that (min_index + index_bias) is >= 0 */
@@ -797,6 +798,9 @@ nv50_draw_vbo(struct pipe_context *pipe, const struct 
pipe_draw_info *info)
 
push->kick_notify = nv50_draw_vbo_kick_notify;
 
+   /* TODO: Instead of iterating over all the buffer resources looking for
+* coherent buffers, keep track of a context-wide count.
+*/
for (s = 0; s < 3 && !nv50->cb_dirty; ++s) {
   uint32_t valid = nv50->constbuf_valid[s];
 
@@ -824,6 +828,21 @@ nv50_draw_vbo(struct pipe_context *pipe, const struct 
pipe_draw_info *info)
   nv50->cb_dirty = false;
}
 
+   for (s = 0; s < 3 && !tex_dirty; ++s) {
+  for (i = 0; i < nv50->num_textures[s] && !tex_dirty; ++i) {
+ if (!nv50->textures[s][i] ||
+ nv50->textures[s][i]->texture->target != PIPE_BUFFER)
+continue;
+ if (nv50->textures[s][i]->texture->flags &
+ PIPE_RESOURCE_FLAG_MAP_COHERENT)
+tex_dirty = true;
+  }
+   }
+   if (tex_dirty) {
+  BEGIN_NV04(push, NV50_3D(TEX_CACHE_CTL), 1);
+  PUSH_DATA (push, 0x20);
+   }
+
if (nv50->vbo_fifo) {
   nv50_push_vbo(nv50, info);
   push->kick_notify = nv50_default_kick_notify;
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c
index 6f9e790..188c7d7 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c
@@ -899,6 +899,9 @@ nvc0_draw_vbo(struct pipe_context *pipe, const struct 
pipe_draw_info *info)
 
push->kick_notify = nvc0_draw_vbo_kick_notify;
 
+   /* TODO: Instead of iterating over all the buffer resources looking for
+* coherent buffers, keep track of a context-wide count.
+*/
for (s = 0; s < 5 && !nvc0->cb_dirty; ++s) {
   uint32_t valid = nvc0->constbuf_valid[s];
 
@@ -924,6 +927,23 @@ nvc0_draw_vbo(struct pipe_context *pipe, const struct 
pipe_draw_info *info)
   nvc0->cb_dirty = false;
}
 
+   for (s = 0; s < 5; ++s) {
+  for (int i = 0; i < nvc0->num_textures[s]; ++i) {
+ struct nv50_tic_entry *tic = nv50_tic_entry(nvc0->textures[s][i]);
+ struct pipe_resource *res;
+ if (!tic)
+continue;
+ res = nvc0->textures[s][i]->texture;
+ if (res->target != PIPE_BUFFER ||
+ !(res->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT))
+continue;
+
+ BEGIN_NVC0(push, NVC0_3D(TEX_CACHE_CTL), 1);
+ PUSH_DATA (push, (tic->id << 4) | 1);
+ NOUVEAU_DRV_STAT(&nvc0->screen->base, tex_cache_flush_count, 1);
+  }
+   }
+
if (nvc0->state.vbo_mode) {
   nvc0_push_vbo(nvc0, info);
   push->kick_notify = nvc0_default_kick_notify;
-- 
2.4.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: fix textureGrad for cubemaps

2015-09-16 Thread Tapani Pälli
Fixes regression caused by commit
2b1cdb0eddb73f62e4848d4b64840067f1f70865 in:
   ES3-CTS.gtf.GL3Tests.shadow.shadow_execution_frag

No regressions observed in deqp, CTS or Piglit.

Signed-off-by: Tapani Pälli 
Signed-off-by: Kevin Rogovin 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91114
Cc: "11.0 10.7" 
---
 .../dri/i965/brw_lower_texture_gradients.cpp   | 172 -
 1 file changed, 169 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_lower_texture_gradients.cpp 
b/src/mesa/drivers/dri/i965/brw_lower_texture_gradients.cpp
index 7a5f983..f8a31b7 100644
--- a/src/mesa/drivers/dri/i965/brw_lower_texture_gradients.cpp
+++ b/src/mesa/drivers/dri/i965/brw_lower_texture_gradients.cpp
@@ -48,6 +48,7 @@ public:
 
 private:
void emit(ir_variable *, ir_rvalue *);
+   ir_variable *temp(void *ctx, const glsl_type *type, const char *name);
 };
 
 /**
@@ -60,6 +61,17 @@ lower_texture_grad_visitor::emit(ir_variable *var, ir_rvalue 
*value)
base_ir->insert_before(assign(var, value));
 }
 
+/**
+ * Emit a temporary variable declaration
+ */
+ir_variable *
+lower_texture_grad_visitor::temp(void *ctx, const glsl_type *type, const char 
*name)
+{
+   ir_variable *var = new(ctx) ir_variable(type, name, ir_var_temporary);
+   base_ir->insert_before(var);
+   return var;
+}
+
 static const glsl_type *
 txs_type(const glsl_type *type)
 {
@@ -162,9 +174,163 @@ lower_texture_grad_visitor::visit_leave(ir_texture *ir)
 */
ir->op = ir_txl;
if (ir->sampler->type->sampler_dimensionality == GLSL_SAMPLER_DIM_CUBE) {
-  ir->lod_info.lod = expr(ir_binop_add,
-  expr(ir_unop_log2, rho),
-  new(mem_ctx) ir_constant(-1.0f));
+  /* Cubemap texture lookups first generate a texture coordinate normalized
+ to [-1, 1] on the appropiate face. The appropiate face is determined
+ by which component has largest magnitude and its sign. The texture
+ coordinate is the quotient of the remaining texture coordinates 
against
+ that absolute value of the component of largest magnitude. This 
division
+ requires that the computing of the derivative of the texel coordinate
+ must use the quotient rule. The high level GLSL code is as follows:
+
+ Step 1: selection
+
+ vec3 abs_p, Q, dQdx, dQdy;
+ abs_p = abs(ir->coordinate);
+ if (abs_p.x >= max(abs_p.y, abs_p.z)) {
+Q = ir->coordinate.yzx;
+dQdx = ir->lod_info.grad.dPdx.yzx;
+dQdy = ir->lod_info.grad.dPdy.yzx;
+ }
+ if (abs_p.y >= max(abs_p.x, abs_p.z)) {
+Q = ir->coordinate.xzy;
+dQdx = ir->lod_info.grad.dPdx.xzy;
+dQdy = ir->lod_info.grad.dPdy.xzy;
+ }
+ if (abs_p.z >= max(abs_p.x, abs_p.y)) {
+Q = ir->coordinate;
+dQdx = ir->lod_info.grad.dPdx;
+dQdy = ir->lod_info.grad.dPdy;
+ }
+
+ Step 2: use quotient rule to compute derivative. The normalized to 
[-1, 1]
+ texel coordinate is given by Q.xy / (sign(Q.z) * Q.z). We are only 
concerned
+ with the magnitudes of the derivatives whose values are not affected 
by the
+ sign. We drop the sign from the computation.
+
+ vec2 dx, dy;
+ float recip;
+
+ recip = 1.0 / Q.z;
+ dx = recip * ( dqdx.xy - q.xy * (dqdx.z * recip) );
+ dy = recip * ( dqdy.xy - q.xy * (dqdy.z * recip) );
+
+ Step 3: compute LOD. At this point we have the derivatives of the
+ texture coordinates normalized to [-1,1]. We take the LOD to be
+  result = log2( max(sqrt(dot(dx, dx)), sqrt(dy, dy)) * 0.5 * L)
+ = -1.0 + log2(max(sqrt(dot(dx, dx)), sqrt(dy, dy)) * L)
+ = -1.0 + log2(sqrt(max(dot(dx, dx), dot(dy,dy))) * L)
+ = -1.0 + log2(sqrt(l * l * max(dot(dx, dx), dot(dy,dy
+ = -1.0 + 0.5 * log2(L * L * max(dot(dx, dx), dot(dy,dy)))
+ where L is the dimension of the cubemap. The code is:
+
+ float m, result;
+ m = max(dot(dx, dx), dot(dy, dy));
+ L = textureSize(sampler, 0).x;
+ result = -1.0 + 0.5 * log2(l * l * m);
+   */
+
+/* Helpers to make code more human readable. */
+#define EMIT(instr) base_ir->insert_before(instr)
+#define THEN(irif, instr) irif->then_instructions.push_tail(instr)
+#define CLONE(x) x->clone(mem_ctx, NULL)
+
+  ir_variable *abs_p = temp(mem_ctx, glsl_type::vec3_type, "abs_p");
+
+  EMIT(assign(abs_p, swizzle_for_size(abs(CLONE(ir->coordinate)), 3)));
+
+  ir_variable *Q = temp(mem_ctx, glsl_type::vec3_type, "Q");
+  ir_variable *dQdx = temp(mem_ctx, glsl_type::vec3_type, "dQdx");
+  ir_variable *dQdy = temp(mem_ctx, glsl_type::vec3_type, "dQdy");
+
+  /* unmodified dPdx, dPdy values */
+  ir_rvalue *dPdx = ir->lod_info.grad.dPdx;
+  ir_rvalue *dPdy = ir->lod_info.grad.dPdy

Re: [Mesa-dev] [PATCH] mesa: fix errors when reading depth with glReadPixels

2015-09-16 Thread Tapani Pälli



On 09/15/2015 07:15 PM, Emil Velikov wrote:

Hi Tapani,

On 15 September 2015 at 08:13, Tapani Pälli  wrote:

OpenGL ES 3.0 spec 3.7.2 "Transfer of Pixel Rectangles" specifies
DEPTH_COMPONENT, UNSIGNED_INT as a valid couple, validation for
internal format is checked by is_float_depth().

Fix regression caused by 81d2fd9 in following CTS test:

Please use 8+ symbols for the sha and/or add the mesa-stable tag.

With the former I have a script that will pick the patch, in case the
latter is missed :-)


OK, will do. BTW I was just about to push this but I noticed a failing 
Piglit test "ext_packed_depth_stencil-errors" (all of CTS passes 
though). Will need to investigate this a bit more.


// Tapani
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] freedreno/a3xx: disable filtering for texture buffers and int textures

2015-09-16 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/freedreno/a3xx/fd3_texture.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/gallium/drivers/freedreno/a3xx/fd3_texture.c 
b/src/gallium/drivers/freedreno/a3xx/fd3_texture.c
index 3367f23..6ed5e0c 100644
--- a/src/gallium/drivers/freedreno/a3xx/fd3_texture.c
+++ b/src/gallium/drivers/freedreno/a3xx/fd3_texture.c
@@ -229,6 +229,8 @@ fd3_sampler_view_create(struct pipe_context *pctx, struct 
pipe_resource *prsc,
fd3_tex_swiz(cso->format, cso->swizzle_r, 
cso->swizzle_g,
cso->swizzle_b, cso->swizzle_a);
 
+   if (prsc->target == PIPE_BUFFER || 
util_format_is_pure_integer(cso->format))
+   so->texconst0 |= A3XX_TEX_CONST_0_NOCONVERT;
if (util_format_is_srgb(cso->format))
so->texconst0 |= A3XX_TEX_CONST_0_SRGB;
 
-- 
2.4.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] freedreno/a3xx: fix texture buffers, enable offsets

2015-09-16 Thread Ilia Mirkin
The main issue is that the current logic looked into cso->u.tex, which
is the wrong side of the union to look into for texture buffers. While I
was at it, it was easy enough to add the logic to handle offsets
(first_element).

 - reduce texture buffer size limit (determined experimentally)
 - don't look at first/last levels, instead look at first/last element
 - include the first element offset
 - set offset alignment to 16 (determined experimentally)

Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/freedreno/a3xx/fd3_emit.c| 16 +++-
 src/gallium/drivers/freedreno/a3xx/fd3_texture.c | 21 +
 src/gallium/drivers/freedreno/freedreno_screen.c |  9 -
 3 files changed, 32 insertions(+), 14 deletions(-)

diff --git a/src/gallium/drivers/freedreno/a3xx/fd3_emit.c 
b/src/gallium/drivers/freedreno/a3xx/fd3_emit.c
index 4e56a71..e4c618b 100644
--- a/src/gallium/drivers/freedreno/a3xx/fd3_emit.c
+++ b/src/gallium/drivers/freedreno/a3xx/fd3_emit.c
@@ -209,13 +209,19 @@ emit_textures(struct fd_context *ctx, struct 
fd_ringbuffer *ring,
fd3_pipe_sampler_view(tex->textures[i]) 
:
&dummy_view;
struct fd_resource *rsc = 
fd_resource(view->base.texture);
-   unsigned start = fd_sampler_first_level(&view->base);
-   unsigned end   = fd_sampler_last_level(&view->base);;
+   if (rsc && rsc->base.b.target == PIPE_BUFFER) {
+   OUT_RELOC(ring, rsc->bo, 
view->base.u.buf.first_element *
+ 
util_format_get_blocksize(view->base.format), 0, 0);
+   j = 1;
+   } else {
+   unsigned start = 
fd_sampler_first_level(&view->base);
+   unsigned end   = 
fd_sampler_last_level(&view->base);;
 
-   for (j = 0; j < (end - start + 1); j++) {
-   struct fd_resource_slice *slice =
+   for (j = 0; j < (end - start + 1); j++) {
+   struct fd_resource_slice *slice =
fd_resource_slice(rsc, j + 
start);
-   OUT_RELOC(ring, rsc->bo, slice->offset, 0, 0);
+   OUT_RELOC(ring, rsc->bo, slice->offset, 
0, 0);
+   }
}
 
/* pad the remaining entries w/ null: */
diff --git a/src/gallium/drivers/freedreno/a3xx/fd3_texture.c 
b/src/gallium/drivers/freedreno/a3xx/fd3_texture.c
index 2d6ecb2..3367f23 100644
--- a/src/gallium/drivers/freedreno/a3xx/fd3_texture.c
+++ b/src/gallium/drivers/freedreno/a3xx/fd3_texture.c
@@ -211,8 +211,7 @@ fd3_sampler_view_create(struct pipe_context *pctx, struct 
pipe_resource *prsc,
 {
struct fd3_pipe_sampler_view *so = CALLOC_STRUCT(fd3_pipe_sampler_view);
struct fd_resource *rsc = fd_resource(prsc);
-   unsigned lvl = fd_sampler_first_level(cso);
-   unsigned miplevels = fd_sampler_last_level(cso) - lvl;
+   unsigned lvl;
uint32_t sz2 = 0;
 
if (!so)
@@ -227,17 +226,31 @@ fd3_sampler_view_create(struct pipe_context *pctx, struct 
pipe_resource *prsc,
so->texconst0 =
A3XX_TEX_CONST_0_TYPE(tex_type(prsc->target)) |
A3XX_TEX_CONST_0_FMT(fd3_pipe2tex(cso->format)) |
-   A3XX_TEX_CONST_0_MIPLVLS(miplevels) |
fd3_tex_swiz(cso->format, cso->swizzle_r, 
cso->swizzle_g,
cso->swizzle_b, cso->swizzle_a);
 
if (util_format_is_srgb(cso->format))
so->texconst0 |= A3XX_TEX_CONST_0_SRGB;
 
-   so->texconst1 =
+   if (prsc->target == PIPE_BUFFER) {
+   lvl = 0;
+   so->texconst1 =
+   
A3XX_TEX_CONST_1_FETCHSIZE(fd3_pipe2fetchsize(cso->format)) |
+   A3XX_TEX_CONST_1_WIDTH(cso->u.buf.last_element -
+  
cso->u.buf.first_element + 1) |
+   A3XX_TEX_CONST_1_HEIGHT(1);
+   } else {
+   unsigned miplevels;
+
+   lvl = fd_sampler_first_level(cso);
+   miplevels = fd_sampler_last_level(cso) - lvl;
+
+   so->texconst0 |= A3XX_TEX_CONST_0_MIPLVLS(miplevels);
+   so->texconst1 =

A3XX_TEX_CONST_1_FETCHSIZE(fd3_pipe2fetchsize(cso->format)) |
A3XX_TEX_CONST_1_WIDTH(u_minify(prsc->width0, lvl)) |
A3XX_TEX_CONST_1_HEIGHT(u_minify(prsc->height0, lvl));
+   }
/* when emitted, A3XX_TEX_CONST_2_INDX() must be OR'd in: */
so->texconst2 =

A3XX_T