date:20150921

Re: [Mesa-dev] [PATCH v5 26/70] glsl: Add parser/compiler support for std430 interface packing qualifier

2015-09-21 Thread Samuel Iglesias Gonsálvez



On 18/09/15 20:14, Kristian Høgsberg wrote:
> On Thu, Sep 10, 2015 at 03:35:42PM +0200, Iago Toral Quiroga wrote:
>> From: Samuel Iglesias Gonsalvez 
>>
>> v2:
>> - Fix a missing check in has_layout()
>>
>> Signed-off-by: Samuel Iglesias Gonsalvez 
>> Reviewed-by: Jordan Justen 
>> ---
>>  src/glsl/ast.h   |  1 +
>>  src/glsl/ast_to_hir.cpp  | 20 
>>  src/glsl/ast_type.cpp|  2 ++
>>  src/glsl/glsl_parser.yy  |  2 ++
>>  src/glsl/glsl_types.h|  3 ++-
>>  src/glsl/link_uniform_blocks.cpp | 15 ---
>>  src/mesa/main/mtypes.h   |  3 ++-
>>  7 files changed, 37 insertions(+), 9 deletions(-)
>>
>> diff --git a/src/glsl/ast.h b/src/glsl/ast.h
>> index cca32b3..4c31436 100644
>> --- a/src/glsl/ast.h
>> +++ b/src/glsl/ast.h
>> @@ -491,6 +491,7 @@ struct ast_type_qualifier {
>>   /** \name Layout qualifiers for GL_ARB_uniform_buffer_object */
>>   /** \{ */
>>   unsigned std140:1;
>> + unsigned std430:1;
>>   unsigned shared:1;
>>   unsigned packed:1;
>>   unsigned column_major:1;
>> diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
>> index 92038a6..566cc87 100644
>> --- a/src/glsl/ast_to_hir.cpp
>> +++ b/src/glsl/ast_to_hir.cpp
>> @@ -2920,11 +2920,12 @@ apply_type_qualifier_to_variable(const struct 
>> ast_type_qualifier *qual,
>> var->data.depth_layout = ir_depth_layout_none;
>>  
>> if (qual->flags.q.std140 ||
>> +   qual->flags.q.std430 ||
>> qual->flags.q.packed ||
>> qual->flags.q.shared) {
>>_mesa_glsl_error(loc, state,
>> -   "uniform block layout qualifiers std140, packed, and 
>> "
>> -   "shared can only be applied to uniform blocks, not "
>> +   "uniform block layout qualifiers std140, std430, 
>> packed, "
>> +   "and shared can only be applied to uniform blocks, 
>> not "
>> "members");
> 
> Update the error to also mention shader storage blocks, not just ubos?
> 

OK, I will do it.

Sam

>> }
>>  
>> @@ -5691,12 +5692,14 @@ ast_process_structure_or_interface_block(exec_list 
>> *instructions,
>>   const struct ast_type_qualifier *const qual =
>>  & decl_list->type->qualifier;
>>   if (qual->flags.q.std140 ||
>> + qual->flags.q.std430 ||
>>   qual->flags.q.packed ||
>>   qual->flags.q.shared) {
>>  _mesa_glsl_error(&loc, state,
>>   "uniform/shader storage block layout 
>> qualifiers "
>> - "std140, packed, and shared can only be 
>> applied "
>> - "to uniform/shader storage blocks, not 
>> members");
>> + "std140, std430, packed, and shared can only 
>> be "
>> + "applied to uniform/shader storage blocks, not 
>> "
>> + "members");
>>   }
>>  
>>   if (qual->flags.q.constant) {
>> @@ -5908,6 +5911,13 @@ ast_interface_block::hir(exec_list *instructions,
>> this->block_name);
>> }
>>  
>> +   if (!this->layout.flags.q.buffer &&
>> +   this->layout.flags.q.std430) {
>> +  _mesa_glsl_error(&loc, state,
>> +   "std430 storage block layout qualifier is supported "
>> +   "only for shader storage blocks");
>> +   }
>> +
>> /* The ast_interface_block has a list of ast_declarator_lists.  We
>>  * need to turn those into ir_variables with an association
>>  * with this uniform block.
>> @@ -5917,6 +5927,8 @@ ast_interface_block::hir(exec_list *instructions,
>>packing = GLSL_INTERFACE_PACKING_SHARED;
>> } else if (this->layout.flags.q.packed) {
>>packing = GLSL_INTERFACE_PACKING_PACKED;
>> +   } else if (this->layout.flags.q.std430) {
>> +  packing = GLSL_INTERFACE_PACKING_STD430;
>> } else {
>>/* The default layout is std140.
>> */
>> diff --git a/src/glsl/ast_type.cpp b/src/glsl/ast_type.cpp
>> index a4671e2..08a4504 100644
>> --- a/src/glsl/ast_type.cpp
>> +++ b/src/glsl/ast_type.cpp
>> @@ -65,6 +65,7 @@ ast_type_qualifier::has_layout() const
>>|| this->flags.q.depth_less
>>|| this->flags.q.depth_unchanged
>>|| this->flags.q.std140
>> +  || this->flags.q.std430
>>|| this->flags.q.shared
>>|| this->flags.q.column_major
>>|| this->flags.q.row_major
>> @@ -123,6 +124,7 @@ ast_type_qualifier::merge_qualifier(YYLTYPE *loc,
>> ubo_layout_mask.flags.q.std140 = 1;
>> ubo_layout_mask.flags.q.packed = 1;
>> ubo_layout_mask.flags.q.shared = 1;
>> +   ubo_layout_mask.flags.q.std430 = 1;
>>  
>> ast_type_qualifier ubo_binding_mask;
>> ubo_binding_mask.flags.i = 0;
>> diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
>> index

Re: [Mesa-dev] [PATCH v5 29/70] glsl: a shader storage buffer must be smaller than the maximum size allowed

2015-09-21 Thread Samuel Iglesias Gonsálvez



On 18/09/15 20:59, Kristian Høgsberg wrote:
> On Thu, Sep 10, 2015 at 03:35:45PM +0200, Iago Toral Quiroga wrote:
>> From: Samuel Iglesias Gonsalvez 
>>
>> Otherwise, generate a link time error as per the
>> ARB_shader_storage_buffer_object spec.
>>
>> v2:
>> - Fix error message (Jordan)
>>
>> Signed-off-by: Samuel Iglesias Gonsalvez 
>> Reviewed-by: Jordan Justen 
>> ---
>>  src/glsl/glsl_types.cpp  |  9 +++--
>>  src/glsl/link_uniform_blocks.cpp | 19 +++
>>  src/glsl/linker.cpp  |  2 +-
>>  src/glsl/linker.h|  1 +
>>  4 files changed, 28 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp
>> index d97991a..f8227df 100644
>> --- a/src/glsl/glsl_types.cpp
>> +++ b/src/glsl/glsl_types.cpp
>> @@ -1325,7 +1325,7 @@ glsl_type::std140_size(bool row_major) const
>>  * rounded up to the next multiple of the base alignment of the
>>  * structure.
>>  */
>> -   if (this->is_record()) {
>> +   if (this->is_record() || this->is_interface()) {
>>unsigned size = 0;
>>unsigned max_align = 0;
>>  
>> @@ -1341,7 +1341,12 @@ glsl_type::std140_size(bool row_major) const
>>  
>>   const struct glsl_type *field_type = this->fields.structure[i].type;
>>   unsigned align = field_type->std140_base_alignment(field_row_major);
>> - size = glsl_align(size, align);
>> +
>> + /* Ignore unsized arrays when calculating size */
>> + if (field_type->is_unsized_array())
>> +continue;
>> +
>> + size = glsl_align(size, align);
>>   size += field_type->std140_size(field_row_major);
>>  
>>   max_align = MAX2(align, max_align);
> 
> These two chunks look like they should be their own patch ("Add
> unsized array support to glsl_type::std140_size" or such).
> 

Agreed. I will do that change.

Thanks for reviewing!

Sam

>> diff --git a/src/glsl/link_uniform_blocks.cpp 
>> b/src/glsl/link_uniform_blocks.cpp
>> index 8f65f4a..7ceffee 100644
>> --- a/src/glsl/link_uniform_blocks.cpp
>> +++ b/src/glsl/link_uniform_blocks.cpp
>> @@ -187,6 +187,7 @@ struct block {
>>  
>>  unsigned
>>  link_uniform_blocks(void *mem_ctx,
>> +struct gl_context *ctx,
>>  struct gl_shader_program *prog,
>>  struct gl_shader **shader_list,
>>  unsigned num_shaders,
>> @@ -308,6 +309,15 @@ link_uniform_blocks(void *mem_ctx,
>>  
>>  blocks[i].UniformBufferSize = parcel.buffer_size;
>>  
>> +/* Check SSBO size is lower than maximum supported size for 
>> SSBO */
>> +if (b->is_shader_storage &&
>> +parcel.buffer_size > ctx->Const.MaxShaderStorageBlockSize) {
>> +   linker_error(prog, "shader storage block `%s' has size %d, "
>> +"which is larger than than the maximum allowed 
>> (%d)",
>> +block_type->name,
>> +parcel.buffer_size,
>> +ctx->Const.MaxShaderStorageBlockSize);
>> +}
>>  blocks[i].NumUniforms =
>> (unsigned)(ptrdiff_t)(&variables[parcel.index] - 
>> blocks[i].Uniforms);
>>  
>> @@ -328,6 +338,15 @@ link_uniform_blocks(void *mem_ctx,
>>  
>>   blocks[i].UniformBufferSize = parcel.buffer_size;
>>  
>> + /* Check SSBO size is lower than maximum supported size for SSBO */
>> + if (b->is_shader_storage &&
>> + parcel.buffer_size > ctx->Const.MaxShaderStorageBlockSize) {
>> +linker_error(prog, "shader storage block `%s' has size %d, "
>> + "which is larger than than the maximum allowed 
>> (%d)",
>> + block_type->name,
>> + parcel.buffer_size,
>> + ctx->Const.MaxShaderStorageBlockSize);
>> + }
>>   blocks[i].NumUniforms =
>>  (unsigned)(ptrdiff_t)(&variables[parcel.index] - 
>> blocks[i].Uniforms);
>>  
>> diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
>> index e078f86..cf9f1f6 100644
>> --- a/src/glsl/linker.cpp
>> +++ b/src/glsl/linker.cpp
>> @@ -2023,7 +2023,7 @@ link_intrastage_shaders(void *mem_ctx,
>>  
>> /* Link up uniform blocks defined within this stage. */
>> const unsigned num_uniform_blocks =
>> -  link_uniform_blocks(mem_ctx, prog, shader_list, num_shaders,
>> +  link_uniform_blocks(mem_ctx, ctx, prog, shader_list, num_shaders,
>>&uniform_blocks);
>> if (!prog->LinkStatus)
>>return NULL;
>> diff --git a/src/glsl/linker.h b/src/glsl/linker.h
>> index fb8f81e..953b49f 100644
>> --- a/src/glsl/linker.h
>> +++ b/src/glsl/linker.h
>> @@ -56,6 +56,7 @@ link_uniform_blocks_are_compatible(const gl_uniform_block 
>> *a,
>>  
>>  extern unsigned
>>  link_uniform_blocks(void *mem_ctx,
>> +struct gl_context *ctx,
>>

Re: [Mesa-dev] [RFC] nir/opt_peephole_ffma: Bypass fusion if any operand of fadd and fmul is a const

2015-09-21 Thread Eduardo Lima Mitev

On 09/19/2015 09:00 PM, Jason Ekstrand wrote:
> 
> On Sep 18, 2015 2:49 AM, "Eduardo Lima Mitev"  > wrote:
>>
>> When both fadd and fmul instructions have at least one operand that is a
>> constant and it is only used once, the total number of instructions can
>> be reduced from 3 (1 ffma + 2 load_const) to 2 (1 fmul + 1 fadd); because
>> the constants will be progagated as immediate operands of fmul and fadd.
> 
> Right... I'm pretty sure that I've written this patch once before.  At
> the time we didn't do this because we thought Matt's constant combining
> would take care of most of those issues and we could just emit the MAD. 
> However, that pass doesn't always combine things optimally and doesn't
> exist for vec4.
> 
>> This patch modifies opt_peephole_ffma pass to detect this situation and
>> bails-out fusing fmul+fadd into ffma.
>>
>> As shown in shader-db results below, it seems to help a good bunch.
> However,
>> there are some caveats:
>>
>> * It seems i965 specific, so I'm not sure if modifying the NIR pass
>> directly is desired, as opposed to moving this to the backend.
>>
>> * There are still a high number of HURTs, but these could be reduced
> by being
>> more specific in the conditions to bailout.
>>
>> total instructions in shared programs: 1683959 -> 1677447 (-0.39%)
>> instructions in affected programs: 604918 -> 598406 (-1.08%)
>> helped:4633
>> HURT:  804
>> GAINED:0
>> LOST:  0
> 
> What does that look like if you split that between vec4 and fs?
> 

The above is shader-db results filtered for VS only. Following is the
same results filtered for FS:

total instructions in shared programs: 4593761 -> 4590252 (-0.08%)
instructions in affected programs: 538690 -> 535181 (-0.65%)
helped:2981
HURT:  38
GAINED:4
LOST:  0

And this is the raw, unfiltered report:

total instructions in shared programs: 6277603 -> 6267582 (-0.16%)
instructions in affected programs: 1143656 -> 1133635 (-0.88%)
helped:7614
HURT:  842
GAINED:4
LOST:  0

For FS, the gain is more noticeable because vector instructions tend to
neutralize the benefits (constants cannot be propagated to
multi-component operands).

Eduardo

>> ---
>>  src/glsl/nir/nir_opt_peephole_ffma.c | 31 +++
>>  1 file changed, 31 insertions(+)
>>
>> diff --git a/src/glsl/nir/nir_opt_peephole_ffma.c
> b/src/glsl/nir/nir_opt_peephole_ffma.c
>> index 4f0f0da..da47f8f 100644
>> --- a/src/glsl/nir/nir_opt_peephole_ffma.c
>> +++ b/src/glsl/nir/nir_opt_peephole_ffma.c
>> @@ -133,6 +133,28 @@ get_mul_for_src(nir_alu_src *src, int num_components,
>> return alu;
>>  }
>>
>> +/**
>> + * Given a list of (at least two) nir_alu_src's, tells if any of them
> is a
>> + * constant value and is used only once.
>> + */
>> +static bool
>> +any_alu_src_is_a_constant(nir_alu_src srcs[])
>> +{
>> +   for (unsigned i = 0; i < 2; i++) {
>> +  if (srcs[i].src.ssa->parent_instr->type ==
> nir_instr_type_load_const) {
>> + nir_load_const_instr *load_const =
>> +nir_instr_as_load_const (srcs[i].src.ssa->parent_instr);
>> +
>> + if (list_length(&load_const->def.uses) == 1 &&
>> + list_length(&load_const->def.if_uses) == 0) {
>> +return true;
>> + }
>> +  }
>> +   }
>> +
>> +   return false;
>> +}
>> +
>>  static bool
>>  nir_opt_peephole_ffma_block(nir_block *block, void *void_state)
>>  {
>> @@ -183,6 +205,15 @@ nir_opt_peephole_ffma_block(nir_block *block,
> void *void_state)
>>mul_src[0] = mul->src[0].src.ssa;
>>mul_src[1] = mul->src[1].src.ssa;
>>
>> +  /* If any of the operands of the fmul and any of the fadd is a
> constant,
>> +   * we bypass because it will be more efficient as the constants
> will be
>> +   * propagated as operands, potentially saving two load_const
> instructions.
>> +   */
>> +  if (any_alu_src_is_a_constant(mul->src) &&
>> +  any_alu_src_is_a_constant(add->src)) {
>> + continue;
>> +  }
>> +
>>if (abs) {
>>   for (unsigned i = 0; i < 2; i++) {
>>  nir_alu_instr *abs = nir_alu_instr_create(state->mem_ctx,
>> --
>> 2.4.6
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org 
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v5 28/70] glsl: add std430 interface packing support to ssbo related operations

2015-09-21 Thread Samuel Iglesias Gonsálvez



On 18/09/15 21:04, Kristian Høgsberg wrote:
> On Thu, Sep 10, 2015 at 03:35:44PM +0200, Iago Toral Quiroga wrote:
>> From: Samuel Iglesias Gonsalvez 
>>
>> v2:
>> - Get interface packing information from interface's type, not the variable 
>> type.
>> - Simplify is_std430 condition in emit_access() for readability (Jordan)
>> - Add a commment explaing why array of three-component vector case is 
>> different
>>   in std430 than the rest of cases.
>> - Add calls to std430_array_stride().
>>
>> Signed-off-by: Samuel Iglesias Gonsalvez 
>> ---
>>  src/glsl/lower_ubo_reference.cpp | 102 
>> ++-
>>  1 file changed, 78 insertions(+), 24 deletions(-)
>>
>> diff --git a/src/glsl/lower_ubo_reference.cpp 
>> b/src/glsl/lower_ubo_reference.cpp
>> index 8694383..7e45a26 100644
>> --- a/src/glsl/lower_ubo_reference.cpp
>> +++ b/src/glsl/lower_ubo_reference.cpp
>> @@ -147,7 +147,8 @@ public:
>>  ir_rvalue **offset,
>>  unsigned *const_offset,
>>  bool *row_major,
>> -int *matrix_columns);
>> +int *matrix_columns,
>> +unsigned packing);
>> ir_expression *ubo_load(const struct glsl_type *type,
>> ir_rvalue *offset);
>> ir_call *ssbo_load(const struct glsl_type *type,
>> @@ -164,7 +165,7 @@ public:
>> void emit_access(bool is_write, ir_dereference *deref,
>>  ir_variable *base_offset, unsigned int deref_offset,
>>  bool row_major, int matrix_columns,
>> -unsigned write_mask);
>> +bool is_std430, unsigned write_mask);
>>  
>> ir_visitor_status visit_enter(class ir_expression *);
>> ir_expression *calculate_ssbo_unsized_array_length(ir_expression *expr);
>> @@ -176,7 +177,8 @@ public:
>>  ir_variable *);
>> ir_expression *emit_ssbo_get_buffer_size();
>>  
>> -   unsigned calculate_unsized_array_stride(ir_dereference *deref);
>> +   unsigned calculate_unsized_array_stride(ir_dereference *deref,
>> +   unsigned packing);
>>  
>> void *mem_ctx;
>> struct gl_shader *shader;
>> @@ -257,7 +259,8 @@ 
>> lower_ubo_reference_visitor::setup_for_load_or_store(ir_variable *var,
>>   ir_rvalue **offset,
>>   unsigned *const_offset,
>>   bool *row_major,
>> - int *matrix_columns)
>> + int *matrix_columns,
>> + unsigned packing)
>>  {
>> /* Determine the name of the interface block */
>> ir_rvalue *nonconst_block_index;
>> @@ -343,8 +346,15 @@ 
>> lower_ubo_reference_visitor::setup_for_load_or_store(ir_variable *var,
>>  const bool array_row_major =
>> is_dereferenced_thing_row_major(deref_array);
>>  
>> -array_stride = deref_array->type->std140_size(array_row_major);
>> -array_stride = glsl_align(array_stride, 16);
>> +/* The array type will give the correct interface packing
>> + * information
>> + */
>> +if (packing == GLSL_INTERFACE_PACKING_STD430) {
>> +   array_stride = 
>> deref_array->type->std430_array_stride(array_row_major);
>> +} else {
>> +   array_stride = 
>> deref_array->type->std140_size(array_row_major);
>> +   array_stride = glsl_align(array_stride, 16);
>> +}
>>   }
>>  
>>   ir_rvalue *array_index = deref_array->array_index;
>> @@ -380,7 +390,12 @@ 
>> lower_ubo_reference_visitor::setup_for_load_or_store(ir_variable *var,
>>  
>>  ralloc_free(field_deref);
>>  
>> -unsigned field_align = 
>> type->std140_base_alignment(field_row_major);
>> +unsigned field_align = 0;
>> +
>> +if (packing == GLSL_INTERFACE_PACKING_STD430)
>> +   field_align = type->std430_base_alignment(field_row_major);
>> +else
>> +   field_align = type->std140_base_alignment(field_row_major);
>>  
>>  intra_struct_offset = glsl_align(intra_struct_offset, 
>> field_align);
>>  
>> @@ -388,7 +403,10 @@ 
>> lower_ubo_reference_visitor::setup_for_load_or_store(ir_variable *var,
>> deref_record->field) == 0)
>> break;
>>  
>> -intra_struct_offset += type->std140_size(field_row_major);
>> +if (packing == GLSL_INTERFACE_PACKING_STD430)
>> +   intra_struct_offset += type->std430_size(field_row_major);
>> +else
>> +   intra_struct_

Re: [Mesa-dev] [PATCH v5 70/70] glsl: Mark as active all elements of shared/std140 block arrays

2015-09-21 Thread Tapani Pälli

Seems like a nice fix, takes ES3 CTS failures from 116 to 64 on Haswell 
(most failing tests are with ubos).


Tested-by: Tapani Pälli 

This is individual patch not related to just SSBOs, maybe this could be 
pushed separately from the rest?


// Tapani

On 09/10/2015 04:36 PM, Iago Toral Quiroga wrote:

From: Antia Puentes 

Commit 1ca25ab (glsl: Do not eliminate 'shared' or 'std140' blocks
or block members) considered as active 'shared' and 'std140' uniform
blocks and uniform block arrays, but did not include the block array
elements. Because of that, it was possible to have an active uniform
block array without any elements marked as used, making the assertion
((b->num_array_elements > 0) == b->type->is_array())
in link_uniform_blocks() fail.

Fixes the following 5 dEQP tests:

  * dEQP-GLES3.functional.ubo.random.nested_structs_instance_arrays.18
  * dEQP-GLES3.functional.ubo.random.nested_structs_instance_arrays.24
  * dEQP-GLES3.functional.ubo.random.nested_structs_arrays_instance_arrays.19
  * dEQP-GLES3.functional.ubo.random.all_per_block_buffers.49
  * dEQP-GLES3.functional.ubo.random.all_shared_buffer.36

Fixes bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83508
---
  src/glsl/link_uniform_block_active_visitor.cpp | 23 +++
  1 file changed, 23 insertions(+)

diff --git a/src/glsl/link_uniform_block_active_visitor.cpp 
b/src/glsl/link_uniform_block_active_visitor.cpp
index 5102947..fbe79de 100644
--- a/src/glsl/link_uniform_block_active_visitor.cpp
+++ b/src/glsl/link_uniform_block_active_visitor.cpp
@@ -106,6 +106,22 @@ link_uniform_block_active_visitor::visit(ir_variable *var)
 assert(b->num_array_elements == 0);
 assert(b->array_elements == NULL);
 assert(b->type != NULL);
+   assert(!b->type->is_array() || b->has_instance_name);
+
+   /* For uniform block arrays declared with a shared or std140 layout
+* qualifier, mark all its instances as used.
+*/
+   if (b->type->is_array() && b->type->length > 0) {
+  b->num_array_elements = b->type->length;
+  b->array_elements = reralloc(this->mem_ctx,
+   b->array_elements,
+   unsigned,
+   b->num_array_elements);
+
+  for (unsigned i = 0; i < b->num_array_elements; i++) {
+ b->array_elements[i] = i;
+  }
+   }

 return visit_continue;
  }
@@ -147,6 +163,13 @@ 
link_uniform_block_active_visitor::visit_enter(ir_dereference_array *ir)
 assert((b->num_array_elements == 0) == (b->array_elements == NULL));
 assert(b->type != NULL);

+   /* If the block array was declared with a shared or
+* std140 layout qualifier, all its instances have been already marked
+* as used in link_uniform_block_active_visitor::visit(ir_variable *).
+*/
+   if (var->type->interface_packing != GLSL_INTERFACE_PACKING_PACKED)
+  return visit_continue_with_parent;
+
 ir_constant *c = ir->array_index->as_constant();

 if (c) {


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v5 40/70] nir: Implement __intrinsic_load_ssbo

2015-09-21 Thread Samuel Iglesias Gonsálvez



On 19/09/15 00:30, Kristian Høgsberg wrote:
> On Thu, Sep 10, 2015 at 03:35:56PM +0200, Iago Toral Quiroga wrote:
>> v2:
>> - Fix ssbo loads with boolean variables.
>>
>> Reviewed-by: Connor Abbott 
>> ---
>>  src/glsl/nir/glsl_to_nir.cpp| 80 
>> -
>>  src/glsl/nir/nir_intrinsics.h   |  2 +-
>>  src/glsl/nir/nir_lower_phis_to_scalar.c |  2 +
>>  3 files changed, 81 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
>> index 6f1e20a..cb7b196 100644
>> --- a/src/glsl/nir/glsl_to_nir.cpp
>> +++ b/src/glsl/nir/glsl_to_nir.cpp
>> @@ -646,11 +646,14 @@ nir_visitor::visit(ir_call *ir)
>>   op = nir_intrinsic_image_size;
>>} else if (strcmp(ir->callee_name(), "__intrinsic_store_ssbo") == 0) {
>>   op = nir_intrinsic_store_ssbo;
>> +  } else if (strcmp(ir->callee_name(), "__intrinsic_load_ssbo") == 0) {
>> + op = nir_intrinsic_load_ssbo;
>>} else {
>>   unreachable("not reached");
>>}
>>  
>>nir_intrinsic_instr *instr = nir_intrinsic_instr_create(shader, op);
>> +  nir_alu_instr *load_ssbo_compare;
>>  
>>switch (op) {
>>case nir_intrinsic_atomic_counter_read_var:
>> @@ -776,11 +779,75 @@ nir_visitor::visit(ir_call *ir)
>>   instr->src[1] = evaluate_rvalue(block);
>>   break;
>>}
>> +  case nir_intrinsic_load_ssbo: {
>> + exec_node *param = ir->actual_parameters.get_head();
>> + ir_rvalue *block = ((ir_instruction *)param)->as_rvalue();
>> +
>> + param = param->get_next();
>> + ir_rvalue *offset = ((ir_instruction *)param)->as_rvalue();
>> +
>> + /* Check if we need the indirect version */
>> + ir_constant *const_offset = offset->as_constant();
>> + if (!const_offset) {
>> +op = nir_intrinsic_load_ssbo_indirect;
>> +ralloc_free(instr);
>> +instr = nir_intrinsic_instr_create(shader, op);
>> +instr->src[1] = evaluate_rvalue(offset);
>> +instr->const_index[0] = 0;
>> + } else {
>> +instr->const_index[0] = const_offset->value.u[0];
>> + }
>> +
>> + instr->src[0] = evaluate_rvalue(block);
>> +
>> + const glsl_type *type = ir->return_deref->var->type;
>> + instr->num_components = type->vector_elements;
>> +
>> + /* Setup destination register */
>> + nir_ssa_dest_init(&instr->instr, &instr->dest,
>> +   type->vector_elements, NULL);
>> +
>> + /* Insert the created nir instruction now since in the case of 
>> boolean
>> +  * result we will need to emit another instruction after it
>> +  */
>> + nir_instr_insert_after_cf_list(this->cf_node_list, &instr->instr);
> 
> I'd prefer moving this insert into the GLSL_TYPE_BOOL condition below...
> 
>> + /*
>> +  * In SSBO/UBO's, a true boolean value is any non-zero value, but 
>> we
>> +  * consider a true boolean to be ~0. Fix this up with a != 0
>> +  * comparison.
>> +  */
>> + if (type->base_type == GLSL_TYPE_BOOL) {
> 
> ... that is, here...
> 
>> +nir_load_const_instr *const_zero =
>> +   nir_load_const_instr_create(shader, 1);
>> +const_zero->value.u[0] = 0;
>> +nir_instr_insert_after_cf_list(this->cf_node_list,
>> +   &const_zero->instr);
>> +
>> +load_ssbo_compare = nir_alu_instr_create(shader, nir_op_ine);
> 
> and then, insteed of introducing and using 'load_ssbo_compare' here,
> re-use 'instr' for the compare instruction...
> 

I see why you suggest these changes but I don't like them:

instr is a pointer to nir_instrinsic_instr, while the compare
instruction is a nir_alu_instr*, so although we are lucky because some
fields are at the same offsets for both struct base addresses(*), this
could change in the future.

Even if I introduce load_ssbo_compare and, after setting it up, assign
'instr' with load_ssbo_compare pointer value to save some changes done
later, this would work now but it is not guaranteed to work in the future.

Sam

(*) For example:
load_ssbo_compare->dest.dest.ssa and instr->dest.ssa

>> +load_ssbo_compare->src[0].src.is_ssa = true;
>> +load_ssbo_compare->src[0].src.ssa = &instr->dest.ssa;
>> +load_ssbo_compare->src[1].src.is_ssa = true;
>> +load_ssbo_compare->src[1].src.ssa = &const_zero->def;
>> +for (unsigned i = 0; i < type->vector_elements; i++)
>> +   load_ssbo_compare->src[1].swizzle[i] = 0;
>> +nir_ssa_dest_init(&load_ssbo_compare->instr,
>> +  &load_ssbo_compare->dest.dest,
>> +  type->vector_elements, NULL);
>> +load_ssbo_compare->dest.write_mask = (1 << 
>> type->vector_elements) - 1;
>>

Re: [Mesa-dev] [PATCH v5 23/70] glsl: refactor parser processing of an interface block definition

2015-09-21 Thread Samuel Iglesias Gonsálvez



On 19/09/15 01:44, Kristian Høgsberg wrote:
> On Thu, Sep 10, 2015 at 03:35:39PM +0200, Iago Toral Quiroga wrote:
>> From: Samuel Iglesias Gonsalvez 
> 
> I'd be more specific in the subject and say:
> 
>   'glsl: Move interface block processing to glsl_parser_extras.cpp
> 
> 'Refactor' can mean many different things. Maybe even clarify in the
> commit message that there are no other changes.
> 

OK, I am going to update it.

Thanks,

Sam

>> ---
>>  src/glsl/ast.h  |   5 ++
>>  src/glsl/glsl_parser.yy | 127 
>> +---
>>  src/glsl/glsl_parser_extras.cpp | 122 ++
>>  3 files changed, 128 insertions(+), 126 deletions(-)
>>
>> diff --git a/src/glsl/ast.h b/src/glsl/ast.h
>> index 335f426..cca32b3 100644
>> --- a/src/glsl/ast.h
>> +++ b/src/glsl/ast.h
>> @@ -1172,4 +1172,9 @@ extern void
>>  check_builtin_array_max_size(const char *name, unsigned size,
>>   YYLTYPE loc, struct _mesa_glsl_parse_state 
>> *state);
>>  
>> +extern void _mesa_ast_process_interface_block(YYLTYPE *locp,
>> +  _mesa_glsl_parse_state *state,
>> +  ast_interface_block *const 
>> block,
>> +  const struct 
>> ast_type_qualifier q);
>> +
>>  #endif /* AST_H */
>> diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
>> index 42108a3..7f00929 100644
>> --- a/src/glsl/glsl_parser.yy
>> +++ b/src/glsl/glsl_parser.yy
>> @@ -2634,132 +2634,7 @@ basic_interface_block:
>>block->block_name = $2;
>>block->declarations.push_degenerate_list_at_head(& $4->link);
>>  
>> -  if ($1.flags.q.buffer) {
>> - if (!state->has_shader_storage_buffer_objects()) {
>> -_mesa_glsl_error(& @1, state,
>> - "#version 430 / 
>> GL_ARB_shader_storage_buffer_object "
>> - "required for defining shader storage blocks");
>> - } else if (state->ARB_shader_storage_buffer_object_warn) {
>> -_mesa_glsl_warning(& @1, state,
>> -   "#version 430 / 
>> GL_ARB_shader_storage_buffer_object "
>> -   "required for defining shader storage 
>> blocks");
>> - }
>> -  } else if ($1.flags.q.uniform) {
>> - if (!state->has_uniform_buffer_objects()) {
>> -_mesa_glsl_error(& @1, state,
>> - "#version 140 / GL_ARB_uniform_buffer_object "
>> - "required for defining uniform blocks");
>> - } else if (state->ARB_uniform_buffer_object_warn) {
>> -_mesa_glsl_warning(& @1, state,
>> -   "#version 140 / GL_ARB_uniform_buffer_object 
>> "
>> -   "required for defining uniform blocks");
>> - }
>> -  } else {
>> - if (state->es_shader || state->language_version < 150) {
>> -_mesa_glsl_error(& @1, state,
>> - "#version 150 required for using "
>> - "interface blocks");
>> - }
>> -  }
>> -
>> -  /* From the GLSL 1.50.11 spec, section 4.3.7 ("Interface Blocks"):
>> -   * "It is illegal to have an input block in a vertex shader
>> -   *  or an output block in a fragment shader"
>> -   */
>> -  if ((state->stage == MESA_SHADER_VERTEX) && $1.flags.q.in) {
>> - _mesa_glsl_error(& @1, state,
>> -  "`in' interface block is not allowed for "
>> -  "a vertex shader");
>> -  } else if ((state->stage == MESA_SHADER_FRAGMENT) && $1.flags.q.out) {
>> - _mesa_glsl_error(& @1, state,
>> -  "`out' interface block is not allowed for "
>> -  "a fragment shader");
>> -  }
>> -
>> -  /* Since block arrays require names, and both features are added in
>> -   * the same language versions, we don't have to explicitly
>> -   * version-check both things.
>> -   */
>> -  if (block->instance_name != NULL) {
>> - state->check_version(150, 300, & @1, "interface blocks with "
>> -   "an instance name are not allowed");
>> -  }
>> -
>> -  uint64_t interface_type_mask;
>> -  struct ast_type_qualifier temp_type_qualifier;
>> -
>> -  /* Get a bitmask containing only the in/out/uniform/buffer
>> -   * flags, allowing us to ignore other irrelevant flags like
>> -   * interpolation qualifiers.
>> -   */
>> -  temp_type_qualifier.flags.i = 0;
>> -  temp_type_qualifier.flags.q.uniform = true;
>> -  temp_type_qualifier.flags.q.buffer = true;
>> -  temp_type_qualifier.flags.q.in = true;
>> -  temp_type_qualifier.flags.q.out = true;
>> -  interface_type_mask = temp_type_qualifier.flags.i;
>> -
>> -  /* Get t

Re: [Mesa-dev] [PATCH v5 38/70] i965/nir/vec4: Implement nir_intrinsic_store_ssbo

2015-09-21 Thread Iago Toral

On Fri, 2015-09-18 at 13:02 -0700, Kristian Høgsberg wrote:
> On Thu, Sep 10, 2015 at 03:35:54PM +0200, Iago Toral Quiroga wrote:
> > ---
> >  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 148 
> > +
> >  1 file changed, 148 insertions(+)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > index f47b029..450441d 100644
> > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > @@ -23,8 +23,13 @@
> >  
> >  #include "brw_nir.h"
> >  #include "brw_vec4.h"
> > +#include "brw_vec4_builder.h"
> > +#include "brw_vec4_surface_builder.h"
> >  #include "glsl/ir_uniform.h"
> >  
> > +using namespace brw;
> > +using namespace brw::surface_access;
> > +
> >  namespace brw {
> >  
> >  void
> > @@ -556,6 +561,149 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
> > *instr)
> >break;
> > }
> >  
> > +   case nir_intrinsic_store_ssbo_indirect:
> > +  has_indirect = true;
> > +  /* fallthrough */
> > +   case nir_intrinsic_store_ssbo: {
> > +  assert(devinfo->gen >= 7);
> > +
> > +  /* Block index */
> > +  src_reg surf_index;
> > +  nir_const_value *const_uniform_block =
> > + nir_src_as_const_value(instr->src[1]);
> > +  if (const_uniform_block) {
> > + unsigned index = prog_data->base.binding_table.ubo_start +
> > +  const_uniform_block->u[0];
> > + surf_index = src_reg(index);
> > + brw_mark_surface_used(&prog_data->base, index);
> > +  } else {
> > + surf_index = src_reg(this, glsl_type::uint_type);
> > + emit(ADD(dst_reg(surf_index), get_nir_src(instr->src[1], 1),
> > +  src_reg(prog_data->base.binding_table.ubo_start)));
> > + surf_index = emit_uniformize(surf_index);
> > +
> > + brw_mark_surface_used(&prog_data->base,
> > +   prog_data->base.binding_table.ubo_start +
> > +   shader_prog->NumUniformBlocks - 1);
> > +  }
> > +
> > +  /* Offset */
> > +  src_reg offset_reg = src_reg(this, glsl_type::uint_type);
> > +  unsigned const_offset_bytes = 0;
> > +  if (has_indirect) {
> > + emit(MOV(dst_reg(offset_reg), get_nir_src(instr->src[2], 1)));
> > +  } else {
> > + const_offset_bytes = instr->const_index[0];
> > + emit(MOV(dst_reg(offset_reg), src_reg(const_offset_bytes)));
> > +  }
> > +
> > +  /* Value */
> > +  src_reg val_reg = get_nir_src(instr->src[0], 4);
> > +
> > +  /* Writemask */
> > +  unsigned write_mask = instr->const_index[1];
> > +
> > +  /* IvyBridge does not have a native SIMD4x2 untyped write message so 
> > untyped
> > +   * writes will use SIMD8 mode. In order to hide this and keep 
> > symmetry across
> > +   * typed and untyped messages and across hardware platforms, the
> > +   * current implementation of the untyped messages will transparently 
> > convert
> > +   * the SIMD4x2 payload into an equivalent SIMD8 payload by 
> > transposing it
> > +   * and enabling only channel X on the SEND instruction.
> > +   *
> > +   * The above, works well for full vector writes, but not for partial 
> > writes
> > +   * where we want to write some channels and not others, like when we 
> > have
> > +   * code such as v.xyw = vec3(1,2,4). Because the untyped write 
> > messages are
> > +   * quite restrictive with regards to the channel enables we can 
> > configure in
> > +   * the message descriptor (not all combinations are allowed) we 
> > cannot simply
> > +   * implement these scenarios with a single message while keeping the
> > +   * aforementioned symmetry in the implementation. For now we de 
> > decided that
> > +   * it is better to keep the symmetry to reduce complexity, so in 
> > situations
> > +   * such as the one described we end up emitting two untyped write 
> > messages
> > +   * (one for xy and another for w).
> > +   *
> > +   * The code below packs consecutive channels into a single write 
> > message,
> > +   * detects gaps in the vector write and if needed, sends a second 
> > message
> > +   * with the remaining channels. If in the future we decide that we 
> > want to
> > +   * emit a single message at the expense of losing the symmetry in the
> > +   * implementation we can:
> > +   *
> > +   * 1) For IvyBridge: Only use the red channel of the untyped write 
> > SIMD8
> > +   *message payload. In this mode we can write up to 8 offsets and 
> > dwords
> > +   *to the red channel only (for the two vec4s in the SIMD4x2 
> > execution)
> > +   *and select which of the 8 channels carry data to write by 
> > setting the
> > +   *appropriate writemask in the dst register of the SEND 
> > instruction.
> > +   *It would require to

Re: [Mesa-dev] [PATCH v5 00/70] ARB_shader_storage_buffer_object (mesa, i965)

2015-09-21 Thread Iago Toral

Hi Kristian,

On Fri, 2015-09-18 at 16:56 -0700, Kristian Høgsberg wrote:
> On Thu, Sep 10, 2015 at 03:35:16PM +0200, Iago Toral Quiroga wrote:
> > Hi,
> > 
> > this is the latest version of the ARB_shader_storage_buffer_object
> > implementation. A good part of the frontend bits for this are already in
> > master, but this adds some more missing pieces, specifically std430 and
> > memory qualifiers. Additionally, this provides the i965 implementation.
> 
> I've gone through all patches in the series and I replied to patches
> where I had comments.  Overall, the series look good and with the
> comments addressed, I'm ready to give my Reviewed-by for the series.
> I want to take a closer look at the atomics lowering in patches 49+,
> but I'm done for today.  Base on the quick look-through I did, I don't
> expect to find any showstoppers there.

great, thanks for reviewing this! We will send new versions of the
patches for which you had comments or reply to your comments otherwise.

Iago

> Here's a summary of what I found:
> 
> [PATCH v5 01/70] mesa: set MAX_SHADER_STORAGE_BUFFERS to 15.
> 
>   Update limit to 16 and drop the comment
> 
> [PATCH v5 02/70] i965: Use 16-byte offset alignment for shader storage buffers
> 
>   ctx->Const.ShaderStorageBufferOffsetAlignment should be 64
> 
> [PATCH v5 23/70] glsl: refactor parser processing of an interface block 
> definition
> 
>   Clarify that the commit is just moving code.
> 
> [PATCH v5 26/70] glsl: Add parser/compiler support for std430 interface 
> packing qualifier
> 
>   Update the error to also mention shader storage blocks, not just ubos?
> 
> [PATCH v5 28/70] glsl: add std430 interface packing support to ssbo related 
> operations
> 
>   Why are we passing false for is_std430 here (emit_access in
>   handle_rvalue)?  We use handle_rvalue for both UBO and SSBO loads,
>   right?  Also, for consistency, I'd prefer if we could just pass
>   'packing' around instead of is_std430.
> 
> [PATCH v5 29/70] glsl: a shader storage buffer must be smaller than the 
> maximum size allowed
> 
>   Two chunks look like they should be their own patch ("Add unsized
>   array support to glsl_type::std140_size" or such).
> 
> [PATCH v5 38/70] i965/nir/vec4: Implement nir_intrinsic_store_ssbo
> 
>   Shouldn't this be 'skipped_channels += num_channels;' to handle write mask 
> reg.yw?
> 
> [PATCH v5 40/70] nir: Implement __intrinsic_load_ssbo
> 
>   Refactor handling of cmp instruction for converting to bool
> 
> [PATCH v5 54/70] glsl: First argument to atomic functions must be a buffer 
> variable
> 
>   Nitpick: move check that only looks at first element in list to after loop
> 
> Also, I expect that before we land this series (thought that shouldn't
> be far off), we'll have deleted the vec4 GLSL IR visitor so we can
> drop these patches (I didn't review them):
> 
> [PATCH v5 16/70] i965/vec4: Implement ir_unop_get_buffer_size
> [PATCH v5 39/70] i965/vec4: Implement __intrinsic_store_ssbo
> [PATCH v5 43/70] i965/vec4: Implement __intrinsic_load_ssbo
> [PATCH v5 53/70] i965/vec4: Implement lowered SSBO atomic intrinsics
> 
> I wrote the initial prototype of the SSBO functionality, but I don't
> recall writing:
> 
> [PATCH v5 45/70] glsl: atomic counters can be declared as buffer-qualified 
> variables
> 
> I don't think I did anything for atomics in my patch. Feel free to
> take ownership of that one and add my Reviewed-by.
> 
> thanks,
> Kristian
> 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v5 70/70] glsl: Mark as active all elements of shared/std140 block arrays

2015-09-21 Thread Samuel Iglesias Gonsálvez

On 21/09/15 09:41, Tapani Pälli wrote:
> Seems like a nice fix, takes ES3 CTS failures from 116 to 64 on Haswell
> (most failing tests are with ubos).
> 
> Tested-by: Tapani Pälli 
>

OK thanks!

> This is individual patch not related to just SSBOs, maybe this could be
> pushed separately from the rest?
> 

Yes, this patch can be pushed separately from the rest of patches of the
series: we just need an R-b, as usual.

We are going to discuss the proper solution with Timothy [0], as he
found that we are not covering one case.

Sam

[0] https://bugs.freedesktop.org/show_bug.cgi?id=83508


> // Tapani
> 
> On 09/10/2015 04:36 PM, Iago Toral Quiroga wrote:
>> From: Antia Puentes 
>>
>> Commit 1ca25ab (glsl: Do not eliminate 'shared' or 'std140' blocks
>> or block members) considered as active 'shared' and 'std140' uniform
>> blocks and uniform block arrays, but did not include the block array
>> elements. Because of that, it was possible to have an active uniform
>> block array without any elements marked as used, making the assertion
>> ((b->num_array_elements > 0) == b->type->is_array())
>> in link_uniform_blocks() fail.
>>
>> Fixes the following 5 dEQP tests:
>>
>>   * dEQP-GLES3.functional.ubo.random.nested_structs_instance_arrays.18
>>   * dEQP-GLES3.functional.ubo.random.nested_structs_instance_arrays.24
>>   *
>> dEQP-GLES3.functional.ubo.random.nested_structs_arrays_instance_arrays.19
>>   * dEQP-GLES3.functional.ubo.random.all_per_block_buffers.49
>>   * dEQP-GLES3.functional.ubo.random.all_shared_buffer.36
>>
>> Fixes bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83508
>> ---
>>   src/glsl/link_uniform_block_active_visitor.cpp | 23
>> +++
>>   1 file changed, 23 insertions(+)
>>
>> diff --git a/src/glsl/link_uniform_block_active_visitor.cpp
>> b/src/glsl/link_uniform_block_active_visitor.cpp
>> index 5102947..fbe79de 100644
>> --- a/src/glsl/link_uniform_block_active_visitor.cpp
>> +++ b/src/glsl/link_uniform_block_active_visitor.cpp
>> @@ -106,6 +106,22 @@
>> link_uniform_block_active_visitor::visit(ir_variable *var)
>>  assert(b->num_array_elements == 0);
>>  assert(b->array_elements == NULL);
>>  assert(b->type != NULL);
>> +   assert(!b->type->is_array() || b->has_instance_name);
>> +
>> +   /* For uniform block arrays declared with a shared or std140 layout
>> +* qualifier, mark all its instances as used.
>> +*/
>> +   if (b->type->is_array() && b->type->length > 0) {
>> +  b->num_array_elements = b->type->length;
>> +  b->array_elements = reralloc(this->mem_ctx,
>> +   b->array_elements,
>> +   unsigned,
>> +   b->num_array_elements);
>> +
>> +  for (unsigned i = 0; i < b->num_array_elements; i++) {
>> + b->array_elements[i] = i;
>> +  }
>> +   }
>>
>>  return visit_continue;
>>   }
>> @@ -147,6 +163,13 @@
>> link_uniform_block_active_visitor::visit_enter(ir_dereference_array *ir)
>>  assert((b->num_array_elements == 0) == (b->array_elements == NULL));
>>  assert(b->type != NULL);
>>
>> +   /* If the block array was declared with a shared or
>> +* std140 layout qualifier, all its instances have been already
>> marked
>> +* as used in link_uniform_block_active_visitor::visit(ir_variable
>> *).
>> +*/
>> +   if (var->type->interface_packing != GLSL_INTERFACE_PACKING_PACKED)
>> +  return visit_continue_with_parent;
>> +
>>  ir_constant *c = ir->array_index->as_constant();
>>
>>  if (c) {
>>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] glsl: First argument to atomic functions must be a buffer variable

2015-09-21 Thread Iago Toral Quiroga

v2:
  - Add ssbo_in the names of the static functions so it is clear that this
is specific to SSBO atomics.

v3:
  - Move the check after the loop (Kristian Høgsberg)
---
 src/glsl/ast_function.cpp | 42 ++
 1 file changed, 42 insertions(+)

diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp
index ff5ecb9..26d4c62 100644
--- a/src/glsl/ast_function.cpp
+++ b/src/glsl/ast_function.cpp
@@ -142,6 +142,31 @@ verify_image_parameter(YYLTYPE *loc, 
_mesa_glsl_parse_state *state,
return true;
 }
 
+static bool
+verify_first_atomic_ssbo_parameter(YYLTYPE *loc, _mesa_glsl_parse_state *state,
+   ir_variable *var)
+{
+   if (!var || !var->is_in_shader_storage_block()) {
+  _mesa_glsl_error(loc, state, "First argument to atomic function "
+   "must be a buffer variable");
+  return false;
+   }
+   return true;
+}
+
+static bool
+is_atomic_ssbo_function(const char *func_name)
+{
+   return !strcmp(func_name, "atomicAdd") ||
+  !strcmp(func_name, "atomicMin") ||
+  !strcmp(func_name, "atomicMax") ||
+  !strcmp(func_name, "atomicAnd") ||
+  !strcmp(func_name, "atomicOr") ||
+  !strcmp(func_name, "atomicXor") ||
+  !strcmp(func_name, "atomicExchange") ||
+  !strcmp(func_name, "atomicCompSwap");
+}
+
 /**
  * Verify that 'out' and 'inout' actual parameters are lvalues.  Also, verify
  * that 'const_in' formal parameters (an extension in our IR) correspond to
@@ -256,6 +281,23 @@ verify_parameter_modes(_mesa_glsl_parse_state *state,
   actual_ir_node  = actual_ir_node->next;
   actual_ast_node = actual_ast_node->next;
}
+
+   /* The first parameter of atomic functions must be a buffer variable */
+   const char *func_name = sig->function_name();
+   bool is_atomic_ssbo = is_atomic_ssbo_function(func_name);
+   if (is_atomic_ssbo) {
+  const ir_rvalue *const actual = (ir_rvalue *) actual_ir_parameters.head;
+
+  const ast_expression *const actual_ast =
+ exec_node_data(ast_expression, actual_ast_parameters.head, link);
+  YYLTYPE loc = actual_ast->get_location();
+
+  if (!verify_first_atomic_ssbo_parameter(&loc, state,
+  actual->variable_referenced())) {
+ return false;
+  }
+   }
+
return true;
 }
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 86265] Error while running gazebo in ROS Hydro

2015-09-21 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=86265

--- Comment #1 from Grigori Goronzy  ---
Both Mesa 9.1.7 and ROS Hydro are really old. Gazebo 5.0.1 with Mesa 10.5.9
works fine for me on Intel and Radeon (radeonsi).

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v5 18/70] i965/fs/nir: implement nir_intrinsic_get_buffer_size

2015-09-21 Thread Samuel Iglesias Gonsálvez



On 10/09/15 15:35, Iago Toral Quiroga wrote:
> From: Samuel Iglesias Gonsalvez 
> 
> Signed-off-by: Samuel Iglesias Gonsalvez 
> ---
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 25 +
>  1 file changed, 25 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index a6c6a2f..3c55a12 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -1702,6 +1702,31 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, 
> nir_intrinsic_instr *instr
>emit_barrier();
>break;
>  
> +   case nir_intrinsic_get_buffer_size: {
> +  nir_const_value *const_uniform_block = 
> nir_src_as_const_value(instr->src[0]);
> +  unsigned ubo_index = const_uniform_block ? const_uniform_block->u[0] : 
> 0;
> +  int reg_width = dispatch_width / 8;
> +
> +  assert(shader->base.UniformBlocks[ubo_index].IsShaderStorage);
> +
> +  /* Set LOD = 0 */
> +  fs_reg source = fs_reg(0);
> +
> +  int mlen = 1 * reg_width;
> +  fs_reg src_payload = fs_reg(GRF, alloc.allocate(mlen),
> +  BRW_REGISTER_TYPE_UD);
> +  bld.LOAD_PAYLOAD(src_payload, &source, 1, 0);
> +
> +  fs_reg surf_index = fs_reg(prog_data->binding_table.ubo_start + 
> ubo_index);
> +  fs_inst *inst = bld.emit(FS_OPCODE_GET_BUFFER_SIZE, dest,
> +   src_payload, surf_index);
> +  inst->header_size = 0;
> +  inst->mlen = mlen;
> +  inst->regs_written = 4 * reg_width;

This is a mistake. The number of registers written by the instruction is
only one and resinfo returns the buffer size for SURFTYPE_BUFFER in x
component. For other surface types, it could write up to 4 components of
the same destination register.

I will fix it.

Sam

> +  bld.emit(inst);
> +  break;
> +   }
> +
> default:
>unreachable("unknown intrinsic");
> }
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/5] Enable up to 24 MRF registers in gen6

2015-09-21 Thread Mark Janes

This series hits an assertion on ILK and G45:

src/mesa/drivers/dri/i965/brw_eu_emit.c:150: brw_set_dest: Assertion
`dest.nr < (devinfo->gen == 6 ? 24 : 16)' failed.

It triggers about 8k piglit assertions on those platforms.  I'm turning
off testing for G45 and ILK until it is resolved.

https://bugs.freedesktop.org/show_bug.cgi?id=92066

-Mark

Iago Toral Quiroga  writes:

> There are some bug reports about shaders failing to compile in gen6
> because MRF 14 is used when we need to spill. For example:
> https://bugs.freedesktop.org/show_bug.cgi?id=86469
> https://bugs.freedesktop.org/show_bug.cgi?id=90631
>
> Discussion in bugzilla pointed to the fact that gen6 might actually have
> 24 MRF registers available instead of 16, so we could use other MRF
> registers and avoid these conflicts (we still need to investigate why
> some shaders need up to MRF 14 anyway, since this is not expected).
>
> Notice that the hardware docs are not clear about this fact:
>
> SNB PRM Vol4 Part2's "Table 5-4. MRF Registers Available in Device
> Hardware" says "Number per Thread" - "24 registers"
>
> However, SNB PRM Vol4 Part1, 1.6.1 Message Register File (MRF) says:
>
> "Normal threads should construct their messages in m1..m15. (...)
> Regardless of actual hardware implementation, the thread should
> not assume th at MRF addresses above m15 wrap to legal MRF registers."
>
> Therefore experimentation was necessary to evaluate if we had these extra
> MRF registers available or not. This was tested in gen6 using MRF
> registers 21..23 for spilling and doing a full piglit run (all.py) forcing
> spilling of everything on the FS backend. It was also tested by doing
> spilling of everything on both the FS and the VS backends with a piglit run
> of shader.py. In both cases no regressions were observed. In fact, many of
> these tests where helped in the cases where we forced spilling, since that
> triggered the same underlying problem described in the bug reports. Here are
> some results using INTEL_DEBUG=spill_fs,spill_vec4 for a shader.py run on
> gen6 hardware:
>
> Using MRFs 13..15 for spilling:
> crash: 2, fail: 113, pass: 6621, skip: 5461
> 
> Using MRFs 21..23 for spilling:
> crash: 2, fail: 12, pass: 6722, skip: 5461
>
> We might want to test this further with other instances of gen6 hardware
> though... I am not sure that we can safely conclude that all implementations
> of gen6 hardware have 24 MRF registers from my tests on just one particular
> SandyBridge laptop.
>
> Iago Toral Quiroga (5):
>   i965: Move MRF register asserts out of brw_reg.h
>   i965: Turn BRW_MAX_MRF into a macro that accepts a hardware generation
>   i965/fs: Use MRF registers 21-23 for spilling in gen6
>   i965/vec4: Use MRF registers 21-23 for spilling in gen6
>   i965: Maximum allowed size of SEND messages is 15 (4 bits)
>
>  src/mesa/drivers/dri/i965/brw_eu_emit.c| 11 +
>  src/mesa/drivers/dri/i965/brw_fs.cpp   |  4 ++--
>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 15 
>  src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  | 27 
> --
>  src/mesa/drivers/dri/i965/brw_inst.h   |  3 +++
>  src/mesa/drivers/dri/i965/brw_ir_vec4.h|  2 +-
>  src/mesa/drivers/dri/i965/brw_reg.h|  9 
>  .../drivers/dri/i965/brw_schedule_instructions.cpp |  4 ++--
>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp   | 10 +---
>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 15 +++-
>  10 files changed, 61 insertions(+), 39 deletions(-)
>
> -- 
> 1.9.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: do not attempt to dump_shader if no shaderobj

2015-09-21 Thread Iago Toral

On Mon, 2015-09-21 at 09:15 +0300, Tapani Pälli wrote:
> Patch fixes a crash in conformance test that tries out different
> invalid arguments for glShaderSource and glGetShaderSource:
> 
>ES2-CTS.gtf.GL.glGetShaderSource.getshadersource_programhandle
> 
> This is a regression from commit:
>04e201d0c02cd30ace5c6fe80e9f021ebb733682
> 
> Signed-off-by: Tapani Pälli 
> ---
>  src/mesa/main/shaderapi.c | 18 ++
>  1 file changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
> index f31980b..7733d02 100644
> --- a/src/mesa/main/shaderapi.c
> +++ b/src/mesa/main/shaderapi.c
> @@ -1699,15 +1699,17 @@ _mesa_ShaderSource(GLhandleARB shaderObj, GLsizei 
> count,
>  #if defined(HAVE_SHA1)
> sh = _mesa_lookup_shader(ctx, shaderObj);

Why not call this earlier in that function (before we even process the
shader string) and return if it is NULL? There is no point in waiting
until this moment to check for that.

Then, when we call shader_source right below this code, we could just
pass the sh object directly instead of having that function call
_mesa_lookup_shader again (we could maybe replace that call with an
assert to make sure that we passed a valid shader object)

Iago

> -   /* Dump original shader source to MESA_SHADER_DUMP_PATH and replace
> -* if corresponding entry found from MESA_SHADER_READ_PATH.
> -*/
> -   dump_shader(sh->Stage, source);
> +   if (sh) {
> +  /* Dump original shader source to MESA_SHADER_DUMP_PATH and replace
> +   * if corresponding entry found from MESA_SHADER_READ_PATH.
> +   */
> +  dump_shader(sh->Stage, source);
>  
> -   replacement = read_shader(sh->Stage, source);
> -   if (replacement) {
> -  free(source);
> -  source = replacement;
> +  replacement = read_shader(sh->Stage, source);
> +  if (replacement) {
> + free(source);
> + source = replacement;
> +  }
> }
>  #endif /* HAVE_SHA1 */
>  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] r600g/sb: Support gs5 sampler indexing

2015-09-21 Thread Glenn Kennard

Signed-off-by: Glenn Kennard 
---
Just UBO support left before gs5 can be enabled.
Could improve how the two index registers are set/used to reduce
the number of clauses, but as is its about as good as what the blob
emits.

 src/gallium/drivers/r600/r600_shader.c   |  12 ++-
 src/gallium/drivers/r600/r600_shader.h   |   4 +-
 src/gallium/drivers/r600/sb/sb_bc.h  |  10 ++-
 src/gallium/drivers/r600/sb/sb_bc_dump.cpp   |  17 +++-
 src/gallium/drivers/r600/sb/sb_bc_parser.cpp |  50 +++-
 src/gallium/drivers/r600/sb/sb_gcm.cpp   |  11 ++-
 src/gallium/drivers/r600/sb/sb_sched.cpp | 118 +--
 src/gallium/drivers/r600/sb/sb_sched.h   |   5 +-
 8 files changed, 201 insertions(+), 26 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_shader.c 
b/src/gallium/drivers/r600/r600_shader.c
index 1d90582..24c3d43 100644
--- a/src/gallium/drivers/r600/r600_shader.c
+++ b/src/gallium/drivers/r600/r600_shader.c
@@ -166,8 +166,8 @@ int r600_pipe_shader_create(struct pipe_context *ctx,
 if (rctx->b.chip_class <= R700) {
use_sb &= (shader->shader.processor_type != 
TGSI_PROCESSOR_GEOMETRY);
 }
-   /* disable SB for shaders using CF_INDEX_0/1 (sampler/ubo array 
indexing) as it doesn't handle those currently */
-   use_sb &= !shader->shader.uses_index_registers;
+   /* disable SB for shaders using ubo array indexing as it doesn't handle 
those currently */
+   use_sb &= !shader->shader.uses_ubo_indexing;
/* disable SB for shaders using doubles */
use_sb &= !shader->shader.uses_doubles;
 
@@ -1251,7 +1251,7 @@ static int tgsi_split_constant(struct r600_shader_ctx 
*ctx)
}
 
if (ctx->src[i].kc_rel)
-   ctx->shader->uses_index_registers = true;
+   ctx->shader->uses_ubo_indexing = true;
 
if (ctx->src[i].rel) {
int chan = inst->Src[i].Indirect.Swizzle;
@@ -1912,7 +1912,7 @@ static int r600_shader_from_tgsi(struct r600_context 
*rctx,
 
shader->uses_doubles = ctx.info.uses_doubles;
 
-   indirect_gprs = ctx.info.indirect_files & ~(1 << TGSI_FILE_CONSTANT);
+   indirect_gprs = ctx.info.indirect_files & ~((1 << TGSI_FILE_CONSTANT) | 
(1 << TGSI_FILE_SAMPLER));
tgsi_parse_init(&ctx.parse, tokens);
ctx.type = ctx.info.processor;
shader->processor_type = ctx.type;
@@ -1936,7 +1936,7 @@ static int r600_shader_from_tgsi(struct r600_context 
*rctx,
ctx.gs_next_vertex = 0;
ctx.gs_stream_output_info = &so;
 
-   shader->uses_index_registers = false;
+   shader->uses_ubo_indexing = false;
ctx.face_gpr = -1;
ctx.fixed_pt_position_gpr = -1;
ctx.fragcoord_input = -1;
@@ -5703,8 +5703,6 @@ static int tgsi_tex(struct r600_shader_ctx *ctx)
sampler_src_reg = 3;
 
sampler_index_mode = inst->Src[sampler_src_reg].Indirect.Index == 2 ? 2 
: 0; // CF_INDEX_1 : CF_INDEX_NONE
-   if (sampler_index_mode)
-   ctx->shader->uses_index_registers = true;
 
src_gpr = tgsi_tex_get_src_gpr(ctx, 0);
 
diff --git a/src/gallium/drivers/r600/r600_shader.h 
b/src/gallium/drivers/r600/r600_shader.h
index 48de9cd..8ba32ae 100644
--- a/src/gallium/drivers/r600/r600_shader.h
+++ b/src/gallium/drivers/r600/r600_shader.h
@@ -75,8 +75,8 @@ struct r600_shader {
boolean has_txq_cube_array_z_comp;
boolean uses_tex_buffers;
boolean gs_prim_id_input;
-   /* Temporarily workaround SB not handling CF_INDEX_[01] index registers 
*/
-   boolean uses_index_registers;
+   /* Temporarily workaround SB not handling ubo indexing */
+   boolean uses_ubo_indexing;
 
/* Size in bytes of a data item in the ring(s) (single vertex data).
   Stages with only one ring items 123 will be set to 0. */
diff --git a/src/gallium/drivers/r600/sb/sb_bc.h 
b/src/gallium/drivers/r600/sb/sb_bc.h
index ab988f8..126750d 100644
--- a/src/gallium/drivers/r600/sb/sb_bc.h
+++ b/src/gallium/drivers/r600/sb/sb_bc.h
@@ -48,6 +48,7 @@ class fetch_node;
 class alu_group_node;
 class region_node;
 class shader;
+class value;
 
 class sb_ostream {
 public:
@@ -818,13 +819,16 @@ class bc_parser {
 
bool gpr_reladdr;
 
+   // Note: currently relies on input emitting SET_CF in same basic block 
as uses
+   value *cf_index_value[2];
+   alu_node *mova;
 public:
 
bc_parser(sb_context &sctx, r600_bytecode *bc, r600_shader* pshader) :
ctx(sctx), dec(), bc(bc), pshader(pshader),
dw(), bc_ndw(), max_cf(),
sh(), error(), slots(), cgroup(),
-   cf_map(), loop_stack(), gpr_reladdr() { }
+   cf_map(), loop_stack(), gpr_reladdr(), cf_index_value(), mova() 
{ }
 
int decode();
int prepare();
@@ -852,6 +856,10 @@ private:
i

Re: [Mesa-dev] [PATCH 0/5] Enable up to 24 MRF registers in gen6

2015-09-21 Thread Kenneth Graunke

On Monday, September 21, 2015 09:46:24 AM Mark Janes wrote:
> This series hits an assertion on ILK and G45:
> 
> src/mesa/drivers/dri/i965/brw_eu_emit.c:150: brw_set_dest: Assertion
> `dest.nr < (devinfo->gen == 6 ? 24 : 16)' failed.
> 
> It triggers about 8k piglit assertions on those platforms.  I'm turning
> off testing for G45 and ILK until it is resolved.
> 
> https://bugs.freedesktop.org/show_bug.cgi?id=92066
> 
> -Mark

I've pushed a fix for this:

commit c1070550c289d48ef389aeb8c564d1abd1123ad1
Author: Kenneth Graunke 
Date:   Mon Sep 21 07:42:27 2015 -0700

i965: Fix MRF register number assertions for compr4.

compr4 is represented by setting the high bit on the MRF number.
We need to mask it out before sanity checking the register number.

Fixes ~8000 assert fails on Ironlake and G45.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92066
Signed-off-by: Kenneth Graunke 

Easy mistake...I always forget about compr4.  Hopefully we should be
good now.

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/5] Enable up to 24 MRF registers in gen6

2015-09-21 Thread Iago Toral

On Mon, 2015-09-21 at 07:49 -0700, Kenneth Graunke wrote:
> On Monday, September 21, 2015 09:46:24 AM Mark Janes wrote:
> > This series hits an assertion on ILK and G45:
> > 
> > src/mesa/drivers/dri/i965/brw_eu_emit.c:150: brw_set_dest: Assertion
> > `dest.nr < (devinfo->gen == 6 ? 24 : 16)' failed.
> > 
> > It triggers about 8k piglit assertions on those platforms.  I'm turning
> > off testing for G45 and ILK until it is resolved.
> > 
> > https://bugs.freedesktop.org/show_bug.cgi?id=92066
> > 
> > -Mark
> 
> I've pushed a fix for this:
> 
> commit c1070550c289d48ef389aeb8c564d1abd1123ad1
> Author: Kenneth Graunke 
> Date:   Mon Sep 21 07:42:27 2015 -0700
> 
> i965: Fix MRF register number assertions for compr4.
> 
> compr4 is represented by setting the high bit on the MRF number.
> We need to mask it out before sanity checking the register number.
> 
> Fixes ~8000 assert fails on Ironlake and G45.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92066
> Signed-off-by: Kenneth Graunke 
> 
> Easy mistake...I always forget about compr4.  Hopefully we should be
> good now.

that was fast, thanks Ken! I was scratching my head over this...

BTW, I just noticed that the ILK docs also say that they have 24 MRFs...
(volume 4, part 2, 5.3.3 MRF Registers). Assuming that we don't find any
other issues, would we want to extend the fix to ILK too?

Iago

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2] nir: Move system value -> intrinsic mapping into nir.c

2015-09-21 Thread Jason Ekstrand

This way they're right next to the map going the other direction.
---
 src/glsl/nir/nir.c | 38 +++
 src/glsl/nir/nir.h |  1 +
 src/glsl/nir/nir_lower_system_values.c | 41 +-
 3 files changed, 40 insertions(+), 40 deletions(-)

diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
index 23071ef..1206bb4 100644
--- a/src/glsl/nir/nir.c
+++ b/src/glsl/nir/nir.c
@@ -1461,6 +1461,44 @@ nir_index_instrs(nir_function_impl *impl)
return index;
 }
 
+nir_intrinsic_op
+nir_intrinsic_from_system_value(gl_system_value val)
+{
+   switch (val) {
+   case SYSTEM_VALUE_VERTEX_ID:
+  return nir_intrinsic_load_vertex_id;
+   case SYSTEM_VALUE_INSTANCE_ID:
+  return nir_intrinsic_load_instance_id;
+   case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE:
+  return nir_intrinsic_load_vertex_id_zero_base;
+   case SYSTEM_VALUE_BASE_VERTEX:
+  return nir_intrinsic_load_base_vertex;
+   case SYSTEM_VALUE_INVOCATION_ID:
+  return nir_intrinsic_load_invocation_id;
+   case SYSTEM_VALUE_FRONT_FACE:
+  return nir_intrinsic_load_front_face;
+   case SYSTEM_VALUE_SAMPLE_ID:
+  return nir_intrinsic_load_sample_id;
+   case SYSTEM_VALUE_SAMPLE_POS:
+  return nir_intrinsic_load_sample_pos;
+   case SYSTEM_VALUE_SAMPLE_MASK_IN:
+  return nir_intrinsic_load_sample_mask_in;
+   case SYSTEM_VALUE_LOCAL_INVOCATION_ID:
+  return nir_intrinsic_load_local_invocation_id;
+   case SYSTEM_VALUE_WORK_GROUP_ID:
+  return nir_intrinsic_load_work_group_id;
+   /* FINISHME: Add tessellation intrinsics.
+   case SYSTEM_VALUE_TESS_COORD:
+   case SYSTEM_VALUE_VERTICES_IN:
+   case SYSTEM_VALUE_PRIMITIVE_ID:
+   case SYSTEM_VALUE_TESS_LEVEL_OUTER:
+   case SYSTEM_VALUE_TESS_LEVEL_INNER:
+*/
+   default:
+  unreachable("system value does not directly correspond to intrinsic");
+   }
+}
+
 gl_system_value
 nir_system_value_from_intrinsic(nir_intrinsic_op intrin)
 {
diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index 63b0b51..666a70f 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -1924,6 +1924,7 @@ bool nir_opt_undef(nir_shader *shader);
 
 void nir_sweep(nir_shader *shader);
 
+nir_intrinsic_op nir_intrinsic_from_system_value(gl_system_value val);
 gl_system_value nir_system_value_from_intrinsic(nir_intrinsic_op intrin);
 
 #ifdef __cplusplus
diff --git a/src/glsl/nir/nir_lower_system_values.c 
b/src/glsl/nir/nir_lower_system_values.c
index a656b27..06ee3e6 100644
--- a/src/glsl/nir/nir_lower_system_values.c
+++ b/src/glsl/nir/nir_lower_system_values.c
@@ -40,46 +40,7 @@ convert_instr(nir_intrinsic_instr *instr)
 
void *mem_ctx = ralloc_parent(instr);
 
-   nir_intrinsic_op op;
-
-   switch (var->data.location) {
-   case SYSTEM_VALUE_FRONT_FACE:
-  op = nir_intrinsic_load_front_face;
-  break;
-   case SYSTEM_VALUE_VERTEX_ID:
-  op = nir_intrinsic_load_vertex_id;
-  break;
-   case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE:
-  op = nir_intrinsic_load_vertex_id_zero_base;
-  break;
-   case SYSTEM_VALUE_BASE_VERTEX:
-  op = nir_intrinsic_load_base_vertex;
-  break;
-   case SYSTEM_VALUE_INSTANCE_ID:
-  op = nir_intrinsic_load_instance_id;
-  break;
-   case SYSTEM_VALUE_SAMPLE_ID:
-  op = nir_intrinsic_load_sample_id;
-  break;
-   case SYSTEM_VALUE_SAMPLE_POS:
-  op = nir_intrinsic_load_sample_pos;
-  break;
-   case SYSTEM_VALUE_SAMPLE_MASK_IN:
-  op = nir_intrinsic_load_sample_mask_in;
-  break;
-   case SYSTEM_VALUE_INVOCATION_ID:
-  op = nir_intrinsic_load_invocation_id;
-  break;
-   case SYSTEM_VALUE_LOCAL_INVOCATION_ID:
-  op = nir_intrinsic_load_local_invocation_id;
-  break;
-   case SYSTEM_VALUE_WORK_GROUP_ID:
-  op = nir_intrinsic_load_work_group_id;
-  break;
-   default:
-  unreachable("not reached");
-   }
-
+   nir_intrinsic_op op = nir_intrinsic_from_system_value(var->data.location);
nir_intrinsic_instr *new_instr = nir_intrinsic_instr_create(mem_ctx, op);
 
if (instr->dest.is_ssa) {
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/vec4: Don't coalesce registers in gen6 math ops if reswizzling needed

2015-09-21 Thread Kenneth Graunke

On Sunday, September 20, 2015 11:11:15 AM Matt Turner wrote:
> On Sun, Sep 20, 2015 at 8:48 AM, Antia Puentes  wrote:
> > Math operations in SandyBridge do not support source swizzling
> 
> I can't find any documentation to support this claim, but I remember
> that SNB math must be in align1 mode so it can't do swizzles or
> writemasking (see commit e14cc504).
> 
> But, the documentation actually says "The supported regioning modes
> for math instruction are align16, align1 with the following
> restrictions: ..."
> 
> Would be nice if we could actually find a PRM citation.

I can't find a PRM citation that mentions swizzles explicitly.  However,
the "Src Mod" box is unchecked, indicating that the MATH instruction
can't handle source modifiers.  vec4_visitor::fix_math_operand's
comments interpret that as abs, negate, /and/ swizzles.  Which sort of
makes sense.

In the internal docs, I see [reformatting mine]:

"[DevSNB], [DevIVB], [DevHSW]:
 The supported regioning mode for math instruction is align1 with
 the following restrictions:

 * Scalar source is supported.
 * Source and destination horizontal stride must be 1.
 * Width must be the same as execution size.
 * Source and destination offset must be the same, except the case
 * of scalar source.
 * The math instruction does not support indirect addressing modes"

(Those comments are also in the Haswell PRM.)

vs.

"[DevBDW]:
 The supported regioning mode for math instruction is align1 and
 align16.  The following restrictions apply for align1 mode:

 * Scalar source is supported.
 * Source and destination horizontal stride must be the same.
 * Regioning must ensure Src.Vstride = Src.Width * Src.Hstride
 * Source and destination offset must be the same, except the case of
   scalar source."

Unfortunately, I couldn't find the above "it must be align1" comments
in any of the PRMs until Haswell.  I assume it just got lost.

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] nir: Move system value -> intrinsic mapping into nir.c

2015-09-21 Thread Kenneth Graunke

On Monday, September 21, 2015 08:14:24 AM Jason Ekstrand wrote:
> This way they're right next to the map going the other direction.
> ---
>  src/glsl/nir/nir.c | 38 +++
>  src/glsl/nir/nir.h |  1 +
>  src/glsl/nir/nir_lower_system_values.c | 41 
> +-
>  3 files changed, 40 insertions(+), 40 deletions(-)

Reviewed-by: Kenneth Graunke 

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nir/builder: Don't use designated initializers

2015-09-21 Thread Jason Ekstrand

Designated initializers are not allowed in C++ (not even C++11).  Since
nir_lower_samplers is now using nir_builder, and nir_lower_samplers is in
C++, this breaks the build on some compilers.  Aparently, GCC 5 allows it
in some limited extent because mesa still builds on my system without this
patch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92052
---
 src/glsl/nir/nir_builder.h | 21 ++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/src/glsl/nir/nir_builder.h b/src/glsl/nir/nir_builder.h
index 8db5fcf..8a43130 100644
--- a/src/glsl/nir/nir_builder.h
+++ b/src/glsl/nir/nir_builder.h
@@ -76,21 +76,36 @@ nir_build_imm(nir_builder *build, unsigned num_components, 
nir_const_value value
 static inline nir_ssa_def *
 nir_imm_float(nir_builder *build, float x)
 {
-   nir_const_value v = { { .f = {x, 0, 0, 0} } };
+   nir_const_value v;
+
+   memset(&v, 0, sizeof(&v));
+   v.f[0] = x;
+
return nir_build_imm(build, 1, v);
 }
 
 static inline nir_ssa_def *
 nir_imm_vec4(nir_builder *build, float x, float y, float z, float w)
 {
-   nir_const_value v = { { .f = {x, y, z, w} } };
+   nir_const_value v;
+
+   memset(&v, 0, sizeof(&v));
+   v.f[0] = x;
+   v.f[1] = y;
+   v.f[2] = z;
+   v.f[3] = w;
+
return nir_build_imm(build, 4, v);
 }
 
 static inline nir_ssa_def *
 nir_imm_int(nir_builder *build, int x)
 {
-   nir_const_value v = { { .i = {x, 0, 0, 0} } };
+   nir_const_value v;
+
+   memset(&v, 0, sizeof(&v));
+   v.i[0] = x;
+
return nir_build_imm(build, 1, v);
 }
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 92052] nir/nir_builder.h:79: error: expected primary-expression before ‘.’ token

2015-09-21 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=92052

Jason Ekstrand  changed:

   What|Removed |Added

 Status|NEW |NEEDINFO

--- Comment #2 from Jason Ekstrand  ---
What GCC version are you using?  Mesa still builds fine on my system with GCC
5.1.1 so maybe GCC 5 allows designated initializers in C++?  In any case, I've
sent a patch that should fix the errors listed:

http://patchwork.freedesktop.org/patch/60023/

Let me know if it still won't build.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] nir/opt_peephole_ffma: Bypass fusion if any operand of fadd and fmul is a const

2015-09-21 Thread Jason Ekstrand

On Mon, Sep 21, 2015 at 12:14 AM, Eduardo Lima Mitev  wrote:
> On 09/19/2015 09:00 PM, Jason Ekstrand wrote:
>>
>> On Sep 18, 2015 2:49 AM, "Eduardo Lima Mitev" > > wrote:
>>>
>>> When both fadd and fmul instructions have at least one operand that is a
>>> constant and it is only used once, the total number of instructions can
>>> be reduced from 3 (1 ffma + 2 load_const) to 2 (1 fmul + 1 fadd); because
>>> the constants will be progagated as immediate operands of fmul and fadd.
>>
>> Right... I'm pretty sure that I've written this patch once before.  At
>> the time we didn't do this because we thought Matt's constant combining
>> would take care of most of those issues and we could just emit the MAD.
>> However, that pass doesn't always combine things optimally and doesn't
>> exist for vec4.
>>
>>> This patch modifies opt_peephole_ffma pass to detect this situation and
>>> bails-out fusing fmul+fadd into ffma.
>>>
>>> As shown in shader-db results below, it seems to help a good bunch.
>> However,
>>> there are some caveats:
>>>
>>> * It seems i965 specific, so I'm not sure if modifying the NIR pass
>>> directly is desired, as opposed to moving this to the backend.
>>>
>>> * There are still a high number of HURTs, but these could be reduced
>> by being
>>> more specific in the conditions to bailout.
>>>
>>> total instructions in shared programs: 1683959 -> 1677447 (-0.39%)
>>> instructions in affected programs: 604918 -> 598406 (-1.08%)
>>> helped:4633
>>> HURT:  804
>>> GAINED:0
>>> LOST:  0
>>
>> What does that look like if you split that between vec4 and fs?
>>
>
> The above is shader-db results filtered for VS only. Following is the
> same results filtered for FS:
>
> total instructions in shared programs: 4593761 -> 4590252 (-0.08%)
> instructions in affected programs: 538690 -> 535181 (-0.65%)
> helped:2981
> HURT:  38
> GAINED:4
> LOST:  0
>
> And this is the raw, unfiltered report:
>
> total instructions in shared programs: 6277603 -> 6267582 (-0.16%)
> instructions in affected programs: 1143656 -> 1133635 (-0.88%)
> helped:7614
> HURT:  842
> GAINED:4
> LOST:  0
>
> For FS, the gain is more noticeable because vector instructions tend to
> neutralize the benefits (constants cannot be propagated to
> multi-component operands).

In that case, it seems like this might be a good idea for the FS as
well.  I was always a bit skeptical that constant combining would just
fix it entirely.  That said, I would like Matt's take before I tell
you to push anything.

If you want to make it less i965-specific, you could always add an
"allow constants" boolean or someting.  That said, i965 is the only
place that the ffma peephole is currently used, so we can let the next
user of it fix that. :-)
--Jason

> Eduardo
>
>>> ---
>>>  src/glsl/nir/nir_opt_peephole_ffma.c | 31 +++
>>>  1 file changed, 31 insertions(+)
>>>
>>> diff --git a/src/glsl/nir/nir_opt_peephole_ffma.c
>> b/src/glsl/nir/nir_opt_peephole_ffma.c
>>> index 4f0f0da..da47f8f 100644
>>> --- a/src/glsl/nir/nir_opt_peephole_ffma.c
>>> +++ b/src/glsl/nir/nir_opt_peephole_ffma.c
>>> @@ -133,6 +133,28 @@ get_mul_for_src(nir_alu_src *src, int num_components,
>>> return alu;
>>>  }
>>>
>>> +/**
>>> + * Given a list of (at least two) nir_alu_src's, tells if any of them
>> is a
>>> + * constant value and is used only once.
>>> + */
>>> +static bool
>>> +any_alu_src_is_a_constant(nir_alu_src srcs[])
>>> +{
>>> +   for (unsigned i = 0; i < 2; i++) {
>>> +  if (srcs[i].src.ssa->parent_instr->type ==
>> nir_instr_type_load_const) {
>>> + nir_load_const_instr *load_const =
>>> +nir_instr_as_load_const (srcs[i].src.ssa->parent_instr);
>>> +
>>> + if (list_length(&load_const->def.uses) == 1 &&
>>> + list_length(&load_const->def.if_uses) == 0) {
>>> +return true;
>>> + }
>>> +  }
>>> +   }
>>> +
>>> +   return false;
>>> +}
>>> +
>>>  static bool
>>>  nir_opt_peephole_ffma_block(nir_block *block, void *void_state)
>>>  {
>>> @@ -183,6 +205,15 @@ nir_opt_peephole_ffma_block(nir_block *block,
>> void *void_state)
>>>mul_src[0] = mul->src[0].src.ssa;
>>>mul_src[1] = mul->src[1].src.ssa;
>>>
>>> +  /* If any of the operands of the fmul and any of the fadd is a
>> constant,
>>> +   * we bypass because it will be more efficient as the constants
>> will be
>>> +   * propagated as operands, potentially saving two load_const
>> instructions.
>>> +   */
>>> +  if (any_alu_src_is_a_constant(mul->src) &&
>>> +  any_alu_src_is_a_c

Re: [Mesa-dev] mesa-10.6: LLVM/Clang v3.7 fixes (gallivm)

2015-09-21 Thread Sedat Dilek

On Fri, Sep 18, 2015 at 11:34 AM, Emil Velikov  wrote:
> On 14 September 2015 at 10:36, Sedat Dilek  wrote:
>> On Thu, Sep 10, 2015 at 4:18 PM, Emil Velikov  
>> wrote:
>>> On 4 September 2015 at 19:57, Sedat Dilek  wrote:
 Hi,

 I compiled a toolchain based on LLVM/Clang v3.7.0 today and tested it
 with an updated version of my Linux Graphics driver stack (see
 attached logs).

 Here on Ubuntu/precise AMD64 I required two patches (see attached 0001
 and 0002) post-mesa-10.6.5+ to build it successfully.
 Feel free to cherry-pick them to 10.6 Git branch.

>>> Ftr the attached patches are master commits
>>> 147ffd48166d851341cadd12de98895f32ec25a2 gallivm: Do not use
>>> NoFramePointerElim with LLVM 3.7.
>>> 09d6243aed016eed4518435c9885275dbb6d2aa9 gallivm: Workaround LLVM PR23628.
>>>
>>> Both of which seems like trivial and non-evasive changes imho, so
>>> unless there are any objections I'll scoop them.
>>> Iirc on the radeon side, more work is needed. So picking these won't
>>> imply that building mesa 10.6.x with llvm 3.7 is officially supported
>>> and/or tested. But a bit of "try it, if you're crazy enough" :-)
>>>
>>> Thanks Sedat !
>>>
>>
>> Yes, both patches where cherry-picked from Git master for mesa-10.6.
>>
>> What does your statement means in the end...
>> Will those 2 patches go into mesa-10.6 (be included in mesa v10.6.8)?
>>
> It means two things
>  - they will (are) picked, considering the lack of objections.
>  - building with/against llvm 3.7 is entirely at your own risk and not
> officially supported :-)
>
> I can see how my earlier reply was a bit confusing, hopefully thing
> one is a tad clearer.
>

Thanks, both patches landed in mesa v10.6.8.

- Sedat -
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] nir: rename nir_lower_samplers.c{pp,}

2015-09-21 Thread Emil Velikov

On 19 September 2015 at 14:53, Jason Ekstrand  wrote:
> On Sat, Sep 19, 2015 at 1:28 AM, Jonathan Gray  wrote:
>> On Fri, Sep 18, 2015 at 10:52:35AM -0700, Jason Ekstrand wrote:
>>> On Sep 18, 2015 12:40 PM, "Emil Velikov"  wrote:
>>> >
>>> > Thanks for the review/testing guys.
>>> >
>>> > Just noticed that some typos/strange wording
>>> >
>>> > On 17 September 2015 at 16:25, Emil Velikov 
>>> wrote:
>>> > > With the only C++ function having it's own wrapper we can 'demote' this
>>> > s/it's/its/
>>> >
>>> > > file to a normal C one. This allows us to get rid of extern C {
>>> > > support C99 designated initializers in CPP code.
>>> > >
>>> > This allows us to get rid of extern C { #include  } 'hacks'.
>>> > Plus some of the headers may use C99 initializers, which are not
>>> > supported by the ISO standard.
>>>
>>> What headers are those?  All NIR headers should be C++ safe.
>>
>> As mentioned in another thread already, nir_builder.h
>>
>> In file included from nir/nir_lower_samplers.cpp:27:
>> nir/nir_builder.h: In function 'nir_ssa_def* nir_imm_float(nir_builder*, 
>> float)':
>> nir/nir_builder.h:79: error: expected primary-expression before '.' token
>> nir/nir_builder.h:79: error: expected primary-expression before '{' token
>> nir/nir_builder.h:79: error: expected `}' before '{' token
>> nir/nir_builder.h:79: error: expected `}' before '{' token
>> nir/nir_builder.h:79: warning: missing braces around initializer for 'float 
>> [4]'
>> nir/nir_builder.h:79: error: expected ',' or ';' before '{' token
>> nir/nir_builder.h:79: warning: unused variable 'v'
>> nir/nir_builder.h:79: warning: no return statement in function returning 
>> non-void
>> nir/nir_builder.h: At global scope:
>> nir/nir_builder.h:79: error: expected declaration before '}' token
>> Makefile:1447: recipe for target 'nir/nir_lower_samplers.lo' failed
>
> Thanks.  We need to fix nir_builder.
As mentioned previously, there is already a fix for that. I'll leave
the decision up-to you whether to pick it or not :)
http://lists.freedesktop.org/archives/mesa-dev/2015-September/094706.html

>  If anything, I'd rather leave it
> as a .cpp for now for the express purpose of checking nir_builder.
If we really want a test shouldn't we wire up a simple one tp make check ?
Something like an empty .cpp file that includes all(?) the nir
headers, unlike nir_lower_samplers.cpp which only has a few.

Cheers,
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/5] Enable up to 24 MRF registers in gen6

2015-09-21 Thread Matt Turner

On Mon, Sep 21, 2015 at 8:00 AM, Iago Toral  wrote:
> On Mon, 2015-09-21 at 07:49 -0700, Kenneth Graunke wrote:
>> On Monday, September 21, 2015 09:46:24 AM Mark Janes wrote:
>> > This series hits an assertion on ILK and G45:
>> >
>> > src/mesa/drivers/dri/i965/brw_eu_emit.c:150: brw_set_dest: Assertion
>> > `dest.nr < (devinfo->gen == 6 ? 24 : 16)' failed.
>> >
>> > It triggers about 8k piglit assertions on those platforms.  I'm turning
>> > off testing for G45 and ILK until it is resolved.
>> >
>> > https://bugs.freedesktop.org/show_bug.cgi?id=92066
>> >
>> > -Mark
>>
>> I've pushed a fix for this:
>>
>> commit c1070550c289d48ef389aeb8c564d1abd1123ad1
>> Author: Kenneth Graunke 
>> Date:   Mon Sep 21 07:42:27 2015 -0700
>>
>> i965: Fix MRF register number assertions for compr4.
>>
>> compr4 is represented by setting the high bit on the MRF number.
>> We need to mask it out before sanity checking the register number.
>>
>> Fixes ~8000 assert fails on Ironlake and G45.
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92066
>> Signed-off-by: Kenneth Graunke 
>>
>> Easy mistake...I always forget about compr4.  Hopefully we should be
>> good now.
>
> that was fast, thanks Ken! I was scratching my head over this...
>
> BTW, I just noticed that the ILK docs also say that they have 24 MRFs...
> (volume 4, part 2, 5.3.3 MRF Registers). Assuming that we don't find any
> other issues, would we want to extend the fix to ILK too?

The ILK docs are notoriously bad and often contain more information
about Sandybridge than Ironlake. I suspect that information is
actually about SNB, though I suppose it couldn't hurt to try on ILK,
though I'm doubtful.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir/builder: Don't use designated initializers

2015-09-21 Thread Kenneth Graunke

On Monday, September 21, 2015 08:26:04 AM Jason Ekstrand wrote:
> Designated initializers are not allowed in C++ (not even C++11).  Since
> nir_lower_samplers is now using nir_builder, and nir_lower_samplers is in
> C++, this breaks the build on some compilers.  Aparently, GCC 5 allows it
> in some limited extent because mesa still builds on my system without this
> patch.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92052
> ---
>  src/glsl/nir/nir_builder.h | 21 ++---
>  1 file changed, 18 insertions(+), 3 deletions(-)

I'd rather just take Emil's patch to convert nir_lower_samplers out of
C++, honestly.  NIR is in C99, let's keep it that way.

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir/builder: Don't use designated initializers

2015-09-21 Thread Jason Ekstrand

On Mon, Sep 21, 2015 at 9:22 AM, Kenneth Graunke  wrote:
> On Monday, September 21, 2015 08:26:04 AM Jason Ekstrand wrote:
>> Designated initializers are not allowed in C++ (not even C++11).  Since
>> nir_lower_samplers is now using nir_builder, and nir_lower_samplers is in
>> C++, this breaks the build on some compilers.  Aparently, GCC 5 allows it
>> in some limited extent because mesa still builds on my system without this
>> patch.
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92052
>> ---
>>  src/glsl/nir/nir_builder.h | 21 ++---
>>  1 file changed, 18 insertions(+), 3 deletions(-)
>
> I'd rather just take Emil's patch to convert nir_lower_samplers out of
> C++, honestly.  NIR is in C99, let's keep it that way.

But I want to be able to start using the builder in glsl_to_nir...
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir/builder: Don't use designated initializers

2015-09-21 Thread Matt Turner

On Mon, Sep 21, 2015 at 9:22 AM, Kenneth Graunke  wrote:
> On Monday, September 21, 2015 08:26:04 AM Jason Ekstrand wrote:
>> Designated initializers are not allowed in C++ (not even C++11).  Since
>> nir_lower_samplers is now using nir_builder, and nir_lower_samplers is in
>> C++, this breaks the build on some compilers.  Aparently, GCC 5 allows it
>> in some limited extent because mesa still builds on my system without this
>> patch.
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92052
>> ---
>>  src/glsl/nir/nir_builder.h | 21 ++---
>>  1 file changed, 18 insertions(+), 3 deletions(-)
>
> I'd rather just take Emil's patch to convert nir_lower_samplers out of
> C++, honestly.  NIR is in C99, let's keep it that way.

I think Jason's argument was that we'd have to make nir_builder
compatible with C++ if we plan to use it with glsl_to_nir, but I agree
that we should convert nir_lower_samplers to C regardless.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir/builder: Don't use designated initializers

2015-09-21 Thread Jason Ekstrand

On Mon, Sep 21, 2015 at 9:24 AM, Matt Turner  wrote:
> On Mon, Sep 21, 2015 at 9:22 AM, Kenneth Graunke  
> wrote:
>> On Monday, September 21, 2015 08:26:04 AM Jason Ekstrand wrote:
>>> Designated initializers are not allowed in C++ (not even C++11).  Since
>>> nir_lower_samplers is now using nir_builder, and nir_lower_samplers is in
>>> C++, this breaks the build on some compilers.  Aparently, GCC 5 allows it
>>> in some limited extent because mesa still builds on my system without this
>>> patch.
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92052
>>> ---
>>>  src/glsl/nir/nir_builder.h | 21 ++---
>>>  1 file changed, 18 insertions(+), 3 deletions(-)
>>
>> I'd rather just take Emil's patch to convert nir_lower_samplers out of
>> C++, honestly.  NIR is in C99, let's keep it that way.
>
> I think Jason's argument was that we'd have to make nir_builder
> compatible with C++ if we plan to use it with glsl_to_nir, but I agree
> that we should convert nir_lower_samplers to C regardless.

Yes, I'd like to see lower_samplers written in C.  However, at the
moment, it's making sure that all of the NIR goodness works in C++.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v4] i965/vec4: Change types as needed to propagate source modifiers using from instruction

2015-09-21 Thread Jason Ekstrand

On Sun, Sep 20, 2015 at 3:00 PM, Alejandro Piñeiro  wrote:
> MOV instructions, as long as they don't have source modifiers, are
> just copying bits around.  So those kind of instruction could be
> propagated even if there are type mismatches. This is needed because
> NIR generates integer MOV instructions whenever it doesn't know what
> else to generate.
>
> This commit adds support for copy propagation using previous (from)
> instruction as reference.
>
> Shader-db results for vec4 programs on Haswell:
> total instructions in shared programs: 1683959 -> 1669037 (-0.89%)
> instructions in affected programs: 746238 -> 731316 (-2.00%)
> helped:6264
> HURT:  30
>
> v2: using 'arg' index to get the from inst was wrong, as pointed
> by Jason Ekstrand
> v3: rebased against last change on the previous patch of the series
> v4: don't need to track instructions on struct copy_entry, as we
> only set the source on a direct copy, as pointed by Jason
> ---
>  .../drivers/dri/i965/brw_vec4_copy_propagation.cpp | 36 
> --
>  1 file changed, 26 insertions(+), 10 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
> index 1522eea..193c7d0 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
> @@ -271,6 +271,8 @@ try_copy_propagate(const struct brw_device_info *devinfo,
>  * vectors better.
>  */
> src_reg value = *entry->value[0];
> +   bool from_inst_has_source_modifiers = value.negate || value.abs;
> +
> for (int i = 1; i < 4; i++) {
>/* This is equals() except we don't care about the swizzle. */
>if (value.file != entry->value[i]->file ||
> @@ -311,6 +313,10 @@ try_copy_propagate(const struct brw_device_info *devinfo,
> if (inst->src[arg].negate)
>value.negate = !value.negate;
>
> +   /* has_source_modifiers is using the modifiers at the current instruction,
> +* while from_inst_source_modifiers is using the modifiers of the source
> +* at the instruction the source comes from
> +*/
> bool has_source_modifiers = value.negate || value.abs;
>
> /* gen6 math and gen7+ SENDs from GRFs ignore source modifiers on
> @@ -322,7 +328,8 @@ try_copy_propagate(const struct brw_device_info *devinfo,
>
> if (has_source_modifiers &&
> value.type != inst->src[arg].type &&
> -   !can_change_source_types(inst))
> +   !can_change_source_types(inst) &&
> +   from_inst_has_source_modifiers)

I don't think this quite the condition we want.  Really, what we want is

if (from_inst_has_source_modifiers &&
value.type != inst->src[arg].type &&
!can_change_source_types(inst))

Whether or not the final thing will have source modifiers is
irrelevant.  The only thing that matters is that we have some that
need propagating (from_inst_has_source_modifiers) but we cant (other
two conditions).

>return false;
>
> if (has_source_modifiers &&
> @@ -340,7 +347,8 @@ try_copy_propagate(const struct brw_device_info *devinfo,
>  * instead. See also resolve_ud_negate().
>  */
> if (value.negate &&
> -   value.type == BRW_REGISTER_TYPE_UD)
> +   value.type == BRW_REGISTER_TYPE_UD &&
> +   from_inst_has_source_modifiers)

Why is this here?  I think this check should just be moved up to
before we modify value.  Maybe as its own patch.

>return false;
>
> /* Don't report progress if this is a noop. */
> @@ -378,17 +386,25 @@ try_copy_propagate(const struct brw_device_info 
> *devinfo,
>
> if (has_source_modifiers &&

from_inst_has_source_modifiers

Sorry to make this go through yet another re-spin, but having looked
through vec4 copy propagation yet again and finally understanding what
it's doing, I think we need to refactor it a bit.  In the FS copy
propagation pass, we wait until the end to build the "final value" and
we should probably do that in vec4 as well.  In particular, it should
look something like this in the end:

1) Build up the value we are propagating as if it were the source of a
single MOV:
  - The first for loop to ensure that all values touch the same register
  - The loop to figure out what the value swizzle looks like but
without composing it with inst->src[arg].swizzle

2) Check that we can propagate that value
 - Check the register file
 - Check for the can_change_source_mods condition
 - etc...

(Really, all of the checks can be done without the final value.  Some
of them make a little more sense with the final value such as the
can_do_source_mods() check, but if you can't do source modifiers you
can assume that there aren't any there already so has_source_modifiers
is the same as from_inst_has_source_modifiers.)

3) Build the final value;
 - Compose swizzlese
 - Compute final source modifiers

Whether

Re: [Mesa-dev] [Mesa-users] glClearColor is broken in a weird way if compiled with mangling enabled

2015-09-21 Thread Yuzhu Lu

With this patch: https://bugs.freedesktop.org/show_bug.cgi?id=91724, osmesa 
builds fine, but OpenGL2 shader related functions like glCreateShader does not 
work (It always return 0). Pre 2 fixed pipeline functions all work fine. Does 
anyone have any clue about this? 

Yuzhu 

- On Aug 24, 2015, at 10:21 AM, Emil Velikov  
wrote: 

> On 24 August 2015 at 16:10, Brian Paul  wrote:
> > Sounds like the file is being compiled without having proper prototypes for
> > all the mgl functions. Do you see any compiler warnings when you build?
> > What gcc compiler options are you using?

> Ftr I haven't been able to build mangled GL for as far as I started
> building mesa (2+ years).
> All cases, that I recall, ended up as Yuzhu reported - undefined
> reference to "glFoo"

> Would be great to have it fixed (patches welcome), but I'd like to
> remove the shared-glapi option first ;-)

> Cheers,
> Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v4] i965/vec4: Change types as needed to propagate source modifiers using from instruction

2015-09-21 Thread Alejandro Piñeiro

On 21/09/15 18:27, Jason Ekstrand wrote:
> On Sun, Sep 20, 2015 at 3:00 PM, Alejandro Piñeiro  
> wrote:
>> MOV instructions, as long as they don't have source modifiers, are
>> just copying bits around.  So those kind of instruction could be
>> propagated even if there are type mismatches. This is needed because
>> NIR generates integer MOV instructions whenever it doesn't know what
>> else to generate.
>>
>> This commit adds support for copy propagation using previous (from)
>> instruction as reference.
>>
>> Shader-db results for vec4 programs on Haswell:
>> total instructions in shared programs: 1683959 -> 1669037 (-0.89%)
>> instructions in affected programs: 746238 -> 731316 (-2.00%)
>> helped:6264
>> HURT:  30
>>
>> v2: using 'arg' index to get the from inst was wrong, as pointed
>> by Jason Ekstrand
>> v3: rebased against last change on the previous patch of the series
>> v4: don't need to track instructions on struct copy_entry, as we
>> only set the source on a direct copy, as pointed by Jason
>> ---
>>  .../drivers/dri/i965/brw_vec4_copy_propagation.cpp | 36 
>> --
>>  1 file changed, 26 insertions(+), 10 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp 
>> b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
>> index 1522eea..193c7d0 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
>> @@ -271,6 +271,8 @@ try_copy_propagate(const struct brw_device_info *devinfo,
>>  * vectors better.
>>  */
>> src_reg value = *entry->value[0];
>> +   bool from_inst_has_source_modifiers = value.negate || value.abs;
>> +
>> for (int i = 1; i < 4; i++) {
>>/* This is equals() except we don't care about the swizzle. */
>>if (value.file != entry->value[i]->file ||
>> @@ -311,6 +313,10 @@ try_copy_propagate(const struct brw_device_info 
>> *devinfo,
>> if (inst->src[arg].negate)
>>value.negate = !value.negate;
>>
>> +   /* has_source_modifiers is using the modifiers at the current 
>> instruction,
>> +* while from_inst_source_modifiers is using the modifiers of the source
>> +* at the instruction the source comes from
>> +*/
>> bool has_source_modifiers = value.negate || value.abs;
>>
>> /* gen6 math and gen7+ SENDs from GRFs ignore source modifiers on
>> @@ -322,7 +328,8 @@ try_copy_propagate(const struct brw_device_info *devinfo,
>>
>> if (has_source_modifiers &&
>> value.type != inst->src[arg].type &&
>> -   !can_change_source_types(inst))
>> +   !can_change_source_types(inst) &&
>> +   from_inst_has_source_modifiers)
> I don't think this quite the condition we want.  Really, what we want is
>
> if (from_inst_has_source_modifiers &&
> value.type != inst->src[arg].type &&
> !can_change_source_types(inst))
>
> Whether or not the final thing will have source modifiers is
> irrelevant.  The only thing that matters is that we have some that
> need propagating (from_inst_has_source_modifiers) but we cant (other
> two conditions).

Yes, true, you can simplify the condition that way. Make things clearer.

>
>>return false;
>>
>> if (has_source_modifiers &&
>> @@ -340,7 +347,8 @@ try_copy_propagate(const struct brw_device_info *devinfo,
>>  * instead. See also resolve_ud_negate().
>>  */
>> if (value.negate &&
>> -   value.type == BRW_REGISTER_TYPE_UD)
>> +   value.type == BRW_REGISTER_TYPE_UD &&
>> +   from_inst_has_source_modifiers)
> Why is this here?  I think this check should just be moved up to
> before we modify value.  Maybe as its own patch.

Because it is really usual that the default types are UD. So on an
assembly code like this:
   0: mov vgrf1.0:UD, u0.xyzw:UD
   1: add m3:F, attr0.xyzw:F, -vgrf1.xyzw:F

if we don't modify that condition to take into account
from_inst_has_source_modifiers, it makes to bail out the previous scenario.


>>return false;
>>
>> /* Don't report progress if this is a noop. */
>> @@ -378,17 +386,25 @@ try_copy_propagate(const struct brw_device_info 
>> *devinfo,
>>
>> if (has_source_modifiers &&
> from_inst_has_source_modifiers
>
> Sorry to make this go through yet another re-spin, but having looked
> through vec4 copy propagation yet again and finally understanding what
> it's doing, I think we need to refactor it a bit.  In the FS copy
> propagation pass, we wait until the end to build the "final value" and
> we should probably do that in vec4 as well. 

Not sure if it is fair the comparison with FS right now. For the
equivalent shader, FS have the correct types on those MOVs, that was how
the the original patch (this [1]) tried to solve the situation. This
patch is trying to solve this issue on copy-propagation, something FS
doesn't need to.

>  In particular, it should
> look something like this in the end:
>
>

Re: [Mesa-dev] [PATCH] configure.ac: Add support to enable read-only text segment on x86.

2015-09-21 Thread Matt Turner

On Thu, Sep 3, 2015 at 9:36 PM, Matt Turner  wrote:
> From: Jeremy Huddleston 
>
> Cc: "10.6 11.0" 
> Bugzilla: https://bugs.gentoo.org/240956
> ---
> After talking with Ian today, we determined that this disables an
> optimization.
>
> And FWIW, NVIDIA's fork of glapi (libglvnd) contains this kind of
> writable-text optimization for x86-64.
>
>  configure.ac | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/configure.ac b/configure.ac
> index 90ba4fe..259f770 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -1289,6 +1289,16 @@ AC_SUBST(GLX_TLS, ${GLX_USE_TLS})
>  AS_IF([test "x$GLX_USE_TLS" = xyes -a "x$ax_pthread_ok" = xyes],
>[DEFINES="${DEFINES} -DGLX_USE_TLS"])
>
> +dnl Read-only text section on x86 hardened platforms
> +AC_ARG_ENABLE([glx-read-only-text],
> +[AS_HELP_STRING([--enable-glx-read-only-text],
> +[Disable writable .text section on x86 (decreases performance) 
> @<:@default=disabled@:>@])],
> +[enable_glx_read_only_text="$enableval"],
> +[enable_glx_read_only_text=no])
> +if test "x$enable_glx_read_only_text" = xyes; then
> +DEFINES="$DEFINES -DGLX_X86_READONLY_TEXT"
> +fi
> +
>  dnl
>  dnl More DRI setup
>  dnl
> --
> 2.4.6
>

Planning to push this today.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] Revert "mesa/extensions: restrict GL_OES_EGL_image to GLES"

2015-09-21 Thread Ian Romanick

On 09/16/2015 05:00 PM, Dave Airlie wrote:
> This reverts commit 48961fa3ba37999a6f8fd812458b735e39604a95.
> 
> glamor/Xwayland use this, the spec saying something when it
> was written, and the fact that the comment says Mesa relies on it
> hasn't changed.
> 
> I also don't have a copy of this patch in my mail archive, which
> seems wierd, did it get posted to mesa-dev?
> 
> Signed-off-by: Dave Airlie 
> ---
>  src/mesa/main/extensions.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
> index 767c50e..b2c88c3 100644
> --- a/src/mesa/main/extensions.c
> +++ b/src/mesa/main/extensions.c
> @@ -307,7 +307,8 @@ static const struct extension extension_table[] = {
> { "GL_OES_depth_texture_cube_map",  
> o(OES_depth_texture_cube_map), ES2, 2012 },
> { "GL_OES_draw_texture",o(OES_draw_texture),  
>ES1,   2004 },
> { "GL_OES_EGL_sync",o(dummy_true),
>ES1 | ES2, 2010 },
> -   { "GL_OES_EGL_image",   o(OES_EGL_image), 
>ES1 | ES2, 2006 },
> +   /*  FIXME: Mesa expects GL_OES_EGL_image to be available in OpenGL 
> contexts. */
> +   { "GL_OES_EGL_image",   o(OES_EGL_image), 
>   GL | ES1 | ES2, 2006 },

I saw this patch, and I thought there was a reason it was necessary.  I
did a bit of GIT history searching, and I couldn't find anything... so I
didn't object.   To prevent this happening again, we should change the
comment from a FIXME to actually document the external requirement.

> { "GL_OES_EGL_image_external",  
> o(OES_EGL_image_external),   ES1 | ES2, 2010 },
> { "GL_OES_element_index_uint",  o(dummy_true),
>ES1 | ES2, 2005 },
> { "GL_OES_fbo_render_mipmap",   o(dummy_true),
>ES1 | ES2, 2005 },

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir/builder: Don't use designated initializers

2015-09-21 Thread Kenneth Graunke

On Monday, September 21, 2015 09:24:17 AM Jason Ekstrand wrote:
> On Mon, Sep 21, 2015 at 9:22 AM, Kenneth Graunke  
> wrote:
> > On Monday, September 21, 2015 08:26:04 AM Jason Ekstrand wrote:
> >> Designated initializers are not allowed in C++ (not even C++11).  Since
> >> nir_lower_samplers is now using nir_builder, and nir_lower_samplers is in
> >> C++, this breaks the build on some compilers.  Aparently, GCC 5 allows it
> >> in some limited extent because mesa still builds on my system without this
> >> patch.
> >>
> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92052
> >> ---
> >>  src/glsl/nir/nir_builder.h | 21 ++---
> >>  1 file changed, 18 insertions(+), 3 deletions(-)
> >
> > I'd rather just take Emil's patch to convert nir_lower_samplers out of
> > C++, honestly.  NIR is in C99, let's keep it that way.
> 
> But I want to be able to start using the builder in glsl_to_nir...
> --Jason

Ah, right...that makes more sense.  I'm still not sure whether it makes
sense to have all of this in inline functions in a .h file - if it were
in a .c file we wouldn't need to worry about this.

But this is fairly harmless, so for now,
Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH] configure.ac: Add support to enable read-only text segment on x86.

2015-09-21 Thread Ian Romanick

Reviewed-by: Ian Romanick 

On 09/03/2015 09:36 PM, Matt Turner wrote:
> From: Jeremy Huddleston 
> 
> Cc: "10.6 11.0" 
> Bugzilla: https://bugs.gentoo.org/240956
> ---
> After talking with Ian today, we determined that this disables an
> optimization.
> 
> And FWIW, NVIDIA's fork of glapi (libglvnd) contains this kind of
> writable-text optimization for x86-64.
> 
>  configure.ac | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/configure.ac b/configure.ac
> index 90ba4fe..259f770 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -1289,6 +1289,16 @@ AC_SUBST(GLX_TLS, ${GLX_USE_TLS})
>  AS_IF([test "x$GLX_USE_TLS" = xyes -a "x$ax_pthread_ok" = xyes],
>[DEFINES="${DEFINES} -DGLX_USE_TLS"])
>  
> +dnl Read-only text section on x86 hardened platforms
> +AC_ARG_ENABLE([glx-read-only-text],
> +[AS_HELP_STRING([--enable-glx-read-only-text],
> +[Disable writable .text section on x86 (decreases performance) 
> @<:@default=disabled@:>@])],
> +[enable_glx_read_only_text="$enableval"],
> +[enable_glx_read_only_text=no])
> +if test "x$enable_glx_read_only_text" = xyes; then
> +DEFINES="$DEFINES -DGLX_X86_READONLY_TEXT"
> +fi
> +
>  dnl
>  dnl More DRI setup
>  dnl
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965/compiler: Clean up GLSL compiler option setup

2015-09-21 Thread Jason Ekstrand

The only functional change here is that we now set EmitNoIndirectOutput and
EmitNoIndirectTemp for compute shaders.  Compute shaders don't have outputs
per-se and we should have been setting EmitNoIndirectTemp all along.
---
 src/mesa/drivers/dri/i965/brw_shader.cpp | 46 ++--
 1 file changed, 20 insertions(+), 26 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index eed73fb..c311a03 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -115,41 +115,35 @@ brw_compiler_create(void *mem_ctx, const struct 
brw_device_info *devinfo)
   compiler->glsl_compiler_options[i].EmitNoNoise = true;
   compiler->glsl_compiler_options[i].EmitNoMainReturn = true;
   compiler->glsl_compiler_options[i].EmitNoIndirectInput = true;
-  compiler->glsl_compiler_options[i].EmitNoIndirectOutput =
-(i == MESA_SHADER_FRAGMENT);
-  compiler->glsl_compiler_options[i].EmitNoIndirectTemp =
-(i == MESA_SHADER_FRAGMENT);
   compiler->glsl_compiler_options[i].EmitNoIndirectUniform = false;
   compiler->glsl_compiler_options[i].LowerClipDistance = true;
 
+  bool is_scalar;
+  switch (i) {
+  case MESA_SHADER_FRAGMENT:
+  case MESA_SHADER_COMPUTE:
+ is_scalar = true;
+ break;
+  case MESA_SHADER_VERTEX:
+ is_scalar = compiler->scalar_vs;
+ break;
+  default:
+ is_scalar = false;
+ break;
+  }
+
+  compiler->glsl_compiler_options[i].EmitNoIndirectOutput = is_scalar;
+  compiler->glsl_compiler_options[i].EmitNoIndirectTemp = is_scalar;
+  compiler->glsl_compiler_options[i].OptimizeForAOS = !is_scalar;
+
   /* !ARB_gpu_shader5 */
   if (devinfo->gen < 7)
  compiler->glsl_compiler_options[i].EmitNoIndirectSampler = true;
-   }
 
-   compiler->glsl_compiler_options[MESA_SHADER_VERTEX].OptimizeForAOS = true;
-   compiler->glsl_compiler_options[MESA_SHADER_GEOMETRY].OptimizeForAOS = true;
-
-   if (compiler->scalar_vs || brw_env_var_as_boolean("INTEL_USE_NIR", true)) {
-  if (compiler->scalar_vs) {
- /* If we're using the scalar backend for vertex shaders, we need to
-  * configure these accordingly.
-  */
- 
compiler->glsl_compiler_options[MESA_SHADER_VERTEX].EmitNoIndirectOutput = true;
- 
compiler->glsl_compiler_options[MESA_SHADER_VERTEX].EmitNoIndirectTemp = true;
- compiler->glsl_compiler_options[MESA_SHADER_VERTEX].OptimizeForAOS = 
false;
-  }
-
-  compiler->glsl_compiler_options[MESA_SHADER_VERTEX].NirOptions = 
nir_options;
-   }
-
-   if (brw_env_var_as_boolean("INTEL_USE_NIR", true)) {
-  compiler->glsl_compiler_options[MESA_SHADER_GEOMETRY].NirOptions = 
nir_options;
+  if (is_scalar || brw_env_var_as_boolean("INTEL_USE_NIR", true))
+ compiler->glsl_compiler_options[i].NirOptions = nir_options;
}
 
-   compiler->glsl_compiler_options[MESA_SHADER_FRAGMENT].NirOptions = 
nir_options;
-   compiler->glsl_compiler_options[MESA_SHADER_COMPUTE].NirOptions = 
nir_options;
-
return compiler;
 }
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/compiler: Clean up GLSL compiler option setup

2015-09-21 Thread Jordan Justen

Reviewed-by: Jordan Justen 

On 2015-09-21 11:24:59, Jason Ekstrand wrote:
> The only functional change here is that we now set EmitNoIndirectOutput and
> EmitNoIndirectTemp for compute shaders.  Compute shaders don't have outputs
> per-se and we should have been setting EmitNoIndirectTemp all along.
> ---
>  src/mesa/drivers/dri/i965/brw_shader.cpp | 46 
> ++--
>  1 file changed, 20 insertions(+), 26 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
> b/src/mesa/drivers/dri/i965/brw_shader.cpp
> index eed73fb..c311a03 100644
> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
> @@ -115,41 +115,35 @@ brw_compiler_create(void *mem_ctx, const struct 
> brw_device_info *devinfo)
>compiler->glsl_compiler_options[i].EmitNoNoise = true;
>compiler->glsl_compiler_options[i].EmitNoMainReturn = true;
>compiler->glsl_compiler_options[i].EmitNoIndirectInput = true;
> -  compiler->glsl_compiler_options[i].EmitNoIndirectOutput =
> -(i == MESA_SHADER_FRAGMENT);
> -  compiler->glsl_compiler_options[i].EmitNoIndirectTemp =
> -(i == MESA_SHADER_FRAGMENT);
>compiler->glsl_compiler_options[i].EmitNoIndirectUniform = false;
>compiler->glsl_compiler_options[i].LowerClipDistance = true;
>  
> +  bool is_scalar;
> +  switch (i) {
> +  case MESA_SHADER_FRAGMENT:
> +  case MESA_SHADER_COMPUTE:
> + is_scalar = true;
> + break;
> +  case MESA_SHADER_VERTEX:
> + is_scalar = compiler->scalar_vs;
> + break;
> +  default:
> + is_scalar = false;
> + break;
> +  }
> +
> +  compiler->glsl_compiler_options[i].EmitNoIndirectOutput = is_scalar;
> +  compiler->glsl_compiler_options[i].EmitNoIndirectTemp = is_scalar;
> +  compiler->glsl_compiler_options[i].OptimizeForAOS = !is_scalar;
> +
>/* !ARB_gpu_shader5 */
>if (devinfo->gen < 7)
>   compiler->glsl_compiler_options[i].EmitNoIndirectSampler = true;
> -   }
>  
> -   compiler->glsl_compiler_options[MESA_SHADER_VERTEX].OptimizeForAOS = true;
> -   compiler->glsl_compiler_options[MESA_SHADER_GEOMETRY].OptimizeForAOS = 
> true;
> -
> -   if (compiler->scalar_vs || brw_env_var_as_boolean("INTEL_USE_NIR", true)) 
> {
> -  if (compiler->scalar_vs) {
> - /* If we're using the scalar backend for vertex shaders, we need to
> -  * configure these accordingly.
> -  */
> - 
> compiler->glsl_compiler_options[MESA_SHADER_VERTEX].EmitNoIndirectOutput = 
> true;
> - 
> compiler->glsl_compiler_options[MESA_SHADER_VERTEX].EmitNoIndirectTemp = true;
> - compiler->glsl_compiler_options[MESA_SHADER_VERTEX].OptimizeForAOS 
> = false;
> -  }
> -
> -  compiler->glsl_compiler_options[MESA_SHADER_VERTEX].NirOptions = 
> nir_options;
> -   }
> -
> -   if (brw_env_var_as_boolean("INTEL_USE_NIR", true)) {
> -  compiler->glsl_compiler_options[MESA_SHADER_GEOMETRY].NirOptions = 
> nir_options;
> +  if (is_scalar || brw_env_var_as_boolean("INTEL_USE_NIR", true))
> + compiler->glsl_compiler_options[i].NirOptions = nir_options;
> }
>  
> -   compiler->glsl_compiler_options[MESA_SHADER_FRAGMENT].NirOptions = 
> nir_options;
> -   compiler->glsl_compiler_options[MESA_SHADER_COMPUTE].NirOptions = 
> nir_options;
> -
> return compiler;
>  }
>  
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/compiler: Clean up GLSL compiler option setup

2015-09-21 Thread Matt Turner

On Mon, Sep 21, 2015 at 11:24 AM, Jason Ekstrand  wrote:
> The only functional change here is that we now set EmitNoIndirectOutput and
> EmitNoIndirectTemp for compute shaders.  Compute shaders don't have outputs
> per-se and we should have been setting EmitNoIndirectTemp all along.
> ---

No one has ever used a "i965/compiler: " prefix. Other changes to
brw_shader.cpp just say "i965: "

The patch itself looks nice. With the prefix changed,

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/compiler: Clean up GLSL compiler option setup

2015-09-21 Thread Jason Ekstrand

On Mon, Sep 21, 2015 at 12:40 PM, Matt Turner  wrote:
> On Mon, Sep 21, 2015 at 11:24 AM, Jason Ekstrand  wrote:
>> The only functional change here is that we now set EmitNoIndirectOutput and
>> EmitNoIndirectTemp for compute shaders.  Compute shaders don't have outputs
>> per-se and we should have been setting EmitNoIndirectTemp all along.
>> ---
>
> No one has ever used a "i965/compiler: " prefix. Other changes to
> brw_shader.cpp just say "i965: "
>
> The patch itself looks nice. With the prefix changed,

Done.

> Reviewed-by: Matt Turner 

Thanks!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965/vec4: Detect and delete useless MOVs.

2015-09-21 Thread Matt Turner

With NIR:

instructions in affected programs: 111508 -> 109193 (-2.08%)
helped:507

Without NIR:

instructions in affected programs: 28763 -> 28474 (-1.00%)
helped:186
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 21 +
 1 file changed, 21 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index ed49cd3..d09a8dd 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1021,6 +1021,27 @@ vec4_visitor::opt_register_coalesce()
  inst->src[0].abs || inst->src[0].negate || inst->src[0].reladdr)
 continue;
 
+  if (inst->dst.file == inst->src[0].file &&
+  inst->dst.reg == inst->src[0].reg &&
+  inst->dst.reg_offset == inst->src[0].reg_offset) {
+ bool is_nop_mov = true;
+
+ for (unsigned c = 0; c < 4; c++) {
+if ((inst->dst.writemask & (1 << c)) == 0)
+   continue;
+
+if (BRW_GET_SWZ(inst->src[0].swizzle, c) != c) {
+   is_nop_mov = false;
+   break;
+}
+ }
+
+ if (is_nop_mov) {
+inst->remove(block);
+continue;
+ }
+  }
+
   bool to_mrf = (inst->dst.file == MRF);
 
   /* Can't coalesce this GRF if someone else was going to
-- 
2.4.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/vec4: Detect and delete useless MOVs.

2015-09-21 Thread Jason Ekstrand

On Mon, Sep 21, 2015 at 2:04 PM, Matt Turner  wrote:
> With NIR:
>
> instructions in affected programs: 111508 -> 109193 (-2.08%)
> helped:507

*sigh*  Any idea who's emitting these MOV's?  If it's vec_to_movs, we
should probably fix it one of these days.  In any case,

> Without NIR:
>
> instructions in affected programs: 28763 -> 28474 (-1.00%)
> helped:186
> ---
>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 21 +
>  1 file changed, 21 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index ed49cd3..d09a8dd 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -1021,6 +1021,27 @@ vec4_visitor::opt_register_coalesce()
>   inst->src[0].abs || inst->src[0].negate || inst->src[0].reladdr)
>  continue;
>

Could we add a quick comment here that we're detecting a no-op MOV?

With that,

Reviewed-by: Jason Ekstrand 

Thanks fir fixing this!

--Jason

> +  if (inst->dst.file == inst->src[0].file &&
> +  inst->dst.reg == inst->src[0].reg &&
> +  inst->dst.reg_offset == inst->src[0].reg_offset) {
> + bool is_nop_mov = true;
> +
> + for (unsigned c = 0; c < 4; c++) {
> +if ((inst->dst.writemask & (1 << c)) == 0)
> +   continue;
> +
> +if (BRW_GET_SWZ(inst->src[0].swizzle, c) != c) {
> +   is_nop_mov = false;
> +   break;
> +}
> + }
> +
> + if (is_nop_mov) {
> +inst->remove(block);
> +continue;
> + }
> +  }
> +
>bool to_mrf = (inst->dst.file == MRF);
>
>/* Can't coalesce this GRF if someone else was going to
> --
> 2.4.6
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/vec4: Detect and delete useless MOVs.

2015-09-21 Thread Kenneth Graunke

On Monday, September 21, 2015 02:04:00 PM Matt Turner wrote:
> With NIR:
> 
> instructions in affected programs: 111508 -> 109193 (-2.08%)
> helped:507
> 
> Without NIR:
> 
> instructions in affected programs: 28763 -> 28474 (-1.00%)
> helped:186
> ---
>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 21 +
>  1 file changed, 21 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index ed49cd3..d09a8dd 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -1021,6 +1021,27 @@ vec4_visitor::opt_register_coalesce()
> inst->src[0].abs || inst->src[0].negate || inst->src[0].reladdr)
>continue;
>  
> +  if (inst->dst.file == inst->src[0].file &&
> +  inst->dst.reg == inst->src[0].reg &&
> +  inst->dst.reg_offset == inst->src[0].reg_offset) {
> + bool is_nop_mov = true;
> +
> + for (unsigned c = 0; c < 4; c++) {
> +if ((inst->dst.writemask & (1 << c)) == 0)
> +   continue;
> +
> +if (BRW_GET_SWZ(inst->src[0].swizzle, c) != c) {
> +   is_nop_mov = false;
> +   break;
> +}
> + }
> +
> + if (is_nop_mov) {
> +inst->remove(block);
> +continue;
> + }
> +  }
> +
>bool to_mrf = (inst->dst.file == MRF);
>  
>/* Can't coalesce this GRF if someone else was going to
> 

Seems like this would break for:

   mov(8) g2<1>F g2.xyzw<4,4,1>D

or

   mov(8) g2<1>F -|g2.xyzw|<4,4,1>F

Did I miss something?


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/vec4: Detect and delete useless MOVs.

2015-09-21 Thread Jason Ekstrand

On Mon, Sep 21, 2015 at 2:16 PM, Kenneth Graunke  wrote:
> On Monday, September 21, 2015 02:04:00 PM Matt Turner wrote:
>> With NIR:
>>
>> instructions in affected programs: 111508 -> 109193 (-2.08%)
>> helped:507
>>
>> Without NIR:
>>
>> instructions in affected programs: 28763 -> 28474 (-1.00%)
>> helped:186
>> ---
>>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 21 +
>>  1 file changed, 21 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
>> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> index ed49cd3..d09a8dd 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> @@ -1021,6 +1021,27 @@ vec4_visitor::opt_register_coalesce()
>> inst->src[0].abs || inst->src[0].negate || inst->src[0].reladdr)
>>continue;
>>
>> +  if (inst->dst.file == inst->src[0].file &&
>> +  inst->dst.reg == inst->src[0].reg &&
>> +  inst->dst.reg_offset == inst->src[0].reg_offset) {
>> + bool is_nop_mov = true;
>> +
>> + for (unsigned c = 0; c < 4; c++) {
>> +if ((inst->dst.writemask & (1 << c)) == 0)
>> +   continue;
>> +
>> +if (BRW_GET_SWZ(inst->src[0].swizzle, c) != c) {
>> +   is_nop_mov = false;
>> +   break;
>> +}
>> + }
>> +
>> + if (is_nop_mov) {
>> +inst->remove(block);
>> +continue;
>> + }
>> +  }
>> +
>>bool to_mrf = (inst->dst.file == MRF);
>>
>>/* Can't coalesce this GRF if someone else was going to
>>
>
> Seems like this would break for:
>
>mov(8) g2<1>F g2.xyzw<4,4,1>D
>
> or
>
>mov(8) g2<1>F -|g2.xyzw|<4,4,1>F
>
> Did I miss something?

Right above there, we bail if inst->dst.type != inst->src[0].type

--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/vec4: Detect and delete useless MOVs.

2015-09-21 Thread Jason Ekstrand

On Mon, Sep 21, 2015 at 2:17 PM, Jason Ekstrand  wrote:
> On Mon, Sep 21, 2015 at 2:16 PM, Kenneth Graunke  
> wrote:
>> On Monday, September 21, 2015 02:04:00 PM Matt Turner wrote:
>>> With NIR:
>>>
>>> instructions in affected programs: 111508 -> 109193 (-2.08%)
>>> helped:507
>>>
>>> Without NIR:
>>>
>>> instructions in affected programs: 28763 -> 28474 (-1.00%)
>>> helped:186
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 21 +
>>>  1 file changed, 21 insertions(+)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
>>> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>>> index ed49cd3..d09a8dd 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>>> @@ -1021,6 +1021,27 @@ vec4_visitor::opt_register_coalesce()
>>> inst->src[0].abs || inst->src[0].negate || inst->src[0].reladdr)
>>>continue;
>>>
>>> +  if (inst->dst.file == inst->src[0].file &&
>>> +  inst->dst.reg == inst->src[0].reg &&
>>> +  inst->dst.reg_offset == inst->src[0].reg_offset) {
>>> + bool is_nop_mov = true;
>>> +
>>> + for (unsigned c = 0; c < 4; c++) {
>>> +if ((inst->dst.writemask & (1 << c)) == 0)
>>> +   continue;
>>> +
>>> +if (BRW_GET_SWZ(inst->src[0].swizzle, c) != c) {
>>> +   is_nop_mov = false;
>>> +   break;
>>> +}
>>> + }
>>> +
>>> + if (is_nop_mov) {
>>> +inst->remove(block);
>>> +continue;
>>> + }
>>> +  }
>>> +
>>>bool to_mrf = (inst->dst.file == MRF);
>>>
>>>/* Can't coalesce this GRF if someone else was going to
>>>
>>
>> Seems like this would break for:
>>
>>mov(8) g2<1>F g2.xyzw<4,4,1>D
>>
>> or
>>
>>mov(8) g2<1>F -|g2.xyzw|<4,4,1>F
>>
>> Did I miss something?
>
> Right above there, we bail if inst->dst.type != inst->src[0].type

And source modifiers and a few other conditions.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/vec4: Detect and delete useless MOVs.

2015-09-21 Thread Matt Turner

On Mon, Sep 21, 2015 at 2:16 PM, Kenneth Graunke  wrote:
> On Monday, September 21, 2015 02:04:00 PM Matt Turner wrote:
>> With NIR:
>>
>> instructions in affected programs: 111508 -> 109193 (-2.08%)
>> helped:507
>>
>> Without NIR:
>>
>> instructions in affected programs: 28763 -> 28474 (-1.00%)
>> helped:186
>> ---
>>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 21 +
>>  1 file changed, 21 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
>> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> index ed49cd3..d09a8dd 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> @@ -1021,6 +1021,27 @@ vec4_visitor::opt_register_coalesce()
>> inst->src[0].abs || inst->src[0].negate || inst->src[0].reladdr)
>>continue;
>>
>> +  if (inst->dst.file == inst->src[0].file &&
>> +  inst->dst.reg == inst->src[0].reg &&
>> +  inst->dst.reg_offset == inst->src[0].reg_offset) {
>> + bool is_nop_mov = true;
>> +
>> + for (unsigned c = 0; c < 4; c++) {
>> +if ((inst->dst.writemask & (1 << c)) == 0)
>> +   continue;
>> +
>> +if (BRW_GET_SWZ(inst->src[0].swizzle, c) != c) {
>> +   is_nop_mov = false;
>> +   break;
>> +}
>> + }
>> +
>> + if (is_nop_mov) {
>> +inst->remove(block);
>> +continue;
>> + }
>> +  }
>> +
>>bool to_mrf = (inst->dst.file == MRF);
>>
>>/* Can't coalesce this GRF if someone else was going to
>>
>
> Seems like this would break for:
>
>mov(8) g2<1>F g2.xyzw<4,4,1>D
>
> or
>
>mov(8) g2<1>F -|g2.xyzw|<4,4,1>F
>
> Did I miss something?

Yeah, the block immediately above handles conditions like this:

if (inst->opcode != BRW_OPCODE_MOV ||
(inst->dst.file != GRF && inst->dst.file != MRF) ||
inst->predicate ||
inst->src[0].file != GRF ||
inst->dst.type != inst->src[0].type ||
inst->src[0].abs || inst->src[0].negate || inst->src[0].reladdr)
   continue;
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/vec4: Detect and delete useless MOVs.

2015-09-21 Thread Matt Turner

On Mon, Sep 21, 2015 at 2:16 PM, Jason Ekstrand  wrote:
> On Mon, Sep 21, 2015 at 2:04 PM, Matt Turner  wrote:
>> With NIR:
>>
>> instructions in affected programs: 111508 -> 109193 (-2.08%)
>> helped:507
>
> *sigh*  Any idea who's emitting these MOV's?  If it's vec_to_movs, we
> should probably fix it one of these days.  In any case,

No idea.

I found it by looking at a shader like
team-fortress-2/2200.shader_test -- which contains a series of no-op
moves inside an if/endif.

>> Without NIR:
>>
>> instructions in affected programs: 28763 -> 28474 (-1.00%)
>> helped:186
>> ---
>>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 21 +
>>  1 file changed, 21 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
>> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> index ed49cd3..d09a8dd 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
>> @@ -1021,6 +1021,27 @@ vec4_visitor::opt_register_coalesce()
>>   inst->src[0].abs || inst->src[0].negate || inst->src[0].reladdr)
>>  continue;
>>
>
> Could we add a quick comment here that we're detecting a no-op MOV?

Sure. I added

  /* Remove no-op MOVs */

above the if statement.

> With that,
>
> Reviewed-by: Jason Ekstrand 

Thanks!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3] i965/vec4: Delete the old ir_visitor code

2015-09-21 Thread Jason Ekstrand

---
 src/mesa/drivers/dri/i965/brw_vec4.h  |   73 +-
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |   30 -
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h   |3 -
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp| 2106 ++---
 src/mesa/drivers/dri/i965/gen6_gs_visitor.cpp |   15 +-
 src/mesa/drivers/dri/i965/gen6_gs_visitor.h   |2 -
 6 files changed, 123 insertions(+), 2106 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index de74ec9..e173f78 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -65,7 +65,7 @@ class vec4_live_variables;
  * Translates either GLSL IR or Mesa IR (for ARB_vertex_program and
  * fixed-function) into VS IR.
  */
-class vec4_visitor : public backend_shader, public ir_visitor
+class vec4_visitor : public backend_shader
 {
 public:
vec4_visitor(const struct brw_compiler *compiler,
@@ -116,44 +116,8 @@ public:
brw::vec4_live_variables *live_intervals;
dst_reg userplane[MAX_CLIP_PLANES];
 
-   dst_reg *variable_storage(ir_variable *var);
-
-   void reladdr_to_temp(ir_instruction *ir, src_reg *reg, int *num_reladdr);
-
bool need_all_constants_in_pull_buffer;
 
-   /**
-* \name Visit methods
-*
-* As typical for the visitor pattern, there must be one \c visit method for
-* each concrete subclass of \c ir_instruction.  Virtual base classes within
-* the hierarchy should not have \c visit methods.
-*/
-   /*@{*/
-   virtual void visit(ir_variable *);
-   virtual void visit(ir_loop *);
-   virtual void visit(ir_loop_jump *);
-   virtual void visit(ir_function_signature *);
-   virtual void visit(ir_function *);
-   virtual void visit(ir_expression *);
-   virtual void visit(ir_swizzle *);
-   virtual void visit(ir_dereference_variable  *);
-   virtual void visit(ir_dereference_array *);
-   virtual void visit(ir_dereference_record *);
-   virtual void visit(ir_assignment *);
-   virtual void visit(ir_constant *);
-   virtual void visit(ir_call *);
-   virtual void visit(ir_return *);
-   virtual void visit(ir_discard *);
-   virtual void visit(ir_texture *);
-   virtual void visit(ir_if *);
-   virtual void visit(ir_emit_vertex *);
-   virtual void visit(ir_end_primitive *);
-   virtual void visit(ir_barrier *);
-   /*@}*/
-
-   src_reg result;
-
/* Regs for vertex results.  Generated at ir_variable visiting time
 * for the ir->location's used.
 */
@@ -166,16 +130,12 @@ public:
 
src_reg shader_start_time;
 
-   struct hash_table *variable_ht;
-
bool run();
void fail(const char *msg, ...);
 
virtual void setup_vec4_uniform_value(unsigned param_offset,
  const gl_constant_value *values,
  unsigned n);
-   void setup_uniform_values(ir_variable *ir);
-   void setup_builtin_uniform_values(ir_variable *ir);
int setup_uniforms(int payload_reg);
 
bool reg_allocate_trivial();
@@ -271,21 +231,9 @@ public:
 
int implied_mrf_writes(vec4_instruction *inst);
 
-   bool try_rewrite_rhs_to_dst(ir_assignment *ir,
-  dst_reg dst,
-  src_reg src,
-  vec4_instruction *pre_rhs_inst,
-  vec4_instruction *last_rhs_inst);
-
-   /** Walks an exec_list of ir_instruction and sends it through this visitor. 
*/
-   void visit_instructions(const exec_list *list);
-
void emit_vp_sop(enum brw_conditional_mod condmod, dst_reg dst,
 src_reg src0, src_reg src1, src_reg one);
 
-   void emit_bool_to_cond_code(ir_rvalue *ir, enum brw_predicate *predicate);
-   void emit_if_gen6(ir_if *ir);
-
vec4_instruction *emit_minmax(enum brw_conditional_mod conditionalmod, 
dst_reg dst,
  src_reg src0, src_reg src1);
 
@@ -298,22 +246,11 @@ public:
 */
src_reg emit_uniformize(const src_reg &src);
 
-   void emit_block_move(dst_reg *dst, src_reg *src,
-const struct glsl_type *type, brw_predicate predicate);
-
-   void emit_constant_values(dst_reg *dst, ir_constant *value);
-
/**
 * Emit the correct dot-product instruction for the type of arguments
 */
void emit_dp(dst_reg dst, src_reg src0, src_reg src1, unsigned elements);
 
-   void emit_scalar(ir_instruction *ir, enum prog_opcode op,
-   dst_reg dst, src_reg src0);
-
-   void emit_scalar(ir_instruction *ir, enum prog_opcode op,
-   dst_reg dst, src_reg src0, src_reg src1);
-
src_reg fix_3src_operand(const src_reg &src);
src_reg resolve_source_modifiers(const src_reg &src);
 
@@ -389,20 +326,13 @@ public:
src_reg emit_resolve_reladdr(int scratch_loc[], bblock_t *block,
 vec4_instruction *inst, src_reg src);
 
-   bool try_emit_mad(ir_expression *ir);
-   bool try_emit_b2f_of_compare(ir_ex

[Mesa-dev] [PATCH 3/3] i965/vec4: Delete the old vec4_vp code

2015-09-21 Thread Jason Ekstrand

---
 src/mesa/drivers/dri/i965/Makefile.sources |   1 -
 src/mesa/drivers/dri/i965/brw_vec4.h   |   1 -
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp  |   9 -
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h|   1 -
 src/mesa/drivers/dri/i965/brw_vec4_vp.cpp  | 649 -
 src/mesa/drivers/dri/i965/brw_vs.h |   1 -
 .../dri/i965/test_vec4_copy_propagation.cpp|   5 -
 .../dri/i965/test_vec4_register_coalesce.cpp   |   5 -
 8 files changed, 672 deletions(-)
 delete mode 100644 src/mesa/drivers/dri/i965/brw_vec4_vp.cpp

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
b/src/mesa/drivers/dri/i965/Makefile.sources
index 2ef392a..31eeecd 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -130,7 +130,6 @@ i965_FILES = \
brw_vec4_gs_nir.cpp \
brw_vec4_reg_allocate.cpp \
brw_vec4_visitor.cpp \
-   brw_vec4_vp.cpp \
brw_vec4_vs_visitor.cpp \
brw_vs.c \
brw_vs.h \
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index e173f78..abaae14 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -382,7 +382,6 @@ protected:
virtual void assign_binding_table_offsets();
virtual void setup_payload() = 0;
virtual void emit_prolog() = 0;
-   virtual void emit_program_code() = 0;
virtual void emit_thread_end() = 0;
virtual void emit_urb_write_header(int mrf) = 0;
virtual vec4_instruction *emit_urb_write_opcode(bool complete) = 0;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
index 8164ef5..4322f2c 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
@@ -207,15 +207,6 @@ vec4_gs_visitor::emit_prolog()
this->current_annotation = NULL;
 }
 
-
-void
-vec4_gs_visitor::emit_program_code()
-{
-   /* We don't support NV_geometry_program4. */
-   unreachable("Unreached");
-}
-
-
 void
 vec4_gs_visitor::emit_thread_end()
 {
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h 
b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h
index d6ff777..05453b5 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h
@@ -83,7 +83,6 @@ protected:
   const glsl_type *type);
virtual void setup_payload();
virtual void emit_prolog();
-   virtual void emit_program_code();
virtual void emit_thread_end();
virtual void emit_urb_write_header(int mrf);
virtual vec4_instruction *emit_urb_write_opcode(bool complete);
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp
deleted file mode 100644
index d1a72d7..000
--- a/src/mesa/drivers/dri/i965/brw_vec4_vp.cpp
+++ /dev/null
@@ -1,649 +0,0 @@
-/*
- * Copyright © 2012 Intel Corporation
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice (including the next
- * paragraph) shall be included in all copies or substantial portions of the
- * Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
- * IN THE SOFTWARE.
- */
-
-/** @file brw_vec4_vp.cpp
- *
- * A translator from Mesa IR to the i965 driver's Vec4 IR, used to implement
- * ARB_vertex_program and fixed-function vertex processing.
- */
-
-#include "brw_context.h"
-#include "brw_vec4.h"
-#include "brw_vs.h"
-extern "C" {
-#include "program/prog_parameter.h"
-#include "program/prog_print.h"
-}
-using namespace brw;
-
-void
-vec4_visitor::emit_vp_sop(enum brw_conditional_mod conditional_mod,
-  dst_reg dst, src_reg src0, src_reg src1,
-  src_reg one)
-{
-   vec4_instruction *inst;
-
-   inst = emit(CMP(dst_null_f(), src0, src1, conditional_mod));
-
-   inst = emit(BRW_OPCODE_SEL, dst, one, src_reg(0.0f));
-   inst->predicate = BRW_PREDICATE_NORMAL;
-}
-
-void
-vec4_vs_visitor::emit_program_code()
-{
-   this->need_all_constants_

[Mesa-dev] [PATCH 1/3] i965/vec4: Always use NIR

2015-09-21 Thread Jason Ekstrand

GLSL IR vs. NIR shader-db results for vec4 programs on i965:

   total instructions in shared programs: 1499395 -> 1408109 (-6.09%)
   instructions in affected programs: 1309372 -> 1218086 (-6.97%)
   helped:6359
   HURT:  4528

GLSL IR vs. NIR shader-db results for vec4 programs on G4x:

   total instructions in shared programs: 1436866 -> 1345580 (-6.35%)
   instructions in affected programs: 1265868 -> 1174582 (-7.21%)
   helped:6359
   HURT:  4528

GLSL IR vs. NIR shader-db results for vec4 programs on Iron Lake:

   total instructions in shared programs: 1436856 -> 1345570 (-6.35%)
   instructions in affected programs: 1265858 -> 1174572 (-7.21%)
   helped:6359
   HURT:  4528

GLSL IR vs. NIR shader-db results for vec4 programs on Sandy Bridge:

   total instructions in shared programs: 2020573 -> 1822601 (-9.80%)
   instructions in affected programs: 1883334 -> 1685362 (-10.51%)
   helped:13328
   HURT:  3594

GLSL IR vs. NIR shader-db results for vec4 programs on Ivy Bridge:

   total instructions in shared programs: 1852351 -> 1683842 (-9.10%)
   instructions in affected programs: 1686970 -> 1518461 (-9.99%)
   helped:12677
   HURT:  3940

GLSL IR vs. NIR shader-db results for vec4 programs on Bay Trail:

   total instructions in shared programs: 1852351 -> 1683842 (-9.10%)
   instructions in affected programs: 1686970 -> 1518461 (-9.99%)
   helped:12677
   HURT:  3940

GLSL IR vs. NIR shader-db results for vec4 programs on Haswell:

   total instructions in shared programs: 1852351 -> 1683842 (-9.10%)
   instructions in affected programs: 1686970 -> 1518461 (-9.99%)
   helped:12677
   HURT:  3940
---
 src/mesa/drivers/dri/i965/brw_shader.cpp |  3 +--
 src/mesa/drivers/dri/i965/brw_vec4.cpp   | 35 +++-
 2 files changed, 8 insertions(+), 30 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index c311a03..bfb7287 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -140,8 +140,7 @@ brw_compiler_create(void *mem_ctx, const struct 
brw_device_info *devinfo)
   if (devinfo->gen < 7)
  compiler->glsl_compiler_options[i].EmitNoIndirectSampler = true;
 
-  if (is_scalar || brw_env_var_as_boolean("INTEL_USE_NIR", true))
- compiler->glsl_compiler_options[i].NirOptions = nir_options;
+  compiler->glsl_compiler_options[i].NirOptions = nir_options;
}
 
return compiler;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index ed49cd3..8f8f03c 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1755,9 +1755,6 @@ vec4_visitor::emit_shader_time_write(int 
shader_time_subindex, src_reg value)
 bool
 vec4_visitor::run()
 {
-   bool use_vec4_nir =
-  compiler->glsl_compiler_options[stage].NirOptions != NULL;
-
sanity_param_count = prog->Parameters->NumParameters;
 
if (shader_time_index >= 0)
@@ -1767,19 +1764,10 @@ vec4_visitor::run()
 
emit_prolog();
 
-   if (use_vec4_nir) {
-  assert(prog->nir != NULL);
-  emit_nir_code();
-  if (failed)
- return false;
-   } else if (shader) {
-  /* Generate VS IR for main().  (the visitor only descends into
-   * functions called "main").
-   */
-  visit_instructions(shader->base.ir);
-   } else {
-  emit_program_code();
-   }
+   assert(prog->nir != NULL);
+   emit_nir_code();
+   if (failed)
+  return false;
base_ir = NULL;
 
emit_thread_end();
@@ -1792,18 +1780,9 @@ vec4_visitor::run()
 * that we have reladdr computations available for CSE, since we'll
 * often do repeated subexpressions for those.
 */
-   if (shader || use_vec4_nir) {
-  move_grf_array_access_to_scratch();
-  move_uniform_array_access_to_pull_constants();
-   } else {
-  /* The ARB_vertex_program frontend emits pull constant loads directly
-   * rather than using reladdr, so we don't need to walk through all the
-   * instructions looking for things to move.  There isn't anything.
-   *
-   * We do still need to split things to vec4 size.
-   */
-  split_uniform_registers();
-   }
+   move_grf_array_access_to_scratch();
+   move_uniform_array_access_to_pull_constants();
+
pack_uniform_registers();
move_push_constants_to_pull_constants();
split_virtual_grfs();
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list

[Mesa-dev] [PATCH 0/3] i965: Delete all of the non-NIR vec4 code

2015-09-21 Thread Jason Ekstrand

At this point, piglit is the same as for GLSL and the shader-db numbers are
looking pretty good.  On SNB, GLSL vs. NIR for vec4 programs is:

   total instructions in shared programs: 2020573 -> 1822601 (-9.80%)
   instructions in affected programs: 1883334 -> 1685362 (-10.51%)
   helped:13328
   HURT:  3594

and there are patches on the list that improve this to

   total instructions in shared programs: 2020283 -> 1805487 (-10.63%)
   instructions in affected programs: 1855759 -> 1640963 (-11.57%)
   helped:14142
   HURT:  2346

More shader-db numbers can be found in the first real patch.

Jason Ekstrand (3):
  i965/vec4: Always use NIR
  i965/vec4: Delete the old ir_visitor code
  i965/vec4: Delete the old vec4_vp code

 src/mesa/drivers/dri/i965/Makefile.sources |1 -
 src/mesa/drivers/dri/i965/brw_shader.cpp   |3 +-
 src/mesa/drivers/dri/i965/brw_vec4.cpp |   35 +-
 src/mesa/drivers/dri/i965/brw_vec4.h   |   74 +-
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp  |   39 -
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h|4 -
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 2106 ++--
 src/mesa/drivers/dri/i965/brw_vec4_vp.cpp  |  649 --
 src/mesa/drivers/dri/i965/brw_vs.h |1 -
 src/mesa/drivers/dri/i965/gen6_gs_visitor.cpp  |   15 +-
 src/mesa/drivers/dri/i965/gen6_gs_visitor.h|2 -
 .../dri/i965/test_vec4_copy_propagation.cpp|5 -
 .../dri/i965/test_vec4_register_coalesce.cpp   |5 -
 13 files changed, 131 insertions(+), 2808 deletions(-)
 delete mode 100644 src/mesa/drivers/dri/i965/brw_vec4_vp.cpp

-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH V3 1/8] i965: Add a helper function intel_get_tile_dims()

2015-09-21 Thread Anuj Phogat

V2:
- Do the tile width/height computations in the new helper
  function and use it later in intel_miptree_get_tile_masks().
- Change the name to intel_get_tile_dims().

V3: Return the tile_h in number of rows in place of bytes.
Document the units of tile_w, tile_h parameters.

Cc: Chad Versace 
Cc: Topi Pohjolainen 
Signed-off-by: Anuj Phogat 
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 82 ---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  4 ++
 2 files changed, 64 insertions(+), 22 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 2150708..ee5904d 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -557,35 +557,15 @@ static unsigned long
 intel_get_yf_ys_bo_size(struct intel_mipmap_tree *mt, unsigned *alignment,
 unsigned long *pitch)
 {
-   const uint32_t bpp = mt->cpp * 8;
-   const uint32_t aspect_ratio = (bpp == 16 || bpp == 64) ? 2 : 1;
uint32_t tile_width, tile_height;
unsigned long stride, size, aligned_y;
 
assert(mt->tr_mode != INTEL_MIPTREE_TRMODE_NONE);
-
-   switch (bpp) {
-   case 8:
-  tile_height = 64;
-  break;
-   case 16:
-   case 32:
-  tile_height = 32;
-  break;
-   case 64:
-   case 128:
-  tile_height = 16;
-  break;
-   default:
-  unreachable("not reached");
-   }
-
-   if (mt->tr_mode == INTEL_MIPTREE_TRMODE_YS)
-  tile_height *= 4;
+   intel_get_tile_dims(mt->tiling, mt->tr_mode, mt->cpp,
+   &tile_width, &tile_height);
 
aligned_y = ALIGN(mt->total_height, tile_height);
stride = mt->total_width * mt->cpp;
-   tile_width = tile_height * mt->cpp * aspect_ratio;
stride = ALIGN(stride, tile_width);
size = stride * aligned_y;
 
@@ -1075,6 +1055,64 @@ intel_miptree_get_image_offset(const struct 
intel_mipmap_tree *mt,
*y = mt->level[level].slice[slice].y_offset;
 }
 
+
+/**
+ * This function computes the tile_w (in bytes) and tile_h (in rows) of
+ * different tiling patterns. If the BO is untiled, tile_w is set to cpp
+ * and tile_h is set to 1.
+ */
+void
+intel_get_tile_dims(uint32_t tiling, uint32_t tr_mode, uint32_t cpp,
+uint32_t *tile_w, uint32_t *tile_h)
+{
+   if (tr_mode == INTEL_MIPTREE_TRMODE_NONE) {
+  switch (tiling) {
+  case I915_TILING_X:
+ *tile_w = 512;
+ *tile_h = 8;
+ break;
+  case I915_TILING_Y:
+ *tile_w = 128;
+ *tile_h = 32;
+ break;
+  case I915_TILING_NONE:
+ *tile_w = cpp;
+ *tile_h = 1;
+ break;
+  default:
+ unreachable("not reached");
+  }
+   } else {
+  uint32_t aspect_ratio = 1;
+  assert(_mesa_is_pow_two(cpp));
+
+  switch (cpp) {
+  case 1:
+ *tile_h = 64;
+ break;
+  case 2:
+  case 4:
+ *tile_h = 32;
+ break;
+  case 8:
+  case 16:
+ *tile_h = 16;
+ break;
+  default:
+ unreachable("not reached");
+  }
+
+  if (cpp == 2 || cpp == 8)
+ aspect_ratio = 2;
+
+  if (tr_mode == INTEL_MIPTREE_TRMODE_YS)
+ *tile_h *= 4;
+
+  *tile_w = *tile_h * aspect_ratio * cpp;
+   }
+}
+
+
 /**
  * This function computes masks that may be used to select the bits of the X
  * and Y coordinates that indicate the offset within a tile.  If the BO is
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index bcf6d00..294e301 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -627,6 +627,10 @@ intel_miptree_get_tile_masks(const struct 
intel_mipmap_tree *mt,
  uint32_t *mask_x, uint32_t *mask_y,
  bool map_stencil_as_y_tiled);
 
+void
+intel_get_tile_dims(uint32_t tiling, uint32_t tr_mode, uint32_t cpp,
+uint32_t *tile_w, uint32_t *tile_h);
+
 uint32_t
 intel_miptree_get_tile_offsets(const struct intel_mipmap_tree *mt,
GLuint level, GLuint slice,
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH V3 2/8] i965: Use intel_get_tile_dims() to get tile masks

2015-09-21 Thread Anuj Phogat

This will require change in the parameters passed to
intel_miptree_get_tile_masks().

V2: Rearrange the order of parameters. (Ben)
Change the name to intel_get_tile_masks(). (Topi)

V3: Use temporary variables in intel_get_tile_masks()
for clarity. Fix mask_y computation.

Cc: Chad Versace 
Cc: Topi Pohjolainen 
Signed-off-by: Anuj Phogat 
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_blorp.cpp   |  4 +++-
 src/mesa/drivers/dri/i965/brw_misc_state.c| 20 +++--
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 31 ---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  6 +++---
 4 files changed, 28 insertions(+), 33 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp.cpp
index eac1f00..df2969d 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp
@@ -144,7 +144,9 @@ brw_blorp_surface_info::compute_tile_offsets(uint32_t 
*tile_x,
 {
uint32_t mask_x, mask_y;
 
-   intel_miptree_get_tile_masks(mt, &mask_x, &mask_y, map_stencil_as_y_tiled);
+   intel_get_tile_masks(mt->tiling, mt->tr_mode, mt->cpp,
+map_stencil_as_y_tiled,
+&mask_x, &mask_y);
 
*tile_x = x_offset & mask_x;
*tile_y = y_offset & mask_y;
diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c 
b/src/mesa/drivers/dri/i965/brw_misc_state.c
index 2751152..7d17edb 100644
--- a/src/mesa/drivers/dri/i965/brw_misc_state.c
+++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
@@ -174,13 +174,17 @@ brw_get_depthstencil_tile_masks(struct intel_mipmap_tree 
*depth_mt,
uint32_t tile_mask_x = 0, tile_mask_y = 0;
 
if (depth_mt) {
-  intel_miptree_get_tile_masks(depth_mt, &tile_mask_x, &tile_mask_y, 
false);
+  intel_get_tile_masks(depth_mt->tiling, depth_mt->tr_mode,
+   depth_mt->cpp, false,
+   &tile_mask_x, &tile_mask_y);
 
   if (intel_miptree_level_has_hiz(depth_mt, depth_level)) {
  uint32_t hiz_tile_mask_x, hiz_tile_mask_y;
- intel_miptree_get_tile_masks(depth_mt->hiz_buf->mt,
-  &hiz_tile_mask_x, &hiz_tile_mask_y,
-  false);
+ intel_get_tile_masks(depth_mt->hiz_buf->mt->tiling,
+  depth_mt->hiz_buf->mt->tr_mode,
+  depth_mt->hiz_buf->mt->cpp,
+  false, &hiz_tile_mask_x,
+  &hiz_tile_mask_y);
 
  /* Each HiZ row represents 2 rows of pixels */
  hiz_tile_mask_y = hiz_tile_mask_y << 1 | 1;
@@ -200,9 +204,11 @@ brw_get_depthstencil_tile_masks(struct intel_mipmap_tree 
*depth_mt,
  tile_mask_y |= 63;
   } else {
  uint32_t stencil_tile_mask_x, stencil_tile_mask_y;
- intel_miptree_get_tile_masks(stencil_mt,
-  &stencil_tile_mask_x,
-  &stencil_tile_mask_y, false);
+ intel_get_tile_masks(stencil_mt->tiling,
+  stencil_mt->tr_mode,
+  stencil_mt->cpp,
+  false, &stencil_tile_mask_x,
+  &stencil_tile_mask_y);
 
  tile_mask_x |= stencil_tile_mask_x;
  tile_mask_y |= stencil_tile_mask_y;
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index ee5904d..28d5031 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1119,31 +1119,18 @@ intel_get_tile_dims(uint32_t tiling, uint32_t tr_mode, 
uint32_t cpp,
  * untiled, the masks are set to 0.
  */
 void
-intel_miptree_get_tile_masks(const struct intel_mipmap_tree *mt,
- uint32_t *mask_x, uint32_t *mask_y,
- bool map_stencil_as_y_tiled)
+intel_get_tile_masks(uint32_t tiling, uint32_t tr_mode, uint32_t cpp,
+ bool map_stencil_as_y_tiled,
+ uint32_t *mask_x, uint32_t *mask_y)
 {
-   int cpp = mt->cpp;
-   uint32_t tiling = mt->tiling;
-
+   uint32_t tile_w_bytes, tile_h;
if (map_stencil_as_y_tiled)
   tiling = I915_TILING_Y;
 
-   switch (tiling) {
-   default:
-  unreachable("not reached");
-   case I915_TILING_NONE:
-  *mask_x = *mask_y = 0;
-  break;
-   case I915_TILING_X:
-  *mask_x = 512 / cpp - 1;
-  *mask_y = 7;
-  break;
-   case I915_TILING_Y:
-  *mask_x = 128 / cpp - 1;
-  *mask_y = 31;
-  break;
-   }
+   intel_get_tile_dims(tiling, tr_mode, cpp, &tile_w_bytes, &tile_h);
+
+   *mask_x = tile_w_bytes / cpp - 1;
+   *mask_y = tile_h - 1;
 }
 
 /**
@@ -1208,7 +1195,7 @@ intel_miptree_get_tile_offsets(const struct 
intel_mipmap_tree *mt,
uint32_t x, y;
uint32_t mask_x, mask_y;
 
-   intel_miptree_get_tile_masks(mt, &m

Re: [Mesa-dev] [PATCH rework] mesa: rework Driver.CopyImageSubData() and related code

2015-09-21 Thread Brian Paul


On 09/04/2015 12:39 PM, Brian Paul wrote:

On 08/28/2015 11:03 AM, Jason Ekstrand wrote:

On Fri, Aug 28, 2015 at 7:43 AM, Brian Paul  wrote:

On 08/28/2015 12:43 AM, Jason Ekstrand wrote:


On Thu, Aug 27, 2015 at 11:42 PM, Jason Ekstrand 
wrote:


From: Brian Paul 

Previously, core Mesa's _mesa_CopyImageSubData() created temporary
textures
to wrap renderbuffer sources/destinations.  This caused a bit of a
mess
in
the Mesa/gallium state tracker because we had to basically undo that
wrapping.

Instead, change ctx->Driver.CopyImageSubData() to take both
gl_renderbuffer
and gl_texture_image src/dst pointers (one being null, the other
non-null)
so the driver can handle renderbuffer vs. texture as needed.

For the i965 driver, we basically moved the code that wrapped textures
around renderbuffers from copyimage.c down into the driver.  So that
approach is still used there as before.

The old code in copyimage.c also made some questionable calls to
_mesa_BindTexture(), etc. which weren't undone at the end.

v2 (Jason Ekstrand): Rework the intel bits
---

TBH, I haven't actually reviewed the rest of the patch yet but I
figured
I'd save Brian the time of guess-and-check implementing it on
Intel.  I
completely reworked the intel bits so we now use the tex/rb
directly and
pun the renderbuffer wrapping all the way down to the meta layer
where it
belongs.



I ran a few quick piglit tests on my Broadwell and it seems to work.
I also sent it off to our CI system and I'll have those results in the
morning.



Thanks, Jason!  I appreciate you taking the time to do this.

-Brian


I looked through the patch for real this time, and everything looks
pretty good.  thanks for cleaning things up and adding spec comments.
The non-Intel bits are

Reviewed-by: Jason Ekstrand 

Someone else should review the intel bits since I wrote them.  Also,
FWIW, I ran this patch through our CI system and there are no piglit
regressions on any Intel platform.


The intel changes looked ok to me, FWIW.  Can we get this committed so I
can commit the rest of my series?


Ping, Jason.

-Brian


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH rework] mesa: rework Driver.CopyImageSubData() and related code

2015-09-21 Thread Jason Ekstrand

On Mon, Sep 21, 2015 at 4:41 PM, Brian Paul  wrote:
> On 09/04/2015 12:39 PM, Brian Paul wrote:
>>
>> On 08/28/2015 11:03 AM, Jason Ekstrand wrote:
>>>
>>> On Fri, Aug 28, 2015 at 7:43 AM, Brian Paul  wrote:

 On 08/28/2015 12:43 AM, Jason Ekstrand wrote:
>
>
> On Thu, Aug 27, 2015 at 11:42 PM, Jason Ekstrand 
> wrote:
>>
>>
>> From: Brian Paul 
>>
>> Previously, core Mesa's _mesa_CopyImageSubData() created temporary
>> textures
>> to wrap renderbuffer sources/destinations.  This caused a bit of a
>> mess
>> in
>> the Mesa/gallium state tracker because we had to basically undo that
>> wrapping.
>>
>> Instead, change ctx->Driver.CopyImageSubData() to take both
>> gl_renderbuffer
>> and gl_texture_image src/dst pointers (one being null, the other
>> non-null)
>> so the driver can handle renderbuffer vs. texture as needed.
>>
>> For the i965 driver, we basically moved the code that wrapped textures
>> around renderbuffers from copyimage.c down into the driver.  So that
>> approach is still used there as before.
>>
>> The old code in copyimage.c also made some questionable calls to
>> _mesa_BindTexture(), etc. which weren't undone at the end.
>>
>> v2 (Jason Ekstrand): Rework the intel bits
>> ---
>>
>> TBH, I haven't actually reviewed the rest of the patch yet but I
>> figured
>> I'd save Brian the time of guess-and-check implementing it on
>> Intel.  I
>> completely reworked the intel bits so we now use the tex/rb
>> directly and
>> pun the renderbuffer wrapping all the way down to the meta layer
>> where it
>> belongs.
>
>
>
> I ran a few quick piglit tests on my Broadwell and it seems to work.
> I also sent it off to our CI system and I'll have those results in the
> morning.



 Thanks, Jason!  I appreciate you taking the time to do this.

 -Brian
>>>
>>>
>>> I looked through the patch for real this time, and everything looks
>>> pretty good.  thanks for cleaning things up and adding spec comments.
>>> The non-Intel bits are
>>>
>>> Reviewed-by: Jason Ekstrand 
>>>
>>> Someone else should review the intel bits since I wrote them.  Also,
>>> FWIW, I ran this patch through our CI system and there are no piglit
>>> regressions on any Intel platform.
>>
>>
>> The intel changes looked ok to me, FWIW.  Can we get this committed so I
>> can commit the rest of my series?
>
>
> Ping, Jason.

You got another review from Topi.  That's good enough for me.
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] r600g/sb: Support gs5 sampler indexing

2015-09-21 Thread Edward O'Callaghan

Reviewed-by: Edward O'Callaghan 

-- 
  Edward O'Callaghan
  edward.ocallag...@koparo.com

On Tue, Sep 22, 2015, at 12:21 AM, Glenn Kennard wrote:
> Signed-off-by: Glenn Kennard 
> ---
> Just UBO support left before gs5 can be enabled.
> Could improve how the two index registers are set/used to reduce
> the number of clauses, but as is its about as good as what the blob
> emits.
> 
>  src/gallium/drivers/r600/r600_shader.c   |  12 ++-
>  src/gallium/drivers/r600/r600_shader.h   |   4 +-
>  src/gallium/drivers/r600/sb/sb_bc.h  |  10 ++-
>  src/gallium/drivers/r600/sb/sb_bc_dump.cpp   |  17 +++-
>  src/gallium/drivers/r600/sb/sb_bc_parser.cpp |  50 +++-
>  src/gallium/drivers/r600/sb/sb_gcm.cpp   |  11 ++-
>  src/gallium/drivers/r600/sb/sb_sched.cpp | 118
>  +--
>  src/gallium/drivers/r600/sb/sb_sched.h   |   5 +-
>  8 files changed, 201 insertions(+), 26 deletions(-)
> 
> diff --git a/src/gallium/drivers/r600/r600_shader.c
> b/src/gallium/drivers/r600/r600_shader.c
> index 1d90582..24c3d43 100644
> --- a/src/gallium/drivers/r600/r600_shader.c
> +++ b/src/gallium/drivers/r600/r600_shader.c
> @@ -166,8 +166,8 @@ int r600_pipe_shader_create(struct pipe_context *ctx,
>  if (rctx->b.chip_class <= R700) {
>   use_sb &= (shader->shader.processor_type != 
> TGSI_PROCESSOR_GEOMETRY);
>  }
> -   /* disable SB for shaders using CF_INDEX_0/1 (sampler/ubo array
> indexing) as it doesn't handle those currently */
> -   use_sb &= !shader->shader.uses_index_registers;
> +   /* disable SB for shaders using ubo array indexing as it doesn't
> handle those currently */
> +   use_sb &= !shader->shader.uses_ubo_indexing;
>   /* disable SB for shaders using doubles */
>   use_sb &= !shader->shader.uses_doubles;
>  
> @@ -1251,7 +1251,7 @@ static int tgsi_split_constant(struct
> r600_shader_ctx *ctx)
>   }
>  
>   if (ctx->src[i].kc_rel)
> -   ctx->shader->uses_index_registers = true;
> +   ctx->shader->uses_ubo_indexing = true;
>  
>   if (ctx->src[i].rel) {
>   int chan = inst->Src[i].Indirect.Swizzle;
> @@ -1912,7 +1912,7 @@ static int r600_shader_from_tgsi(struct
> r600_context *rctx,
>  
>   shader->uses_doubles = ctx.info.uses_doubles;
>  
> -   indirect_gprs = ctx.info.indirect_files & ~(1 <<
> TGSI_FILE_CONSTANT);
> +   indirect_gprs = ctx.info.indirect_files & ~((1 <<
> TGSI_FILE_CONSTANT) | (1 << TGSI_FILE_SAMPLER));
>   tgsi_parse_init(&ctx.parse, tokens);
>   ctx.type = ctx.info.processor;
>   shader->processor_type = ctx.type;
> @@ -1936,7 +1936,7 @@ static int r600_shader_from_tgsi(struct
> r600_context *rctx,
>   ctx.gs_next_vertex = 0;
>   ctx.gs_stream_output_info = &so;
>  
> -   shader->uses_index_registers = false;
> +   shader->uses_ubo_indexing = false;
>   ctx.face_gpr = -1;
>   ctx.fixed_pt_position_gpr = -1;
>   ctx.fragcoord_input = -1;
> @@ -5703,8 +5703,6 @@ static int tgsi_tex(struct r600_shader_ctx *ctx)
>   sampler_src_reg = 3;
>  
>   sampler_index_mode = inst->Src[sampler_src_reg].Indirect.Index == 2 ? 2 
> : 0; // CF_INDEX_1 : CF_INDEX_NONE
> -   if (sampler_index_mode)
> -   ctx->shader->uses_index_registers = true;
>  
>   src_gpr = tgsi_tex_get_src_gpr(ctx, 0);
>  
> diff --git a/src/gallium/drivers/r600/r600_shader.h
> b/src/gallium/drivers/r600/r600_shader.h
> index 48de9cd..8ba32ae 100644
> --- a/src/gallium/drivers/r600/r600_shader.h
> +++ b/src/gallium/drivers/r600/r600_shader.h
> @@ -75,8 +75,8 @@ struct r600_shader {
>   boolean has_txq_cube_array_z_comp;
>   boolean uses_tex_buffers;
>   boolean gs_prim_id_input;
> -   /* Temporarily workaround SB not handling CF_INDEX_[01] index
> registers */
> -   boolean uses_index_registers;
> +   /* Temporarily workaround SB not handling ubo indexing */
> +   boolean uses_ubo_indexing;
>  
>   /* Size in bytes of a data item in the ring(s) (single vertex data).
>  Stages with only one ring items 123 will be set to 0. */
> diff --git a/src/gallium/drivers/r600/sb/sb_bc.h
> b/src/gallium/drivers/r600/sb/sb_bc.h
> index ab988f8..126750d 100644
> --- a/src/gallium/drivers/r600/sb/sb_bc.h
> +++ b/src/gallium/drivers/r600/sb/sb_bc.h
> @@ -48,6 +48,7 @@ class fetch_node;
>  class alu_group_node;
>  class region_node;
>  class shader;
> +class value;
>  
>  class sb_ostream {
>  public:
> @@ -818,13 +819,16 @@ class bc_parser {
>  
>   bool gpr_reladdr;
>  
> +   // Note: currently relies on input emitting SET_CF in same basic
> block as uses
> +   value *cf_index_value[2];
> +   alu_node *mova;
>  public:
>  
>   bc_parser(sb_context &sctx, r600_bytecode *bc, r600_shader* pshader) :
>   ctx(sctx), dec(), bc(bc), pshader(pshade

Re: [Mesa-dev] [PATCH 0/5] Enable up to 24 MRF registers in gen6

2015-09-21 Thread Mark Janes

Hi Iago,

According to my tests, this patch series fixes the gles2/gles3
"functional.uniform_api.random.23" tests in dEQP, on sandybridge.

Do you see the same results?  Should this patch series be applied to the
stable branch?

thanks,

Mark

Iago Toral Quiroga  writes:

> There are some bug reports about shaders failing to compile in gen6
> because MRF 14 is used when we need to spill. For example:
> https://bugs.freedesktop.org/show_bug.cgi?id=86469
> https://bugs.freedesktop.org/show_bug.cgi?id=90631
>
> Discussion in bugzilla pointed to the fact that gen6 might actually have
> 24 MRF registers available instead of 16, so we could use other MRF
> registers and avoid these conflicts (we still need to investigate why
> some shaders need up to MRF 14 anyway, since this is not expected).
>
> Notice that the hardware docs are not clear about this fact:
>
> SNB PRM Vol4 Part2's "Table 5-4. MRF Registers Available in Device
> Hardware" says "Number per Thread" - "24 registers"
>
> However, SNB PRM Vol4 Part1, 1.6.1 Message Register File (MRF) says:
>
> "Normal threads should construct their messages in m1..m15. (...)
> Regardless of actual hardware implementation, the thread should
> not assume th at MRF addresses above m15 wrap to legal MRF registers."
>
> Therefore experimentation was necessary to evaluate if we had these extra
> MRF registers available or not. This was tested in gen6 using MRF
> registers 21..23 for spilling and doing a full piglit run (all.py) forcing
> spilling of everything on the FS backend. It was also tested by doing
> spilling of everything on both the FS and the VS backends with a piglit run
> of shader.py. In both cases no regressions were observed. In fact, many of
> these tests where helped in the cases where we forced spilling, since that
> triggered the same underlying problem described in the bug reports. Here are
> some results using INTEL_DEBUG=spill_fs,spill_vec4 for a shader.py run on
> gen6 hardware:
>
> Using MRFs 13..15 for spilling:
> crash: 2, fail: 113, pass: 6621, skip: 5461
> 
> Using MRFs 21..23 for spilling:
> crash: 2, fail: 12, pass: 6722, skip: 5461
>
> We might want to test this further with other instances of gen6 hardware
> though... I am not sure that we can safely conclude that all implementations
> of gen6 hardware have 24 MRF registers from my tests on just one particular
> SandyBridge laptop.
>
> Iago Toral Quiroga (5):
>   i965: Move MRF register asserts out of brw_reg.h
>   i965: Turn BRW_MAX_MRF into a macro that accepts a hardware generation
>   i965/fs: Use MRF registers 21-23 for spilling in gen6
>   i965/vec4: Use MRF registers 21-23 for spilling in gen6
>   i965: Maximum allowed size of SEND messages is 15 (4 bits)
>
>  src/mesa/drivers/dri/i965/brw_eu_emit.c| 11 +
>  src/mesa/drivers/dri/i965/brw_fs.cpp   |  4 ++--
>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 15 
>  src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  | 27 
> --
>  src/mesa/drivers/dri/i965/brw_inst.h   |  3 +++
>  src/mesa/drivers/dri/i965/brw_ir_vec4.h|  2 +-
>  src/mesa/drivers/dri/i965/brw_reg.h|  9 
>  .../drivers/dri/i965/brw_schedule_instructions.cpp |  4 ++--
>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp   | 10 +---
>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 15 +++-
>  10 files changed, 61 insertions(+), 39 deletions(-)
>
> -- 
> 1.9.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/3] i965: Delete all of the non-NIR vec4 code

2015-09-21 Thread Matt Turner

On Mon, Sep 21, 2015 at 3:18 PM, Jason Ekstrand  wrote:
> At this point, piglit is the same as for GLSL and the shader-db numbers are
> looking pretty good.  On SNB, GLSL vs. NIR for vec4 programs is:
>
>total instructions in shared programs: 2020573 -> 1822601 (-9.80%)
>instructions in affected programs: 1883334 -> 1685362 (-10.51%)
>helped:13328
>HURT:  3594
>
> and there are patches on the list that improve this to
>
>total instructions in shared programs: 2020283 -> 1805487 (-10.63%)
>instructions in affected programs: 1855759 -> 1640963 (-11.57%)
>helped:14142
>HURT:  2346

Wow, that's great. I didn't realize we were that close.

That said, I don't feel like we're /quite/ ready for this (especially
with outstanding optimization patches on the list). I'm not sure what
patches are pending.

Some things I've seen in digging through hurt programs today:

portal-2/high/5134 emits:

vec1 ssa_53 = flog2 ssa_52
vec1 ssa_54 = flog2 ssa_52.y
vec1 ssa_55 = flog2 ssa_52.z
vec4 ssa_56 = vec4 ssa_53, ssa_54, ssa_55, ssa_42.w
vec3 ssa_57 = fmul ssa_56, ssa_3
vec1 ssa_58 = fexp2 ssa_57
vec1 ssa_59 = fexp2 ssa_57.y
vec1 ssa_60 = fexp2 ssa_57.z
vec4 ssa_61 = vec4 ssa_58, ssa_59, ssa_60, ssa_42.w

which we didn't transform into a vec3 pow with or without NIR but we
really should. Why isn't NIR able to handle this? (also, why isn't
".x" printed when the use of an ssa value scalar, e.g., in the
assignment of ssa_58 the RHS should use ssa_57.x).

We generate worse code for all_equal/any_nequal/any.

book-of-unwritten-tales/original/vp-33 (a vertex program) emits uses
DPH and NIR doesn't have DPH. NIR should probably grow a DPH
instruction even if we don't have an optimization to recognize
open-coded DPH.

Lots of things hurt because of lack of global copy/constant
propagation. I think NIR often emits the constant loads in blocks
earlier than their uses and the backend optimizations aren't able to
cope. See team-fortress-2/2197 for example (search for 953267991D, the
hex value for 0.0001F).

I remember this issue from the FS/NIR backend as well, but dota-2/504
(and others) emit:

mad(8)  g16<1>.xF  g11<4,4,1>.xF  g12<4,4,1>.xF  g2<4,4,1>.xF
mad(8)  g19<1>.xF  g10<4,4,1>.xF  g12<4,4,1>.xF  g2<4,4,1>.xF
mad(8)  g22<1>.xF  g9<4,4,1>.xF   g12<4,4,1>.xF  g2<4,4,1>.xF
mad(8)  g25<1>.xF  g8<4,4,1>.xF   g12<4,4,1>.xF  g2<4,4,1>.xF
mad(8)  g28<1>.xF  g7<4,4,1>.xF   g12<4,4,1>.xF  g2<4,4,1>.xF
mad(8)  g31<1>.xF  g6<4,4,1>.xF   g12<4,4,1>.xF  g2<4,4,1>.xF

where the multiplication is duplicated. I can't remember what we decided.

Some instruction sequences are improved with MAD, some are hurt. I
have seen Eduardo's patch and will have to think some more about it.
As far as I can tell, this one has affects the most shaders.

A good number of dashboard2 shaders are hurt significantly (+20%), but
it's not obvious to me why. I'll keep looking.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/vec4: Detect and delete useless MOVs.

2015-09-21 Thread Kenneth Graunke

On Monday, September 21, 2015 02:27:00 PM Matt Turner wrote:
[snip]
> Yeah, the block immediately above handles conditions like this:
> 
> if (inst->opcode != BRW_OPCODE_MOV ||
> (inst->dst.file != GRF && inst->dst.file != MRF) ||
> inst->predicate ||
> inst->src[0].file != GRF ||
> inst->dst.type != inst->src[0].type ||
> inst->src[0].abs || inst->src[0].negate || inst->src[0].reladdr)
>continue;

D'oh, thanks.  I was thinking copy propagation for some reason...

Reviewed-by: Kenneth Graunke 

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/3] i965: Delete all of the non-NIR vec4 code

2015-09-21 Thread Jason Ekstrand

On Sep 21, 2015 5:45 PM, "Matt Turner"  wrote:
>
> On Mon, Sep 21, 2015 at 3:18 PM, Jason Ekstrand 
wrote:
> > At this point, piglit is the same as for GLSL and the shader-db numbers
are
> > looking pretty good.  On SNB, GLSL vs. NIR for vec4 programs is:
> >
> >total instructions in shared programs: 2020573 -> 1822601 (-9.80%)
> >instructions in affected programs: 1883334 -> 1685362 (-10.51%)
> >helped:13328
> >HURT:  3594
> >
> > and there are patches on the list that improve this to
> >
> >total instructions in shared programs: 2020283 -> 1805487 (-10.63%)
> >instructions in affected programs: 1855759 -> 1640963 (-11.57%)
> >helped:14142
> >HURT:  2346
>
> Wow, that's great. I didn't realize we were that close.
>
> That said, I don't feel like we're /quite/ ready for this (especially
> with outstanding optimization patches on the list). I'm not sure what
> patches are pending.

Only two: the one you sent today and Alejandro's patch to make copy
propagation less type-sensitive.

> Some things I've seen in digging through hurt programs today:
>
> portal-2/high/5134 emits:
>
> vec1 ssa_53 = flog2 ssa_52
> vec1 ssa_54 = flog2 ssa_52.y
> vec1 ssa_55 = flog2 ssa_52.z
> vec4 ssa_56 = vec4 ssa_53, ssa_54, ssa_55, ssa_42.w
> vec3 ssa_57 = fmul ssa_56, ssa_3
> vec1 ssa_58 = fexp2 ssa_57
> vec1 ssa_59 = fexp2 ssa_57.y
> vec1 ssa_60 = fexp2 ssa_57.z
> vec4 ssa_61 = vec4 ssa_58, ssa_59, ssa_60, ssa_42.w
>
> which we didn't transform into a vec3 pow with or without NIR but we
> really should. Why isn't NIR able to handle this? (also, why isn't
> ".x" printed when the use of an ssa value scalar, e.g., in the
> assignment of ssa_58 the RHS should use ssa_57.x).
>
>
> We generate worse code for all_equal/any_nequal/any.
>
>
> book-of-unwritten-tales/original/vp-33 (a vertex program) emits uses
> DPH and NIR doesn't have DPH. NIR should probably grow a DPH
> instruction even if we don't have an optimization to recognize
> open-coded DPH.
>
>
> Lots of things hurt because of lack of global copy/constant
> propagation. I think NIR often emits the constant loads in blocks
> earlier than their uses and the backend optimizations aren't able to
> cope. See team-fortress-2/2197 for example (search for 953267991D, the
> hex value for 0.0001F).
>
>
> I remember this issue from the FS/NIR backend as well, but dota-2/504
> (and others) emit:
>
> mad(8)  g16<1>.xF  g11<4,4,1>.xF  g12<4,4,1>.xF  g2<4,4,1>.xF
> mad(8)  g19<1>.xF  g10<4,4,1>.xF  g12<4,4,1>.xF  g2<4,4,1>.xF
> mad(8)  g22<1>.xF  g9<4,4,1>.xF   g12<4,4,1>.xF  g2<4,4,1>.xF
> mad(8)  g25<1>.xF  g8<4,4,1>.xF   g12<4,4,1>.xF  g2<4,4,1>.xF
> mad(8)  g28<1>.xF  g7<4,4,1>.xF   g12<4,4,1>.xF  g2<4,4,1>.xF
> mad(8)  g31<1>.xF  g6<4,4,1>.xF   g12<4,4,1>.xF  g2<4,4,1>.xF
>
> where the multiplication is duplicated. I can't remember what we decided.
>
>
> Some instruction sequences are improved with MAD, some are hurt. I
> have seen Eduardo's patch and will have to think some more about it.
> As far as I can tell, this one has affects the most shaders.
>
>
> A good number of dashboard2 shaders are hurt significantly (+20%), but
> it's not obvious to me why. I'll keep looking.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH shader-db] report.py: Report loop changes separately.

2015-09-21 Thread Matt Turner

Previously, if the number of loops changed (and the instruction count
changed wildly) we'd happily consider those differences in the totals,
even though they're not meaningful.

Based on code by Abdiel Janulgue .
---
 report.py | 43 ---
 1 file changed, 32 insertions(+), 11 deletions(-)

diff --git a/report.py b/report.py
index 1c1e5ac..4c06714 100755
--- a/report.py
+++ b/report.py
@@ -10,7 +10,7 @@ def get_results(filename):
 
 results = {}
 
-re_match = re.compile(r"(\S+) - (.S \S+) shader: (\S*)")
+re_match = re.compile(r"(\S+) - (.S \S+) shader: (\S*) inst, (\S*) loops")
 for line in lines:
 match = re.search(re_match, line)
 if match is None:
@@ -18,8 +18,9 @@ def get_results(filename):
 
 groups = match.groups()
 count = int(groups[2])
+loop = int(groups[3])
 if count != 0:
-results[(groups[0], groups[1])] = count
+results[(groups[0], groups[1])] = count, loop
 
 return results
 
@@ -51,6 +52,8 @@ def main():
 
 total_before = 0
 total_after = 0
+total_before_loop = 0
+total_after_loop = 0
 affected_before = 0
 affected_after = 0
 
@@ -58,22 +61,31 @@ def main():
 hurt = []
 lost = []
 gained = []
+loop_change = []
 for p in args.before:
 (name, type) = p
 namestr = name + " " + type
-before_count = args.before[p]
+before_count = args.before[p][0]
+before_loop = args.before[p][1]
 
 if args.after.get(p) is not None:
-after_count = args.after[p]
+after_count = args.after[p][0]
+after_loop = args.after[p][1]
 
-total_before += before_count
-total_after += after_count
+total_before_loop += before_loop
+total_after_loop += after_loop
+
+if before_loop == after_loop:
+total_before += before_count
+total_after += after_count
 
 if before_count != after_count:
 affected_before += before_count
 affected_after += after_count
 
-if after_count > before_count:
+if after_loop != before_loop:
+loop_change.append(p);
+elif after_count > before_count:
 hurt.append(p)
 else:
 helped.append(p)
@@ -85,23 +97,30 @@ def main():
 gained.append(p[0] + " " + p[1])
 
 helped.sort(
-key=lambda k: float(args.before[k] - args.after[k]) / args.before[k])
+key=lambda k: float(args.before[k][0] - args.after[k][0]) / 
args.before[k][0])
 for p in helped:
 namestr = p[0] + " " + p[1]
 print("helped:   " + get_result_string(
-namestr, args.before[p], args.after[p]))
+namestr, args.before[p][0], args.after[p][0]))
 if len(helped) > 0:
 print("")
 
 hurt.sort(
-key=lambda k: float(args.after[k] - args.before[k]) / args.before[k])
+key=lambda k: float(args.after[k][0] - args.before[k][0]) / 
args.before[k][0])
 for p in hurt:
 namestr = p[0] + " " + p[1]
 print("HURT:   " + get_result_string(
-namestr, args.before[p], args.after[p]))
+namestr, args.before[p][0], args.after[p][0]))
 if len(hurt) > 0:
 print("")
 
+for p in loop_change:
+namestr = p[0] + " " + p[1]
+print("LOOP CHANGE (" + str(args.before[p][1]) + " -> " + 
str(args.after[p][1]) +
+"): " + get_result_string(namestr, args.before[p][0], 
args.after[p][0]))
+if len(loop_change) > 0:
+print("")
+
 lost.sort()
 for p in lost:
 print("LOST:   " + p)
@@ -116,12 +135,14 @@ def main():
 
 print("total instructions in shared programs: {}\n"
   "instructions in affected programs: {}\n"
+  "total loops in shared programs:{}\n"
   "helped:{}\n"
   "HURT:  {}\n"
   "GAINED:{}\n"
   "LOST:  {}".format(
   change(total_before, total_after),
   change(affected_before, affected_after),
+  change(total_before_loop, total_after_loop),
   len(helped),
   len(hurt),
   len(gained),
-- 
2.4.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/3] i965: Delete all of the non-NIR vec4 code

2015-09-21 Thread Jason Ekstrand

On Mon, Sep 21, 2015 at 6:15 PM, Jason Ekstrand  wrote:
>
> On Sep 21, 2015 5:45 PM, "Matt Turner"  wrote:
>>
>> On Mon, Sep 21, 2015 at 3:18 PM, Jason Ekstrand 
>> wrote:
>> > At this point, piglit is the same as for GLSL and the shader-db numbers
>> > are
>> > looking pretty good.  On SNB, GLSL vs. NIR for vec4 programs is:
>> >
>> >total instructions in shared programs: 2020573 -> 1822601 (-9.80%)
>> >instructions in affected programs: 1883334 -> 1685362 (-10.51%)
>> >helped:13328
>> >HURT:  3594
>> >
>> > and there are patches on the list that improve this to
>> >
>> >total instructions in shared programs: 2020283 -> 1805487 (-10.63%)
>> >instructions in affected programs: 1855759 -> 1640963 (-11.57%)
>> >helped:14142
>> >HURT:  2346
>>
>> Wow, that's great. I didn't realize we were that close.
>>
>> That said, I don't feel like we're /quite/ ready for this (especially
>> with outstanding optimization patches on the list). I'm not sure what
>> patches are pending.
>
> Only two: the one you sent today and Alejandro's patch to make copy
> propagation less type-sensitive.
>
>> Some things I've seen in digging through hurt programs today:
>>
>> portal-2/high/5134 emits:
>>
>> vec1 ssa_53 = flog2 ssa_52
>> vec1 ssa_54 = flog2 ssa_52.y
>> vec1 ssa_55 = flog2 ssa_52.z
>> vec4 ssa_56 = vec4 ssa_53, ssa_54, ssa_55, ssa_42.w
>> vec3 ssa_57 = fmul ssa_56, ssa_3
>> vec1 ssa_58 = fexp2 ssa_57
>> vec1 ssa_59 = fexp2 ssa_57.y
>> vec1 ssa_60 = fexp2 ssa_57.z
>> vec4 ssa_61 = vec4 ssa_58, ssa_59, ssa_60, ssa_42.w
>>
>> which we didn't transform into a vec3 pow with or without NIR but we
>> really should. Why isn't NIR able to handle this?

Ken and I were talking about this today.  What it comes down to is
that no one has written the pass yet.  We haven't done that many
vector optimizations to date.

>> (also, why isn't
>> ".x" printed when the use of an ssa value scalar, e.g., in the
>> assignment of ssa_58 the RHS should use ssa_57.x).

It doesn't print the identity swizzle.

>> We generate worse code for all_equal/any_nequal/any.

Yes, we should fix that.  Suggestions/patches welcome, I don't have
any hot ideas at the moment.

>> book-of-unwritten-tales/original/vp-33 (a vertex program) emits uses
>> DPH and NIR doesn't have DPH. NIR should probably grow a DPH
>> instruction even if we don't have an optimization to recognize
>> open-coded DPH.

We could detect fdot(vec4(a.x, a.y, a.z, 1)) in the backend if we
really wanted to.  The long-term solution is probably to add swizzle
support to nir_search but that's going to be a real bear.  How would
having the nir_op_dph instruction help if we can't recognize it?

>> Lots of things hurt because of lack of global copy/constant
>> propagation. I think NIR often emits the constant loads in blocks
>> earlier than their uses and the backend optimizations aren't able to
>> cope. See team-fortress-2/2197 for example (search for 953267991D, the
>> hex value for 0.0001F).

Hrm...  One option would be to copy-prop load_const in emit_alu.  This
should be easy enough to do if we detect that it's a 2-src and one
source is an immediate.  We could also do global copy-prop but I don't
know how hard that is.

>> I remember this issue from the FS/NIR backend as well, but dota-2/504
>> (and others) emit:
>>
>> mad(8)  g16<1>.xF  g11<4,4,1>.xF  g12<4,4,1>.xF  g2<4,4,1>.xF
>> mad(8)  g19<1>.xF  g10<4,4,1>.xF  g12<4,4,1>.xF  g2<4,4,1>.xF
>> mad(8)  g22<1>.xF  g9<4,4,1>.xF   g12<4,4,1>.xF  g2<4,4,1>.xF
>> mad(8)  g25<1>.xF  g8<4,4,1>.xF   g12<4,4,1>.xF  g2<4,4,1>.xF
>> mad(8)  g28<1>.xF  g7<4,4,1>.xF   g12<4,4,1>.xF  g2<4,4,1>.xF
>> mad(8)  g31<1>.xF  g6<4,4,1>.xF   g12<4,4,1>.xF  g2<4,4,1>.xF
>>
>> where the multiplication is duplicated. I can't remember what we decided.

If I remember correctly, it came down to "optimization is hard" and we
said "good enough" about our current heuristics.

>> Some instruction sequences are improved with MAD, some are hurt. I
>> have seen Eduardo's patch and will have to think some more about it.
>> As far as I can tell, this one has affects the most shaders.
>>
>>
>> A good number of dashboard2 shaders are hurt significantly (+20%), but
>> it's not obvious to me why. I'll keep looking.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nir: Always print non-xyzw swizzles.

2015-09-21 Thread Matt Turner

Previously we would not print a swizzle on ssa_52 when only its .x
component is used (as seen in the definition of ssa_53):

   vec3 ssa_52 = fadd ssa_51, ssa_51
   vec1 ssa_53 = flog2 ssa_52
   vec1 ssa_54 = flog2 ssa_52.y
   vec1 ssa_55 = flog2 ssa_52.z

But this makes the interpretation of the RHS of the definition difficult
to understand and dependent on the size of the LHS. Just print swizzles
when they are not .xyzw (which is only possible on vec4 uses), so the
previous example is now printed as:

   vec3 ssa_52 = fadd ssa_51.xyz, ssa_51.xyz
   vec1 ssa_53 = flog2 ssa_52.x
   vec1 ssa_54 = flog2 ssa_52.y
   vec1 ssa_55 = flog2 ssa_52.z
---
 src/glsl/nir/nir_print.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/glsl/nir/nir_print.c b/src/glsl/nir/nir_print.c
index a19aa8b..f72c062 100644
--- a/src/glsl/nir/nir_print.c
+++ b/src/glsl/nir/nir_print.c
@@ -158,17 +158,20 @@ print_alu_src(nir_alu_instr *instr, unsigned src, 
print_state *state)
print_src(&instr->src[src].src, state);
 
bool print_swizzle = false;
+   unsigned used_channels = 0;
for (unsigned i = 0; i < 4; i++) {
   if (!nir_alu_instr_channel_used(instr, src, i))
  continue;
 
+  used_channels++;
+
   if (instr->src[src].swizzle[i] != i) {
  print_swizzle = true;
  break;
   }
}
 
-   if (print_swizzle) {
+   if (print_swizzle || used_channels != 4) {
   fprintf(fp, ".");
   for (unsigned i = 0; i < 4; i++) {
  if (!nir_alu_instr_channel_used(instr, src, i))
-- 
2.4.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir: Always print non-xyzw swizzles.

2015-09-21 Thread Ilia Mirkin

On Tue, Sep 22, 2015 at 12:31 AM, Matt Turner  wrote:
> Previously we would not print a swizzle on ssa_52 when only its .x
> component is used (as seen in the definition of ssa_53):
>
>vec3 ssa_52 = fadd ssa_51, ssa_51
>vec1 ssa_53 = flog2 ssa_52
>vec1 ssa_54 = flog2 ssa_52.y
>vec1 ssa_55 = flog2 ssa_52.z
>
> But this makes the interpretation of the RHS of the definition difficult
> to understand and dependent on the size of the LHS. Just print swizzles
> when they are not .xyzw (which is only possible on vec4 uses), so the
> previous example is now printed as:
>
>vec3 ssa_52 = fadd ssa_51.xyz, ssa_51.xyz

IMHO if ssa_51 is a vec3, this makes sense without the .xyz. I'd
change the condition to print the identity swizzle only if source size
!= output size, not sure if that's easy to do though.

>vec1 ssa_53 = flog2 ssa_52.x
>vec1 ssa_54 = flog2 ssa_52.y
>vec1 ssa_55 = flog2 ssa_52.z
> ---
>  src/glsl/nir/nir_print.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/src/glsl/nir/nir_print.c b/src/glsl/nir/nir_print.c
> index a19aa8b..f72c062 100644
> --- a/src/glsl/nir/nir_print.c
> +++ b/src/glsl/nir/nir_print.c
> @@ -158,17 +158,20 @@ print_alu_src(nir_alu_instr *instr, unsigned src, 
> print_state *state)
> print_src(&instr->src[src].src, state);
>
> bool print_swizzle = false;
> +   unsigned used_channels = 0;
> for (unsigned i = 0; i < 4; i++) {
>if (!nir_alu_instr_channel_used(instr, src, i))
>   continue;
>
> +  used_channels++;
> +
>if (instr->src[src].swizzle[i] != i) {
>   print_swizzle = true;
>   break;
>}
> }
>
> -   if (print_swizzle) {
> +   if (print_swizzle || used_channels != 4) {
>fprintf(fp, ".");
>for (unsigned i = 0; i < 4; i++) {
>   if (!nir_alu_instr_channel_used(instr, src, i))
> --
> 2.4.6
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir: Always print non-xyzw swizzles.

2015-09-21 Thread Matt Turner

On Mon, Sep 21, 2015 at 9:37 PM, Ilia Mirkin  wrote:
> On Tue, Sep 22, 2015 at 12:31 AM, Matt Turner  wrote:
>> Previously we would not print a swizzle on ssa_52 when only its .x
>> component is used (as seen in the definition of ssa_53):
>>
>>vec3 ssa_52 = fadd ssa_51, ssa_51
>>vec1 ssa_53 = flog2 ssa_52
>>vec1 ssa_54 = flog2 ssa_52.y
>>vec1 ssa_55 = flog2 ssa_52.z
>>
>> But this makes the interpretation of the RHS of the definition difficult
>> to understand and dependent on the size of the LHS. Just print swizzles
>> when they are not .xyzw (which is only possible on vec4 uses), so the
>> previous example is now printed as:
>>
>>vec3 ssa_52 = fadd ssa_51.xyz, ssa_51.xyz
>
> IMHO if ssa_51 is a vec3, this makes sense without the .xyz. I'd
> change the condition to print the identity swizzle only if source size
> != output size, not sure if that's easy to do though.

I actually like it better as is for the same reason as stated -- you
don't have to know the type of the ssa value to know what it is.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir: Always print non-xyzw swizzles.

2015-09-21 Thread Ilia Mirkin

On Tue, Sep 22, 2015 at 12:45 AM, Matt Turner  wrote:
> On Mon, Sep 21, 2015 at 9:37 PM, Ilia Mirkin  wrote:
>> On Tue, Sep 22, 2015 at 12:31 AM, Matt Turner  wrote:
>>> Previously we would not print a swizzle on ssa_52 when only its .x
>>> component is used (as seen in the definition of ssa_53):
>>>
>>>vec3 ssa_52 = fadd ssa_51, ssa_51
>>>vec1 ssa_53 = flog2 ssa_52
>>>vec1 ssa_54 = flog2 ssa_52.y
>>>vec1 ssa_55 = flog2 ssa_52.z
>>>
>>> But this makes the interpretation of the RHS of the definition difficult
>>> to understand and dependent on the size of the LHS. Just print swizzles
>>> when they are not .xyzw (which is only possible on vec4 uses), so the
>>> previous example is now printed as:
>>>
>>>vec3 ssa_52 = fadd ssa_51.xyz, ssa_51.xyz
>>
>> IMHO if ssa_51 is a vec3, this makes sense without the .xyz. I'd
>> change the condition to print the identity swizzle only if source size
>> != output size, not sure if that's easy to do though.
>
> I actually like it better as is for the same reason as stated -- you
> don't have to know the type of the ssa value to know what it is.

Well, you'd know based on whether the swizzle is printed or not. If no
swizzle, then it's the same size as the dest. Anyways, your way is an
argument to always just print the swizzle, for vec4's as well.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir: Always print non-xyzw swizzles.

2015-09-21 Thread Jason Ekstrand

On Sep 21, 2015 9:48 PM, "Ilia Mirkin"  wrote:
>
> On Tue, Sep 22, 2015 at 12:45 AM, Matt Turner  wrote:
> > On Mon, Sep 21, 2015 at 9:37 PM, Ilia Mirkin 
wrote:
> >> On Tue, Sep 22, 2015 at 12:31 AM, Matt Turner 
wrote:
> >>> Previously we would not print a swizzle on ssa_52 when only its .x
> >>> component is used (as seen in the definition of ssa_53):
> >>>
> >>>vec3 ssa_52 = fadd ssa_51, ssa_51
> >>>vec1 ssa_53 = flog2 ssa_52
> >>>vec1 ssa_54 = flog2 ssa_52.y
> >>>vec1 ssa_55 = flog2 ssa_52.z
> >>>
> >>> But this makes the interpretation of the RHS of the definition
difficult
> >>> to understand and dependent on the size of the LHS. Just print
swizzles
> >>> when they are not .xyzw (which is only possible on vec4 uses), so the
> >>> previous example is now printed as:
> >>>
> >>>vec3 ssa_52 = fadd ssa_51.xyz, ssa_51.xyz
> >>
> >> IMHO if ssa_51 is a vec3, this makes sense without the .xyz. I'd
> >> change the condition to print the identity swizzle only if source size
> >> != output size, not sure if that's easy to do though.
> >
> > I actually like it better as is for the same reason as stated -- you
> > don't have to know the type of the ssa value to know what it is.
>
> Well, you'd know based on whether the swizzle is printed or not. If no
> swizzle, then it's the same size as the dest. Anyways, your way is an
> argument to always just print the swizzle, for vec4's as well.

I think I'd be inclined to agree with ilia. I fully agree that "vec1 ssa_25
= fadd ssa_4, ssa_7.y" is silly when ssa_4 is a vec3. However, I would like
to drop the swizzles whenever it's "clear from the context".  That probably
means source is the same size as destination, identity swizzle, and all
channels written.

--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: do not attempt to dump_shader if no shaderobj

2015-09-21 Thread Tapani Pälli




On 09/21/2015 05:04 PM, Iago Toral wrote:

On Mon, 2015-09-21 at 09:15 +0300, Tapani Pälli wrote:

Patch fixes a crash in conformance test that tries out different
invalid arguments for glShaderSource and glGetShaderSource:

ES2-CTS.gtf.GL.glGetShaderSource.getshadersource_programhandle

This is a regression from commit:
04e201d0c02cd30ace5c6fe80e9f021ebb733682

Signed-off-by: Tapani Pälli 
---
  src/mesa/main/shaderapi.c | 18 ++
  1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index f31980b..7733d02 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -1699,15 +1699,17 @@ _mesa_ShaderSource(GLhandleARB shaderObj, GLsizei count,
  #if defined(HAVE_SHA1)
 sh = _mesa_lookup_shader(ctx, shaderObj);


Why not call this earlier in that function (before we even process the
shader string) and return if it is NULL? There is no point in waiting
until this moment to check for that.

Then, when we call shader_source right below this code, we could just
pass the sh object directly instead of having that function call
_mesa_lookup_shader again (we could maybe replace that call with an
assert to make sure that we passed a valid shader object)


True, it can be simplified. I did not want to touch original code 
because it works but while being here I can give it a shot.



Iago


-   /* Dump original shader source to MESA_SHADER_DUMP_PATH and replace
-* if corresponding entry found from MESA_SHADER_READ_PATH.
-*/
-   dump_shader(sh->Stage, source);
+   if (sh) {
+  /* Dump original shader source to MESA_SHADER_DUMP_PATH and replace
+   * if corresponding entry found from MESA_SHADER_READ_PATH.
+   */
+  dump_shader(sh->Stage, source);

-   replacement = read_shader(sh->Stage, source);
-   if (replacement) {
-  free(source);
-  source = replacement;
+  replacement = read_shader(sh->Stage, source);
+  if (replacement) {
+ free(source);
+ source = replacement;
+  }
 }
  #endif /* HAVE_SHA1 */





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 92052] nir/nir_builder.h:79: error: expected primary-expression before ‘.’ token

2015-09-21 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=92052

Vinson Lee  changed:

   What|Removed |Added

 Status|NEEDINFO|RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Vinson Lee  ---
commit 46362db4a6bb6db64727d3adcb16ca8f32aa70fb
Author: Jason Ekstrand 
Date:   Mon Sep 21 08:22:12 2015 -0700

nir/builder: Don't use designated initializers

Designated initializers are not allowed in C++ (not even C++11).  Since
nir_lower_samplers is now using nir_builder, and nir_lower_samplers is in
C++, this breaks the build on some compilers.  Aparently, GCC 5 allows it
in some limited extent because mesa still builds on my system without this
patch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92052
Reviewed-by: Kenneth Graunke 

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/5] Enable up to 24 MRF registers in gen6

2015-09-21 Thread Iago Toral

Hi Mark,

On Mon, 2015-09-21 at 17:45 -0700, Mark Janes wrote:
> Hi Iago,
> 
> According to my tests, this patch series fixes the gles2/gles3
> "functional.uniform_api.random.23" tests in dEQP, on sandybridge.
> 
> Do you see the same results?  Should this patch series be applied to the
> stable branch?

I can try to verify this, but I would not be surprised if that was the
case. This seems to fix a couple of bugs in bugzilla and it fixes a lot
of piglit tests in SNB when we force spilling, so if that test is
triggering spilling, there is a good chance that it had the same problem
and this patch fixed it. I'll try to confirm this today and let you
know.

As for whether this should go to stable, I suppose it depends on whether
we are confident that this won't break anything else... if you have not
seen any more issues with piglit/deqp on any other platform I guess
that's a good sign, but I'll let Ken make that call.

Iago

> thanks,
> 
> Mark
> 
> Iago Toral Quiroga  writes:
> 
> > There are some bug reports about shaders failing to compile in gen6
> > because MRF 14 is used when we need to spill. For example:
> > https://bugs.freedesktop.org/show_bug.cgi?id=86469
> > https://bugs.freedesktop.org/show_bug.cgi?id=90631
> >
> > Discussion in bugzilla pointed to the fact that gen6 might actually have
> > 24 MRF registers available instead of 16, so we could use other MRF
> > registers and avoid these conflicts (we still need to investigate why
> > some shaders need up to MRF 14 anyway, since this is not expected).
> >
> > Notice that the hardware docs are not clear about this fact:
> >
> > SNB PRM Vol4 Part2's "Table 5-4. MRF Registers Available in Device
> > Hardware" says "Number per Thread" - "24 registers"
> >
> > However, SNB PRM Vol4 Part1, 1.6.1 Message Register File (MRF) says:
> >
> > "Normal threads should construct their messages in m1..m15. (...)
> > Regardless of actual hardware implementation, the thread should
> > not assume th at MRF addresses above m15 wrap to legal MRF registers."
> >
> > Therefore experimentation was necessary to evaluate if we had these extra
> > MRF registers available or not. This was tested in gen6 using MRF
> > registers 21..23 for spilling and doing a full piglit run (all.py) forcing
> > spilling of everything on the FS backend. It was also tested by doing
> > spilling of everything on both the FS and the VS backends with a piglit run
> > of shader.py. In both cases no regressions were observed. In fact, many of
> > these tests where helped in the cases where we forced spilling, since that
> > triggered the same underlying problem described in the bug reports. Here are
> > some results using INTEL_DEBUG=spill_fs,spill_vec4 for a shader.py run on
> > gen6 hardware:
> >
> > Using MRFs 13..15 for spilling:
> > crash: 2, fail: 113, pass: 6621, skip: 5461
> > 
> > Using MRFs 21..23 for spilling:
> > crash: 2, fail: 12, pass: 6722, skip: 5461
> >
> > We might want to test this further with other instances of gen6 hardware
> > though... I am not sure that we can safely conclude that all implementations
> > of gen6 hardware have 24 MRF registers from my tests on just one particular
> > SandyBridge laptop.
> >
> > Iago Toral Quiroga (5):
> >   i965: Move MRF register asserts out of brw_reg.h
> >   i965: Turn BRW_MAX_MRF into a macro that accepts a hardware generation
> >   i965/fs: Use MRF registers 21-23 for spilling in gen6
> >   i965/vec4: Use MRF registers 21-23 for spilling in gen6
> >   i965: Maximum allowed size of SEND messages is 15 (4 bits)
> >
> >  src/mesa/drivers/dri/i965/brw_eu_emit.c| 11 +
> >  src/mesa/drivers/dri/i965/brw_fs.cpp   |  4 ++--
> >  src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 15 
> >  src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  | 27 
> > --
> >  src/mesa/drivers/dri/i965/brw_inst.h   |  3 +++
> >  src/mesa/drivers/dri/i965/brw_ir_vec4.h|  2 +-
> >  src/mesa/drivers/dri/i965/brw_reg.h|  9 
> >  .../drivers/dri/i965/brw_schedule_instructions.cpp |  4 ++--
> >  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp   | 10 +---
> >  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 15 +++-
> >  10 files changed, 61 insertions(+), 39 deletions(-)
> >
> > -- 
> > 1.9.1
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 0/2] i965/vec4: Change SEL and MOV types as needed to propagate source modifiers

2015-09-21 Thread Thomas Helland

16. sep. 2015 21.47 skrev "Alejandro Piñeiro" :
>
> On the review of the patch "i965/nir/vec4: fill the type of the dst
> and src when loading an uniform" Jason Ekstrand suggested to change
> the optimization pass in order to allow the copy propagation with
> MOVs even if there is a type mismatch, as was done on the fs path,
> instead of fixing the type for MOV instructions.[1]
>
> So using commit 472ef9 as reference I implemented the equivalent
> for the vec4 case. But that only worked if it was the current
> instruction the MOV with default types. It didn't fixed the shader-db
> instruction count regression I was working on, that was when it was
> the from instruction the MOV with default types. Or in other words,
> it didn't cover this case:
>
>1: mov vgrf1.0:UD, u0.xyzw:UD
>2: add vgrf2.0:F, vgrf0.xyzw:F, -vgrf1.xyzw:F
>
> So I extended the same idea by checking too against the from
> instruction. In order to do that, I needed to also track
> the vec4_instructions on the copy_entry struct.
>
> Submitting two patches because I think that it will be easier
> to review in this way. But if this solutions is approved, I
> think that it could be better to push them squashed on just
> one patch.
>
> Shader-db results for vec4 programs on Haswell:
> total instructions in shared programs: 1746280 -> 1732159 (-0.81%)
> instructions in affected programs: 760595 -> 746474 (-1.86%)
> helped:6132
> HURT:  0
> GAINED:0
> LOST:  0
>
>
> [1]
http://lists.freedesktop.org/archives/mesa-dev/2015-September/094555.html
>
> Alejandro Piñeiro (2):
>   i965/vec4: Change types as needed to propagate source modifiers using
> current instruction
>   i965/vec4: Change types as needed to propagate source modifiers using
> from instruction
>
>  .../drivers/dri/i965/brw_vec4_copy_propagation.cpp | 45
--
>  1 file changed, 41 insertions(+), 4 deletions(-)
>

I don't know if it is of much use, but I implemented a function along these
lines
for my value range propagation pass, but that works in NIR.
It tries to infer the type based on the users or the sources.
It lives in [1] as get_alu_type_from_sources / dest.
Just thought I'd mention it. It's not perfect, but I think it should
be quite clear how I wanted it to work. Maybe it's off use, maybe not,
just thought I'd point you to it.

[1]
https://github.com/thohel/mesa/blob/gsoc-final-range-prop/src/glsl/nir/nir_opt_value_range.c
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

73 matches

Mail list logo