[Mesa-dev] [PATCH] vc4: make vc4_begin_query() returns a boolean

2015-05-22 Thread Samuel Pitoiset
I forgot to make the change in 96f164f6f047833091eb98a73aa80c31dc94f962.
This fixes a warning with GCC and probably an error with Clang.

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/vc4/vc4_query.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/vc4/vc4_query.c 
b/src/gallium/drivers/vc4/vc4_query.c
index 1792bec..270832e 100644
--- a/src/gallium/drivers/vc4/vc4_query.c
+++ b/src/gallium/drivers/vc4/vc4_query.c
@@ -50,9 +50,10 @@ vc4_destroy_query(struct pipe_context *ctx, struct 
pipe_query *query)
 free(query);
 }
 
-static void
+static boolean
 vc4_begin_query(struct pipe_context *ctx, struct pipe_query *query)
 {
+return true;
 }
 
 static void
-- 
2.4.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 15/20] i915: Enable intel_render path for points

2015-05-22 Thread Ville Syrjälä
On Thu, May 21, 2015 at 09:14:03PM +0300, Ville Syrjälä wrote:
> On Fri, May 15, 2015 at 12:18:11PM -0700, Ian Romanick wrote:
> > There are some really twitchy tests in ES1 (and possibly ES2)
> > conformance related to this.  Do any of those tests change with this commit?
> 
> I did run some ES1 conformnce tests, but the branches in the repo
> were not very clear so I'm not sure if I ran the right thing (looks
> like I used the "gles1" branch and managed to build something that
> at least runs).
> 
> No changes in the results on 855 or PNV between my branch and the
> baseline AFAICS.
> 
> I'll see if I can get the ES2 tests built as well, and run them on
> PNV.

OK, I managed to run the ES2 tests on PNV, and there are no changes
in the results.

> 
> > 
> > On 03/23/2015 05:47 AM, ville.syrj...@linux.intel.com wrote:
> > > From: Ville Syrjälä 
> > > 
> > > The sub-pixel adjustment for points was killed off in
> > >  commit 60d762aa625095a8c1f9597d8530bb5a6fa61b4c
> > >  Author: Xiang, Haihao 
> > >  Date:   Wed Jan 2 11:38:51 2008 +0800
> > > 
> > > i915: Needn't adjust pixel centers. fix #12944
> > > 
> > > so if we don't need it in intel_tris.c we don't need it in
> > > intel_render.c either, which means we can allow
> > > intel_render.c to render points.
> > > 
> > > Signed-off-by: Ville Syrjälä 
> > > ---
> > >  src/mesa/drivers/dri/i915/intel_render.c | 8 +++-
> > >  1 file changed, 3 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/src/mesa/drivers/dri/i915/intel_render.c 
> > > b/src/mesa/drivers/dri/i915/intel_render.c
> > > index 65ecd05..ef1c718 100644
> > > --- a/src/mesa/drivers/dri/i915/intel_render.c
> > > +++ b/src/mesa/drivers/dri/i915/intel_render.c
> > > @@ -54,9 +54,7 @@
> > >   * dma buffers.  Use strip/fan hardware primitives where possible.
> > >   * Try to simulate missing primitives with indexed vertices.
> > >   */
> > > -#define HAVE_POINTS  0  /* Has it, but can't use because 
> > > subpixel has to
> > > - * be adjusted for points on the 
> > > INTEL/I845G
> > > - */
> > > +#define HAVE_POINTS  1
> > >  #define HAVE_LINES   1
> > >  #define HAVE_LINE_STRIPS 1
> > >  #define HAVE_TRIANGLES   1
> > > @@ -70,7 +68,7 @@
> > >  #define HAVE_ELTS0
> > >  
> > >  static const uint32_t hw_prim[GL_POLYGON + 1] = {
> > > -   [GL_POINTS] = 0,
> > > +   [GL_POINTS] = PRIM3D_POINTLIST,
> > > [GL_LINES ] = PRIM3D_LINELIST,
> > > [GL_LINE_LOOP] = PRIM3D_LINESTRIP,
> > > [GL_LINE_STRIP] = PRIM3D_LINESTRIP,
> > > @@ -96,7 +94,7 @@ static const GLenum reduced_prim[GL_POLYGON + 1] = {
> > >  };
> > >  
> > >  static const int scale_prim[GL_POLYGON + 1] = {
> > > -   [GL_POINTS] = 0, /* fallback case */
> > > +   [GL_POINTS] = 1,
> > > [GL_LINES] = 1,
> > > [GL_LINE_LOOP] = 2,
> > > [GL_LINE_STRIP] = 2,
> > > 
> 
> -- 
> Ville Syrjälä
> Intel OTC

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/20] t_dd_dmatmp: Kill the paths rendering quads/quad strips via indexed vertices

2015-05-22 Thread Ville Syrjälä
On Fri, May 15, 2015 at 12:06:54PM -0700, Ian Romanick wrote:
> On 03/23/2015 05:47 AM, ville.syrj...@linux.intel.com wrote:
> > From: Ville Syrjälä 
> > 
> > No driver supports elts currently, and these make the validate_render
> > code a bit hard to follow. Just kill them.
> 
> It looks like both r200_tcl.c and radeon_tcl.c define HAVE_TRI_STRIPS
> and HAVE_ELTS, but I guess they're using t_dd_dmatmp2.h.  If you add a
> 
> #if defined HAVE_ELTS
> #error "HAVE_ELTS support is removed"
> #endif
> 
> do radeon and r200 still build?

Yes, they only emit elts via t_dd_dmatmp2.h as you said.

And do note that I didn't remove elt support entirely, I just killed
off the more tricky code that emits elts from the vert paths. So the
code can still handle elts just fine if VB->Elts is present.

> 
> > Signed-off-by: Ville Syrjälä 
> > ---
> >  src/mesa/tnl_dd/t_dd_dmatmp.h | 133 
> > ++
> >  1 file changed, 5 insertions(+), 128 deletions(-)
> > 
> > diff --git a/src/mesa/tnl_dd/t_dd_dmatmp.h b/src/mesa/tnl_dd/t_dd_dmatmp.h
> > index 52ea2bf..5ea2d31 100644
> > --- a/src/mesa/tnl_dd/t_dd_dmatmp.h
> > +++ b/src/mesa/tnl_dd/t_dd_dmatmp.h
> > @@ -36,7 +36,7 @@
> >   * Produces code for both inline triangles and indexed triangles.
> >   * Where various primitive types are unaccelerated by hardware, the
> >   * code attempts to fallback to other primitive types (quadstrips to
> > - * tristrips, lineloops to linestrips), or to indexed vertices.
> > + * tristrips, lineloops to linestrips).
> >   */
> >  
> >  #if !defined(HAVE_TRIANGLES)
> > @@ -444,65 +444,6 @@ static void TAG(render_quad_strip_verts)( struct 
> > gl_context *ctx,
> >}
> >  
> >FLUSH();
> > -
> > -   } else if (HAVE_TRI_STRIPS && 
> > - ctx->Light.ShadeModel == GL_FLAT &&
> > - TNL_CONTEXT(ctx)->vb.AttribPtr[_TNL_ATTRIB_COLOR0]->stride) {
> 
> I don't think removing this whole block is right because !HAVE_ELTS is
> an error case... but I guess we've never hit that error case since
> HAVE_ELTS is always zero.

There are no real error cases here, just dead code. validate_render() is
supposed to make sure we never call these functions if the code can't
actually render the primitives. The fprintf()+return branches should
really just contain assert(0) or equivalent.

> 
> > -  if (HAVE_ELTS) {
> > -LOCAL_VARS;
> > -int dmasz = GET_SUBSEQUENT_VB_MAX_ELTS();
> > -int currentsz;
> > -GLuint j, nr;
> > -
> > - EMIT_INDEXED_VERTS( ctx, start, count );
> > -
> > -/* Simulate flat-shaded quadstrips using indexed vertices:
> > - */
> > -ELT_INIT( GL_TRIANGLES );
> > -
> > -currentsz = GET_CURRENT_VB_MAX_ELTS();
> > -
> > -/* Emit whole number of quads in total, and in each buffer.
> > - */
> > -dmasz -= dmasz & 1;
> > -count -= (count-start) & 1;
> > -currentsz -= currentsz & 1;
> > -
> > -if (currentsz < 12)
> > -   currentsz = dmasz;
> > -
> > -currentsz = currentsz/6*2;
> > -dmasz = dmasz/6*2;
> > -
> > -for (j = start; j + 3 < count; j += nr - 2 ) {
> > -   nr = MIN2( currentsz, count - j );
> > -   if (nr >= 4) {
> > -  GLint quads = (nr/2)-1;
> > -  GLint i;
> > -  ELTS_VARS( ALLOC_ELTS( quads*6 ) );
> > -
> > -  for ( i = j-start ; i < j-start+quads*2 ; i+=2 ) {
> > - EMIT_TWO_ELTS( 0, (i+0), (i+1) );
> > - EMIT_TWO_ELTS( 2, (i+2), (i+1) );
> > - EMIT_TWO_ELTS( 4, (i+3), (i+2) );
> > - INCR_ELTS( 6 );
> > -  }
> > -
> > -  FLUSH();
> > -   }
> > -   currentsz = dmasz;
> > -}
> > -
> > -RELEASE_ELT_VERTS();
> > -FLUSH();
> > -  }
> > -  else {
> > -/* Vertices won't fit in a single buffer or elts not
> > - * available - should never happen.
> > - */
> > -fprintf(stderr, "%s - cannot draw primitive\n", __FUNCTION__);
> > -return;
> > -  }
> > }
> > else if (HAVE_TRI_STRIPS) {
> >LOCAL_VARS;
> > @@ -568,57 +509,6 @@ static void TAG(render_quads_verts)( struct gl_context 
> > *ctx,
> >   currentsz = dmasz;
> >}
> > }
> > -   else if (HAVE_ELTS) {
> > -  /* Hardware doesn't have a quad primitive type -- try to
> > -   * simulate it using indexed vertices and the triangle
> > -   * primitive:
> > -   */
> > -  LOCAL_VARS;
> > -  int dmasz = GET_SUBSEQUENT_VB_MAX_ELTS();
> > -  int currentsz;
> > -  GLuint j, nr;
> > -
> > -  EMIT_INDEXED_VERTS( ctx, start, count );
> > -
> > -  FLUSH();
> > -  ELT_INIT( GL_TRIANGLES );
> > -  currentsz = GET_CURRENT_VB_MAX_ELTS();
> > -
> > -  /* Emit whole number of quads in total, and in each buffer.
> > -   */
> > -  dmasz -= dmasz & 3;
> > -  count -= (count-start) & 3;
> > -  currentsz -= currentsz & 3;
> > -
> > -  /* Adjust for rendering as triangles:
> > -   */
> > -  currentsz = currentsz/6*4;
> > -  d

Re: [Mesa-dev] [PATCH v2 03/14] nir: cleanup cf nodes earlier in nir_cf_node_remove()

2015-05-22 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand 

On Thu, May 21, 2015 at 9:40 AM, Connor Abbott  wrote:
> Before, when we were deleting a cf node that was a block, we were first
> removing all the instructions and then calling cleanup_cf_node(), at
> which point cleanup_cf_node() couldn't do its job. Just move it before
> everything else, which should be ok for the non-block case too.
>
> v2: split out from previous commit
> Signed-off-by: Connor Abbott 
> ---
>  src/glsl/nir/nir.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
> index 704553f..a2b5e7c 100644
> --- a/src/glsl/nir/nir.c
> +++ b/src/glsl/nir/nir.c
> @@ -1253,6 +1253,8 @@ nir_cf_node_remove(nir_cf_node *node)
> nir_function_impl *impl = nir_cf_node_get_function(node);
> nir_metadata_preserve(impl, nir_metadata_none);
>
> +   cleanup_cf_node(node, impl);
> +
> if (node->type == nir_cf_node_block) {
>/*
> * Basic blocks can't really be removed by themselves, since they act 
> as
> @@ -1274,8 +1276,6 @@ nir_cf_node_remove(nir_cf_node *node)
>exec_node_remove(&node->node);
>stitch_blocks(before_block, after_block);
> }
> -
> -   cleanup_cf_node(node);
>  }
>
>  static bool
> --
> 2.1.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 04/14] nir: properly clean up jumps when removing cf nodes

2015-05-22 Thread Jason Ekstrand
On Thu, May 21, 2015 at 9:55 AM, Matt Turner  wrote:
> On Thu, May 21, 2015 at 9:40 AM, Connor Abbott  wrote:
>> Before, we might have left dangling predecessors from jumps that were
>> going to be removed.
>>
>> v2: split out from "nir: insert ssa_undef instructions when cleaning up
>> defs/uses"
>>
>> Signed-off-by: Connor Abbott 
>> ---
>>  src/glsl/nir/nir.c | 10 --
>>  1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
>> index a2b5e7c..dc6d63f 100644
>> --- a/src/glsl/nir/nir.c
>> +++ b/src/glsl/nir/nir.c
>> @@ -1214,9 +1214,15 @@ cleanup_cf_node(nir_cf_node *node, nir_function_impl 
>> *impl)
>> switch (node->type) {
>> case nir_cf_node_block: {
>>nir_block *block = nir_cf_node_as_block(node);
>> -  /* We need to walk the instructions and clean up defs/uses */
>> -  nir_foreach_instr(block, instr)
>> +  /* We need to walk the instructions and clean up defs/uses,
>> +   * as well as clean up any jumps to control flow that may not be 
>> getting
>> +   * deleted.
>
> I think you can line wrap this block a little better.

Good idea.  With that,

Reviewed-by: Jason Ekstrand 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] nir/cse: use the instr_hash helper

2015-05-22 Thread Connor Abbott
This replaces an O(n^2) algorithm with an O(n) one, while allowing us to
import most of the infrastructure required for GVN. The idea is to walk
the dominance tree depth-first, similar when converting to SSA, and
remove the instructions from the set when we're done visiting the
sub-tree of the dominance tree so that the only instructions in the set
are the instructions that dominate the current block.

No piglit regressions. No changes in the result of the public shader-db.

Difference at 95.0% confidence
-0.998 +/- 0.312663
-5.96961% +/- 1.87022%
(Student's t, pooled s = 0.332763)

Signed-off-by: Connor Abbott 
---
 src/glsl/nir/nir_opt_cse.c | 134 -
 1 file changed, 24 insertions(+), 110 deletions(-)

diff --git a/src/glsl/nir/nir_opt_cse.c b/src/glsl/nir/nir_opt_cse.c
index a6a4a21..7894147 100644
--- a/src/glsl/nir/nir_opt_cse.c
+++ b/src/glsl/nir/nir_opt_cse.c
@@ -22,10 +22,11 @@
  *
  * Authors:
  *Jason Ekstrand (ja...@jlekstrand.net)
+ *Connor Abbott (cwabbo...@gmail.com)
  *
  */
 
-#include "nir.h"
+#include "nir_instr_hash.h"
 
 /*
  * Implements common subexpression elimination
@@ -33,141 +34,54 @@
 
 struct cse_state {
void *mem_ctx;
+   struct set *instr_set;
bool progress;
 };
 
-static bool
-src_is_ssa(nir_src *src, void *data)
-{
-   (void) data;
-   return src->is_ssa;
-}
-
-static bool
-dest_is_ssa(nir_dest *dest, void *data)
-{
-   (void) data;
-   return dest->is_ssa;
-}
+/*
+ * Visits and CSE's the given block and all its descendants in the dominance
+ * tree recursively. Note that the instr_set is guaranteed to only ever
+ * contain instructions that dominate the current block.
+ */
 
 static bool
-nir_instr_can_cse(nir_instr *instr)
+cse_block(nir_block *block, struct set *instr_set)
 {
-   /* We only handle SSA. */
-   if (!nir_foreach_dest(instr, dest_is_ssa, NULL) ||
-   !nir_foreach_src(instr, src_is_ssa, NULL))
-  return false;
-
-   switch (instr->type) {
-   case nir_instr_type_alu:
-   case nir_instr_type_load_const:
-   case nir_instr_type_phi:
-  return true;
-   case nir_instr_type_tex:
-  return false; /* TODO */
-   case nir_instr_type_intrinsic: {
-  const nir_intrinsic_info *info =
- &nir_intrinsic_infos[nir_instr_as_intrinsic(instr)->intrinsic];
-  return (info->flags & NIR_INTRINSIC_CAN_ELIMINATE) &&
- (info->flags & NIR_INTRINSIC_CAN_REORDER) &&
- info->num_variables == 0; /* not implemented yet */
-   }
-   case nir_instr_type_call:
-   case nir_instr_type_jump:
-   case nir_instr_type_ssa_undef:
-  return false;
-   case nir_instr_type_parallel_copy:
-   default:
-  unreachable("Invalid instruction type");
-   }
-
-   return false;
-}
-
-static nir_ssa_def *
-nir_instr_get_dest_ssa_def(nir_instr *instr)
-{
-   switch (instr->type) {
-   case nir_instr_type_alu:
-  assert(nir_instr_as_alu(instr)->dest.dest.is_ssa);
-  return &nir_instr_as_alu(instr)->dest.dest.ssa;
-   case nir_instr_type_load_const:
-  return &nir_instr_as_load_const(instr)->def;
-   case nir_instr_type_phi:
-  assert(nir_instr_as_phi(instr)->dest.is_ssa);
-  return &nir_instr_as_phi(instr)->dest.ssa;
-   case nir_instr_type_intrinsic:
-  assert(nir_instr_as_intrinsic(instr)->dest.is_ssa);
-  return &nir_instr_as_intrinsic(instr)->dest.ssa;
-   default:
-  unreachable("We never ask for any of these");
-   }
-}
+   bool progress = false;
 
-static void
-nir_opt_cse_instr(nir_instr *instr, struct cse_state *state)
-{
-   if (!nir_instr_can_cse(instr))
-  return;
-
-   for (struct exec_node *node = instr->node.prev;
-!exec_node_is_head_sentinel(node); node = node->prev) {
-  nir_instr *other = exec_node_data(nir_instr, node, node);
-  if (nir_instrs_equal(instr, other)) {
- nir_ssa_def *other_def = nir_instr_get_dest_ssa_def(other);
- nir_ssa_def_rewrite_uses(nir_instr_get_dest_ssa_def(instr),
-  nir_src_for_ssa(other_def),
-  state->mem_ctx);
+   nir_foreach_instr_safe(block, instr) {
+  if (nir_instr_set_add(instr_set, instr)) {
+ progress = true;
  nir_instr_remove(instr);
- state->progress = true;
- return;
   }
}
 
-   for (nir_block *block = instr->block->imm_dom;
-block != NULL; block = block->imm_dom) {
-  nir_foreach_instr_reverse(block, other) {
- if (nir_instrs_equal(instr, other)) {
-nir_ssa_def *other_def = nir_instr_get_dest_ssa_def(other);
-nir_ssa_def_rewrite_uses(nir_instr_get_dest_ssa_def(instr),
- nir_src_for_ssa(other_def),
- state->mem_ctx);
-nir_instr_remove(instr);
-state->progress = true;
-return;
- }
-  }
+   for (unsigned i = 0; i < block->num_dom_children; i++) {
+  nir_block *child = block

[Mesa-dev] [PATCH 1/5] nir: split out instruction comparison functions

2015-05-22 Thread Connor Abbott
We'll want to use these outside of nir_opt_cse.c. Rather than adding yet
another thing to the mess that is nir.c, create a new file and move
nir_srcs_equal() there too.

Signed-off-by: Connor Abbott 
---
 src/glsl/Makefile.sources|   1 +
 src/glsl/nir/nir.c   |  27 --
 src/glsl/nir/nir.h   |   1 +
 src/glsl/nir/nir_instr_compare.c | 179 +++
 src/glsl/nir/nir_opt_cse.c   | 121 --
 5 files changed, 181 insertions(+), 148 deletions(-)
 create mode 100644 src/glsl/nir/nir_instr_compare.c

diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
index d784a81..75e5377 100644
--- a/src/glsl/Makefile.sources
+++ b/src/glsl/Makefile.sources
@@ -27,6 +27,7 @@ NIR_FILES = \
nir/nir_constant_expressions.h \
nir/nir_dominance.c \
nir/nir_from_ssa.c \
+   nir/nir_instr_compare.c \
nir/nir_intrinsics.c \
nir/nir_intrinsics.h \
nir/nir_live_variables.c \
diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
index f03e80a..170807e 100644
--- a/src/glsl/nir/nir.c
+++ b/src/glsl/nir/nir.c
@@ -1785,33 +1785,6 @@ nir_src_as_const_value(nir_src src)
return &load->value;
 }
 
-bool
-nir_srcs_equal(nir_src src1, nir_src src2)
-{
-   if (src1.is_ssa) {
-  if (src2.is_ssa) {
- return src1.ssa == src2.ssa;
-  } else {
- return false;
-  }
-   } else {
-  if (src2.is_ssa) {
- return false;
-  } else {
- if ((src1.reg.indirect == NULL) != (src2.reg.indirect == NULL))
-return false;
-
- if (src1.reg.indirect) {
-if (!nir_srcs_equal(*src1.reg.indirect, *src2.reg.indirect))
-   return false;
- }
-
- return src1.reg.reg == src2.reg.reg &&
-src1.reg.base_offset == src2.reg.base_offset;
-  }
-   }
-}
-
 static bool
 src_is_valid(const nir_src *src)
 {
diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index 697d37e..f04195f 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -1579,6 +1579,7 @@ bool nir_foreach_src(nir_instr *instr, nir_foreach_src_cb 
cb, void *state);
 
 nir_const_value *nir_src_as_const_value(nir_src src);
 bool nir_srcs_equal(nir_src src1, nir_src src2);
+bool nir_instrs_equal(nir_instr *instr1, nir_instr *instr2);
 void nir_instr_rewrite_src(nir_instr *instr, nir_src *src, nir_src new_src);
 void nir_instr_move_src(nir_instr *dest_instr, nir_src *dest, nir_src *src);
 void nir_if_rewrite_condition(nir_if *if_stmt, nir_src new_src);
diff --git a/src/glsl/nir/nir_instr_compare.c b/src/glsl/nir/nir_instr_compare.c
new file mode 100644
index 000..89b576c
--- /dev/null
+++ b/src/glsl/nir/nir_instr_compare.c
@@ -0,0 +1,179 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ * Copyright © 2015 Connor Abbott
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *Jason Ekstrand (ja...@jlekstrand.net)
+ *Connor Abbott (cwabbo...@gmail.com)
+ *
+ */
+
+#include "nir.h"
+
+bool
+nir_srcs_equal(nir_src src1, nir_src src2)
+{
+   if (src1.is_ssa) {
+  if (src2.is_ssa) {
+ return src1.ssa == src2.ssa;
+  } else {
+ return false;
+  }
+   } else {
+  if (src2.is_ssa) {
+ return false;
+  } else {
+ if ((src1.reg.indirect == NULL) != (src2.reg.indirect == NULL))
+return false;
+
+ if (src1.reg.indirect) {
+if (!nir_srcs_equal(*src1.reg.indirect, *src2.reg.indirect))
+   return false;
+ }
+
+ return src1.reg.reg == src2.reg.reg &&
+src1.reg.base_offset == src2.reg.base_offset;
+  }
+   }
+}
+
+
+static bool
+nir_alu_srcs_equal(nir_alu_instr *alu1, nir_alu_instr *alu2, unsigned src1,
+   unsigned src2)
+{
+   if (alu1->src[src1].abs != alu2->src[src2].abs ||
+   alu1->src[src1].negate != alu2->src[src2].neg

[Mesa-dev] [PATCH 4/5] nir: add a helper for finding duplicate instructions

2015-05-22 Thread Connor Abbott
This can be used for both CSE and value numbering.

Signed-off-by: Connor Abbott 
---
 src/glsl/Makefile.sources |   2 +
 src/glsl/nir/nir_instr_hash.c | 255 ++
 src/glsl/nir/nir_instr_hash.h |  36 ++
 3 files changed, 293 insertions(+)
 create mode 100644 src/glsl/nir/nir_instr_hash.c
 create mode 100644 src/glsl/nir/nir_instr_hash.h

diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
index 75e5377..fd10cdd 100644
--- a/src/glsl/Makefile.sources
+++ b/src/glsl/Makefile.sources
@@ -28,6 +28,8 @@ NIR_FILES = \
nir/nir_dominance.c \
nir/nir_from_ssa.c \
nir/nir_instr_compare.c \
+   nir/nir_instr_hash.c \
+   nir/nir_instr_hash.h \
nir/nir_intrinsics.c \
nir/nir_intrinsics.h \
nir/nir_live_variables.c \
diff --git a/src/glsl/nir/nir_instr_hash.c b/src/glsl/nir/nir_instr_hash.c
new file mode 100644
index 000..d900b66
--- /dev/null
+++ b/src/glsl/nir/nir_instr_hash.c
@@ -0,0 +1,255 @@
+#include "nir_instr_hash.h"
+
+#define HASH(data) hash = _mesa_fnv32_1a_accumulate(hash, (data))
+
+static uint32_t
+hash_src(uint32_t hash, const nir_src *src)
+{
+   assert(src->is_ssa);
+   HASH(src->ssa);
+   return hash;
+}
+
+static uint32_t
+hash_alu_src(uint32_t hash, const nir_alu_src *src, unsigned num_components)
+{
+   HASH(src->abs);
+   HASH(src->negate);
+
+   for (unsigned i = 0; i < num_components; i++)
+  HASH(src->swizzle[i]);
+
+   hash = hash_src(hash, &src->src);
+   return hash;
+}
+
+static uint32_t
+hash_alu(uint32_t hash, const nir_alu_instr *instr)
+{
+   HASH(instr->op);
+   HASH(instr->dest.dest.ssa.num_components);
+
+   if (nir_op_infos[instr->op].algebraic_properties & NIR_OP_IS_COMMUTATIVE) {
+  assert(nir_op_infos[instr->op].num_inputs == 2);
+  uint32_t hash0 = hash_alu_src(hash, &instr->src[0],
+nir_ssa_alu_instr_src_components(instr, 
0));
+  uint32_t hash1 = hash_alu_src(hash, &instr->src[1],
+nir_ssa_alu_instr_src_components(instr, 
1));
+  /* For commutative operations, we need some commutative way of
+   * combining the hashes.  One option would be to XOR them but that
+   * means that anything with two identical sources will hash to 0 and
+   * that's common enough we probably don't want the guaranteed
+   * collision.  Either addition or multiplication will also work.
+   */
+  hash = hash0 * hash1;
+   } else {
+  for (unsigned i = 0; i < nir_op_infos[instr->op].num_inputs; i++) {
+ hash = hash_alu_src(hash, &instr->src[i],
+ nir_ssa_alu_instr_src_components(instr, i));
+  }
+   }
+
+   return hash;
+}
+
+static uint32_t
+hash_load_const(uint32_t hash, const nir_load_const_instr *instr)
+{
+   HASH(instr->def.num_components);
+
+   hash = _mesa_fnv32_1a_accumulate_block(hash, instr->value.f,
+  instr->def.num_components
+ * sizeof(instr->value.f[0]));
+
+   return hash;
+}
+
+static int
+cmp_phi_src(const void *data1, const void *data2)
+{
+   const nir_phi_src *src1 = data1, *src2 = data2;
+   return src1->pred->index - src2->pred->index;
+}
+
+static uint32_t
+hash_phi(uint32_t hash, const nir_phi_instr *instr)
+{
+   HASH(instr->instr.block);
+
+   /* sort sources by predecessor index, since the order shouldn't matter */
+   unsigned num_preds = instr->instr.block->predecessors->entries;
+   nir_phi_src *srcs = malloc(num_preds * sizeof(nir_phi_src));
+   unsigned i = 0;
+   nir_foreach_phi_src(instr, src) {
+  srcs[i++] = *src;
+   }
+   qsort(srcs, num_preds, sizeof(nir_phi_src), cmp_phi_src);
+   for (i = 0; i < num_preds; i++) {
+  hash = hash_src(hash, &srcs[i].src);
+  HASH(srcs[i].pred);
+   }
+   free(srcs);
+
+   return hash;
+}
+
+static uint32_t
+hash_intrinsic(uint32_t hash, const nir_intrinsic_instr *instr)
+{
+   const nir_intrinsic_info *info = &nir_intrinsic_infos[instr->intrinsic];
+   HASH(instr->intrinsic);
+
+   if (info->has_dest)
+  HASH(instr->dest.ssa.num_components);
+
+   assert(info->num_variables == 0);
+
+   hash = _mesa_fnv32_1a_accumulate_block(hash, instr->const_index,
+  info->num_indices
+ * sizeof(instr->const_index[0]));
+   return hash;
+}
+
+static uint32_t
+hash_instr(const void *data)
+{
+   const nir_instr *instr = data;
+   uint32_t hash = _mesa_fnv32_1a_offset_bias;
+
+   switch (instr->type) {
+   case nir_instr_type_alu:
+  hash = hash_alu(hash, nir_instr_as_alu(instr));
+  break;
+   case nir_instr_type_load_const:
+  hash = hash_load_const(hash, nir_instr_as_load_const(instr));
+  break;
+   case nir_instr_type_phi:
+  hash = hash_phi(hash, nir_instr_as_phi(instr));
+  break;
+   case nir_instr_type_intrinsic:
+  hash = hash_intrinsic(hash, nir_instr

[Mesa-dev] [PATCH 2/5] nir: constify nir_ssa_alu_instr_src_components()

2015-05-22 Thread Connor Abbott
Signed-off-by: Connor Abbott 
---
 src/glsl/nir/nir.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index f04195f..e3b7b17 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -722,7 +722,7 @@ nir_alu_instr_channel_used(nir_alu_instr *instr, unsigned 
src, unsigned channel)
  * used for a source
  */
 static inline unsigned
-nir_ssa_alu_instr_src_components(nir_alu_instr *instr, unsigned src)
+nir_ssa_alu_instr_src_components(const nir_alu_instr *instr, unsigned src)
 {
assert(instr->dest.dest.is_ssa);
 
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/5] nir: constify instruction comparison functions

2015-05-22 Thread Connor Abbott
Signed-off-by: Connor Abbott 
---
 src/glsl/nir/nir.h   | 4 ++--
 src/glsl/nir/nir_instr_compare.c | 8 
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index e3b7b17..19b1e18 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -1578,8 +1578,8 @@ bool nir_foreach_dest(nir_instr *instr, 
nir_foreach_dest_cb cb, void *state);
 bool nir_foreach_src(nir_instr *instr, nir_foreach_src_cb cb, void *state);
 
 nir_const_value *nir_src_as_const_value(nir_src src);
-bool nir_srcs_equal(nir_src src1, nir_src src2);
-bool nir_instrs_equal(nir_instr *instr1, nir_instr *instr2);
+bool nir_srcs_equal(const nir_src src1, const nir_src src2);
+bool nir_instrs_equal(const nir_instr *instr1, const nir_instr *instr2);
 void nir_instr_rewrite_src(nir_instr *instr, nir_src *src, nir_src new_src);
 void nir_instr_move_src(nir_instr *dest_instr, nir_src *dest, nir_src *src);
 void nir_if_rewrite_condition(nir_if *if_stmt, nir_src new_src);
diff --git a/src/glsl/nir/nir_instr_compare.c b/src/glsl/nir/nir_instr_compare.c
index 89b576c..07d0031 100644
--- a/src/glsl/nir/nir_instr_compare.c
+++ b/src/glsl/nir/nir_instr_compare.c
@@ -30,7 +30,7 @@
 #include "nir.h"
 
 bool
-nir_srcs_equal(nir_src src1, nir_src src2)
+nir_srcs_equal(const nir_src src1, const nir_src src2)
 {
if (src1.is_ssa) {
   if (src2.is_ssa) {
@@ -58,8 +58,8 @@ nir_srcs_equal(nir_src src1, nir_src src2)
 
 
 static bool
-nir_alu_srcs_equal(nir_alu_instr *alu1, nir_alu_instr *alu2, unsigned src1,
-   unsigned src2)
+nir_alu_srcs_equal(const nir_alu_instr *alu1, const nir_alu_instr *alu2,
+   unsigned src1, unsigned src2)
 {
if (alu1->src[src1].abs != alu2->src[src2].abs ||
alu1->src[src1].negate != alu2->src[src2].negate)
@@ -74,7 +74,7 @@ nir_alu_srcs_equal(nir_alu_instr *alu1, nir_alu_instr *alu2, 
unsigned src1,
 }
 
 bool
-nir_instrs_equal(nir_instr *instr1, nir_instr *instr2)
+nir_instrs_equal(const nir_instr *instr1, const nir_instr *instr2)
 {
if (instr1->type != instr2->type)
   return false;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/5] NIR CSE performance improvements

2015-05-22 Thread Connor Abbott
This series implements an API that lets us use a hash set in the CSE
pass, which fixes the performance problem while getting most of the code
for Global Value Numbering (GVN) on top of GCM into the tree and
actively used. The first three patches are taken from Jason's rebase of
my GVN series, and patch 4 is modified from the original patch 4 to
expose a slightly lower-level API (details in the patch itself). Patch
5 then rewrites the CSE pass to use the API introduced in patch 4. Note
that the API could be just as easily adapted to implement
nir_value_number() from the original patch 4.

Perf numbers are in patch 5, but for those that prefer pretty pictures,
I've uploaded a few graphs:

http://people.freedesktop.org/~cwabbott0/cse-series/perf-before.svg
http://people.freedesktop.org/~cwabbott0/cse-series/perf-after.svg

where you can clearly see how CSE goes from taking up almost all the
time spent in nir_optimize() to less than half (but still a significant
amount).

This series is also available at

git://people.freedesktop.org/~cwabbott0/mesa nir-cse-hash

Connor Abbott (5):
  nir: split out instruction comparison functions
  nir: constify nir_ssa_alu_instr_src_components()
  nir: constify instruction comparison functions
  nir: add a helper for finding duplicate instructions
  nir/cse: use the instr_hash helper

 src/glsl/Makefile.sources|   3 +
 src/glsl/nir/nir.c   |  27 -
 src/glsl/nir/nir.h   |   5 +-
 src/glsl/nir/nir_instr_compare.c | 179 +++
 src/glsl/nir/nir_instr_hash.c| 255 +++
 src/glsl/nir/nir_instr_hash.h|  36 ++
 src/glsl/nir/nir_opt_cse.c   | 255 ---
 7 files changed, 500 insertions(+), 260 deletions(-)
 create mode 100644 src/glsl/nir/nir_instr_compare.c
 create mode 100644 src/glsl/nir/nir_instr_hash.c
 create mode 100644 src/glsl/nir/nir_instr_hash.h

-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glx: fix Scons build

2015-05-22 Thread Brian Paul
Replace -h with --header-tag as was done for the Makefile build.
---
 src/glx/SConscript | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glx/SConscript b/src/glx/SConscript
index b91c0bd..619e4c3 100644
--- a/src/glx/SConscript
+++ b/src/glx/SConscript
@@ -125,7 +125,7 @@ env.CodeGenerate(
 target = 'indirect_size.h',
 script = GLAPI + 'gen/glX_proto_size.py',
 source = sources,
-command = python_cmd + ' $SCRIPT -f $SOURCE -m size_h --only-set -h 
_INDIRECT_SIZE_H > $TARGET'
+command = python_cmd + ' $SCRIPT -f $SOURCE -m size_h --only-set 
--header-tag _INDIRECT_SIZE_H > $TARGET'
 )
 
 env.CodeGenerate(
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glapi: fix scons build

2015-05-22 Thread Dylan Baker
The arguments for glX_proto_size.py changed slightly, the '-h' short
option was removed, because argparse reserves that for help messages.
The auto-tools based build was already updated to account for this
change, but the scons build was not.

Signed-off-by: Dylan Baker 
---

The Scons build is broken for me before I get here, someone could test
this that would be great.

I don't have push access.

 src/glx/SConscript | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glx/SConscript b/src/glx/SConscript
index b91c0bd..619e4c3 100644
--- a/src/glx/SConscript
+++ b/src/glx/SConscript
@@ -125,7 +125,7 @@ env.CodeGenerate(
 target = 'indirect_size.h',
 script = GLAPI + 'gen/glX_proto_size.py',
 source = sources,
-command = python_cmd + ' $SCRIPT -f $SOURCE -m size_h --only-set -h 
_INDIRECT_SIZE_H > $TARGET'
+command = python_cmd + ' $SCRIPT -f $SOURCE -m size_h --only-set 
--header-tag _INDIRECT_SIZE_H > $TARGET'
 )
 
 env.CodeGenerate(
-- 
2.4.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glx: fix Scons build

2015-05-22 Thread Dylan Baker
Well, I wrote the exact same patch and sent it a few seconds ago before
I noticed this.

Sorry to break things.

Reviewed-by: Dylan Baker 

On Fri, May 22, 2015 at 01:23:52PM -0700, Brian Paul wrote:
> Replace -h with --header-tag as was done for the Makefile build.
> ---
>  src/glx/SConscript | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/glx/SConscript b/src/glx/SConscript
> index b91c0bd..619e4c3 100644
> --- a/src/glx/SConscript
> +++ b/src/glx/SConscript
> @@ -125,7 +125,7 @@ env.CodeGenerate(
>  target = 'indirect_size.h',
>  script = GLAPI + 'gen/glX_proto_size.py',
>  source = sources,
> -command = python_cmd + ' $SCRIPT -f $SOURCE -m size_h --only-set -h 
> _INDIRECT_SIZE_H > $TARGET'
> +command = python_cmd + ' $SCRIPT -f $SOURCE -m size_h --only-set 
> --header-tag _INDIRECT_SIZE_H > $TARGET'
>  )
>  
>  env.CodeGenerate(
> -- 
> 1.9.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: Digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] xlib: fix X_GLXCreateContextAtrribs/Attribs typo

2015-05-22 Thread Brian Paul
In case the glproto.h file isn't up to date, we provide the #define
for X_GLXCreateContextAttribsARB.
---
 src/gallium/state_trackers/glx/xlib/glx_api.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/glx/xlib/glx_api.c 
b/src/gallium/state_trackers/glx/xlib/glx_api.c
index 0508255..ca86aea 100644
--- a/src/gallium/state_trackers/glx/xlib/glx_api.c
+++ b/src/gallium/state_trackers/glx/xlib/glx_api.c
@@ -40,6 +40,12 @@
 
 #include "xm_api.h"
 
+/* An "Atrribs/Attribs" typo was fixed in glxproto.h in Nov 2014.
+ * This is in case we don't have the updated header.
+ */
+#ifndef X_GLXCreateContextAttribsARB
+#define X_GLXCreateContextAttribsARB 34
+#endif
 
 /* This indicates the client-side GLX API and GLX encoder version. */
 #define CLIENT_MAJOR_VERSION 1
@@ -2168,7 +2174,7 @@ glXQueryDrawable(Display *dpy, GLXDrawable draw, int 
attribute,
 #endif
 
   default:
- generate_error(dpy, BadValue, 0, X_GLXCreateContextAtrribsARB, true);
+ generate_error(dpy, BadValue, 0, X_GLXCreateContextAttribsARB, true);
  return;
}
 }
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 21/22] i965/gen9: Plugin the code for selecting YF/YS tiling on skl+

2015-05-22 Thread Anuj Phogat
On Thu, May 21, 2015 at 4:33 PM, Ben Widawsky  wrote:
> On Fri, Apr 17, 2015 at 04:51:42PM -0700, Anuj Phogat wrote:
>> Note: Yf/Ys tiling stays disabled to avoid any piglit regressions. I'm
>> working on fixing the remaining piglit failures.
>>
>> We need to do some benchmarking to come up with conditions to choose
>> Ys (64 KB) over Yf (4 KB). Any thoughts on how big a texture should
>> be so that 64 KB tiling is preferred over 4KB?
>>
>> Signed-off-by: Anuj Phogat 
>> ---
>>  src/mesa/drivers/dri/i965/brw_tex_layout.c | 81 
>> ++
>>  1 file changed, 81 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c 
>> b/src/mesa/drivers/dri/i965/brw_tex_layout.c
>> index 28927e9..b2a2dac 100644
>> --- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
>> +++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
>> @@ -684,6 +684,65 @@ intel_miptree_total_width_height(struct brw_context 
>> *brw,
>> }
>>  }
>>
>> +static bool
>> +intel_miptree_choose_tr_mode(struct brw_context *brw,
>
> Either rename this function, or make it actually choose the tr_mode.
>
>> + mesa_format format,
>> + uint32_t width0,
>> + uint32_t num_samples,
>> + enum intel_miptree_tiling_mode requested,
>> + struct intel_mipmap_tree *mt,
>> + uint32_t tr_mode)
>
> As I state above and below, I don't think tr_mode should be passed in here. 
> This
> function is supposed to "choose" the mode.
>
Got it.

>> +{
>> +   const unsigned bpp = mt->cpp * 8;
>> +
>> +   /* bpp must be power of 2. */
>> +   if (!mt->compressed &&
>> +   _mesa_is_format_color_format(mt->format) &&
>
> I am not finding the reason for the color format restriction. I believe it, 
> just
> not seeing it...
>
I couldn't find it either. Found this in my notes:
Enable YF/YS tiling only for color buffers because depth and
stencil buffers are not supported in XY_FAST_COPY_BLT
and meta_pbo_{TexSubImage, GetTexSubImage} paths.
These are the only paths enabled by this series to handle
Yf/Ys surfaces.

I think we can enable them for depth and stencil if we
implement the tiled_memcpy() paths to do the software
decoding of Yf/Ys tiled surfaces. I'll add a comment here
to explain the restriction.

>> +   (requested == INTEL_MIPTREE_TILING_Y ||
>> +requested == INTEL_MIPTREE_TILING_ANY) &&
>> +   (bpp && (bpp & (bpp - 1)) == 0)) {
>> +
>
> is_power_of_2
>
yes.
>
> How about something like...
>
> /* Low bits are the highest priority modes */
> modes = INTEL_MIPTREE_TRMODE_YS | INTEL_MIPTREE_TRMODE_YF;
> while ((modes >>= 1) & 1) {
> ... try to allocate
> }
>
Yes. This looks better. We can later add conditions here to set the
priority of Yf/Ys based on buffer size, alignment etc.

>> +  mt->tr_mode = tr_mode;
>
> you're leaking tr_mode state here if you fail below.
>
I'll reset it to TRMODE_NONE at the bottom of function. This patch
does it outside this function in the fallback path.

>> +  mt->align_w = intel_horizontal_texture_alignment_unit(brw, mt);
>> +  mt->align_h = intel_vertical_texture_alignment_unit(brw, mt);
>> +
>> +  intel_miptree_total_width_height(brw, mt);
>> +
>> +  /* pitch == 0 || height == 0  indicates the null texture */
>
> What is the null texture?
Copy-pasted comment. I'll get rid of it.
>
>> +  if (!mt || !mt->total_width || !mt->total_height) {
>> + intel_miptree_release(&mt);
>> + return false;
>> +  }
>> +
>> +  mt->tiling = intel_miptree_choose_tiling(brw, format, width0,
>> +   num_samples,
>> +   requested, mt);
>> +
>> +  if (mt->tiling == I915_TILING_Y ||
>> +  mt->tiling == (I915_TILING_Y | I915_TILING_X)) {
>> +
>> + /* Don't allow YS tiling at the moment. Using 64KB tiling for small
>> +  * textures might result in to wastage of memory.
>> +  * FIXME: Revisit this condition when we have more information 
>> about
>> +  * the specific cases where using YS over YF will be useful.
>> +  */
>> + if (mt->tr_mode == INTEL_MIPTREE_TRMODE_YF)
>> +return true;
>> +  }
>> +
>> +  /* Can't use requested tr_mode. Free up the memory allocated for
>> +   * miptree levels in intel_miptree_total_width_height().
>> +   */
>> +  unsigned level;
>> +  for (level = mt->first_level; level <= mt->last_level; level++) {
>> + free(mt->level[level].slice);
>> + mt->level[level].slice = NULL;
>> +  }
>
> intel_miptree_release(&mt); ? Either that or remove it above. Seems like
> returning false should always do the same thing.
>
intel_miptree_release() does more than just freeing up the allocated
slices, which we don't want to do here. We still want to keep the
allocated miptree and retry with differen

Re: [Mesa-dev] [PATCH V2 22/22] i965/gen9: Disable Mip Tail for YF/YS tiled surfaces

2015-05-22 Thread Anuj Phogat
On Thu, May 21, 2015 at 4:55 PM, Ben Widawsky  wrote:
> On Fri, Apr 17, 2015 at 04:51:43PM -0700, Anuj Phogat wrote:
>> This fixed the buffer corruption happening in a FBO which use YF/YS
>> tiled renderbuffer or texture as color attachment.
>>
>> Spec recommends disabling mip tails for non-mip-mapped surfaces.
>> But, with this enabled I couldn't get correct data out of YF/YS
>> tiled surface. I get the expected data with this disabled.
>
> The docs do say to disable "to disable the Mip Tail this field must be set to 
> a
> mip that larger than those present in the surface" So I don't see why you 
> said,
> "But". The docs also say you must set this field when trmode isn't None.
>
> In other words, if you don't want to use mip tails, you are using the correct
> mechanism to disable it.
>
I agree I'm using the correct way of disabling it. But my point in above
comment is that there is no restriction in the docs saying "always
disable the miptails". There might be some advantage of enabling them
"by setting value < 15" which I don't understand currently. Maybe I need
to bump up my comment.

>>
>> I haven't spent any time trying to understand miptails. So, I'm
>> not sure if this patch is the right thing to do. But, It helps
>> move things forward at the moment.
>
> I don't understand them either, but I think your code is fine. Maybe add an
> assert that max level is < 15?
>
> So with some explanation in the commit message that this disables mip tails,
> this is:
> Reviewed-by: Ben Widawsky 
>
>>
>> Signed-off-by: Anuj Phogat 
>> ---
>>  src/mesa/drivers/dri/i965/brw_defines.h|  3 +++
>>  src/mesa/drivers/dri/i965/gen8_surface_state.c | 10 --
>>  2 files changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
>> b/src/mesa/drivers/dri/i965/brw_defines.h
>> index c62c09b..1c50172 100644
>> --- a/src/mesa/drivers/dri/i965/brw_defines.h
>> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
>> @@ -594,6 +594,9 @@
>>  #define GEN9_SURFACE_TRMODE_TILEYF 1
>>  #define GEN9_SURFACE_TRMODE_TILEYS 2
>>
>> +#define GEN9_SURFACE_MIP_TAIL_START_LOD_SHIFT  8
>> +#define GEN9_SURFACE_MIP_TAIL_START_LOD_MASK   INTEL_MASK(11, 8)
>> +
>>  /* Surface state DW6 */
>>  #define GEN7_SURFACE_MCS_ENABLE (1 << 0)
>>  #define GEN7_SURFACE_MCS_PITCH_SHIFT3
>> diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
>> b/src/mesa/drivers/dri/i965/gen8_surface_state.c
>> index 189f1db..155563f 100644
>> --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
>> +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
>> @@ -258,8 +258,11 @@ gen8_update_texture_surface(struct gl_context *ctx,
>> GEN7_SURFACE_MIN_LOD) |
>>   (intelObj->_MaxLevel - tObj->BaseLevel); /* mip count */
>>
>> -   if (brw->gen >= 9)
>> +   if (brw->gen >= 9) {
>>surf[5] |= SET_FIELD(tr_mode, GEN9_SURFACE_TRMODE);
>> +  /* Disable Mip Tail by setting a large value. */
>> +  surf[5] |= SET_FIELD(15, GEN9_SURFACE_MIP_TAIL_START_LOD);
>> +   }
>>
>> if (aux_mt) {
>>surf[6] = SET_FIELD(mt->qpitch / 4, GEN8_SURFACE_AUX_QPITCH) |
>> @@ -446,8 +449,11 @@ gen8_update_renderbuffer_surface(struct brw_context 
>> *brw,
>>
>> surf[5] = irb->mt_level - irb->mt->first_level;
>>
>> -   if (brw->gen >= 9)
>> +   if (brw->gen >= 9) {
>>surf[5] |= SET_FIELD(tr_mode, GEN9_SURFACE_TRMODE);
>> +  /* Disable Mip Tail by setting a large value. */
>> +  surf[5] |= SET_FIELD(15, GEN9_SURFACE_MIP_TAIL_START_LOD);
>> +   }
>>
>> if (aux_mt) {
>>surf[6] = SET_FIELD(mt->qpitch / 4, GEN8_SURFACE_AUX_QPITCH) |
>> --
>> 2.3.4
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
> --
> Ben Widawsky, Intel Open Source Technology Center
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 90600] IOError: [Errno 2] No such file or directory: 'gl_API.xml'

2015-05-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=90600

Bug ID: 90600
   Summary: IOError: [Errno 2] No such file or directory:
'gl_API.xml'
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Keywords: regression
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: v...@freedesktop.org
QA Contact: mesa-dev@lists.freedesktop.org
CC: baker.dyla...@gmail.com

mesa: 491adb61d25eef8afe2615e0fd842dda20b17004 (master 10.7.0-devel)

CentOS 6 build error

  Generating build/linux-x86_64-debug/mesa/main/dispatch.h ...
Traceback (most recent call last):
  File "src/mapi/glapi/gen/gl_table.py", line 244, in 
main()
  File "src/mapi/glapi/gen/gl_table.py", line 230, in main
args = _parser()
  File "src/mapi/glapi/gen/gl_table.py", line 225, in _parser
return parser.parse_args()
  File "/usr/lib/python2.6/site-packages/argparse.py", line 1703, in parse_args
args, argv = self.parse_known_args(args, namespace)
  File "/usr/lib/python2.6/site-packages/argparse.py", line 1725, in
parse_known_args
default = self._get_value(action, default)
  File "/usr/lib/python2.6/site-packages/argparse.py", line 2248, in _get_value
result = type_func(arg_string)
  File "src/mapi/glapi/gen/gl_XML.py", line 42, in parse_GL_API
api.parse_file( file_name )
  File "src/mapi/glapi/gen/gl_XML.py", line 899, in parse_file
doc = ET.parse( file_name )
  File "/usr/lib64/python2.6/xml/etree/ElementTree.py", line 862, in parse
tree.parse(source, parser)
  File "/usr/lib64/python2.6/xml/etree/ElementTree.py", line 579, in parse
source = open(source, "rb")
IOError: [Errno 2] No such file or directory: 'gl_API.xml'
scons: *** [build/linux-x86_64-debug/mesa/main/dispatch.h] Error 1

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 90600] IOError: [Errno 2] No such file or directory: 'gl_API.xml'

2015-05-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=90600

--- Comment #1 from Brian Paul  ---
Vinson, what does your scons command look like?
I haven't been able to reproduce this here.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 90600] IOError: [Errno 2] No such file or directory: 'gl_API.xml'

2015-05-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=90600

--- Comment #2 from Vinson Lee  ---
(In reply to Brian Paul from comment #1)
> Vinson, what does your scons command look like?
> I haven't been able to reproduce this here.

It is just 'scons', no arguments.

This is the final build command that fails.

/usr/bin/python src/mapi/glapi/gen/gl_table.py -m remap_table -f
src/mapi/glapi/gen/gl_and_es_API.xml

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 90600] IOError: [Errno 2] No such file or directory: 'gl_API.xml'

2015-05-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=90600

--- Comment #3 from Ilia Mirkin  ---
IIRC Dylan even mentioned this would break on python2.6 (without the extra
argument parser module).

I'm not so eager to drop support for py2.6, not sure why this was done =/

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 90600] IOError: [Errno 2] No such file or directory: 'gl_API.xml'

2015-05-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=90600

--- Comment #4 from Vinson Lee  ---
Build is also broken on Ubuntu 12.04 and it has Python 2.7.

  Generating build/linux-x86_64-debug/mapi/glapi/glapitable.h ...
Traceback (most recent call last):
  File "src/mapi/glapi/gen/gl_table.py", line 244, in 
main()
  File "src/mapi/glapi/gen/gl_table.py", line 230, in main
args = _parser()
  File "src/mapi/glapi/gen/gl_table.py", line 225, in _parser
return parser.parse_args()
  File "/usr/lib/python2.7/argparse.py", line 1688, in parse_args
args, argv = self.parse_known_args(args, namespace)
  File "/usr/lib/python2.7/argparse.py", line 1710, in parse_known_args
default = self._get_value(action, default)
  File "/usr/lib/python2.7/argparse.py", line 2233, in _get_value
result = type_func(arg_string)
  File "src/mapi/glapi/gen/gl_XML.py", line 42, in parse_GL_API
api.parse_file( file_name )
  File "src/mapi/glapi/gen/gl_XML.py", line 899, in parse_file
doc = ET.parse( file_name )
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1183, in parse
tree.parse(source, parser)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 647, in parse
source = open(source, "rb")
IOError: [Errno 2] No such file or directory: 'gl_API.xml'

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 90600] IOError: [Errno 2] No such file or directory: 'gl_API.xml'

2015-05-22 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=90600

--- Comment #5 from Ilia Mirkin  ---
(In reply to Ilia Mirkin from comment #3)
> IIRC Dylan even mentioned this would break on python2.6 (without the extra
> argument parser module).
> 
> I'm not so eager to drop support for py2.6, not sure why this was done =/

While everything above is true, it's also a bit misleading as I just realized
(sorry). Looks like your CentOS box has the separate argparse module as well.
If you didn't have it, the errors would be different.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 17/22] i965/gen9: Add XY_FAST_COPY_BLT support to intelEmitCopyBlit()

2015-05-22 Thread Anuj Phogat
On Thu, May 21, 2015 at 5:46 PM, Anuj Phogat  wrote:
> On Thu, May 21, 2015 at 12:26 PM, Ben Widawsky  wrote:
>> A lot of opinion stuff is below, feel free to ignore them if you don't think
>> there are improvements.
>>
>> On Fri, Apr 17, 2015 at 04:51:38PM -0700, Anuj Phogat wrote:
>>> This patch enables using XY_FAST_COPY_BLT only for Yf/Ys tiled buffers.
>>> Later It can be turned on for other tiling patterns (X,Y).
>>>
>>> Signed-off-by: Anuj Phogat 
>>> ---
>>>  src/mesa/drivers/dri/i965/intel_blit.c   | 292 
>>> +++
>>>  src/mesa/drivers/dri/i965/intel_blit.h   |   3 +
>>>  src/mesa/drivers/dri/i965/intel_copy_image.c |   3 +
>>>  src/mesa/drivers/dri/i965/intel_reg.h|  33 +++
>>>  4 files changed, 291 insertions(+), 40 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/intel_blit.c 
>>> b/src/mesa/drivers/dri/i965/intel_blit.c
>>> index 9500bd7..36746c4 100644
>>> --- a/src/mesa/drivers/dri/i965/intel_blit.c
>>> +++ b/src/mesa/drivers/dri/i965/intel_blit.c
>>> @@ -43,6 +43,23 @@
>>>
>>>  #define FILE_DEBUG_FLAG DEBUG_BLIT
>>>
>>> +#define SET_TILING_XY_FAST_COPY_BLT(tiling, tr_mode, type)   \
>>> +({   \
>>> +   switch (tiling) { \
>>> +   case I915_TILING_X:   \
>>> +  CMD |= type ## _TILED_X;   \
>>> +  break; \
>>> +   case I915_TILING_Y:   \
>>
>> assert(tr_mode != INTEL_MIPTREE_TRMODE_YF); ?
>>
> INTEL_MIPTREE_TRMODE_{YF, NONE} are allowed and covered in else case.
>
>>> +  if (tr_mode == INTEL_MIPTREE_TRMODE_YS)\
>>> + CMD |= type ## _TILED_64K;  \
>>> +  else   \
>>> + CMD |= type ## _TILED_Y;\
>>> +  break; \
>>> +   default:  \
>>> +  unreachable("not reached");\
>>> +   } \
>>> +})
>>> +
>>>  static void
>>>  intel_miptree_set_alpha_to_one(struct brw_context *brw,
>>> struct intel_mipmap_tree *mt,
>>> @@ -75,6 +92,12 @@ static uint32_t
>>>  br13_for_cpp(int cpp)
>>>  {
>>> switch (cpp) {
>>> +   case 16:nn
>>> +  return BR13_32323232;
>>> +  break;
>>> +   case 8:
>>> +  return BR13_16161616;
>>> +  break;
>>> case 4:
>>>return BR13_;
>>>break;
>>> @@ -89,6 +112,132 @@ br13_for_cpp(int cpp)
>>> }
>>>  }
>>>
>>> +static uint32_t
>>> +get_tr_horizontal_align(uint32_t tr_mode, uint32_t cpp, bool is_src) {
>>> +   /*Alignment tables for YF/YS tiled surfaces. */
>>> +   const uint32_t align_2d_yf[] = {64, 64, 32, 32, 16};
>>> +   const uint32_t align_2d_ys[] = {256, 256, 128, 128, 64};
>>> +   const uint32_t bpp = cpp * 8;
>>> +   uint32_t align;
>>> +   int i = 0;
>>> +
>>> +   if (tr_mode == INTEL_MIPTREE_TRMODE_NONE)
>>> +  return 0;
>>> +
>>> +   /* Compute array index. */
>>> +   assert (bpp >= 8 && bpp <= 128 && (bpp & (bpp - 1)) == 0);
>>
>> assert(bpp >= 8 && bpp <= 128 && _mesa_bitcount(bpp) == 1);
>>
>> (I couldn't find a is_pow2, but one must exist).
>>
> There is a is_power_of_two() in main/macros.h. I'll use that here.
>
>>> +   while (bpp >> (i + 4))
>>> +  i++;
>>> +
>> Since you just asserted this was a power of 2 above, isn't this just:
>> ffs(bpp/8) - 1;
>>
> Yes, ffs() will work too. I'll use it.
>
>>> +   if (tr_mode == INTEL_MIPTREE_TRMODE_YF)
>>> +  align = align_2d_yf[i];
>>> +   else
>>> +  align = align_2d_ys[i];
>>> +
>>> +   switch(align) {
>>> +   case 512:
>>> +  return is_src ? XY_SRC_H_ALIGN_512 : XY_DST_H_ALIGN_512;
>>> +   case 256:
>>> +  return is_src ? XY_SRC_H_ALIGN_256 : XY_DST_H_ALIGN_256;
>>> +   case 128:
>>> +  return is_src ? XY_SRC_H_ALIGN_128 : XY_DST_H_ALIGN_128;
>>> +   case 64:
>>> +  return is_src ? XY_SRC_H_ALIGN_64 : XY_DST_H_ALIGN_64;
>>> +   case 32:
>>> +   /* XY_FAST_COPY_BLT doesn't support horizontal alignment of 16. */
>>> +   case 16:
>>> +  return is_src ? XY_SRC_H_ALIGN_32 : XY_DST_H_ALIGN_32;
>>> +   default:
>>> +  unreachable("not reached");
>>> +   }
>>> +}
>>> +
>>> +static uint32_t
>>> +get_tr_vertical_align(uint32_t tr_mode, uint32_t cpp, bool is_src) {
>>> +   /* Vertical alignment tables for YF/YS tiled surfaces. */
>>> +   const unsigned align_2d_yf[] = {64, 32, 32, 16, 16};
>>> +   const unsigned align_2d_ys[] = {256, 128, 128, 64, 64};
>>> +   const uint32_t bpp = cpp * 8;
>>> +   uint32_t align;
>>> +   int i = 0;
>>> +
>>> +   if (tr_mode == INTEL_MIPTREE_TRMODE_NONE)

[Mesa-dev] [PATCH] nv50/ir: avoid messing up arg1 of PFETCH

2015-05-22 Thread Ilia Mirkin
There can be scenarios where the "indirect" arg of a PFETCH becomes
known, and so the code will attempt to propagate it. Use this
opportunity to just fold it into the first argument, and prevent the
load propagation pass from touching PFETCH further.

This fixes gs-input-array-vec4-index-rd.shader_test and
vs-output-array-vec4-index-wr-before-gs.shader_test on nvc0 at least.

Signed-off-by: Ilia Mirkin 
Cc: "10.5 10.6" 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 72dd31e..98e3d1f 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -236,6 +236,9 @@ LoadPropagation::visit(BasicBlock *bb)
   if (i->op == OP_CALL) // calls have args as sources, they must be in regs
  continue;
 
+  if (i->op == OP_PFETCH) // pfetch expects arg1 to be a reg
+ continue;
+
   if (i->srcExists(1))
  checkSwapSrc01(i);
 
@@ -581,6 +584,11 @@ ConstantFolding::expr(Instruction *i,
case OP_POPCNT:
   res.data.u32 = util_bitcount(a->data.u32 & b->data.u32);
   break;
+   case OP_PFETCH:
+  // The two arguments to pfetch are logically added together. Normally
+  // the second argument will not be constant, but that can happen.
+  res.data.u32 = a->data.u32 + b->data.u32;
+  break;
default:
   return;
}
@@ -610,6 +618,8 @@ ConstantFolding::expr(Instruction *i,
  bld.setPosition(i, false);
  i->setSrc(1, bld.loadImm(NULL, res.data.u32));
   }
+   } else if (i->op == OP_PFETCH) {
+  // Leave PFETCH alone... we just folded its 2 args into 1.
} else {
   i->op = i->saturate ? OP_SAT : OP_MOV; /* SAT handled by unary() */
}
-- 
2.3.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev