[Mesa-dev] [PATCH] vc4: make vc4_begin_query() returns a boolean
I forgot to make the change in 96f164f6f047833091eb98a73aa80c31dc94f962. This fixes a warning with GCC and probably an error with Clang. Signed-off-by: Samuel Pitoiset --- src/gallium/drivers/vc4/vc4_query.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/vc4/vc4_query.c b/src/gallium/drivers/vc4/vc4_query.c index 1792bec..270832e 100644 --- a/src/gallium/drivers/vc4/vc4_query.c +++ b/src/gallium/drivers/vc4/vc4_query.c @@ -50,9 +50,10 @@ vc4_destroy_query(struct pipe_context *ctx, struct pipe_query *query) free(query); } -static void +static boolean vc4_begin_query(struct pipe_context *ctx, struct pipe_query *query) { +return true; } static void -- 2.4.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 15/20] i915: Enable intel_render path for points
On Thu, May 21, 2015 at 09:14:03PM +0300, Ville Syrjälä wrote: > On Fri, May 15, 2015 at 12:18:11PM -0700, Ian Romanick wrote: > > There are some really twitchy tests in ES1 (and possibly ES2) > > conformance related to this. Do any of those tests change with this commit? > > I did run some ES1 conformnce tests, but the branches in the repo > were not very clear so I'm not sure if I ran the right thing (looks > like I used the "gles1" branch and managed to build something that > at least runs). > > No changes in the results on 855 or PNV between my branch and the > baseline AFAICS. > > I'll see if I can get the ES2 tests built as well, and run them on > PNV. OK, I managed to run the ES2 tests on PNV, and there are no changes in the results. > > > > > On 03/23/2015 05:47 AM, ville.syrj...@linux.intel.com wrote: > > > From: Ville Syrjälä > > > > > > The sub-pixel adjustment for points was killed off in > > > commit 60d762aa625095a8c1f9597d8530bb5a6fa61b4c > > > Author: Xiang, Haihao > > > Date: Wed Jan 2 11:38:51 2008 +0800 > > > > > > i915: Needn't adjust pixel centers. fix #12944 > > > > > > so if we don't need it in intel_tris.c we don't need it in > > > intel_render.c either, which means we can allow > > > intel_render.c to render points. > > > > > > Signed-off-by: Ville Syrjälä > > > --- > > > src/mesa/drivers/dri/i915/intel_render.c | 8 +++- > > > 1 file changed, 3 insertions(+), 5 deletions(-) > > > > > > diff --git a/src/mesa/drivers/dri/i915/intel_render.c > > > b/src/mesa/drivers/dri/i915/intel_render.c > > > index 65ecd05..ef1c718 100644 > > > --- a/src/mesa/drivers/dri/i915/intel_render.c > > > +++ b/src/mesa/drivers/dri/i915/intel_render.c > > > @@ -54,9 +54,7 @@ > > > * dma buffers. Use strip/fan hardware primitives where possible. > > > * Try to simulate missing primitives with indexed vertices. > > > */ > > > -#define HAVE_POINTS 0 /* Has it, but can't use because > > > subpixel has to > > > - * be adjusted for points on the > > > INTEL/I845G > > > - */ > > > +#define HAVE_POINTS 1 > > > #define HAVE_LINES 1 > > > #define HAVE_LINE_STRIPS 1 > > > #define HAVE_TRIANGLES 1 > > > @@ -70,7 +68,7 @@ > > > #define HAVE_ELTS0 > > > > > > static const uint32_t hw_prim[GL_POLYGON + 1] = { > > > - [GL_POINTS] = 0, > > > + [GL_POINTS] = PRIM3D_POINTLIST, > > > [GL_LINES ] = PRIM3D_LINELIST, > > > [GL_LINE_LOOP] = PRIM3D_LINESTRIP, > > > [GL_LINE_STRIP] = PRIM3D_LINESTRIP, > > > @@ -96,7 +94,7 @@ static const GLenum reduced_prim[GL_POLYGON + 1] = { > > > }; > > > > > > static const int scale_prim[GL_POLYGON + 1] = { > > > - [GL_POINTS] = 0, /* fallback case */ > > > + [GL_POINTS] = 1, > > > [GL_LINES] = 1, > > > [GL_LINE_LOOP] = 2, > > > [GL_LINE_STRIP] = 2, > > > > > -- > Ville Syrjälä > Intel OTC -- Ville Syrjälä Intel OTC ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/20] t_dd_dmatmp: Kill the paths rendering quads/quad strips via indexed vertices
On Fri, May 15, 2015 at 12:06:54PM -0700, Ian Romanick wrote: > On 03/23/2015 05:47 AM, ville.syrj...@linux.intel.com wrote: > > From: Ville Syrjälä > > > > No driver supports elts currently, and these make the validate_render > > code a bit hard to follow. Just kill them. > > It looks like both r200_tcl.c and radeon_tcl.c define HAVE_TRI_STRIPS > and HAVE_ELTS, but I guess they're using t_dd_dmatmp2.h. If you add a > > #if defined HAVE_ELTS > #error "HAVE_ELTS support is removed" > #endif > > do radeon and r200 still build? Yes, they only emit elts via t_dd_dmatmp2.h as you said. And do note that I didn't remove elt support entirely, I just killed off the more tricky code that emits elts from the vert paths. So the code can still handle elts just fine if VB->Elts is present. > > > Signed-off-by: Ville Syrjälä > > --- > > src/mesa/tnl_dd/t_dd_dmatmp.h | 133 > > ++ > > 1 file changed, 5 insertions(+), 128 deletions(-) > > > > diff --git a/src/mesa/tnl_dd/t_dd_dmatmp.h b/src/mesa/tnl_dd/t_dd_dmatmp.h > > index 52ea2bf..5ea2d31 100644 > > --- a/src/mesa/tnl_dd/t_dd_dmatmp.h > > +++ b/src/mesa/tnl_dd/t_dd_dmatmp.h > > @@ -36,7 +36,7 @@ > > * Produces code for both inline triangles and indexed triangles. > > * Where various primitive types are unaccelerated by hardware, the > > * code attempts to fallback to other primitive types (quadstrips to > > - * tristrips, lineloops to linestrips), or to indexed vertices. > > + * tristrips, lineloops to linestrips). > > */ > > > > #if !defined(HAVE_TRIANGLES) > > @@ -444,65 +444,6 @@ static void TAG(render_quad_strip_verts)( struct > > gl_context *ctx, > >} > > > >FLUSH(); > > - > > - } else if (HAVE_TRI_STRIPS && > > - ctx->Light.ShadeModel == GL_FLAT && > > - TNL_CONTEXT(ctx)->vb.AttribPtr[_TNL_ATTRIB_COLOR0]->stride) { > > I don't think removing this whole block is right because !HAVE_ELTS is > an error case... but I guess we've never hit that error case since > HAVE_ELTS is always zero. There are no real error cases here, just dead code. validate_render() is supposed to make sure we never call these functions if the code can't actually render the primitives. The fprintf()+return branches should really just contain assert(0) or equivalent. > > > - if (HAVE_ELTS) { > > -LOCAL_VARS; > > -int dmasz = GET_SUBSEQUENT_VB_MAX_ELTS(); > > -int currentsz; > > -GLuint j, nr; > > - > > - EMIT_INDEXED_VERTS( ctx, start, count ); > > - > > -/* Simulate flat-shaded quadstrips using indexed vertices: > > - */ > > -ELT_INIT( GL_TRIANGLES ); > > - > > -currentsz = GET_CURRENT_VB_MAX_ELTS(); > > - > > -/* Emit whole number of quads in total, and in each buffer. > > - */ > > -dmasz -= dmasz & 1; > > -count -= (count-start) & 1; > > -currentsz -= currentsz & 1; > > - > > -if (currentsz < 12) > > - currentsz = dmasz; > > - > > -currentsz = currentsz/6*2; > > -dmasz = dmasz/6*2; > > - > > -for (j = start; j + 3 < count; j += nr - 2 ) { > > - nr = MIN2( currentsz, count - j ); > > - if (nr >= 4) { > > - GLint quads = (nr/2)-1; > > - GLint i; > > - ELTS_VARS( ALLOC_ELTS( quads*6 ) ); > > - > > - for ( i = j-start ; i < j-start+quads*2 ; i+=2 ) { > > - EMIT_TWO_ELTS( 0, (i+0), (i+1) ); > > - EMIT_TWO_ELTS( 2, (i+2), (i+1) ); > > - EMIT_TWO_ELTS( 4, (i+3), (i+2) ); > > - INCR_ELTS( 6 ); > > - } > > - > > - FLUSH(); > > - } > > - currentsz = dmasz; > > -} > > - > > -RELEASE_ELT_VERTS(); > > -FLUSH(); > > - } > > - else { > > -/* Vertices won't fit in a single buffer or elts not > > - * available - should never happen. > > - */ > > -fprintf(stderr, "%s - cannot draw primitive\n", __FUNCTION__); > > -return; > > - } > > } > > else if (HAVE_TRI_STRIPS) { > >LOCAL_VARS; > > @@ -568,57 +509,6 @@ static void TAG(render_quads_verts)( struct gl_context > > *ctx, > > currentsz = dmasz; > >} > > } > > - else if (HAVE_ELTS) { > > - /* Hardware doesn't have a quad primitive type -- try to > > - * simulate it using indexed vertices and the triangle > > - * primitive: > > - */ > > - LOCAL_VARS; > > - int dmasz = GET_SUBSEQUENT_VB_MAX_ELTS(); > > - int currentsz; > > - GLuint j, nr; > > - > > - EMIT_INDEXED_VERTS( ctx, start, count ); > > - > > - FLUSH(); > > - ELT_INIT( GL_TRIANGLES ); > > - currentsz = GET_CURRENT_VB_MAX_ELTS(); > > - > > - /* Emit whole number of quads in total, and in each buffer. > > - */ > > - dmasz -= dmasz & 3; > > - count -= (count-start) & 3; > > - currentsz -= currentsz & 3; > > - > > - /* Adjust for rendering as triangles: > > - */ > > - currentsz = currentsz/6*4; > > - d
Re: [Mesa-dev] [PATCH v2 03/14] nir: cleanup cf nodes earlier in nir_cf_node_remove()
Reviewed-by: Jason Ekstrand On Thu, May 21, 2015 at 9:40 AM, Connor Abbott wrote: > Before, when we were deleting a cf node that was a block, we were first > removing all the instructions and then calling cleanup_cf_node(), at > which point cleanup_cf_node() couldn't do its job. Just move it before > everything else, which should be ok for the non-block case too. > > v2: split out from previous commit > Signed-off-by: Connor Abbott > --- > src/glsl/nir/nir.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c > index 704553f..a2b5e7c 100644 > --- a/src/glsl/nir/nir.c > +++ b/src/glsl/nir/nir.c > @@ -1253,6 +1253,8 @@ nir_cf_node_remove(nir_cf_node *node) > nir_function_impl *impl = nir_cf_node_get_function(node); > nir_metadata_preserve(impl, nir_metadata_none); > > + cleanup_cf_node(node, impl); > + > if (node->type == nir_cf_node_block) { >/* > * Basic blocks can't really be removed by themselves, since they act > as > @@ -1274,8 +1276,6 @@ nir_cf_node_remove(nir_cf_node *node) >exec_node_remove(&node->node); >stitch_blocks(before_block, after_block); > } > - > - cleanup_cf_node(node); > } > > static bool > -- > 2.1.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 04/14] nir: properly clean up jumps when removing cf nodes
On Thu, May 21, 2015 at 9:55 AM, Matt Turner wrote: > On Thu, May 21, 2015 at 9:40 AM, Connor Abbott wrote: >> Before, we might have left dangling predecessors from jumps that were >> going to be removed. >> >> v2: split out from "nir: insert ssa_undef instructions when cleaning up >> defs/uses" >> >> Signed-off-by: Connor Abbott >> --- >> src/glsl/nir/nir.c | 10 -- >> 1 file changed, 8 insertions(+), 2 deletions(-) >> >> diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c >> index a2b5e7c..dc6d63f 100644 >> --- a/src/glsl/nir/nir.c >> +++ b/src/glsl/nir/nir.c >> @@ -1214,9 +1214,15 @@ cleanup_cf_node(nir_cf_node *node, nir_function_impl >> *impl) >> switch (node->type) { >> case nir_cf_node_block: { >>nir_block *block = nir_cf_node_as_block(node); >> - /* We need to walk the instructions and clean up defs/uses */ >> - nir_foreach_instr(block, instr) >> + /* We need to walk the instructions and clean up defs/uses, >> + * as well as clean up any jumps to control flow that may not be >> getting >> + * deleted. > > I think you can line wrap this block a little better. Good idea. With that, Reviewed-by: Jason Ekstrand ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/5] nir/cse: use the instr_hash helper
This replaces an O(n^2) algorithm with an O(n) one, while allowing us to import most of the infrastructure required for GVN. The idea is to walk the dominance tree depth-first, similar when converting to SSA, and remove the instructions from the set when we're done visiting the sub-tree of the dominance tree so that the only instructions in the set are the instructions that dominate the current block. No piglit regressions. No changes in the result of the public shader-db. Difference at 95.0% confidence -0.998 +/- 0.312663 -5.96961% +/- 1.87022% (Student's t, pooled s = 0.332763) Signed-off-by: Connor Abbott --- src/glsl/nir/nir_opt_cse.c | 134 - 1 file changed, 24 insertions(+), 110 deletions(-) diff --git a/src/glsl/nir/nir_opt_cse.c b/src/glsl/nir/nir_opt_cse.c index a6a4a21..7894147 100644 --- a/src/glsl/nir/nir_opt_cse.c +++ b/src/glsl/nir/nir_opt_cse.c @@ -22,10 +22,11 @@ * * Authors: *Jason Ekstrand (ja...@jlekstrand.net) + *Connor Abbott (cwabbo...@gmail.com) * */ -#include "nir.h" +#include "nir_instr_hash.h" /* * Implements common subexpression elimination @@ -33,141 +34,54 @@ struct cse_state { void *mem_ctx; + struct set *instr_set; bool progress; }; -static bool -src_is_ssa(nir_src *src, void *data) -{ - (void) data; - return src->is_ssa; -} - -static bool -dest_is_ssa(nir_dest *dest, void *data) -{ - (void) data; - return dest->is_ssa; -} +/* + * Visits and CSE's the given block and all its descendants in the dominance + * tree recursively. Note that the instr_set is guaranteed to only ever + * contain instructions that dominate the current block. + */ static bool -nir_instr_can_cse(nir_instr *instr) +cse_block(nir_block *block, struct set *instr_set) { - /* We only handle SSA. */ - if (!nir_foreach_dest(instr, dest_is_ssa, NULL) || - !nir_foreach_src(instr, src_is_ssa, NULL)) - return false; - - switch (instr->type) { - case nir_instr_type_alu: - case nir_instr_type_load_const: - case nir_instr_type_phi: - return true; - case nir_instr_type_tex: - return false; /* TODO */ - case nir_instr_type_intrinsic: { - const nir_intrinsic_info *info = - &nir_intrinsic_infos[nir_instr_as_intrinsic(instr)->intrinsic]; - return (info->flags & NIR_INTRINSIC_CAN_ELIMINATE) && - (info->flags & NIR_INTRINSIC_CAN_REORDER) && - info->num_variables == 0; /* not implemented yet */ - } - case nir_instr_type_call: - case nir_instr_type_jump: - case nir_instr_type_ssa_undef: - return false; - case nir_instr_type_parallel_copy: - default: - unreachable("Invalid instruction type"); - } - - return false; -} - -static nir_ssa_def * -nir_instr_get_dest_ssa_def(nir_instr *instr) -{ - switch (instr->type) { - case nir_instr_type_alu: - assert(nir_instr_as_alu(instr)->dest.dest.is_ssa); - return &nir_instr_as_alu(instr)->dest.dest.ssa; - case nir_instr_type_load_const: - return &nir_instr_as_load_const(instr)->def; - case nir_instr_type_phi: - assert(nir_instr_as_phi(instr)->dest.is_ssa); - return &nir_instr_as_phi(instr)->dest.ssa; - case nir_instr_type_intrinsic: - assert(nir_instr_as_intrinsic(instr)->dest.is_ssa); - return &nir_instr_as_intrinsic(instr)->dest.ssa; - default: - unreachable("We never ask for any of these"); - } -} + bool progress = false; -static void -nir_opt_cse_instr(nir_instr *instr, struct cse_state *state) -{ - if (!nir_instr_can_cse(instr)) - return; - - for (struct exec_node *node = instr->node.prev; -!exec_node_is_head_sentinel(node); node = node->prev) { - nir_instr *other = exec_node_data(nir_instr, node, node); - if (nir_instrs_equal(instr, other)) { - nir_ssa_def *other_def = nir_instr_get_dest_ssa_def(other); - nir_ssa_def_rewrite_uses(nir_instr_get_dest_ssa_def(instr), - nir_src_for_ssa(other_def), - state->mem_ctx); + nir_foreach_instr_safe(block, instr) { + if (nir_instr_set_add(instr_set, instr)) { + progress = true; nir_instr_remove(instr); - state->progress = true; - return; } } - for (nir_block *block = instr->block->imm_dom; -block != NULL; block = block->imm_dom) { - nir_foreach_instr_reverse(block, other) { - if (nir_instrs_equal(instr, other)) { -nir_ssa_def *other_def = nir_instr_get_dest_ssa_def(other); -nir_ssa_def_rewrite_uses(nir_instr_get_dest_ssa_def(instr), - nir_src_for_ssa(other_def), - state->mem_ctx); -nir_instr_remove(instr); -state->progress = true; -return; - } - } + for (unsigned i = 0; i < block->num_dom_children; i++) { + nir_block *child = block
[Mesa-dev] [PATCH 1/5] nir: split out instruction comparison functions
We'll want to use these outside of nir_opt_cse.c. Rather than adding yet another thing to the mess that is nir.c, create a new file and move nir_srcs_equal() there too. Signed-off-by: Connor Abbott --- src/glsl/Makefile.sources| 1 + src/glsl/nir/nir.c | 27 -- src/glsl/nir/nir.h | 1 + src/glsl/nir/nir_instr_compare.c | 179 +++ src/glsl/nir/nir_opt_cse.c | 121 -- 5 files changed, 181 insertions(+), 148 deletions(-) create mode 100644 src/glsl/nir/nir_instr_compare.c diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources index d784a81..75e5377 100644 --- a/src/glsl/Makefile.sources +++ b/src/glsl/Makefile.sources @@ -27,6 +27,7 @@ NIR_FILES = \ nir/nir_constant_expressions.h \ nir/nir_dominance.c \ nir/nir_from_ssa.c \ + nir/nir_instr_compare.c \ nir/nir_intrinsics.c \ nir/nir_intrinsics.h \ nir/nir_live_variables.c \ diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c index f03e80a..170807e 100644 --- a/src/glsl/nir/nir.c +++ b/src/glsl/nir/nir.c @@ -1785,33 +1785,6 @@ nir_src_as_const_value(nir_src src) return &load->value; } -bool -nir_srcs_equal(nir_src src1, nir_src src2) -{ - if (src1.is_ssa) { - if (src2.is_ssa) { - return src1.ssa == src2.ssa; - } else { - return false; - } - } else { - if (src2.is_ssa) { - return false; - } else { - if ((src1.reg.indirect == NULL) != (src2.reg.indirect == NULL)) -return false; - - if (src1.reg.indirect) { -if (!nir_srcs_equal(*src1.reg.indirect, *src2.reg.indirect)) - return false; - } - - return src1.reg.reg == src2.reg.reg && -src1.reg.base_offset == src2.reg.base_offset; - } - } -} - static bool src_is_valid(const nir_src *src) { diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index 697d37e..f04195f 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -1579,6 +1579,7 @@ bool nir_foreach_src(nir_instr *instr, nir_foreach_src_cb cb, void *state); nir_const_value *nir_src_as_const_value(nir_src src); bool nir_srcs_equal(nir_src src1, nir_src src2); +bool nir_instrs_equal(nir_instr *instr1, nir_instr *instr2); void nir_instr_rewrite_src(nir_instr *instr, nir_src *src, nir_src new_src); void nir_instr_move_src(nir_instr *dest_instr, nir_src *dest, nir_src *src); void nir_if_rewrite_condition(nir_if *if_stmt, nir_src new_src); diff --git a/src/glsl/nir/nir_instr_compare.c b/src/glsl/nir/nir_instr_compare.c new file mode 100644 index 000..89b576c --- /dev/null +++ b/src/glsl/nir/nir_instr_compare.c @@ -0,0 +1,179 @@ +/* + * Copyright © 2014 Intel Corporation + * Copyright © 2015 Connor Abbott + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Authors: + *Jason Ekstrand (ja...@jlekstrand.net) + *Connor Abbott (cwabbo...@gmail.com) + * + */ + +#include "nir.h" + +bool +nir_srcs_equal(nir_src src1, nir_src src2) +{ + if (src1.is_ssa) { + if (src2.is_ssa) { + return src1.ssa == src2.ssa; + } else { + return false; + } + } else { + if (src2.is_ssa) { + return false; + } else { + if ((src1.reg.indirect == NULL) != (src2.reg.indirect == NULL)) +return false; + + if (src1.reg.indirect) { +if (!nir_srcs_equal(*src1.reg.indirect, *src2.reg.indirect)) + return false; + } + + return src1.reg.reg == src2.reg.reg && +src1.reg.base_offset == src2.reg.base_offset; + } + } +} + + +static bool +nir_alu_srcs_equal(nir_alu_instr *alu1, nir_alu_instr *alu2, unsigned src1, + unsigned src2) +{ + if (alu1->src[src1].abs != alu2->src[src2].abs || + alu1->src[src1].negate != alu2->src[src2].neg
[Mesa-dev] [PATCH 4/5] nir: add a helper for finding duplicate instructions
This can be used for both CSE and value numbering. Signed-off-by: Connor Abbott --- src/glsl/Makefile.sources | 2 + src/glsl/nir/nir_instr_hash.c | 255 ++ src/glsl/nir/nir_instr_hash.h | 36 ++ 3 files changed, 293 insertions(+) create mode 100644 src/glsl/nir/nir_instr_hash.c create mode 100644 src/glsl/nir/nir_instr_hash.h diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources index 75e5377..fd10cdd 100644 --- a/src/glsl/Makefile.sources +++ b/src/glsl/Makefile.sources @@ -28,6 +28,8 @@ NIR_FILES = \ nir/nir_dominance.c \ nir/nir_from_ssa.c \ nir/nir_instr_compare.c \ + nir/nir_instr_hash.c \ + nir/nir_instr_hash.h \ nir/nir_intrinsics.c \ nir/nir_intrinsics.h \ nir/nir_live_variables.c \ diff --git a/src/glsl/nir/nir_instr_hash.c b/src/glsl/nir/nir_instr_hash.c new file mode 100644 index 000..d900b66 --- /dev/null +++ b/src/glsl/nir/nir_instr_hash.c @@ -0,0 +1,255 @@ +#include "nir_instr_hash.h" + +#define HASH(data) hash = _mesa_fnv32_1a_accumulate(hash, (data)) + +static uint32_t +hash_src(uint32_t hash, const nir_src *src) +{ + assert(src->is_ssa); + HASH(src->ssa); + return hash; +} + +static uint32_t +hash_alu_src(uint32_t hash, const nir_alu_src *src, unsigned num_components) +{ + HASH(src->abs); + HASH(src->negate); + + for (unsigned i = 0; i < num_components; i++) + HASH(src->swizzle[i]); + + hash = hash_src(hash, &src->src); + return hash; +} + +static uint32_t +hash_alu(uint32_t hash, const nir_alu_instr *instr) +{ + HASH(instr->op); + HASH(instr->dest.dest.ssa.num_components); + + if (nir_op_infos[instr->op].algebraic_properties & NIR_OP_IS_COMMUTATIVE) { + assert(nir_op_infos[instr->op].num_inputs == 2); + uint32_t hash0 = hash_alu_src(hash, &instr->src[0], +nir_ssa_alu_instr_src_components(instr, 0)); + uint32_t hash1 = hash_alu_src(hash, &instr->src[1], +nir_ssa_alu_instr_src_components(instr, 1)); + /* For commutative operations, we need some commutative way of + * combining the hashes. One option would be to XOR them but that + * means that anything with two identical sources will hash to 0 and + * that's common enough we probably don't want the guaranteed + * collision. Either addition or multiplication will also work. + */ + hash = hash0 * hash1; + } else { + for (unsigned i = 0; i < nir_op_infos[instr->op].num_inputs; i++) { + hash = hash_alu_src(hash, &instr->src[i], + nir_ssa_alu_instr_src_components(instr, i)); + } + } + + return hash; +} + +static uint32_t +hash_load_const(uint32_t hash, const nir_load_const_instr *instr) +{ + HASH(instr->def.num_components); + + hash = _mesa_fnv32_1a_accumulate_block(hash, instr->value.f, + instr->def.num_components + * sizeof(instr->value.f[0])); + + return hash; +} + +static int +cmp_phi_src(const void *data1, const void *data2) +{ + const nir_phi_src *src1 = data1, *src2 = data2; + return src1->pred->index - src2->pred->index; +} + +static uint32_t +hash_phi(uint32_t hash, const nir_phi_instr *instr) +{ + HASH(instr->instr.block); + + /* sort sources by predecessor index, since the order shouldn't matter */ + unsigned num_preds = instr->instr.block->predecessors->entries; + nir_phi_src *srcs = malloc(num_preds * sizeof(nir_phi_src)); + unsigned i = 0; + nir_foreach_phi_src(instr, src) { + srcs[i++] = *src; + } + qsort(srcs, num_preds, sizeof(nir_phi_src), cmp_phi_src); + for (i = 0; i < num_preds; i++) { + hash = hash_src(hash, &srcs[i].src); + HASH(srcs[i].pred); + } + free(srcs); + + return hash; +} + +static uint32_t +hash_intrinsic(uint32_t hash, const nir_intrinsic_instr *instr) +{ + const nir_intrinsic_info *info = &nir_intrinsic_infos[instr->intrinsic]; + HASH(instr->intrinsic); + + if (info->has_dest) + HASH(instr->dest.ssa.num_components); + + assert(info->num_variables == 0); + + hash = _mesa_fnv32_1a_accumulate_block(hash, instr->const_index, + info->num_indices + * sizeof(instr->const_index[0])); + return hash; +} + +static uint32_t +hash_instr(const void *data) +{ + const nir_instr *instr = data; + uint32_t hash = _mesa_fnv32_1a_offset_bias; + + switch (instr->type) { + case nir_instr_type_alu: + hash = hash_alu(hash, nir_instr_as_alu(instr)); + break; + case nir_instr_type_load_const: + hash = hash_load_const(hash, nir_instr_as_load_const(instr)); + break; + case nir_instr_type_phi: + hash = hash_phi(hash, nir_instr_as_phi(instr)); + break; + case nir_instr_type_intrinsic: + hash = hash_intrinsic(hash, nir_instr
[Mesa-dev] [PATCH 2/5] nir: constify nir_ssa_alu_instr_src_components()
Signed-off-by: Connor Abbott --- src/glsl/nir/nir.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index f04195f..e3b7b17 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -722,7 +722,7 @@ nir_alu_instr_channel_used(nir_alu_instr *instr, unsigned src, unsigned channel) * used for a source */ static inline unsigned -nir_ssa_alu_instr_src_components(nir_alu_instr *instr, unsigned src) +nir_ssa_alu_instr_src_components(const nir_alu_instr *instr, unsigned src) { assert(instr->dest.dest.is_ssa); -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/5] nir: constify instruction comparison functions
Signed-off-by: Connor Abbott --- src/glsl/nir/nir.h | 4 ++-- src/glsl/nir/nir_instr_compare.c | 8 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index e3b7b17..19b1e18 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -1578,8 +1578,8 @@ bool nir_foreach_dest(nir_instr *instr, nir_foreach_dest_cb cb, void *state); bool nir_foreach_src(nir_instr *instr, nir_foreach_src_cb cb, void *state); nir_const_value *nir_src_as_const_value(nir_src src); -bool nir_srcs_equal(nir_src src1, nir_src src2); -bool nir_instrs_equal(nir_instr *instr1, nir_instr *instr2); +bool nir_srcs_equal(const nir_src src1, const nir_src src2); +bool nir_instrs_equal(const nir_instr *instr1, const nir_instr *instr2); void nir_instr_rewrite_src(nir_instr *instr, nir_src *src, nir_src new_src); void nir_instr_move_src(nir_instr *dest_instr, nir_src *dest, nir_src *src); void nir_if_rewrite_condition(nir_if *if_stmt, nir_src new_src); diff --git a/src/glsl/nir/nir_instr_compare.c b/src/glsl/nir/nir_instr_compare.c index 89b576c..07d0031 100644 --- a/src/glsl/nir/nir_instr_compare.c +++ b/src/glsl/nir/nir_instr_compare.c @@ -30,7 +30,7 @@ #include "nir.h" bool -nir_srcs_equal(nir_src src1, nir_src src2) +nir_srcs_equal(const nir_src src1, const nir_src src2) { if (src1.is_ssa) { if (src2.is_ssa) { @@ -58,8 +58,8 @@ nir_srcs_equal(nir_src src1, nir_src src2) static bool -nir_alu_srcs_equal(nir_alu_instr *alu1, nir_alu_instr *alu2, unsigned src1, - unsigned src2) +nir_alu_srcs_equal(const nir_alu_instr *alu1, const nir_alu_instr *alu2, + unsigned src1, unsigned src2) { if (alu1->src[src1].abs != alu2->src[src2].abs || alu1->src[src1].negate != alu2->src[src2].negate) @@ -74,7 +74,7 @@ nir_alu_srcs_equal(nir_alu_instr *alu1, nir_alu_instr *alu2, unsigned src1, } bool -nir_instrs_equal(nir_instr *instr1, nir_instr *instr2) +nir_instrs_equal(const nir_instr *instr1, const nir_instr *instr2) { if (instr1->type != instr2->type) return false; -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/5] NIR CSE performance improvements
This series implements an API that lets us use a hash set in the CSE pass, which fixes the performance problem while getting most of the code for Global Value Numbering (GVN) on top of GCM into the tree and actively used. The first three patches are taken from Jason's rebase of my GVN series, and patch 4 is modified from the original patch 4 to expose a slightly lower-level API (details in the patch itself). Patch 5 then rewrites the CSE pass to use the API introduced in patch 4. Note that the API could be just as easily adapted to implement nir_value_number() from the original patch 4. Perf numbers are in patch 5, but for those that prefer pretty pictures, I've uploaded a few graphs: http://people.freedesktop.org/~cwabbott0/cse-series/perf-before.svg http://people.freedesktop.org/~cwabbott0/cse-series/perf-after.svg where you can clearly see how CSE goes from taking up almost all the time spent in nir_optimize() to less than half (but still a significant amount). This series is also available at git://people.freedesktop.org/~cwabbott0/mesa nir-cse-hash Connor Abbott (5): nir: split out instruction comparison functions nir: constify nir_ssa_alu_instr_src_components() nir: constify instruction comparison functions nir: add a helper for finding duplicate instructions nir/cse: use the instr_hash helper src/glsl/Makefile.sources| 3 + src/glsl/nir/nir.c | 27 - src/glsl/nir/nir.h | 5 +- src/glsl/nir/nir_instr_compare.c | 179 +++ src/glsl/nir/nir_instr_hash.c| 255 +++ src/glsl/nir/nir_instr_hash.h| 36 ++ src/glsl/nir/nir_opt_cse.c | 255 --- 7 files changed, 500 insertions(+), 260 deletions(-) create mode 100644 src/glsl/nir/nir_instr_compare.c create mode 100644 src/glsl/nir/nir_instr_hash.c create mode 100644 src/glsl/nir/nir_instr_hash.h -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glx: fix Scons build
Replace -h with --header-tag as was done for the Makefile build. --- src/glx/SConscript | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/glx/SConscript b/src/glx/SConscript index b91c0bd..619e4c3 100644 --- a/src/glx/SConscript +++ b/src/glx/SConscript @@ -125,7 +125,7 @@ env.CodeGenerate( target = 'indirect_size.h', script = GLAPI + 'gen/glX_proto_size.py', source = sources, -command = python_cmd + ' $SCRIPT -f $SOURCE -m size_h --only-set -h _INDIRECT_SIZE_H > $TARGET' +command = python_cmd + ' $SCRIPT -f $SOURCE -m size_h --only-set --header-tag _INDIRECT_SIZE_H > $TARGET' ) env.CodeGenerate( -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glapi: fix scons build
The arguments for glX_proto_size.py changed slightly, the '-h' short option was removed, because argparse reserves that for help messages. The auto-tools based build was already updated to account for this change, but the scons build was not. Signed-off-by: Dylan Baker --- The Scons build is broken for me before I get here, someone could test this that would be great. I don't have push access. src/glx/SConscript | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/glx/SConscript b/src/glx/SConscript index b91c0bd..619e4c3 100644 --- a/src/glx/SConscript +++ b/src/glx/SConscript @@ -125,7 +125,7 @@ env.CodeGenerate( target = 'indirect_size.h', script = GLAPI + 'gen/glX_proto_size.py', source = sources, -command = python_cmd + ' $SCRIPT -f $SOURCE -m size_h --only-set -h _INDIRECT_SIZE_H > $TARGET' +command = python_cmd + ' $SCRIPT -f $SOURCE -m size_h --only-set --header-tag _INDIRECT_SIZE_H > $TARGET' ) env.CodeGenerate( -- 2.4.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glx: fix Scons build
Well, I wrote the exact same patch and sent it a few seconds ago before I noticed this. Sorry to break things. Reviewed-by: Dylan Baker On Fri, May 22, 2015 at 01:23:52PM -0700, Brian Paul wrote: > Replace -h with --header-tag as was done for the Makefile build. > --- > src/glx/SConscript | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/glx/SConscript b/src/glx/SConscript > index b91c0bd..619e4c3 100644 > --- a/src/glx/SConscript > +++ b/src/glx/SConscript > @@ -125,7 +125,7 @@ env.CodeGenerate( > target = 'indirect_size.h', > script = GLAPI + 'gen/glX_proto_size.py', > source = sources, > -command = python_cmd + ' $SCRIPT -f $SOURCE -m size_h --only-set -h > _INDIRECT_SIZE_H > $TARGET' > +command = python_cmd + ' $SCRIPT -f $SOURCE -m size_h --only-set > --header-tag _INDIRECT_SIZE_H > $TARGET' > ) > > env.CodeGenerate( > -- > 1.9.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: Digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] xlib: fix X_GLXCreateContextAtrribs/Attribs typo
In case the glproto.h file isn't up to date, we provide the #define for X_GLXCreateContextAttribsARB. --- src/gallium/state_trackers/glx/xlib/glx_api.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/gallium/state_trackers/glx/xlib/glx_api.c b/src/gallium/state_trackers/glx/xlib/glx_api.c index 0508255..ca86aea 100644 --- a/src/gallium/state_trackers/glx/xlib/glx_api.c +++ b/src/gallium/state_trackers/glx/xlib/glx_api.c @@ -40,6 +40,12 @@ #include "xm_api.h" +/* An "Atrribs/Attribs" typo was fixed in glxproto.h in Nov 2014. + * This is in case we don't have the updated header. + */ +#ifndef X_GLXCreateContextAttribsARB +#define X_GLXCreateContextAttribsARB 34 +#endif /* This indicates the client-side GLX API and GLX encoder version. */ #define CLIENT_MAJOR_VERSION 1 @@ -2168,7 +2174,7 @@ glXQueryDrawable(Display *dpy, GLXDrawable draw, int attribute, #endif default: - generate_error(dpy, BadValue, 0, X_GLXCreateContextAtrribsARB, true); + generate_error(dpy, BadValue, 0, X_GLXCreateContextAttribsARB, true); return; } } -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2 21/22] i965/gen9: Plugin the code for selecting YF/YS tiling on skl+
On Thu, May 21, 2015 at 4:33 PM, Ben Widawsky wrote: > On Fri, Apr 17, 2015 at 04:51:42PM -0700, Anuj Phogat wrote: >> Note: Yf/Ys tiling stays disabled to avoid any piglit regressions. I'm >> working on fixing the remaining piglit failures. >> >> We need to do some benchmarking to come up with conditions to choose >> Ys (64 KB) over Yf (4 KB). Any thoughts on how big a texture should >> be so that 64 KB tiling is preferred over 4KB? >> >> Signed-off-by: Anuj Phogat >> --- >> src/mesa/drivers/dri/i965/brw_tex_layout.c | 81 >> ++ >> 1 file changed, 81 insertions(+) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c >> b/src/mesa/drivers/dri/i965/brw_tex_layout.c >> index 28927e9..b2a2dac 100644 >> --- a/src/mesa/drivers/dri/i965/brw_tex_layout.c >> +++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c >> @@ -684,6 +684,65 @@ intel_miptree_total_width_height(struct brw_context >> *brw, >> } >> } >> >> +static bool >> +intel_miptree_choose_tr_mode(struct brw_context *brw, > > Either rename this function, or make it actually choose the tr_mode. > >> + mesa_format format, >> + uint32_t width0, >> + uint32_t num_samples, >> + enum intel_miptree_tiling_mode requested, >> + struct intel_mipmap_tree *mt, >> + uint32_t tr_mode) > > As I state above and below, I don't think tr_mode should be passed in here. > This > function is supposed to "choose" the mode. > Got it. >> +{ >> + const unsigned bpp = mt->cpp * 8; >> + >> + /* bpp must be power of 2. */ >> + if (!mt->compressed && >> + _mesa_is_format_color_format(mt->format) && > > I am not finding the reason for the color format restriction. I believe it, > just > not seeing it... > I couldn't find it either. Found this in my notes: Enable YF/YS tiling only for color buffers because depth and stencil buffers are not supported in XY_FAST_COPY_BLT and meta_pbo_{TexSubImage, GetTexSubImage} paths. These are the only paths enabled by this series to handle Yf/Ys surfaces. I think we can enable them for depth and stencil if we implement the tiled_memcpy() paths to do the software decoding of Yf/Ys tiled surfaces. I'll add a comment here to explain the restriction. >> + (requested == INTEL_MIPTREE_TILING_Y || >> +requested == INTEL_MIPTREE_TILING_ANY) && >> + (bpp && (bpp & (bpp - 1)) == 0)) { >> + > > is_power_of_2 > yes. > > How about something like... > > /* Low bits are the highest priority modes */ > modes = INTEL_MIPTREE_TRMODE_YS | INTEL_MIPTREE_TRMODE_YF; > while ((modes >>= 1) & 1) { > ... try to allocate > } > Yes. This looks better. We can later add conditions here to set the priority of Yf/Ys based on buffer size, alignment etc. >> + mt->tr_mode = tr_mode; > > you're leaking tr_mode state here if you fail below. > I'll reset it to TRMODE_NONE at the bottom of function. This patch does it outside this function in the fallback path. >> + mt->align_w = intel_horizontal_texture_alignment_unit(brw, mt); >> + mt->align_h = intel_vertical_texture_alignment_unit(brw, mt); >> + >> + intel_miptree_total_width_height(brw, mt); >> + >> + /* pitch == 0 || height == 0 indicates the null texture */ > > What is the null texture? Copy-pasted comment. I'll get rid of it. > >> + if (!mt || !mt->total_width || !mt->total_height) { >> + intel_miptree_release(&mt); >> + return false; >> + } >> + >> + mt->tiling = intel_miptree_choose_tiling(brw, format, width0, >> + num_samples, >> + requested, mt); >> + >> + if (mt->tiling == I915_TILING_Y || >> + mt->tiling == (I915_TILING_Y | I915_TILING_X)) { >> + >> + /* Don't allow YS tiling at the moment. Using 64KB tiling for small >> + * textures might result in to wastage of memory. >> + * FIXME: Revisit this condition when we have more information >> about >> + * the specific cases where using YS over YF will be useful. >> + */ >> + if (mt->tr_mode == INTEL_MIPTREE_TRMODE_YF) >> +return true; >> + } >> + >> + /* Can't use requested tr_mode. Free up the memory allocated for >> + * miptree levels in intel_miptree_total_width_height(). >> + */ >> + unsigned level; >> + for (level = mt->first_level; level <= mt->last_level; level++) { >> + free(mt->level[level].slice); >> + mt->level[level].slice = NULL; >> + } > > intel_miptree_release(&mt); ? Either that or remove it above. Seems like > returning false should always do the same thing. > intel_miptree_release() does more than just freeing up the allocated slices, which we don't want to do here. We still want to keep the allocated miptree and retry with differen
Re: [Mesa-dev] [PATCH V2 22/22] i965/gen9: Disable Mip Tail for YF/YS tiled surfaces
On Thu, May 21, 2015 at 4:55 PM, Ben Widawsky wrote: > On Fri, Apr 17, 2015 at 04:51:43PM -0700, Anuj Phogat wrote: >> This fixed the buffer corruption happening in a FBO which use YF/YS >> tiled renderbuffer or texture as color attachment. >> >> Spec recommends disabling mip tails for non-mip-mapped surfaces. >> But, with this enabled I couldn't get correct data out of YF/YS >> tiled surface. I get the expected data with this disabled. > > The docs do say to disable "to disable the Mip Tail this field must be set to > a > mip that larger than those present in the surface" So I don't see why you > said, > "But". The docs also say you must set this field when trmode isn't None. > > In other words, if you don't want to use mip tails, you are using the correct > mechanism to disable it. > I agree I'm using the correct way of disabling it. But my point in above comment is that there is no restriction in the docs saying "always disable the miptails". There might be some advantage of enabling them "by setting value < 15" which I don't understand currently. Maybe I need to bump up my comment. >> >> I haven't spent any time trying to understand miptails. So, I'm >> not sure if this patch is the right thing to do. But, It helps >> move things forward at the moment. > > I don't understand them either, but I think your code is fine. Maybe add an > assert that max level is < 15? > > So with some explanation in the commit message that this disables mip tails, > this is: > Reviewed-by: Ben Widawsky > >> >> Signed-off-by: Anuj Phogat >> --- >> src/mesa/drivers/dri/i965/brw_defines.h| 3 +++ >> src/mesa/drivers/dri/i965/gen8_surface_state.c | 10 -- >> 2 files changed, 11 insertions(+), 2 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h >> b/src/mesa/drivers/dri/i965/brw_defines.h >> index c62c09b..1c50172 100644 >> --- a/src/mesa/drivers/dri/i965/brw_defines.h >> +++ b/src/mesa/drivers/dri/i965/brw_defines.h >> @@ -594,6 +594,9 @@ >> #define GEN9_SURFACE_TRMODE_TILEYF 1 >> #define GEN9_SURFACE_TRMODE_TILEYS 2 >> >> +#define GEN9_SURFACE_MIP_TAIL_START_LOD_SHIFT 8 >> +#define GEN9_SURFACE_MIP_TAIL_START_LOD_MASK INTEL_MASK(11, 8) >> + >> /* Surface state DW6 */ >> #define GEN7_SURFACE_MCS_ENABLE (1 << 0) >> #define GEN7_SURFACE_MCS_PITCH_SHIFT3 >> diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c >> b/src/mesa/drivers/dri/i965/gen8_surface_state.c >> index 189f1db..155563f 100644 >> --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c >> +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c >> @@ -258,8 +258,11 @@ gen8_update_texture_surface(struct gl_context *ctx, >> GEN7_SURFACE_MIN_LOD) | >> (intelObj->_MaxLevel - tObj->BaseLevel); /* mip count */ >> >> - if (brw->gen >= 9) >> + if (brw->gen >= 9) { >>surf[5] |= SET_FIELD(tr_mode, GEN9_SURFACE_TRMODE); >> + /* Disable Mip Tail by setting a large value. */ >> + surf[5] |= SET_FIELD(15, GEN9_SURFACE_MIP_TAIL_START_LOD); >> + } >> >> if (aux_mt) { >>surf[6] = SET_FIELD(mt->qpitch / 4, GEN8_SURFACE_AUX_QPITCH) | >> @@ -446,8 +449,11 @@ gen8_update_renderbuffer_surface(struct brw_context >> *brw, >> >> surf[5] = irb->mt_level - irb->mt->first_level; >> >> - if (brw->gen >= 9) >> + if (brw->gen >= 9) { >>surf[5] |= SET_FIELD(tr_mode, GEN9_SURFACE_TRMODE); >> + /* Disable Mip Tail by setting a large value. */ >> + surf[5] |= SET_FIELD(15, GEN9_SURFACE_MIP_TAIL_START_LOD); >> + } >> >> if (aux_mt) { >>surf[6] = SET_FIELD(mt->qpitch / 4, GEN8_SURFACE_AUX_QPITCH) | >> -- >> 2.3.4 >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev > > -- > Ben Widawsky, Intel Open Source Technology Center ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 90600] IOError: [Errno 2] No such file or directory: 'gl_API.xml'
https://bugs.freedesktop.org/show_bug.cgi?id=90600 Bug ID: 90600 Summary: IOError: [Errno 2] No such file or directory: 'gl_API.xml' Product: Mesa Version: git Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Keywords: regression Severity: normal Priority: medium Component: Mesa core Assignee: mesa-dev@lists.freedesktop.org Reporter: v...@freedesktop.org QA Contact: mesa-dev@lists.freedesktop.org CC: baker.dyla...@gmail.com mesa: 491adb61d25eef8afe2615e0fd842dda20b17004 (master 10.7.0-devel) CentOS 6 build error Generating build/linux-x86_64-debug/mesa/main/dispatch.h ... Traceback (most recent call last): File "src/mapi/glapi/gen/gl_table.py", line 244, in main() File "src/mapi/glapi/gen/gl_table.py", line 230, in main args = _parser() File "src/mapi/glapi/gen/gl_table.py", line 225, in _parser return parser.parse_args() File "/usr/lib/python2.6/site-packages/argparse.py", line 1703, in parse_args args, argv = self.parse_known_args(args, namespace) File "/usr/lib/python2.6/site-packages/argparse.py", line 1725, in parse_known_args default = self._get_value(action, default) File "/usr/lib/python2.6/site-packages/argparse.py", line 2248, in _get_value result = type_func(arg_string) File "src/mapi/glapi/gen/gl_XML.py", line 42, in parse_GL_API api.parse_file( file_name ) File "src/mapi/glapi/gen/gl_XML.py", line 899, in parse_file doc = ET.parse( file_name ) File "/usr/lib64/python2.6/xml/etree/ElementTree.py", line 862, in parse tree.parse(source, parser) File "/usr/lib64/python2.6/xml/etree/ElementTree.py", line 579, in parse source = open(source, "rb") IOError: [Errno 2] No such file or directory: 'gl_API.xml' scons: *** [build/linux-x86_64-debug/mesa/main/dispatch.h] Error 1 -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 90600] IOError: [Errno 2] No such file or directory: 'gl_API.xml'
https://bugs.freedesktop.org/show_bug.cgi?id=90600 --- Comment #1 from Brian Paul --- Vinson, what does your scons command look like? I haven't been able to reproduce this here. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 90600] IOError: [Errno 2] No such file or directory: 'gl_API.xml'
https://bugs.freedesktop.org/show_bug.cgi?id=90600 --- Comment #2 from Vinson Lee --- (In reply to Brian Paul from comment #1) > Vinson, what does your scons command look like? > I haven't been able to reproduce this here. It is just 'scons', no arguments. This is the final build command that fails. /usr/bin/python src/mapi/glapi/gen/gl_table.py -m remap_table -f src/mapi/glapi/gen/gl_and_es_API.xml -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 90600] IOError: [Errno 2] No such file or directory: 'gl_API.xml'
https://bugs.freedesktop.org/show_bug.cgi?id=90600 --- Comment #3 from Ilia Mirkin --- IIRC Dylan even mentioned this would break on python2.6 (without the extra argument parser module). I'm not so eager to drop support for py2.6, not sure why this was done =/ -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 90600] IOError: [Errno 2] No such file or directory: 'gl_API.xml'
https://bugs.freedesktop.org/show_bug.cgi?id=90600 --- Comment #4 from Vinson Lee --- Build is also broken on Ubuntu 12.04 and it has Python 2.7. Generating build/linux-x86_64-debug/mapi/glapi/glapitable.h ... Traceback (most recent call last): File "src/mapi/glapi/gen/gl_table.py", line 244, in main() File "src/mapi/glapi/gen/gl_table.py", line 230, in main args = _parser() File "src/mapi/glapi/gen/gl_table.py", line 225, in _parser return parser.parse_args() File "/usr/lib/python2.7/argparse.py", line 1688, in parse_args args, argv = self.parse_known_args(args, namespace) File "/usr/lib/python2.7/argparse.py", line 1710, in parse_known_args default = self._get_value(action, default) File "/usr/lib/python2.7/argparse.py", line 2233, in _get_value result = type_func(arg_string) File "src/mapi/glapi/gen/gl_XML.py", line 42, in parse_GL_API api.parse_file( file_name ) File "src/mapi/glapi/gen/gl_XML.py", line 899, in parse_file doc = ET.parse( file_name ) File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1183, in parse tree.parse(source, parser) File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 647, in parse source = open(source, "rb") IOError: [Errno 2] No such file or directory: 'gl_API.xml' -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 90600] IOError: [Errno 2] No such file or directory: 'gl_API.xml'
https://bugs.freedesktop.org/show_bug.cgi?id=90600 --- Comment #5 from Ilia Mirkin --- (In reply to Ilia Mirkin from comment #3) > IIRC Dylan even mentioned this would break on python2.6 (without the extra > argument parser module). > > I'm not so eager to drop support for py2.6, not sure why this was done =/ While everything above is true, it's also a bit misleading as I just realized (sorry). Looks like your CentOS box has the separate argparse module as well. If you didn't have it, the errors would be different. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2 17/22] i965/gen9: Add XY_FAST_COPY_BLT support to intelEmitCopyBlit()
On Thu, May 21, 2015 at 5:46 PM, Anuj Phogat wrote: > On Thu, May 21, 2015 at 12:26 PM, Ben Widawsky wrote: >> A lot of opinion stuff is below, feel free to ignore them if you don't think >> there are improvements. >> >> On Fri, Apr 17, 2015 at 04:51:38PM -0700, Anuj Phogat wrote: >>> This patch enables using XY_FAST_COPY_BLT only for Yf/Ys tiled buffers. >>> Later It can be turned on for other tiling patterns (X,Y). >>> >>> Signed-off-by: Anuj Phogat >>> --- >>> src/mesa/drivers/dri/i965/intel_blit.c | 292 >>> +++ >>> src/mesa/drivers/dri/i965/intel_blit.h | 3 + >>> src/mesa/drivers/dri/i965/intel_copy_image.c | 3 + >>> src/mesa/drivers/dri/i965/intel_reg.h| 33 +++ >>> 4 files changed, 291 insertions(+), 40 deletions(-) >>> >>> diff --git a/src/mesa/drivers/dri/i965/intel_blit.c >>> b/src/mesa/drivers/dri/i965/intel_blit.c >>> index 9500bd7..36746c4 100644 >>> --- a/src/mesa/drivers/dri/i965/intel_blit.c >>> +++ b/src/mesa/drivers/dri/i965/intel_blit.c >>> @@ -43,6 +43,23 @@ >>> >>> #define FILE_DEBUG_FLAG DEBUG_BLIT >>> >>> +#define SET_TILING_XY_FAST_COPY_BLT(tiling, tr_mode, type) \ >>> +({ \ >>> + switch (tiling) { \ >>> + case I915_TILING_X: \ >>> + CMD |= type ## _TILED_X; \ >>> + break; \ >>> + case I915_TILING_Y: \ >> >> assert(tr_mode != INTEL_MIPTREE_TRMODE_YF); ? >> > INTEL_MIPTREE_TRMODE_{YF, NONE} are allowed and covered in else case. > >>> + if (tr_mode == INTEL_MIPTREE_TRMODE_YS)\ >>> + CMD |= type ## _TILED_64K; \ >>> + else \ >>> + CMD |= type ## _TILED_Y;\ >>> + break; \ >>> + default: \ >>> + unreachable("not reached");\ >>> + } \ >>> +}) >>> + >>> static void >>> intel_miptree_set_alpha_to_one(struct brw_context *brw, >>> struct intel_mipmap_tree *mt, >>> @@ -75,6 +92,12 @@ static uint32_t >>> br13_for_cpp(int cpp) >>> { >>> switch (cpp) { >>> + case 16:nn >>> + return BR13_32323232; >>> + break; >>> + case 8: >>> + return BR13_16161616; >>> + break; >>> case 4: >>>return BR13_; >>>break; >>> @@ -89,6 +112,132 @@ br13_for_cpp(int cpp) >>> } >>> } >>> >>> +static uint32_t >>> +get_tr_horizontal_align(uint32_t tr_mode, uint32_t cpp, bool is_src) { >>> + /*Alignment tables for YF/YS tiled surfaces. */ >>> + const uint32_t align_2d_yf[] = {64, 64, 32, 32, 16}; >>> + const uint32_t align_2d_ys[] = {256, 256, 128, 128, 64}; >>> + const uint32_t bpp = cpp * 8; >>> + uint32_t align; >>> + int i = 0; >>> + >>> + if (tr_mode == INTEL_MIPTREE_TRMODE_NONE) >>> + return 0; >>> + >>> + /* Compute array index. */ >>> + assert (bpp >= 8 && bpp <= 128 && (bpp & (bpp - 1)) == 0); >> >> assert(bpp >= 8 && bpp <= 128 && _mesa_bitcount(bpp) == 1); >> >> (I couldn't find a is_pow2, but one must exist). >> > There is a is_power_of_two() in main/macros.h. I'll use that here. > >>> + while (bpp >> (i + 4)) >>> + i++; >>> + >> Since you just asserted this was a power of 2 above, isn't this just: >> ffs(bpp/8) - 1; >> > Yes, ffs() will work too. I'll use it. > >>> + if (tr_mode == INTEL_MIPTREE_TRMODE_YF) >>> + align = align_2d_yf[i]; >>> + else >>> + align = align_2d_ys[i]; >>> + >>> + switch(align) { >>> + case 512: >>> + return is_src ? XY_SRC_H_ALIGN_512 : XY_DST_H_ALIGN_512; >>> + case 256: >>> + return is_src ? XY_SRC_H_ALIGN_256 : XY_DST_H_ALIGN_256; >>> + case 128: >>> + return is_src ? XY_SRC_H_ALIGN_128 : XY_DST_H_ALIGN_128; >>> + case 64: >>> + return is_src ? XY_SRC_H_ALIGN_64 : XY_DST_H_ALIGN_64; >>> + case 32: >>> + /* XY_FAST_COPY_BLT doesn't support horizontal alignment of 16. */ >>> + case 16: >>> + return is_src ? XY_SRC_H_ALIGN_32 : XY_DST_H_ALIGN_32; >>> + default: >>> + unreachable("not reached"); >>> + } >>> +} >>> + >>> +static uint32_t >>> +get_tr_vertical_align(uint32_t tr_mode, uint32_t cpp, bool is_src) { >>> + /* Vertical alignment tables for YF/YS tiled surfaces. */ >>> + const unsigned align_2d_yf[] = {64, 32, 32, 16, 16}; >>> + const unsigned align_2d_ys[] = {256, 128, 128, 64, 64}; >>> + const uint32_t bpp = cpp * 8; >>> + uint32_t align; >>> + int i = 0; >>> + >>> + if (tr_mode == INTEL_MIPTREE_TRMODE_NONE)
[Mesa-dev] [PATCH] nv50/ir: avoid messing up arg1 of PFETCH
There can be scenarios where the "indirect" arg of a PFETCH becomes known, and so the code will attempt to propagate it. Use this opportunity to just fold it into the first argument, and prevent the load propagation pass from touching PFETCH further. This fixes gs-input-array-vec4-index-rd.shader_test and vs-output-array-vec4-index-wr-before-gs.shader_test on nvc0 at least. Signed-off-by: Ilia Mirkin Cc: "10.5 10.6" --- src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 10 ++ 1 file changed, 10 insertions(+) diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp index 72dd31e..98e3d1f 100644 --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp @@ -236,6 +236,9 @@ LoadPropagation::visit(BasicBlock *bb) if (i->op == OP_CALL) // calls have args as sources, they must be in regs continue; + if (i->op == OP_PFETCH) // pfetch expects arg1 to be a reg + continue; + if (i->srcExists(1)) checkSwapSrc01(i); @@ -581,6 +584,11 @@ ConstantFolding::expr(Instruction *i, case OP_POPCNT: res.data.u32 = util_bitcount(a->data.u32 & b->data.u32); break; + case OP_PFETCH: + // The two arguments to pfetch are logically added together. Normally + // the second argument will not be constant, but that can happen. + res.data.u32 = a->data.u32 + b->data.u32; + break; default: return; } @@ -610,6 +618,8 @@ ConstantFolding::expr(Instruction *i, bld.setPosition(i, false); i->setSrc(1, bld.loadImm(NULL, res.data.u32)); } + } else if (i->op == OP_PFETCH) { + // Leave PFETCH alone... we just folded its 2 args into 1. } else { i->op = i->saturate ? OP_SAT : OP_MOV; /* SAT handled by unary() */ } -- 2.3.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev