Re: [Mesa-dev] [PATCH 4/9] i965: remove unused members in blorp clear program
On Thu, Nov 28, 2013 at 11:50:24PM -0800, Kenneth Graunke wrote: > On 11/28/2013 11:44 PM, Kenneth Graunke wrote: > > On 11/27/2013 01:13 PM, Topi Pohjolainen wrote: > >> Documentation for R0 and R1 is taken from > >> fs_visitor::setup_payload_gen6(). > >> > >> Signed-off-by: Topi Pohjolainen > >> --- > >> src/mesa/drivers/dri/i965/brw_blorp_clear.cpp | 15 +++ > >> 1 file changed, 3 insertions(+), 12 deletions(-) > >> > >> diff --git a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp > >> b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp > >> index 2fa0b50..a937edb 100644 > >> --- a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp > >> +++ b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp > >> @@ -104,12 +104,6 @@ private: > >> const brw_blorp_const_color_prog_key *key; > >> struct brw_compile func; > >> > >> - /* Thread dispatch header */ > >> - struct brw_reg R0; > >> - > >> - /* Pixel X/Y coordinates (always in R1). */ > >> - struct brw_reg R1; > >> - > > > > I like the fact that you're getting rid of these. They're trivially > > reconstructable as brw_vec8_grf(0, 0) and brw_vec8_grf(1, 0), so storing > > them isn't really worthwhile. > > > >> /* Register with push constants (a single vec4) */ > >> struct brw_reg clear_rgba; > >> > >> @@ -123,8 +117,6 @@ > >> brw_blorp_const_color_program::brw_blorp_const_color_program( > >> : mem_ctx(ralloc_context(NULL)), > >> brw(brw), > >> key(key), > >> - R0(), > >> - R1(), > >> clear_rgba(), > >> base_mrf(0) > >> { > >> @@ -363,11 +355,8 @@ brw_blorp_const_color_params::get_wm_prog(struct > >> brw_context *brw, > >> void > >> brw_blorp_const_color_program::alloc_regs() > >> { > >> - int reg = 0; > >> - this->R0 = retype(brw_vec8_grf(reg++, 0), BRW_REGISTER_TYPE_UW); > >> - this->R1 = retype(brw_vec8_grf(reg++, 0), BRW_REGISTER_TYPE_UW); > >> + int reg = prog_data.first_curbe_grf; > > > > This line doesn't make sense to me. This function is what sets > > prog_data.first_curbe_grf. Presumably you want: > > > > /* Reserve space for g0 and g1, which contain the thread payload. */ > > int reg = 2; > > > > Otherwise, I think the line below will change clear_rgba to g0 rather > > than g2, which is probably not what you want. Maybe I'm misreading. > > Indeed, I can't read, sorry. You changed it so that compile() sets > first_curbe_grf, and alloc_regs() uses it. So your patch is correct. > > Still, I think it would probably be nicer to keep it all contained in > alloc_regs(). It's such a tiny function though... And thrown away when the replicated path gets converted. I thought it would be simpler to move the assignment to the final location already here. > > > > >> - prog_data.first_curbe_grf = reg; > >> clear_rgba = retype(brw_vec4_grf(reg++, 0), BRW_REGISTER_TYPE_F); > >> reg += BRW_BLORP_NUM_PUSH_CONST_REGS; > >> > >> @@ -384,6 +373,8 @@ brw_blorp_const_color_program::compile(struct > >> brw_context *brw, > >> /* Set up prog_data */ > >> memset(&prog_data, 0, sizeof(prog_data)); > >> prog_data.persample_msaa_dispatch = false; > >> + /* R0-1: masks, pixel X/Y coordinates. */ > >> + prog_data.first_curbe_grf = 2; > >> > >> alloc_regs(); > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/6] Add support for 'sample' qualifier
This series adds the mesa and glsl compiler support for the new 'sample in' and 'sample out' qualifiers from GLSL 4.0 / ARB_gpu_shader5. Driver support (beyond triggering per-sample fragment shader evaluation) is not yet implemented. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/6] mesa: add IsSample bitfield to gl_fragment_program
Drivers will need to look at this to decide if they need to do per-sample fragment shader dispatch. Signed-off-by: Chris Forbes --- src/mesa/main/mtypes.h | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index b4b432f..4698700 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -2132,6 +2132,12 @@ struct gl_fragment_program * uses centroid interpolation, 0 otherwise. Unused inputs are 0. */ GLbitfield64 IsCentroid; + + /** +* Bitfield indicating, for each fragment shader input, 1 if that input +* uses sample interpolation, 0 otherwise. Unused inputs are 0. +*/ + GLbitfield64 IsSample; }; -- 1.8.4.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/6] glsl: Put `sample`-qualified varyings in their own packing classes
Signed-off-by: Chris Forbes --- src/glsl/link_varyings.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/glsl/link_varyings.cpp b/src/glsl/link_varyings.cpp index d2a4fc8..097cee5 100644 --- a/src/glsl/link_varyings.cpp +++ b/src/glsl/link_varyings.cpp @@ -887,7 +887,7 @@ varying_matches::compute_packing_class(ir_variable *var) * * Therefore, the packing class depends only on the interpolation type. */ - unsigned packing_class = var->centroid ? 1 : 0; + unsigned packing_class = var->centroid | (var->sample << 1); packing_class *= 4; packing_class += var->interpolation; return packing_class; -- 1.8.4.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/6] glsl: Populate gl_fragment_program::IsSample bitfield
Signed-off-by: Chris Forbes --- src/glsl/ir_set_program_inouts.cpp | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/glsl/ir_set_program_inouts.cpp b/src/glsl/ir_set_program_inouts.cpp index ab23538..1a36527 100644 --- a/src/glsl/ir_set_program_inouts.cpp +++ b/src/glsl/ir_set_program_inouts.cpp @@ -27,7 +27,7 @@ * Sets the InputsRead and OutputsWritten of Mesa programs. * * Additionally, for fragment shaders, sets the InterpQualifier array, the - * IsCentroid bitfield, and the UsesDFdy flag. + * IsCentroid and IsSample bitfields, and the UsesDFdy flag. * * Mesa programs (gl_program, not gl_shader_program) have a set of * flags indicating which varyings are read and written. Computing @@ -102,6 +102,8 @@ mark(struct gl_program *prog, ir_variable *var, int offset, int len, (glsl_interp_qualifier) var->interpolation; if (var->centroid) fprog->IsCentroid |= bitfield; +if (var->sample) + fprog->IsSample |= bitfield; } } else if (var->mode == ir_var_system_value) { prog->SystemValuesRead |= bitfield; @@ -341,6 +343,7 @@ do_set_program_inouts(exec_list *instructions, struct gl_program *prog, gl_fragment_program *fprog = (gl_fragment_program *) prog; memset(fprog->InterpQualifier, 0, sizeof(fprog->InterpQualifier)); fprog->IsCentroid = 0; + fprog->IsSample = 0; fprog->UsesDFdy = false; fprog->UsesKill = false; } -- 1.8.4.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/6] glsl: Add frontend support for `sample` auxiliary storage qualifier
Signed-off-by: Chris Forbes --- src/glsl/ast.h | 1 + src/glsl/ast_type.cpp | 3 ++- src/glsl/glsl_lexer.ll | 2 +- src/glsl/glsl_parser.yy | 9 +++-- src/glsl/glsl_parser_extras.cpp | 2 ++ 5 files changed, 13 insertions(+), 4 deletions(-) diff --git a/src/glsl/ast.h b/src/glsl/ast.h index 5c214b6..76911f0 100644 --- a/src/glsl/ast.h +++ b/src/glsl/ast.h @@ -357,6 +357,7 @@ struct ast_type_qualifier { unsigned in:1; unsigned out:1; unsigned centroid:1; + unsigned sample:1; unsigned uniform:1; unsigned smooth:1; unsigned flat:1; diff --git a/src/glsl/ast_type.cpp b/src/glsl/ast_type.cpp index 2b088bf..d758bfa 100644 --- a/src/glsl/ast_type.cpp +++ b/src/glsl/ast_type.cpp @@ -90,7 +90,8 @@ ast_type_qualifier::has_storage() const bool ast_type_qualifier::has_auxiliary_storage() const { - return this->flags.q.centroid; + return this->flags.q.centroid + || this->flags.q.sample; } const char* diff --git a/src/glsl/glsl_lexer.ll b/src/glsl/glsl_lexer.ll index 822d70d..50875bf 100644 --- a/src/glsl/glsl_lexer.ll +++ b/src/glsl/glsl_lexer.ll @@ -520,7 +520,7 @@ readonlyKEYWORD(0, 300, 0, 0, READONLY); writeonly KEYWORD(0, 300, 0, 0, WRITEONLY); resource KEYWORD(0, 300, 0, 0, RESOURCE); patch KEYWORD(0, 300, 0, 0, PATCH); -sample KEYWORD(0, 300, 0, 0, SAMPLE); +sample KEYWORD_WITH_ALT(400, 300, 400, 0, yyextra->ARB_gpu_shader5_enable, SAMPLE); subroutine KEYWORD(0, 300, 0, 0, SUBROUTINE); diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy index ada3690..1016554 100644 --- a/src/glsl/glsl_parser.yy +++ b/src/glsl/glsl_parser.yy @@ -1521,7 +1521,7 @@ type_qualifier: { if ($2.has_auxiliary_storage()) { _mesa_glsl_error(&@1, state, - "duplicate auxiliary storage qualifier (centroid)"); + "duplicate auxiliary storage qualifier (centroid or sample)"); } if (!state->ARB_shading_language_420pack_enable && @@ -1571,7 +1571,12 @@ auxiliary_storage_qualifier: memset(& $$, 0, sizeof($$)); $$.flags.q.centroid = 1; } - /* TODO: "sample" and "patch" also go here someday. */ + | SAMPLE + { + memset(& $$, 0, sizeof($$)); + $$.flags.q.sample = 1; + } + /* TODO: "patch" also goes here someday. */ storage_qualifier: CONST_TOK diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp index d76d94b..cdb8652 100644 --- a/src/glsl/glsl_parser_extras.cpp +++ b/src/glsl/glsl_parser_extras.cpp @@ -877,6 +877,8 @@ _mesa_ast_type_qualifier_print(const struct ast_type_qualifier *q) if (q->flags.q.centroid) printf("centroid "); + if (q->flags.q.sample) + printf("sample "); if (q->flags.q.uniform) printf("uniform "); if (q->flags.q.smooth) -- 1.8.4.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/6] mesa: Require per-sample shading if the `sample` qualifier is used.
Signed-off-by: Chris Forbes --- src/mesa/program/program.c | 8 1 file changed, 8 insertions(+) diff --git a/src/mesa/program/program.c b/src/mesa/program/program.c index 01f8c6f..cdf1c03 100644 --- a/src/mesa/program/program.c +++ b/src/mesa/program/program.c @@ -1049,6 +1049,14 @@ _mesa_get_min_invocations_per_fragment(struct gl_context *ctx, * has no effect." */ if (ctx->Multisample.Enabled) { + /* The ARB_gpu_shader5 specification says: + * + * "Use of the "sample" qualifier on a fragment shader input + * forces per-sample shading" + */ + if (prog->IsSample) + return MAX2(ctx->DrawBuffer->Visual.samples, 1); + if (prog->Base.SystemValuesRead & (SYSTEM_BIT_SAMPLE_ID | SYSTEM_BIT_SAMPLE_POS)) return MAX2(ctx->DrawBuffer->Visual.samples, 1); -- 1.8.4.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/6] glsl: Add ir support for `sample` qualifier; adjust compiler and linker
Signed-off-by: Chris Forbes --- src/glsl/ast_to_hir.cpp | 15 +++ src/glsl/builtin_variables.cpp| 2 ++ src/glsl/glsl_types.cpp | 5 + src/glsl/glsl_types.h | 6 ++ src/glsl/ir.cpp | 5 +++-- src/glsl/ir.h | 1 + src/glsl/ir_clone.cpp | 1 + src/glsl/ir_print_visitor.cpp | 5 +++-- src/glsl/ir_reader.cpp| 2 ++ src/glsl/link_varyings.cpp| 14 ++ src/glsl/linker.cpp | 6 ++ src/glsl/lower_named_interface_blocks.cpp | 1 + src/glsl/lower_packed_varyings.cpp| 1 + 13 files changed, 60 insertions(+), 4 deletions(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 43cf497..d525096 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -2180,6 +2180,9 @@ apply_type_qualifier_to_variable(const struct ast_type_qualifier *qual, if (qual->flags.q.centroid) var->centroid = 1; + if (qual->flags.q.sample) + var->sample = 1; + if (qual->flags.q.attribute && state->target != vertex_shader) { var->type = glsl_type::error_type; _mesa_glsl_error(loc, state, @@ -3277,6 +3280,14 @@ ast_declarator_list::hir(exec_list *instructions, "'centroid in' cannot be used in a vertex shader"); } + if (state->target == vertex_shader + && this->type->qualifier.flags.q.sample + && this->type->qualifier.flags.q.in) { + + _mesa_glsl_error(&loc, state, +"'sample in' cannot be used in a vertex shader"); + } + /* Section 4.3.6 of the GLSL 1.30 specification states: * "It is an error to use centroid out in a fragment shader." * @@ -4662,6 +4673,7 @@ ast_process_structure_or_interface_block(exec_list *instructions, fields[i].interpolation = interpret_interpolation_qualifier(qual, var_mode, state, &loc); fields[i].centroid = qual->flags.q.centroid ? 1 : 0; + fields[i].sample = qual->flags.q.sample ? 1 : 0; if (qual->flags.q.row_major || qual->flags.q.column_major) { if (!qual->flags.q.uniform) { @@ -4930,6 +4942,8 @@ ast_interface_block::hir(exec_list *instructions, earlier_per_vertex->fields.structure[j].interpolation; fields[i].centroid = earlier_per_vertex->fields.structure[j].centroid; +fields[i].sample = + earlier_per_vertex->fields.structure[j].sample; } } @@ -5084,6 +5098,7 @@ ast_interface_block::hir(exec_list *instructions, var_mode); var->interpolation = fields[i].interpolation; var->centroid = fields[i].centroid; + var->sample = fields[i].sample; var->init_interface_type(block_type); if (redeclaring_per_vertex) { diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp index d57324c..d0e76e3 100644 --- a/src/glsl/builtin_variables.cpp +++ b/src/glsl/builtin_variables.cpp @@ -332,6 +332,7 @@ per_vertex_accumulator::add_field(int slot, const glsl_type *type, this->fields[this->num_fields].location = slot; this->fields[this->num_fields].interpolation = INTERP_QUALIFIER_NONE; this->fields[this->num_fields].centroid = 0; + this->fields[this->num_fields].sample = 0; this->num_fields++; } @@ -937,6 +938,7 @@ builtin_variable_generator::generate_varyings() fields[i].location); var->interpolation = fields[i].interpolation; var->centroid = fields[i].centroid; + var->sample = fields[i].sample; var->init_interface_type(per_vertex_out_type); } } diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp index f740130..12d4ac0 100644 --- a/src/glsl/glsl_types.cpp +++ b/src/glsl/glsl_types.cpp @@ -103,6 +103,7 @@ glsl_type::glsl_type(const glsl_struct_field *fields, unsigned num_fields, this->fields.structure[i].location = fields[i].location; this->fields.structure[i].interpolation = fields[i].interpolation; this->fields.structure[i].centroid = fields[i].centroid; + this->fields.structure[i].sample = fields[i].sample; this->fields.structure[i].row_major = fields[i].row_major; } } @@ -130,6 +131,7 @@ glsl_type::glsl_type(const glsl_struct_field *fields, unsigned num_fields, this->fields.structure[i].location = fields[i].location; this->fields.structure[i].interpolation = fields[i].interpolation; this->fields.structure[i].centroid = fields[i].centroid; + this->fields.structure[i].sample = fields[i].sample; this->fields.structure[i].row_major = fields[i].row_major; } } @@ -483,6 +485,9 @@ glsl_type::record_key_compare(const void *a, const void *b) if (key1->fie
[Mesa-dev] [PATCH] i965: Don't flag gather quirks for Gen8+
My understanding is that Broadwell retains the same SCS mechanism that Haswell has, so even if the underlying issue with this format is not fixed, the w/a will be applied in SCS rather than needing shader code. Signed-off-by: Chris Forbes Cc: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_wm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm.c b/src/mesa/drivers/dri/i965/brw_wm.c index bc1480c..8a106c7 100644 --- a/src/mesa/drivers/dri/i965/brw_wm.c +++ b/src/mesa/drivers/dri/i965/brw_wm.c @@ -352,7 +352,7 @@ brw_populate_sampler_prog_key_data(struct gl_context *ctx, /* gather4's channel select for green from RG32F is broken; * requires a shader w/a on IVB; fixable with just SCS on HSW. */ - if (brw->gen >= 7 && !brw->is_haswell && prog->UsesGather) { + if (brw->gen == 7 && !brw->is_haswell && prog->UsesGather) { if (img->InternalFormat == GL_RG32F) key->gather_channel_quirk_mask |= 1 << s; } -- 1.8.4.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] nv50: fix a small leak on context destroy
Signed-off-by: Ilia Mirkin Cc: "9.2 10.0" --- Found with valgrind. src/gallium/drivers/nouveau/nv50/nv50_context.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_context.c b/src/gallium/drivers/nouveau/nv50/nv50_context.c index b6bdf79..2b90bc2 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_context.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_context.c @@ -114,6 +114,8 @@ nv50_destroy(struct pipe_context *pipe) draw_destroy(nv50->draw); #endif + free(nv50->blit); + nouveau_context_destroy(&nv50->base); } -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] mesa: don't leak performance monitors on context destroy
Signed-off-by: Ilia Mirkin Cc: "10.0" --- Found with valgrind. Don't have the hardware to test a real implementation, but with nv50 it seemed to work in that valgrind was no longer marking the hash table as leaked. src/mesa/main/context.c | 1 + src/mesa/main/performance_monitor.c | 19 +++ src/mesa/main/performance_monitor.h | 3 +++ 3 files changed, 23 insertions(+) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 87a4a35..658499f 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -1194,6 +1194,7 @@ _mesa_free_context_data( struct gl_context *ctx ) _mesa_free_sync_data(ctx); _mesa_free_varray_data(ctx); _mesa_free_transform_feedback(ctx); + _mesa_free_performance_monitors(ctx); _mesa_reference_buffer_object(ctx, &ctx->Pack.BufferObj, NULL); _mesa_reference_buffer_object(ctx, &ctx->Unpack.BufferObj, NULL); diff --git a/src/mesa/main/performance_monitor.c b/src/mesa/main/performance_monitor.c index 4981e6f..e62f770 100644 --- a/src/mesa/main/performance_monitor.c +++ b/src/mesa/main/performance_monitor.c @@ -93,6 +93,25 @@ fail: return NULL; } +static void +free_performance_monitor(GLuint key, void *data, void *user) +{ + struct gl_perf_monitor_object *m = data; + struct gl_context *ctx = user; + + ralloc_free(m->ActiveGroups); + ralloc_free(m->ActiveCounters); + ctx->Driver.DeletePerfMonitor(ctx, m); +} + +void +_mesa_free_performance_monitors(struct gl_context *ctx) +{ + _mesa_HashDeleteAll(ctx->PerfMonitor.Monitors, + free_performance_monitor, ctx); + _mesa_DeleteHashTable(ctx->PerfMonitor.Monitors); +} + static inline struct gl_perf_monitor_object * lookup_monitor(struct gl_context *ctx, GLuint id) { diff --git a/src/mesa/main/performance_monitor.h b/src/mesa/main/performance_monitor.h index a852a41..76234e5 100644 --- a/src/mesa/main/performance_monitor.h +++ b/src/mesa/main/performance_monitor.h @@ -35,6 +35,9 @@ extern void _mesa_init_performance_monitors(struct gl_context *ctx); +extern void +_mesa_free_performance_monitors(struct gl_context *ctx); + extern void GLAPIENTRY _mesa_GetPerfMonitorGroupsAMD(GLint *numGroups, GLsizei groupsSize, GLuint *groups); -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Fw: Taking part in MESA development - Dissertation Project
On Fri, 2013-11-29 at 09:36 +0200, Petri Latvala wrote: > On 11/28/2013 11:15 PM, Timothy Arceri wrote: > > Hi guys, > > > > I received the following submitted as an Issue on my github account. > > Maybe someone here has a project they can suggest. > > Is NewbieProjects too "newbie" for this? > > http://wiki.freedesktop.org/dri/NewbieProjects/ > > > Most of the remaining work seems to be partly done already so I don't think what remains qualifies as "non-trivial". Maybe there are some left over google summer of code projects that were not undertaken that would be of interest: http://www.x.org/wiki/SummerOfCodeIdeas/ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] egl: add HAVE_LIBDRM define, fix EGL X11 platform
Commit a594cec broke EGL X11 backend by adding dependency between X11 and DRM backends requiring HAVE_EGL_PLATFORM_DRM defined for X11. This patch fixes the issue by adding additional define for libdrm detection independent of which backend is being compiled. Tested by compiling Mesa with '--with-egl-platforms=x11' and running es2gears_x11 + glbenchmark2.7 successfully. v2: return true for dri2_auth if running without libdrm (Samuel) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72062 Signed-off-by: Tapani Pälli Cc: Samuel Thibault Cc: mesa-sta...@lists.freedesktop.org --- configure.ac| 3 +++ src/egl/drivers/dri2/Makefile.am| 1 + src/egl/drivers/dri2/platform_x11.c | 9 +++-- 3 files changed, 7 insertions(+), 6 deletions(-) diff --git a/configure.ac b/configure.ac index 8c52535..fdab621 100644 --- a/configure.ac +++ b/configure.ac @@ -761,6 +761,9 @@ AC_SUBST([MESA_LLVM]) # Check for libdrm PKG_CHECK_MODULES([LIBDRM], [libdrm >= $LIBDRM_REQUIRED], [have_libdrm=yes], [have_libdrm=no]) +if test "x$have_libdrm" = xyes; then + DEFINES="$DEFINES -DHAVE_LIBDRM" +fi PKG_CHECK_MODULES([LIBUDEV], [libudev >= $LIBUDEV_REQUIRED], have_libudev=yes, have_libudev=no) diff --git a/src/egl/drivers/dri2/Makefile.am b/src/egl/drivers/dri2/Makefile.am index 823ef5e..a6a81df 100644 --- a/src/egl/drivers/dri2/Makefile.am +++ b/src/egl/drivers/dri2/Makefile.am @@ -29,6 +29,7 @@ AM_CFLAGS = \ -I$(top_builddir)/src/egl/wayland/wayland-drm \ $(DEFINES) \ $(VISIBILITY_CFLAGS) \ + $(HAVE_LIBDRM) \ $(LIBDRM_CFLAGS) \ $(LIBUDEV_CFLAGS) \ $(LIBKMS_CFLAGS) \ diff --git a/src/egl/drivers/dri2/platform_x11.c b/src/egl/drivers/dri2/platform_x11.c index c56a413..04cb62b 100644 --- a/src/egl/drivers/dri2/platform_x11.c +++ b/src/egl/drivers/dri2/platform_x11.c @@ -33,7 +33,7 @@ #include #include #include -#ifdef HAVE_DRM_PLATFORM +#ifdef HAVE_LIBDRM #include #endif #include @@ -608,7 +608,7 @@ dri2_x11_authenticate(_EGLDisplay *disp, uint32_t id) static EGLBoolean dri2_authenticate(_EGLDisplay *disp) { -#ifdef HAVE_DRM_PLATFORM +#ifdef HAVE_LIBDRM struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp); drm_magic_t magic; @@ -621,11 +621,8 @@ dri2_authenticate(_EGLDisplay *disp) _eglLog(_EGL_WARNING, "DRI2: failed to authenticate"); return EGL_FALSE; } - - return EGL_TRUE; -#else - return EGL_FALSE; #endif + return EGL_TRUE; } static EGLBoolean -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Fw: Taking part in MESA development - Dissertation Project
The tesselation shaders and the compute shader are definitely not trivial. Marek On Fri, Nov 29, 2013 at 12:01 PM, Timothy Arceri wrote: > On Fri, 2013-11-29 at 09:36 +0200, Petri Latvala wrote: >> On 11/28/2013 11:15 PM, Timothy Arceri wrote: >> > Hi guys, >> > >> > I received the following submitted as an Issue on my github account. >> > Maybe someone here has a project they can suggest. >> >> Is NewbieProjects too "newbie" for this? >> >> http://wiki.freedesktop.org/dri/NewbieProjects/ >> >> >> > > Most of the remaining work seems to be partly done already so I don't > think what remains qualifies as "non-trivial". > > Maybe there are some left over google summer of code projects that were > not undertaken that would be of interest: > http://www.x.org/wiki/SummerOfCodeIdeas/ > > > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Fw: Taking part in MESA development - Dissertation Project
They are not a newbie task, too. On Fri, Nov 29, 2013 at 3:34 PM, Marek Olšák wrote: > The tesselation shaders and the compute shader are definitely not trivial. > > Marek > > On Fri, Nov 29, 2013 at 12:01 PM, Timothy Arceri > wrote: > > On Fri, 2013-11-29 at 09:36 +0200, Petri Latvala wrote: > >> On 11/28/2013 11:15 PM, Timothy Arceri wrote: > >> > Hi guys, > >> > > >> > I received the following submitted as an Issue on my github account. > >> > Maybe someone here has a project they can suggest. > >> > >> Is NewbieProjects too "newbie" for this? > >> > >> http://wiki.freedesktop.org/dri/NewbieProjects/ > >> > >> > >> > > > > Most of the remaining work seems to be partly done already so I don't > > think what remains qualifies as "non-trivial". > > > > Maybe there are some left over google summer of code projects that were > > not undertaken that would be of interest: > > http://www.x.org/wiki/SummerOfCodeIdeas/ > > > > > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH][9.2] st/xorg: Handle new DamageUnregister API which has only one argument
This fixes building against the new API in X server 1.15 Taken from xf86-video-modesetting beca4dfb0e4d11d3729214967a1fe56ee5669831 from Keith Packard --- src/gallium/state_trackers/xorg/xorg_driver.c | 4 1 file changed, 4 insertions(+) diff --git a/src/gallium/state_trackers/xorg/xorg_driver.c b/src/gallium/state_trackers/xorg/xorg_driver.c index 9d7713c..4671ba7 100644 --- a/src/gallium/state_trackers/xorg/xorg_driver.c +++ b/src/gallium/state_trackers/xorg/xorg_driver.c @@ -62,6 +62,10 @@ #include "libkms/libkms.h" #endif +#if XORG_VERSION_CURRENT >= XORG_VERSION_NUMERIC(1,14,99,2,0) +#define DamageUnregister(d, dd) DamageUnregister(dd) +#endif + /* * Functions and symbols exported to Xorg via pointers. */ -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/10] radeon: move some functions to r600_buffer.c
From: Marek Olšák --- src/gallium/drivers/radeon/Makefile.sources | 1 + src/gallium/drivers/radeon/r600_buffer.c | 133 ++ src/gallium/drivers/radeon/r600_pipe_common.c | 106 3 files changed, 134 insertions(+), 106 deletions(-) create mode 100644 src/gallium/drivers/radeon/r600_buffer.c diff --git a/src/gallium/drivers/radeon/Makefile.sources b/src/gallium/drivers/radeon/Makefile.sources index 894f22a..bd06ed8 100644 --- a/src/gallium/drivers/radeon/Makefile.sources +++ b/src/gallium/drivers/radeon/Makefile.sources @@ -1,4 +1,5 @@ C_SOURCES := \ + r600_buffer.c \ r600_pipe_common.c \ r600_streamout.c \ r600_texture.c \ diff --git a/src/gallium/drivers/radeon/r600_buffer.c b/src/gallium/drivers/radeon/r600_buffer.c new file mode 100644 index 000..13d11bd --- /dev/null +++ b/src/gallium/drivers/radeon/r600_buffer.c @@ -0,0 +1,133 @@ +/* + * Copyright 2013 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * on the rights to use, copy, modify, merge, publish, distribute, sub + * license, and/or sell copies of the Software, and to permit persons to whom + * the Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL + * THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM, + * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR + * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE + * USE OR OTHER DEALINGS IN THE SOFTWARE. + * + * Authors: + * Marek Olšák + */ + +#include "r600_cs.h" + +void *r600_buffer_map_sync_with_rings(struct r600_common_context *ctx, + struct r600_resource *resource, + unsigned usage) +{ + enum radeon_bo_usage rusage = RADEON_USAGE_READWRITE; + + if (usage & PIPE_TRANSFER_UNSYNCHRONIZED) { + return ctx->ws->buffer_map(resource->cs_buf, NULL, usage); + } + + if (!(usage & PIPE_TRANSFER_WRITE)) { + /* have to wait for the last write */ + rusage = RADEON_USAGE_WRITE; + } + + if (ctx->rings.gfx.cs->cdw && + ctx->ws->cs_is_buffer_referenced(ctx->rings.gfx.cs, +resource->cs_buf, rusage)) { + if (usage & PIPE_TRANSFER_DONTBLOCK) { + ctx->rings.gfx.flush(ctx, RADEON_FLUSH_ASYNC); + return NULL; + } else { + ctx->rings.gfx.flush(ctx, 0); + } + } + if (ctx->rings.dma.cs && + ctx->rings.dma.cs->cdw && + ctx->ws->cs_is_buffer_referenced(ctx->rings.dma.cs, +resource->cs_buf, rusage)) { + if (usage & PIPE_TRANSFER_DONTBLOCK) { + ctx->rings.dma.flush(ctx, RADEON_FLUSH_ASYNC); + return NULL; + } else { + ctx->rings.dma.flush(ctx, 0); + } + } + + if (ctx->ws->buffer_is_busy(resource->buf, rusage)) { + if (usage & PIPE_TRANSFER_DONTBLOCK) { + return NULL; + } else { + /* We will be wait for the GPU. Wait for any offloaded +* CS flush to complete to avoid busy-waiting in the winsys. */ + ctx->ws->cs_sync_flush(ctx->rings.gfx.cs); + if (ctx->rings.dma.cs) + ctx->ws->cs_sync_flush(ctx->rings.dma.cs); + } + } + + return ctx->ws->buffer_map(resource->cs_buf, NULL, usage); +} + +bool r600_init_resource(struct r600_common_screen *rscreen, + struct r600_resource *res, + unsigned size, unsigned alignment, + bool use_reusable_pool, unsigned usage) +{ + uint32_t initial_domain, domains; + + switch(usage) { + case PIPE_USAGE_STAGING: + /* Staging resources participate in transfers, i.e. are used +* for uploads and downloads from regular resources. +* We generate them internally for some transfers. +*/ + initial_domain = RADEON_DOMAIN_GTT; + domains = RADEON_DOMAIN_GTT; +
[Mesa-dev] [PATCH 00/10] Sharing r600g glMapBuffer optimizations with radeonsi
This series moves the r600_buffer.c files from both drivers to the shared directory gallium/drivers/radeon, and implements what's missing for radeonsi to make sharing the code possible. This improves Valve's Team Fortress 2 performance by 75%. Before: 20 fps After: 35 fps Please review. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/10] radeonsi: implement accelerated buffer copying
From: Marek Olšák --- src/gallium/drivers/radeonsi/r600_blit.c | 7 ++-- src/gallium/drivers/radeonsi/si_descriptors.c | 58 +++ src/gallium/drivers/radeonsi/si_state.h | 3 ++ 3 files changed, 64 insertions(+), 4 deletions(-) diff --git a/src/gallium/drivers/radeonsi/r600_blit.c b/src/gallium/drivers/radeonsi/r600_blit.c index e525f79..3adbb81 100644 --- a/src/gallium/drivers/radeonsi/r600_blit.c +++ b/src/gallium/drivers/radeonsi/r600_blit.c @@ -496,15 +496,14 @@ static void r600_resource_copy_region(struct pipe_context *ctx, const struct pipe_box *psbox = src_box; boolean restore_orig[2]; - memset(orig_info, 0, sizeof(orig_info)); - /* Fallback for buffers. */ if (dst->target == PIPE_BUFFER && src->target == PIPE_BUFFER) { - util_resource_copy_region(ctx, dst, dst_level, dstx, dsty, dstz, - src, src_level, src_box); + si_copy_buffer(rctx, dst, src, dstx, src_box->x, src_box->width); return; } + memset(orig_info, 0, sizeof(orig_info)); + /* The driver doesn't decompress resources automatically while * u_blitter is rendering. */ r600_decompress_subresource(ctx, src, src_level, diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c b/src/gallium/drivers/radeonsi/si_descriptors.c index c491584..c591352 100644 --- a/src/gallium/drivers/radeonsi/si_descriptors.c +++ b/src/gallium/drivers/radeonsi/si_descriptors.c @@ -605,6 +605,64 @@ static void si_clear_buffer(struct pipe_context *ctx, struct pipe_resource *dst, offset + size); } +void si_copy_buffer(struct r600_context *rctx, + struct pipe_resource *dst, struct pipe_resource *src, + uint64_t dst_offset, uint64_t src_offset, unsigned size) +{ + if (!size) + return; + + dst_offset += r600_resource_va(&rctx->screen->b.b, dst); + src_offset += r600_resource_va(&rctx->screen->b.b, src); + + /* Flush the caches where the resource is bound. */ + rctx->b.flags |= R600_CONTEXT_INV_TEX_CACHE | +R600_CONTEXT_INV_CONST_CACHE | +R600_CONTEXT_FLUSH_AND_INV_CB | +R600_CONTEXT_FLUSH_AND_INV_DB | +R600_CONTEXT_FLUSH_AND_INV_CB_META | +R600_CONTEXT_FLUSH_AND_INV_DB_META | +R600_CONTEXT_WAIT_3D_IDLE; + + while (size) { + unsigned sync_flags = 0; + unsigned byte_count = MIN2(size, CP_DMA_MAX_BYTE_COUNT); + + si_need_cs_space(rctx, 7 + (rctx->b.flags ? rctx->cache_flush.num_dw : 0), FALSE); + + /* Flush the caches for the first copy only. Also wait for old CP DMA packets to complete. */ + if (rctx->b.flags) { + si_emit_cache_flush(&rctx->b, NULL); + sync_flags |= SI_CP_DMA_RAW_WAIT; + } + + /* Do the synchronization after the last copy, so that all data is written to memory. */ + if (size == byte_count) { + sync_flags |= R600_CP_DMA_SYNC; + } + + /* This must be done after r600_need_cs_space. */ + r600_context_bo_reloc(&rctx->b, &rctx->b.rings.gfx, (struct r600_resource*)src, RADEON_USAGE_READ); + r600_context_bo_reloc(&rctx->b, &rctx->b.rings.gfx, (struct r600_resource*)dst, RADEON_USAGE_WRITE); + + si_emit_cp_dma_copy_buffer(rctx, dst_offset, src_offset, byte_count, sync_flags); + + size -= byte_count; + src_offset += byte_count; + dst_offset += byte_count; + } + + rctx->b.flags |= R600_CONTEXT_INV_TEX_CACHE | +R600_CONTEXT_INV_CONST_CACHE | +R600_CONTEXT_FLUSH_AND_INV_CB | +R600_CONTEXT_FLUSH_AND_INV_DB | +R600_CONTEXT_FLUSH_AND_INV_CB_META | +R600_CONTEXT_FLUSH_AND_INV_DB_META; + + util_range_add(&r600_resource(dst)->valid_buffer_range, dst_offset, + dst_offset + size); +} + /* INIT/DEINIT */ void si_init_all_descriptors(struct r600_context *rctx) diff --git a/src/gallium/drivers/radeonsi/si_state.h b/src/gallium/drivers/radeonsi/si_state.h index f3d4023..6774e57 100644 --- a/src/gallium/drivers/radeonsi/si_state.h +++ b/src/gallium/drivers/radeonsi/si_state.h @@ -197,6 +197,9 @@ void si_set_sampler_view(struct r600_context *rctx, unsigned shader, void si_init_all_descriptors(struct r600_context *rctx); void si_release_all_descriptors(struct r600_context *rctx); void si_all_descriptors_begin_new_cs(struct r600_context *rctx); +void si_copy_buffer(struct r600_context *rctx, + struct pipe_resource *dst, struct pipe_resource *src, +
[Mesa-dev] [PATCH 05/10] radeonsi: handle PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE
From: Marek Olšák which can come from glBufferData and glMapBufferRange. --- src/gallium/drivers/radeonsi/r600_buffer.c| 11 +++ src/gallium/drivers/radeonsi/si_descriptors.c | 123 ++ src/gallium/drivers/radeonsi/si_state.h | 1 + 3 files changed, 135 insertions(+) diff --git a/src/gallium/drivers/radeonsi/r600_buffer.c b/src/gallium/drivers/radeonsi/r600_buffer.c index 4c95130..560e7da 100644 --- a/src/gallium/drivers/radeonsi/r600_buffer.c +++ b/src/gallium/drivers/radeonsi/r600_buffer.c @@ -56,6 +56,17 @@ static void *r600_buffer_transfer_map(struct pipe_context *ctx, struct r600_resource *rbuffer = r600_resource(resource); uint8_t *data; + if (usage & PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE && + !(usage & PIPE_TRANSFER_UNSYNCHRONIZED)) { + assert(usage & PIPE_TRANSFER_WRITE); + + /* Check if mapping this buffer would cause waiting for the GPU. */ + if (r600_rings_is_buffer_referenced(&rctx->b, rbuffer->cs_buf, RADEON_USAGE_READWRITE) || + rctx->b.ws->buffer_is_busy(rbuffer->buf, RADEON_USAGE_READWRITE)) { + si_invalidate_buffer(&rctx->b.b, &rbuffer->b.b); + } + } + data = rctx->b.ws->buffer_map(rbuffer->cs_buf, rctx->b.rings.gfx.cs, usage); if (!data) { return NULL; diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c b/src/gallium/drivers/radeonsi/si_descriptors.c index c591352..62c354f 100644 --- a/src/gallium/drivers/radeonsi/si_descriptors.c +++ b/src/gallium/drivers/radeonsi/si_descriptors.c @@ -523,6 +523,129 @@ static void si_set_streamout_targets(struct pipe_context *ctx, si_update_descriptors(rctx, &buffers->desc); } +static void si_desc_reset_buffer_offset(struct pipe_context *ctx, + uint32_t *desc, uint64_t old_buf_va, + struct pipe_resource *new_buf) +{ + /* Retrieve the buffer offset from the descriptor. */ + uint64_t old_desc_va = + desc[0] | ((uint64_t)G_008F04_BASE_ADDRESS_HI(desc[1]) << 32); + + assert(old_buf_va <= old_desc_va); + uint64_t offset_within_buffer = old_desc_va - old_buf_va; + + /* Update the descriptor. */ + uint64_t va = r600_resource_va(ctx->screen, new_buf) + offset_within_buffer; + + desc[0] = va; + desc[1] = (desc[1] & C_008F04_BASE_ADDRESS_HI) | + S_008F04_BASE_ADDRESS_HI(va >> 32); +} + +/* BUFFER DISCARD/INVALIDATION */ + +/* Reallocate a buffer a update all resource bindings where the buffer is + * bound. + * + * This is used to avoid CPU-GPU synchronizations, because it makes the buffer + * idle by discarding its contents. Apps usually tell us when to do this using + * map_buffer flags, for example. + */ +void si_invalidate_buffer(struct pipe_context *ctx, struct pipe_resource *buf) +{ + struct r600_context *rctx = (struct r600_context*)ctx; + struct r600_resource *rbuffer = r600_resource(buf); + unsigned i, shader, alignment = rbuffer->buf->alignment; + uint64_t old_va = r600_resource_va(ctx->screen, buf); + + /* Discard the buffer. */ + pb_reference(&rbuffer->buf, NULL); + + /* Create a new one in the same pipe_resource. */ + r600_init_resource(&rctx->screen->b, rbuffer, rbuffer->b.b.width0, alignment, + TRUE, rbuffer->b.b.usage); + + /* We changed the buffer, now we need to bind it where the old one +* was bound. This consists of 2 things: +* 1) Updating the resource descriptor and dirtying it. +* 2) Adding a relocation to the CS, so that it's usable. +*/ + + /* Vertex buffers. */ + /* Nothing to do. Vertex buffer bindings are updated before every draw call. */ + + /* Streamout buffers. */ + for (i = 0; i < rctx->streamout_buffers.num_buffers; i++) { + if (rctx->streamout_buffers.buffers[i] == buf) { + /* Update the descriptor. */ + si_desc_reset_buffer_offset(ctx, rctx->streamout_buffers.desc_data[i], + old_va, buf); + + r600_context_bo_reloc(&rctx->b, &rctx->b.rings.gfx, + (struct r600_resource*)buf, + rctx->streamout_buffers.shader_usage); + rctx->streamout_buffers.desc.dirty_mask |= 1 << i; + si_update_descriptors(rctx, &rctx->streamout_buffers.desc); + + /* Update the streamout state. */ + if (rctx->b.streamout.begin_emitted) { + r600_emit_streamout_end(&rctx->b); + } + rctx->b.streamout.append_bitmask = rctx->b.streamout.enabled_mask; +
[Mesa-dev] [PATCH 06/10] r600g, radeonsi: share flags has_cp_dma and has_streamout
From: Marek Olšák --- src/gallium/drivers/r600/evergreen_hw_context.c | 2 +- src/gallium/drivers/r600/evergreen_state.c | 4 ++-- src/gallium/drivers/r600/r600_blit.c| 8 src/gallium/drivers/r600/r600_buffer.c | 4 ++-- src/gallium/drivers/r600/r600_hw_context.c | 2 +- src/gallium/drivers/r600/r600_pipe.c| 16 src/gallium/drivers/r600/r600_pipe.h| 2 -- src/gallium/drivers/r600/r600_state.c | 4 ++-- src/gallium/drivers/radeon/r600_pipe_common.h | 2 ++ src/gallium/drivers/radeonsi/radeonsi_pipe.c| 10 ++ 10 files changed, 28 insertions(+), 26 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_hw_context.c b/src/gallium/drivers/r600/evergreen_hw_context.c index c4fcaa0..e5d6249 100644 --- a/src/gallium/drivers/r600/evergreen_hw_context.c +++ b/src/gallium/drivers/r600/evergreen_hw_context.c @@ -86,7 +86,7 @@ void evergreen_cp_dma_clear_buffer(struct r600_context *rctx, struct radeon_winsys_cs *cs = rctx->b.rings.gfx.cs; assert(size); - assert(rctx->screen->has_cp_dma); + assert(rctx->screen->b.has_cp_dma); offset += r600_resource_va(&rctx->screen->b.b, dst); diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index 065ac6f..6faa538 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -2880,7 +2880,7 @@ static void cayman_init_atom_start_cs(struct r600_context *rctx) r600_store_value(cb, 0); r600_store_value(cb, 0); - if (rctx->screen->has_streamout) { + if (rctx->screen->b.has_streamout) { r600_store_context_reg(cb, R_028B28_VGT_STRMOUT_DRAW_OPAQUE_OFFSET, 0); } @@ -3337,7 +3337,7 @@ void evergreen_init_atom_start_cs(struct r600_context *rctx) r600_store_value(cb, 0); /* R_028B94_VGT_STRMOUT_CONFIG */ r600_store_value(cb, 0); /* R_028B98_VGT_STRMOUT_BUFFER_CONFIG */ - if (rctx->screen->has_streamout) { + if (rctx->screen->b.has_streamout) { r600_store_context_reg(cb, R_028B28_VGT_STRMOUT_DRAW_OPAQUE_OFFSET, 0); } diff --git a/src/gallium/drivers/r600/r600_blit.c b/src/gallium/drivers/r600/r600_blit.c index 8680f79..d3e9ec9 100644 --- a/src/gallium/drivers/r600/r600_blit.c +++ b/src/gallium/drivers/r600/r600_blit.c @@ -604,10 +604,10 @@ static void r600_copy_buffer(struct pipe_context *ctx, struct pipe_resource *dst { struct r600_context *rctx = (struct r600_context*)ctx; - if (rctx->screen->has_cp_dma) { + if (rctx->screen->b.has_cp_dma) { r600_cp_dma_copy_buffer(rctx, dst, dstx, src, src_box->x, src_box->width); } - else if (rctx->screen->has_streamout && + else if (rctx->screen->b.has_streamout && /* Require 4-byte alignment. */ dstx % 4 == 0 && src_box->x % 4 == 0 && src_box->width % 4 == 0) { @@ -654,11 +654,11 @@ static void r600_clear_buffer(struct pipe_context *ctx, struct pipe_resource *ds { struct r600_context *rctx = (struct r600_context*)ctx; - if (rctx->screen->has_cp_dma && + if (rctx->screen->b.has_cp_dma && rctx->b.chip_class >= EVERGREEN && offset % 4 == 0 && size % 4 == 0) { evergreen_cp_dma_clear_buffer(rctx, dst, offset, size, value); - } else if (rctx->screen->has_streamout && offset % 4 == 0 && size % 4 == 0) { + } else if (rctx->screen->b.has_streamout && offset % 4 == 0 && size % 4 == 0) { union pipe_color_union clear_value; clear_value.ui[0] = value; diff --git a/src/gallium/drivers/r600/r600_buffer.c b/src/gallium/drivers/r600/r600_buffer.c index 8d5c255..6c892c0 100644 --- a/src/gallium/drivers/r600/r600_buffer.c +++ b/src/gallium/drivers/r600/r600_buffer.c @@ -151,8 +151,8 @@ static void *r600_buffer_transfer_map(struct pipe_context *ctx, else if ((usage & PIPE_TRANSFER_DISCARD_RANGE) && !(usage & PIPE_TRANSFER_UNSYNCHRONIZED) && !(rctx->screen->b.debug_flags & DBG_NO_DISCARD_RANGE) && -(rctx->screen->has_cp_dma || - (rctx->screen->has_streamout && +(rctx->screen->b.has_cp_dma || + (rctx->screen->b.has_streamout && /* The buffer range must be aligned to 4 with streamout. */ box->x % 4 == 0 && box->width % 4 == 0))) { assert(usage & PIPE_TRANSFER_WRITE); diff --git a/src/gallium/drivers/r600/r600_hw_context.c b/src/gallium/drivers/r600/r600_hw_context.c index 191a81d..11414cb 100644 --- a/src/gallium/drivers/r600/r600_hw_context.c +++ b/src/gallium/drivers/r600/r600_hw_context.c @@ -445,7 +445,7 @@ void r600_cp_dma_copy_buffer(struct r600_context *rctx, struct radeon_winsys_cs *cs = rctx->b.rings.gfx.cs; assert
[Mesa-dev] [PATCH 07/10] r600g: refactor out code for buffer invalidation
From: Marek Olšák --- src/gallium/drivers/r600/r600_buffer.c | 56 +--- src/gallium/drivers/r600/r600_pipe.h | 1 + src/gallium/drivers/r600/r600_state_common.c | 55 +++ 3 files changed, 57 insertions(+), 55 deletions(-) diff --git a/src/gallium/drivers/r600/r600_buffer.c b/src/gallium/drivers/r600/r600_buffer.c index 6c892c0..7239e5a 100644 --- a/src/gallium/drivers/r600/r600_buffer.c +++ b/src/gallium/drivers/r600/r600_buffer.c @@ -39,29 +39,6 @@ static void r600_buffer_destroy(struct pipe_screen *screen, FREE(rbuffer); } -static void r600_set_constants_dirty_if_bound(struct r600_context *rctx, - struct r600_resource *rbuffer) -{ - unsigned shader; - - for (shader = 0; shader < PIPE_SHADER_TYPES; shader++) { - struct r600_constbuf_state *state = &rctx->constbuf_state[shader]; - bool found = false; - uint32_t mask = state->enabled_mask; - - while (mask) { - unsigned i = u_bit_scan(&mask); - if (state->cb[i].buffer == &rbuffer->b.b) { - found = true; - state->dirty_mask |= 1 << i; - } - } - if (found) { - r600_constant_buffers_dirty(rctx, state); - } - } -} - static void *r600_buffer_get_transfer(struct pipe_context *ctx, struct pipe_resource *resource, unsigned level, @@ -114,38 +91,7 @@ static void *r600_buffer_transfer_map(struct pipe_context *ctx, /* Check if mapping this buffer would cause waiting for the GPU. */ if (r600_rings_is_buffer_referenced(&rctx->b, rbuffer->cs_buf, RADEON_USAGE_READWRITE) || rctx->b.ws->buffer_is_busy(rbuffer->buf, RADEON_USAGE_READWRITE)) { - unsigned i, mask; - - /* Discard the buffer. */ - pb_reference(&rbuffer->buf, NULL); - - /* Create a new one in the same pipe_resource. */ - /* XXX We probably want a different alignment for buffers and textures. */ - r600_init_resource(&rctx->screen->b, rbuffer, rbuffer->b.b.width0, 4096, - TRUE, rbuffer->b.b.usage); - - /* We changed the buffer, now we need to bind it where the old one was bound. */ - /* Vertex buffers. */ - mask = rctx->vertex_buffer_state.enabled_mask; - while (mask) { - i = u_bit_scan(&mask); - if (rctx->vertex_buffer_state.vb[i].buffer == &rbuffer->b.b) { - rctx->vertex_buffer_state.dirty_mask |= 1 << i; - r600_vertex_buffers_dirty(rctx); - } - } - /* Streamout buffers. */ - for (i = 0; i < rctx->b.streamout.num_targets; i++) { - if (rctx->b.streamout.targets[i]->b.buffer == &rbuffer->b.b) { - if (rctx->b.streamout.begin_emitted) { - r600_emit_streamout_end(&rctx->b); - } - rctx->b.streamout.append_bitmask = rctx->b.streamout.enabled_mask; - r600_streamout_buffers_dirty(&rctx->b); - } - } - /* Constant buffers. */ - r600_set_constants_dirty_if_bound(rctx, rbuffer); + r600_invalidate_buffer(&rctx->b.b, &rbuffer->b.b); } } else if ((usage & PIPE_TRANSFER_DISCARD_RANGE) && diff --git a/src/gallium/drivers/r600/r600_pipe.h b/src/gallium/drivers/r600/r600_pipe.h index b3eb70c..4b4d095 100644 --- a/src/gallium/drivers/r600/r600_pipe.h +++ b/src/gallium/drivers/r600/r600_pipe.h @@ -718,6 +718,7 @@ unsigned r600_get_swizzle_combined(const unsigned char *swizzle_format, uint32_t r600_translate_texformat(struct pipe_screen *screen, enum pipe_format format, const unsigned char *swizzle_view, uint32_t *word4_p, uint32_t *yuv_format_p); +void r600_invalidate_buffer(struct pipe_context *ctx, struct pipe_resource *buf); /* r600_uvd.c */ struct pipe_video_codec *r600_uvd_create_decoder(struct pipe_context *context, diff --git a/src/gallium/drivers/r600/r600_state_common.c b/src/gallium/drivers/r600/r600_state_common.c index 7d3c5bc..718a173 100644 --- a/src/gallium/drivers/r600
[Mesa-dev] [PATCH 09/10] r600g, radeonsi: add common interface for buffer invalidation
From: Marek Olšák This will be used by common code in the next commit. --- src/gallium/drivers/r600/r600_buffer.c| 2 +- src/gallium/drivers/r600/r600_pipe.h | 1 - src/gallium/drivers/r600/r600_state_common.c | 3 ++- src/gallium/drivers/radeon/r600_pipe_common.h | 4 src/gallium/drivers/radeonsi/r600_buffer.c| 2 +- src/gallium/drivers/radeonsi/si_descriptors.c | 3 ++- src/gallium/drivers/radeonsi/si_state.h | 1 - 7 files changed, 10 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/r600/r600_buffer.c b/src/gallium/drivers/r600/r600_buffer.c index 7239e5a..969803f 100644 --- a/src/gallium/drivers/r600/r600_buffer.c +++ b/src/gallium/drivers/r600/r600_buffer.c @@ -91,7 +91,7 @@ static void *r600_buffer_transfer_map(struct pipe_context *ctx, /* Check if mapping this buffer would cause waiting for the GPU. */ if (r600_rings_is_buffer_referenced(&rctx->b, rbuffer->cs_buf, RADEON_USAGE_READWRITE) || rctx->b.ws->buffer_is_busy(rbuffer->buf, RADEON_USAGE_READWRITE)) { - r600_invalidate_buffer(&rctx->b.b, &rbuffer->b.b); + rctx->b.invalidate_buffer(&rctx->b.b, &rbuffer->b.b); } } else if ((usage & PIPE_TRANSFER_DISCARD_RANGE) && diff --git a/src/gallium/drivers/r600/r600_pipe.h b/src/gallium/drivers/r600/r600_pipe.h index 4551263..15e89a0 100644 --- a/src/gallium/drivers/r600/r600_pipe.h +++ b/src/gallium/drivers/r600/r600_pipe.h @@ -717,7 +717,6 @@ unsigned r600_get_swizzle_combined(const unsigned char *swizzle_format, uint32_t r600_translate_texformat(struct pipe_screen *screen, enum pipe_format format, const unsigned char *swizzle_view, uint32_t *word4_p, uint32_t *yuv_format_p); -void r600_invalidate_buffer(struct pipe_context *ctx, struct pipe_resource *buf); /* r600_uvd.c */ struct pipe_video_codec *r600_uvd_create_decoder(struct pipe_context *context, diff --git a/src/gallium/drivers/r600/r600_state_common.c b/src/gallium/drivers/r600/r600_state_common.c index 718a173..3c7bfe9 100644 --- a/src/gallium/drivers/r600/r600_state_common.c +++ b/src/gallium/drivers/r600/r600_state_common.c @@ -2072,7 +2072,7 @@ out_unknown: return ~0; } -void r600_invalidate_buffer(struct pipe_context *ctx, struct pipe_resource *buf) +static void r600_invalidate_buffer(struct pipe_context *ctx, struct pipe_resource *buf) { struct r600_context *rctx = (struct r600_context*)ctx; struct r600_resource *rbuffer = r600_resource(buf); @@ -2162,6 +2162,7 @@ void r600_init_common_state_functions(struct r600_context *rctx) rctx->b.b.create_surface = r600_create_surface; rctx->b.b.surface_destroy = r600_surface_destroy; rctx->b.b.draw_vbo = r600_draw_vbo; + rctx->b.invalidate_buffer = r600_invalidate_buffer; } void r600_trace_emit(struct r600_context *rctx) diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h b/src/gallium/drivers/radeon/r600_pipe_common.h index e830360..172dd93 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.h +++ b/src/gallium/drivers/radeon/r600_pipe_common.h @@ -256,6 +256,10 @@ struct r600_common_context { unsigned first_level, unsigned last_level, unsigned first_layer, unsigned last_layer, unsigned first_sample, unsigned last_sample); + + /* Reallocate the buffer and update all resource bindings where +* the buffer is bound, including all resource descriptors. */ + void (*invalidate_buffer)(struct pipe_context *ctx, struct pipe_resource *buf); }; /* r600_buffer.c */ diff --git a/src/gallium/drivers/radeonsi/r600_buffer.c b/src/gallium/drivers/radeonsi/r600_buffer.c index 560e7da..c952fe0 100644 --- a/src/gallium/drivers/radeonsi/r600_buffer.c +++ b/src/gallium/drivers/radeonsi/r600_buffer.c @@ -63,7 +63,7 @@ static void *r600_buffer_transfer_map(struct pipe_context *ctx, /* Check if mapping this buffer would cause waiting for the GPU. */ if (r600_rings_is_buffer_referenced(&rctx->b, rbuffer->cs_buf, RADEON_USAGE_READWRITE) || rctx->b.ws->buffer_is_busy(rbuffer->buf, RADEON_USAGE_READWRITE)) { - si_invalidate_buffer(&rctx->b.b, &rbuffer->b.b); + rctx->b.invalidate_buffer(&rctx->b.b, &rbuffer->b.b); } } diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c b/src/gallium/drivers/radeonsi/si_descriptors.c index 62c354f..e6d566d 100644 --- a/src/gallium/drivers/radeonsi/si_descriptors.c +++ b/src/gallium/drivers/radeonsi/si_descriptors.c @@ -551,7 +551,7 @@ static void si_desc_reset_buffer_offset(struct pipe_context *ctx, * idle by discarding its contents. Apps usually tell us when to do this using * map_buffer
[Mesa-dev] [PATCH 10/10] r600g, radeonsi: consolidate buffer code, add handling of DISCARD_RANGE for SI
From: Marek Olšák This adds 2 optimizations for radeonsi: - handling of DISCARD_RANGE - mapping an uninitialized buffer range is automatically UNSYNCHRONIZED --- src/gallium/drivers/r600/Makefile.sources | 1 - src/gallium/drivers/r600/r600_buffer.c| 202 -- src/gallium/drivers/r600/r600_pipe.c | 14 -- src/gallium/drivers/r600/r600_pipe.h | 10 -- src/gallium/drivers/r600/r600_state_common.c | 10 +- src/gallium/drivers/radeon/r600_buffer.c | 174 ++ src/gallium/drivers/radeon/r600_pipe_common.c | 17 +++ src/gallium/drivers/radeon/r600_pipe_common.h | 8 + src/gallium/drivers/radeonsi/r600_buffer.c| 101 + src/gallium/drivers/radeonsi/r600_resource.c | 4 +- src/gallium/drivers/radeonsi/r600_translate.c | 2 +- src/gallium/drivers/radeonsi/radeonsi_pipe.c | 15 -- src/gallium/drivers/radeonsi/radeonsi_pipe.h | 5 - 13 files changed, 210 insertions(+), 353 deletions(-) delete mode 100644 src/gallium/drivers/r600/r600_buffer.c diff --git a/src/gallium/drivers/r600/Makefile.sources b/src/gallium/drivers/r600/Makefile.sources index 76fd164..d96d98b 100644 --- a/src/gallium/drivers/r600/Makefile.sources +++ b/src/gallium/drivers/r600/Makefile.sources @@ -1,7 +1,6 @@ C_SOURCES = \ r600_asm.c \ r600_blit.c \ - r600_buffer.c \ r600_hw_context.c \ r600_isa.c \ r600_pipe.c \ diff --git a/src/gallium/drivers/r600/r600_buffer.c b/src/gallium/drivers/r600/r600_buffer.c deleted file mode 100644 index 969803f..000 --- a/src/gallium/drivers/r600/r600_buffer.c +++ /dev/null @@ -1,202 +0,0 @@ -/* - * Copyright 2010 Jerome Glisse - * - * Permission is hereby granted, free of charge, to any person obtaining a - * copy of this software and associated documentation files (the "Software"), - * to deal in the Software without restriction, including without limitation - * on the rights to use, copy, modify, merge, publish, distribute, sub - * license, and/or sell copies of the Software, and to permit persons to whom - * the Software is furnished to do so, subject to the following conditions: - * - * The above copyright notice and this permission notice (including the next - * paragraph) shall be included in all copies or substantial portions of the - * Software. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL - * THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM, - * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR - * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE - * USE OR OTHER DEALINGS IN THE SOFTWARE. - * - * Authors: - * Jerome Glisse - * Corbin Simpson - */ -#include "r600_pipe.h" -#include "util/u_upload_mgr.h" -#include "util/u_memory.h" -#include "util/u_surface.h" - -static void r600_buffer_destroy(struct pipe_screen *screen, - struct pipe_resource *buf) -{ - struct r600_resource *rbuffer = r600_resource(buf); - - util_range_destroy(&rbuffer->valid_buffer_range); - pb_reference(&rbuffer->buf, NULL); - FREE(rbuffer); -} - -static void *r600_buffer_get_transfer(struct pipe_context *ctx, - struct pipe_resource *resource, - unsigned level, - unsigned usage, - const struct pipe_box *box, - struct pipe_transfer **ptransfer, - void *data, struct r600_resource *staging, - unsigned offset) -{ - struct r600_context *rctx = (struct r600_context*)ctx; - struct r600_transfer *transfer = util_slab_alloc(&rctx->pool_transfers); - - transfer->transfer.resource = resource; - transfer->transfer.level = level; - transfer->transfer.usage = usage; - transfer->transfer.box = *box; - transfer->transfer.stride = 0; - transfer->transfer.layer_stride = 0; - transfer->offset = offset; - transfer->staging = staging; - *ptransfer = &transfer->transfer; - return data; -} - -static void *r600_buffer_transfer_map(struct pipe_context *ctx, - struct pipe_resource *resource, - unsigned level, - unsigned usage, - const struct pipe_box *box, - struct pipe_transfer **ptransfer) -{ - struct r600_context *rctx = (struct r600_context*)ctx; - struct r600_resource *rbuffer = r600_resource(resource); - uint8_t *data; - - assert(box->x + box->width <= re
[Mesa-dev] [PATCH 04/10] radeon: squash with buffer.c
From: Marek Olšák --- src/gallium/drivers/radeon/r600_buffer.c | 14 ++ src/gallium/drivers/radeon/r600_pipe_common.c | 14 -- src/gallium/drivers/radeon/r600_pipe_common.h | 22 -- 3 files changed, 26 insertions(+), 24 deletions(-) diff --git a/src/gallium/drivers/radeon/r600_buffer.c b/src/gallium/drivers/radeon/r600_buffer.c index 13d11bd..8158234 100644 --- a/src/gallium/drivers/radeon/r600_buffer.c +++ b/src/gallium/drivers/radeon/r600_buffer.c @@ -26,6 +26,20 @@ #include "r600_cs.h" +boolean r600_rings_is_buffer_referenced(struct r600_common_context *ctx, + struct radeon_winsys_cs_handle *buf, + enum radeon_bo_usage usage) +{ + if (ctx->ws->cs_is_buffer_referenced(ctx->rings.gfx.cs, buf, usage)) { + return TRUE; + } + if (ctx->rings.dma.cs && + ctx->ws->cs_is_buffer_referenced(ctx->rings.dma.cs, buf, usage)) { + return TRUE; + } + return FALSE; +} + void *r600_buffer_map_sync_with_rings(struct r600_common_context *ctx, struct r600_resource *resource, unsigned usage) diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c b/src/gallium/drivers/radeon/r600_pipe_common.c index 2cdca77..4c95159 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.c +++ b/src/gallium/drivers/radeon/r600_pipe_common.c @@ -305,17 +305,3 @@ void r600_screen_clear_buffer(struct r600_common_screen *rscreen, struct pipe_re rscreen->aux_context->flush(rscreen->aux_context, NULL, 0); pipe_mutex_unlock(rscreen->aux_context_lock); } - -boolean r600_rings_is_buffer_referenced(struct r600_common_context *ctx, - struct radeon_winsys_cs_handle *buf, - enum radeon_bo_usage usage) -{ - if (ctx->ws->cs_is_buffer_referenced(ctx->rings.gfx.cs, buf, usage)) { - return TRUE; - } - if (ctx->rings.dma.cs && - ctx->ws->cs_is_buffer_referenced(ctx->rings.dma.cs, buf, usage)) { - return TRUE; - } - return FALSE; -} diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h b/src/gallium/drivers/radeon/r600_pipe_common.h index f0fcaac..eb54b2a 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.h +++ b/src/gallium/drivers/radeon/r600_pipe_common.h @@ -255,6 +255,18 @@ struct r600_common_context { unsigned first_sample, unsigned last_sample); }; +/* r600_buffer.c */ +boolean r600_rings_is_buffer_referenced(struct r600_common_context *ctx, + struct radeon_winsys_cs_handle *buf, + enum radeon_bo_usage usage); +void *r600_buffer_map_sync_with_rings(struct r600_common_context *ctx, + struct r600_resource *resource, + unsigned usage); +bool r600_init_resource(struct r600_common_screen *rscreen, + struct r600_resource *res, + unsigned size, unsigned alignment, + bool use_reusable_pool, unsigned usage); + /* r600_common_pipe.c */ bool r600_common_screen_init(struct r600_common_screen *rscreen, struct radeon_winsys *ws); @@ -267,16 +279,6 @@ bool r600_can_dump_shader(struct r600_common_screen *rscreen, const struct tgsi_token *tokens); void r600_screen_clear_buffer(struct r600_common_screen *rscreen, struct pipe_resource *dst, unsigned offset, unsigned size, unsigned value); -boolean r600_rings_is_buffer_referenced(struct r600_common_context *ctx, - struct radeon_winsys_cs_handle *buf, - enum radeon_bo_usage usage); -void *r600_buffer_map_sync_with_rings(struct r600_common_context *ctx, - struct r600_resource *resource, - unsigned usage); -bool r600_init_resource(struct r600_common_screen *rscreen, - struct r600_resource *res, - unsigned size, unsigned alignment, - bool use_reusable_pool, unsigned usage); /* r600_streamout.c */ void r600_streamout_buffers_dirty(struct r600_common_context *rctx); -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 08/10] r600g, radeonsi: consolidate some debug flags
From: Marek Olšák --- src/gallium/drivers/r600/r600_pipe.c | 3 --- src/gallium/drivers/r600/r600_pipe.h | 1 - src/gallium/drivers/radeon/r600_pipe_common.c | 4 src/gallium/drivers/radeon/r600_pipe_common.h | 1 + 4 files changed, 5 insertions(+), 4 deletions(-) diff --git a/src/gallium/drivers/r600/r600_pipe.c b/src/gallium/drivers/r600/r600_pipe.c index 0075ae6..296d466 100644 --- a/src/gallium/drivers/r600/r600_pipe.c +++ b/src/gallium/drivers/r600/r600_pipe.c @@ -43,14 +43,11 @@ static const struct debug_named_value r600_debug_options[] = { /* features */ - { "nohyperz", DBG_NO_HYPERZ, "Disable Hyper-Z" }, #if defined(R600_USE_LLVM) { "nollvm", DBG_NO_LLVM, "Disable the LLVM shader compiler" }, #endif { "nocpdma", DBG_NO_CP_DMA, "Disable CP DMA" }, { "nodma", DBG_NO_ASYNC_DMA, "Disable asynchronous DMA" }, - /* GL uses the word INVALIDATE, gallium uses the word DISCARD */ - { "noinvalrange", DBG_NO_DISCARD_RANGE, "Disable handling of INVALIDATE_RANGE map flags" }, /* shader backend */ { "nosb", DBG_NO_SB, "Disable sb backend for graphics shaders" }, diff --git a/src/gallium/drivers/r600/r600_pipe.h b/src/gallium/drivers/r600/r600_pipe.h index 4b4d095..4551263 100644 --- a/src/gallium/drivers/r600/r600_pipe.h +++ b/src/gallium/drivers/r600/r600_pipe.h @@ -192,7 +192,6 @@ struct r600_viewport_state { #define DBG_NO_LLVM(1 << 17) #define DBG_NO_CP_DMA (1 << 18) #define DBG_NO_ASYNC_DMA (1 << 19) -#define DBG_NO_DISCARD_RANGE (1 << 20) /* shader backend */ #define DBG_NO_SB (1 << 21) #define DBG_SB_CS (1 << 22) diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c b/src/gallium/drivers/radeon/r600_pipe_common.c index 4c95159..3a96aea 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.c +++ b/src/gallium/drivers/radeon/r600_pipe_common.c @@ -44,6 +44,10 @@ static const struct debug_named_value common_debug_options[] = { { "ps", DBG_PS, "Print pixel shaders" }, { "cs", DBG_CS, "Print compute shaders" }, + { "nohyperz", DBG_NO_HYPERZ, "Disable Hyper-Z" }, + /* GL uses the word INVALIDATE, gallium uses the word DISCARD */ + { "noinvalrange", DBG_NO_DISCARD_RANGE, "Disable handling of INVALIDATE_RANGE map flags" }, + DEBUG_NAMED_VALUE_END /* must be last */ }; diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h b/src/gallium/drivers/radeon/r600_pipe_common.h index 4a5476e..e830360 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.h +++ b/src/gallium/drivers/radeon/r600_pipe_common.h @@ -74,6 +74,7 @@ #define DBG_CS (1 << 12) /* features */ #define DBG_NO_HYPERZ (1 << 13) +#define DBG_NO_DISCARD_RANGE (1 << 14) /* The maximum allowed bit is 15. */ struct r600_common_context; -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 02/10] r600g: use common interfaces in buffer_transfer_unmap
From: Marek Olšák i.e. dma_copy and resource_copy_region. --- src/gallium/drivers/r600/evergreen_state.c | 6 ++ src/gallium/drivers/r600/r600_blit.c | 4 ++-- src/gallium/drivers/r600/r600_buffer.c | 18 -- src/gallium/drivers/r600/r600_pipe.h | 2 -- src/gallium/drivers/r600/r600_state.c | 6 ++ 5 files changed, 22 insertions(+), 14 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index a4a4e3e..065ac6f 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -3769,6 +3769,12 @@ static boolean evergreen_dma_blit(struct pipe_context *ctx, if (rctx->b.rings.dma.cs == NULL) { return FALSE; } + + if (dst->target == PIPE_BUFFER && src->target == PIPE_BUFFER) { + evergreen_dma_copy(rctx, dst, src, dst_x, src_box->x, src_box->width); + return TRUE; + } + if (src->format != dst->format) { return FALSE; } diff --git a/src/gallium/drivers/r600/r600_blit.c b/src/gallium/drivers/r600/r600_blit.c index f33bb43..8680f79 100644 --- a/src/gallium/drivers/r600/r600_blit.c +++ b/src/gallium/drivers/r600/r600_blit.c @@ -599,8 +599,8 @@ static void r600_clear_depth_stencil(struct pipe_context *ctx, r600_blitter_end(ctx); } -void r600_copy_buffer(struct pipe_context *ctx, struct pipe_resource *dst, unsigned dstx, - struct pipe_resource *src, const struct pipe_box *src_box) +static void r600_copy_buffer(struct pipe_context *ctx, struct pipe_resource *dst, unsigned dstx, +struct pipe_resource *src, const struct pipe_box *src_box) { struct r600_context *rctx = (struct r600_context*)ctx; diff --git a/src/gallium/drivers/r600/r600_buffer.c b/src/gallium/drivers/r600/r600_buffer.c index 107538a..8d5c255 100644 --- a/src/gallium/drivers/r600/r600_buffer.c +++ b/src/gallium/drivers/r600/r600_buffer.c @@ -196,24 +196,22 @@ static void r600_buffer_transfer_unmap(struct pipe_context *pipe, if (rtransfer->staging) { struct pipe_resource *dst, *src; unsigned soffset, doffset, size; + struct pipe_box box; dst = transfer->resource; src = &rtransfer->staging->b.b; size = transfer->box.width; doffset = transfer->box.x; soffset = rtransfer->offset + transfer->box.x % R600_MAP_BUFFER_ALIGNMENT; + + u_box_1d(soffset, size, &box); + /* Copy the staging buffer into the original one. */ - if (rctx->b.rings.dma.cs && !(size % 4) && !(doffset % 4) && !(soffset % 4)) { - if (rctx->screen->b.chip_class >= EVERGREEN) { - evergreen_dma_copy(rctx, dst, src, doffset, soffset, size); - } else { - r600_dma_copy(rctx, dst, src, doffset, soffset, size); - } + if (!(size % 4) && !(doffset % 4) && !(soffset % 4) && + rctx->b.dma_copy(pipe, dst, 0, doffset, 0, 0, src, 0, &box)) { + /* DONE. */ } else { - struct pipe_box box; - - u_box_1d(soffset, size, &box); - r600_copy_buffer(pipe, dst, doffset, src, &box); + pipe->resource_copy_region(pipe, dst, 0, doffset, 0, 0, src, 0, &box); } pipe_resource_reference((struct pipe_resource**)&rtransfer->staging, NULL); } diff --git a/src/gallium/drivers/r600/r600_pipe.h b/src/gallium/drivers/r600/r600_pipe.h index f0d4be4..d58cd2e 100644 --- a/src/gallium/drivers/r600/r600_pipe.h +++ b/src/gallium/drivers/r600/r600_pipe.h @@ -598,8 +598,6 @@ void evergreen_init_color_surface_rat(struct r600_context *rctx, void evergreen_update_db_shader_control(struct r600_context * rctx); /* r600_blit.c */ -void r600_copy_buffer(struct pipe_context *ctx, struct pipe_resource *dst, unsigned dstx, - struct pipe_resource *src, const struct pipe_box *src_box); void r600_init_blit_functions(struct r600_context *rctx); void r600_decompress_depth_textures(struct r600_context *rctx, struct r600_samplerview_state *textures); diff --git a/src/gallium/drivers/r600/r600_state.c b/src/gallium/drivers/r600/r600_state.c index 41e9c5d..b938c33 100644 --- a/src/gallium/drivers/r600/r600_state.c +++ b/src/gallium/drivers/r600/r600_state.c @@ -3149,6 +3149,12 @@ static boolean r600_dma_blit(struct pipe_context *ctx, if (rctx->b.rings.dma.cs == NULL) { return FALSE; } + + if (dst->target == PIPE_BUFFER && src->target == PIPE_BUFFER) { + r600_dma_copy(rctx, dst, src, dst_x, src_box->x, src_box->
Re: [Mesa-dev] [PATCH 00/10] Sharing r600g glMapBuffer optimizations with radeonsi
BTW the improvement only applies to the radeonsi driver. Marek On Fri, Nov 29, 2013 at 6:55 PM, Marek Olšák wrote: > This series moves the r600_buffer.c files from both drivers to the shared > directory gallium/drivers/radeon, and implements what's missing for radeonsi > to make sharing the code possible. > > This improves Valve's Team Fortress 2 performance by 75%. > > Before: 20 fps > After: 35 fps > > Please review. > > Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 04/10] radeon: squash with buffer.c
Sorry, I forgot to merge this patch with the first one (which is what the commit message says). I'll do so before committing. Marek On Fri, Nov 29, 2013 at 6:55 PM, Marek Olšák wrote: > From: Marek Olšák > > --- > src/gallium/drivers/radeon/r600_buffer.c | 14 ++ > src/gallium/drivers/radeon/r600_pipe_common.c | 14 -- > src/gallium/drivers/radeon/r600_pipe_common.h | 22 -- > 3 files changed, 26 insertions(+), 24 deletions(-) > > diff --git a/src/gallium/drivers/radeon/r600_buffer.c > b/src/gallium/drivers/radeon/r600_buffer.c > index 13d11bd..8158234 100644 > --- a/src/gallium/drivers/radeon/r600_buffer.c > +++ b/src/gallium/drivers/radeon/r600_buffer.c > @@ -26,6 +26,20 @@ > > #include "r600_cs.h" > > +boolean r600_rings_is_buffer_referenced(struct r600_common_context *ctx, > + struct radeon_winsys_cs_handle *buf, > + enum radeon_bo_usage usage) > +{ > + if (ctx->ws->cs_is_buffer_referenced(ctx->rings.gfx.cs, buf, usage)) { > + return TRUE; > + } > + if (ctx->rings.dma.cs && > + ctx->ws->cs_is_buffer_referenced(ctx->rings.dma.cs, buf, usage)) { > + return TRUE; > + } > + return FALSE; > +} > + > void *r600_buffer_map_sync_with_rings(struct r600_common_context *ctx, >struct r600_resource *resource, >unsigned usage) > diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c > b/src/gallium/drivers/radeon/r600_pipe_common.c > index 2cdca77..4c95159 100644 > --- a/src/gallium/drivers/radeon/r600_pipe_common.c > +++ b/src/gallium/drivers/radeon/r600_pipe_common.c > @@ -305,17 +305,3 @@ void r600_screen_clear_buffer(struct r600_common_screen > *rscreen, struct pipe_re > rscreen->aux_context->flush(rscreen->aux_context, NULL, 0); > pipe_mutex_unlock(rscreen->aux_context_lock); > } > - > -boolean r600_rings_is_buffer_referenced(struct r600_common_context *ctx, > - struct radeon_winsys_cs_handle *buf, > - enum radeon_bo_usage usage) > -{ > - if (ctx->ws->cs_is_buffer_referenced(ctx->rings.gfx.cs, buf, usage)) { > - return TRUE; > - } > - if (ctx->rings.dma.cs && > - ctx->ws->cs_is_buffer_referenced(ctx->rings.dma.cs, buf, usage)) { > - return TRUE; > - } > - return FALSE; > -} > diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h > b/src/gallium/drivers/radeon/r600_pipe_common.h > index f0fcaac..eb54b2a 100644 > --- a/src/gallium/drivers/radeon/r600_pipe_common.h > +++ b/src/gallium/drivers/radeon/r600_pipe_common.h > @@ -255,6 +255,18 @@ struct r600_common_context { > unsigned first_sample, unsigned > last_sample); > }; > > +/* r600_buffer.c */ > +boolean r600_rings_is_buffer_referenced(struct r600_common_context *ctx, > + struct radeon_winsys_cs_handle *buf, > + enum radeon_bo_usage usage); > +void *r600_buffer_map_sync_with_rings(struct r600_common_context *ctx, > + struct r600_resource *resource, > + unsigned usage); > +bool r600_init_resource(struct r600_common_screen *rscreen, > + struct r600_resource *res, > + unsigned size, unsigned alignment, > + bool use_reusable_pool, unsigned usage); > + > /* r600_common_pipe.c */ > bool r600_common_screen_init(struct r600_common_screen *rscreen, > struct radeon_winsys *ws); > @@ -267,16 +279,6 @@ bool r600_can_dump_shader(struct r600_common_screen > *rscreen, > const struct tgsi_token *tokens); > void r600_screen_clear_buffer(struct r600_common_screen *rscreen, struct > pipe_resource *dst, > unsigned offset, unsigned size, unsigned value); > -boolean r600_rings_is_buffer_referenced(struct r600_common_context *ctx, > - struct radeon_winsys_cs_handle *buf, > - enum radeon_bo_usage usage); > -void *r600_buffer_map_sync_with_rings(struct r600_common_context *ctx, > - struct r600_resource *resource, > - unsigned usage); > -bool r600_init_resource(struct r600_common_screen *rscreen, > - struct r600_resource *res, > - unsigned size, unsigned alignment, > - bool use_reusable_pool, unsigned usage); > > /* r600_streamout.c */ > void r600_streamout_buffers_dirty(struct r600_common_context *rctx); > -- > 1.8.3.2 > ___ mesa-dev mailing l
[Mesa-dev] [PATCH] winsys/radeon: set/get the scanout flag with the tiling ioctls
From: Marek Olšák If we assume that all buffers allocated by the DDX are scanout, a new flag that says "this is not scanout" has to be added to support the non-scanout buffers and maintain backward compatibility. This fixes bad rendering on Wayland. The flag is defined as: #define RADEON_TILING_R600_NO_SCANOUT RADEON_TILING_SWAP_16BIT AFAIK, RADEON_TILING_SWAP_16BIT is not used on SI. --- src/gallium/drivers/r300/r300_state.c | 2 +- src/gallium/drivers/r300/r300_texture.c | 5 +++-- src/gallium/drivers/radeon/r600_texture.c | 9 + src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 12 ++-- src/gallium/winsys/radeon/drm/radeon_winsys.h | 6 -- 5 files changed, 23 insertions(+), 11 deletions(-) diff --git a/src/gallium/drivers/r300/r300_state.c b/src/gallium/drivers/r300/r300_state.c index 6840e8b..048672c 100644 --- a/src/gallium/drivers/r300/r300_state.c +++ b/src/gallium/drivers/r300/r300_state.c @@ -844,7 +844,7 @@ static void r300_tex_set_tiling_flags(struct r300_context *r300, r300->rws->buffer_set_tiling(tex->buf, r300->cs, tex->tex.microtile, tex->tex.macrotile[level], 0, 0, 0, 0, 0, -tex->tex.stride_in_bytes[0]); +tex->tex.stride_in_bytes[0], false); tex->surface_level = level; } diff --git a/src/gallium/drivers/r300/r300_texture.c b/src/gallium/drivers/r300/r300_texture.c index b7fb081..4ea69dc 100644 --- a/src/gallium/drivers/r300/r300_texture.c +++ b/src/gallium/drivers/r300/r300_texture.c @@ -1060,7 +1060,7 @@ r300_texture_create_object(struct r300_screen *rscreen, rws->buffer_set_tiling(tex->buf, NULL, tex->tex.microtile, tex->tex.macrotile[0], 0, 0, 0, 0, 0, -tex->tex.stride_in_bytes[0]); +tex->tex.stride_in_bytes[0], false); return tex; @@ -1115,7 +1115,8 @@ struct pipe_resource *r300_texture_from_handle(struct pipe_screen *screen, if (!buffer) return NULL; -rws->buffer_get_tiling(buffer, µtile, ¯otile, NULL, NULL, NULL, NULL, NULL); +rws->buffer_get_tiling(buffer, µtile, ¯otile, NULL, NULL, NULL, + NULL, NULL, NULL); /* Enforce a microtiled zbuffer. */ if (util_format_is_depth_or_stencil(base->format) && diff --git a/src/gallium/drivers/radeon/r600_texture.c b/src/gallium/drivers/radeon/r600_texture.c index 12f412d..bd1d4c8 100644 --- a/src/gallium/drivers/radeon/r600_texture.c +++ b/src/gallium/drivers/radeon/r600_texture.c @@ -253,7 +253,8 @@ static boolean r600_texture_get_handle(struct pipe_screen* screen, surface->tile_split, surface->stencil_tile_split, surface->mtilea, - surface->level[0].pitch_bytes); + surface->level[0].pitch_bytes, + (surface->flags & RADEON_SURF_SCANOUT) != 0); return rscreen->ws->buffer_get_handle(resource->buf, surface->level[0].pitch_bytes, whandle); @@ -760,6 +761,7 @@ struct pipe_resource *r600_texture_from_handle(struct pipe_screen *screen, unsigned array_mode; enum radeon_bo_layout micro, macro; struct radeon_surface surface; + bool scanout; int r; /* Support only 2D textures without mipmaps */ @@ -775,7 +777,7 @@ struct pipe_resource *r600_texture_from_handle(struct pipe_screen *screen, &surface.bankw, &surface.bankh, &surface.tile_split, &surface.stencil_tile_split, - &surface.mtilea); + &surface.mtilea, &scanout); if (macro == RADEON_LAYOUT_TILED) array_mode = RADEON_SURF_MODE_2D; @@ -789,8 +791,7 @@ struct pipe_resource *r600_texture_from_handle(struct pipe_screen *screen, return NULL; } - /* always set the scanout flags on SI */ - if (rscreen->chip_class >= SI) + if (scanout) surface.flags |= RADEON_SURF_SCANOUT; return (struct pipe_resource *)r600_texture_create_object(screen, templ, diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c index 3019a52..a99d754 100644 --- a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c +++ b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c @@ -49,6 +49,7 @@ #define RADEON_BO_FLAGS_MACRO_TILE 1 #define RADEON_BO_FLAGS_MICRO_TILE 2 #define RADEON_BO_FLAGS_MICRO_TILE_SQUARE 0x20 +#define RADEON_TILING_R600_NO_SCANOUT RADEON_TILING_SWAP_16BIT #ifndef DRM_RADEON_GEM_WAIT #define DRM_RADEON_GEM_WAIT 0x2b @@ -738,7 +739,8 @@ static void radeon_bo_get_tiling(struct pb_buffer *_buf,
Re: [Mesa-dev] [PATCH 01/10] radeon: move some functions to r600_buffer.c
Reviewed-by: Christoph Brill 2013/11/29 Marek Olšák > From: Marek Olšák > > --- > src/gallium/drivers/radeon/Makefile.sources | 1 + > src/gallium/drivers/radeon/r600_buffer.c | 133 > ++ > src/gallium/drivers/radeon/r600_pipe_common.c | 106 > 3 files changed, 134 insertions(+), 106 deletions(-) > create mode 100644 src/gallium/drivers/radeon/r600_buffer.c > > diff --git a/src/gallium/drivers/radeon/Makefile.sources > b/src/gallium/drivers/radeon/Makefile.sources > index 894f22a..bd06ed8 100644 > --- a/src/gallium/drivers/radeon/Makefile.sources > +++ b/src/gallium/drivers/radeon/Makefile.sources > @@ -1,4 +1,5 @@ > C_SOURCES := \ > + r600_buffer.c \ > r600_pipe_common.c \ > r600_streamout.c \ > r600_texture.c \ > diff --git a/src/gallium/drivers/radeon/r600_buffer.c > b/src/gallium/drivers/radeon/r600_buffer.c > new file mode 100644 > index 000..13d11bd > --- /dev/null > +++ b/src/gallium/drivers/radeon/r600_buffer.c > @@ -0,0 +1,133 @@ > +/* > + * Copyright 2013 Advanced Micro Devices, Inc. > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the > "Software"), > + * to deal in the Software without restriction, including without > limitation > + * on the rights to use, copy, modify, merge, publish, distribute, sub > + * license, and/or sell copies of the Software, and to permit persons to > whom > + * the Software is furnished to do so, subject to the following > conditions: > + * > + * The above copyright notice and this permission notice (including the > next > + * paragraph) shall be included in all copies or substantial portions of > the > + * Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT > SHALL > + * THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM, > + * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR > + * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR > THE > + * USE OR OTHER DEALINGS IN THE SOFTWARE. > + * > + * Authors: > + * Marek Olšák > + */ > + > +#include "r600_cs.h" > + > +void *r600_buffer_map_sync_with_rings(struct r600_common_context *ctx, > + struct r600_resource *resource, > + unsigned usage) > +{ > + enum radeon_bo_usage rusage = RADEON_USAGE_READWRITE; > + > + if (usage & PIPE_TRANSFER_UNSYNCHRONIZED) { > + return ctx->ws->buffer_map(resource->cs_buf, NULL, usage); > + } > + > + if (!(usage & PIPE_TRANSFER_WRITE)) { > + /* have to wait for the last write */ > + rusage = RADEON_USAGE_WRITE; > + } > + > + if (ctx->rings.gfx.cs->cdw && > + ctx->ws->cs_is_buffer_referenced(ctx->rings.gfx.cs, > +resource->cs_buf, rusage)) { > + if (usage & PIPE_TRANSFER_DONTBLOCK) { > + ctx->rings.gfx.flush(ctx, RADEON_FLUSH_ASYNC); > + return NULL; > + } else { > + ctx->rings.gfx.flush(ctx, 0); > + } > + } > + if (ctx->rings.dma.cs && > + ctx->rings.dma.cs->cdw && > + ctx->ws->cs_is_buffer_referenced(ctx->rings.dma.cs, > +resource->cs_buf, rusage)) { > + if (usage & PIPE_TRANSFER_DONTBLOCK) { > + ctx->rings.dma.flush(ctx, RADEON_FLUSH_ASYNC); > + return NULL; > + } else { > + ctx->rings.dma.flush(ctx, 0); > + } > + } > + > + if (ctx->ws->buffer_is_busy(resource->buf, rusage)) { > + if (usage & PIPE_TRANSFER_DONTBLOCK) { > + return NULL; > + } else { > + /* We will be wait for the GPU. Wait for any > offloaded > +* CS flush to complete to avoid busy-waiting in > the winsys. */ > + ctx->ws->cs_sync_flush(ctx->rings.gfx.cs); > + if (ctx->rings.dma.cs) > + ctx->ws->cs_sync_flush(ctx->rings.dma.cs); > + } > + } > + > + return ctx->ws->buffer_map(resource->cs_buf, NULL, usage); > +} > + > +bool r600_init_resource(struct r600_common_screen *rscreen, > + struct r600_resource *res, > + unsigned size, unsigned alignment, > + bool use_reusable_pool, unsigned usage) > +{ > + uint32_t initial_domain, domains; > + > + switch(usage) { > + case PIPE_USAGE_STAGING: > + /* Sta
[Mesa-dev] [PATCH 1/2 v2] nv50: fix a small leak on context destroy
Signed-off-by: Ilia Mirkin Cc: "9.2 10.0" --- v2: use FREE instead of free. src/gallium/drivers/nouveau/nv50/nv50_context.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/gallium/drivers/nouveau/nv50/nv50_context.c b/src/gallium/drivers/nouveau/nv50/nv50_context.c index b6bdf79..11afc48 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_context.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_context.c @@ -114,6 +114,8 @@ nv50_destroy(struct pipe_context *pipe) draw_destroy(nv50->draw); #endif + FREE(nv50->blit); + nouveau_context_destroy(&nv50->base); } -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/6] i965/vs: Sample from MCS surface when required
Signed-off-by: Chris Forbes --- src/mesa/drivers/dri/i965/brw_vec4.h | 1 + src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 41 +- 2 files changed, 35 insertions(+), 7 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 5cec9f9..d4029d8 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -477,6 +477,7 @@ public: void emit_unpack_half_2x16(dst_reg dst, src_reg src0); uint32_t gather_channel(ir_texture *ir, int sampler); + src_reg emit_mcs_fetch(ir_texture *ir, src_reg coordinate, int sampler); void swizzle_result(ir_texture *ir, src_reg orig_val, int sampler); void emit_ndc_computation(); diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index a13eafb..3088838 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -2215,6 +2215,31 @@ vec4_visitor::visit(ir_call *ir) } } +src_reg +vec4_visitor::emit_mcs_fetch(ir_texture *ir, src_reg coordinate, int sampler) +{ + vec4_instruction *inst = new(mem_ctx) vec4_instruction(this, SHADER_OPCODE_TXF_MCS); + inst->base_mrf = 2; + inst->mlen = 1; + inst->sampler = sampler; + inst->dst = dst_reg(this, glsl_type::uvec4_type); + inst->dst.writemask = WRITEMASK_XYZW; + + /* parameters are: u, v, r, lod; lod will always be zero due to api restrictions */ + int param_base = inst->base_mrf; + int coord_mask = (1 << ir->coordinate->type->vector_elements) - 1; + int zero_mask = 0xf & ~coord_mask; + + emit(MOV(dst_reg(MRF, param_base, ir->coordinate->type, coord_mask), +coordinate)); + + emit(MOV(dst_reg(MRF, param_base, ir->coordinate->type, zero_mask), +src_reg(0))); + + emit(inst); + return src_reg(inst->dst); +} + void vec4_visitor::visit(ir_texture *ir) { @@ -2265,7 +2290,7 @@ vec4_visitor::visit(ir_texture *ir) } const glsl_type *lod_type = NULL, *sample_index_type = NULL; - src_reg lod, dPdx, dPdy, sample_index; + src_reg lod, dPdx, dPdy, sample_index, mcs; switch (ir->op) { case ir_tex: lod = src_reg(0.0f); @@ -2286,6 +2311,11 @@ vec4_visitor::visit(ir_texture *ir) ir->lod_info.sample_index->accept(this); sample_index = this->result; sample_index_type = ir->lod_info.sample_index->type; + + if (brw->gen >= 7 && key->tex.compressed_multisample_layout_mask & (1accept(this); @@ -2406,13 +2436,10 @@ vec4_visitor::visit(ir_texture *ir) } else if (ir->op == ir_txf_ms) { emit(MOV(dst_reg(MRF, param_base + 1, sample_index_type, WRITEMASK_X), sample_index)); + if (brw->gen >= 7) +emit(MOV(dst_reg(MRF, param_base + 1, glsl_type::uint_type, WRITEMASK_Y), + mcs)); inst->mlen++; - - /* on Gen7, there is an additional MCS parameter here after SI, - * but we don't bother to emit it since it's always zero. If - * we start supporting texturing from CMS surfaces, this will have - * to change - */ } else if (ir->op == ir_txd) { const glsl_type *type = lod_type; -- 1.8.4.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/6] i965: make compute_msaa_layout() nonstatic, add intel_ prefix
We're about to need this in the computation of the sampler key. Signed-off-by: Chris Forbes --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 6 +++--- src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 3 +++ 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 3889803..c7db0ad 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -69,8 +69,8 @@ target_to_target(GLenum target) * Determine which MSAA layout should be used by the MSAA surface being * created, based on the chip generation and the surface type. */ -static enum intel_msaa_layout -compute_msaa_layout(struct brw_context *brw, gl_format format, GLenum target) +enum intel_msaa_layout +intel_compute_msaa_layout(struct brw_context *brw, gl_format format, GLenum target) { /* Prior to Gen7, all MSAA surfaces used IMS layout. */ if (brw->gen < 7) @@ -283,7 +283,7 @@ intel_miptree_create_layout(struct brw_context *brw, if (num_samples > 1) { /* Adjust width/height/depth for MSAA */ - mt->msaa_layout = compute_msaa_layout(brw, format, mt->target); + mt->msaa_layout = intel_compute_msaa_layout(brw, format, mt->target); if (mt->msaa_layout == INTEL_MSAA_LAYOUT_IMS) { /* In the Sandy Bridge PRM, volume 4, part 1, page 31, it says: * diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h index 8777a8c..2e2f178 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h @@ -476,6 +476,9 @@ enum intel_miptree_tiling_mode { INTEL_MIPTREE_TILING_NONE, }; +enum intel_msaa_layout +intel_compute_msaa_layout(struct brw_context *brw, gl_format format, GLenum target); + bool intel_is_non_msrt_mcs_buffer_supported(struct brw_context *brw, struct intel_mipmap_tree *mt); -- 1.8.4.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/6] i965: Enable CMS layout for multisample textures
This series enables the compressed multisample layout for multisample textures. Previously we would only use CMS for renderbuffers, since our texelFetch() implementation didn't understand it. Known to break the sample-mask-execution -tex test, but not sure why yet -- it doesn't use texelFetch(). ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/6] i965/Gen7: Allow CMS layout for multisample textures
Now that all the pieces are in place, this should provide a nice performance boost for apps using multisample textures. Signed-off-by: Chris Forbes --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 18 +- 1 file changed, 1 insertion(+), 17 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index c7db0ad..631c93c 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -99,23 +99,7 @@ intel_compute_msaa_layout(struct brw_context *brw, gl_format format, GLenum targ assert(brw->gen == 7); return INTEL_MSAA_LAYOUT_UMS; } else { - /* For now, if we're going to be texturing from this surface, - * force UMS, so that the shader doesn't have to do different things - * based on whether there's a multisample control surface needing sampled first. - * We can't just blindly read the MCS surface in all cases because: - * - * From the Ivy Bridge PRM, Vol4 Part1 p77 ("MCS Enable"): - * - *If this field is disabled and the sampling engine message - *is issued on this surface, the MCS surface may be accessed. Software - *must ensure that the surface is defined to avoid GTT errors. - */ - if (target == GL_TEXTURE_2D_MULTISAMPLE || - target == GL_TEXTURE_2D_MULTISAMPLE_ARRAY) { -return INTEL_MSAA_LAYOUT_UMS; - } else { -return INTEL_MSAA_LAYOUT_CMS; - } + return INTEL_MSAA_LAYOUT_CMS; } } } -- 1.8.4.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/6] i965: Add shader opcode for sampling MCS surface
Signed-off-by: Chris Forbes --- src/mesa/drivers/dri/i965/brw_defines.h | 1 + src/mesa/drivers/dri/i965/brw_fs.cpp | 1 + src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 5 + src/mesa/drivers/dri/i965/brw_shader.cpp | 3 +++ src/mesa/drivers/dri/i965/brw_vec4.cpp | 1 + src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 5 + 6 files changed, 16 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index 597d3b2..e0dfe52 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -770,6 +770,7 @@ enum opcode { SHADER_OPCODE_TXS, FS_OPCODE_TXB, SHADER_OPCODE_TXF_MS, + SHADER_OPCODE_TXF_MCS, SHADER_OPCODE_LOD, SHADER_OPCODE_TG4, SHADER_OPCODE_TG4_OFFSET, diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index eecde62..354d3ed 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -766,6 +766,7 @@ fs_visitor::implied_mrf_writes(fs_inst *inst) case SHADER_OPCODE_TXD: case SHADER_OPCODE_TXF: case SHADER_OPCODE_TXF_MS: + case SHADER_OPCODE_TXF_MCS: case SHADER_OPCODE_TG4: case SHADER_OPCODE_TG4_OFFSET: case SHADER_OPCODE_TXL: diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 6626a8c..5a11eef 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -431,6 +431,10 @@ fs_generator::generate_tex(fs_inst *inst, struct brw_reg dst, struct brw_reg src else msg_type = GEN5_SAMPLER_MESSAGE_SAMPLE_LD; break; + case SHADER_OPCODE_TXF_MCS: + assert(brw->gen >= 7); + msg_type = GEN7_SAMPLER_MESSAGE_SAMPLE_LD_MCS; + break; case SHADER_OPCODE_LOD: msg_type = GEN5_SAMPLER_MESSAGE_LOD; break; @@ -1651,6 +1655,7 @@ fs_generator::generate_code(exec_list *instructions) case SHADER_OPCODE_TXD: case SHADER_OPCODE_TXF: case SHADER_OPCODE_TXF_MS: + case SHADER_OPCODE_TXF_MCS: case SHADER_OPCODE_TXL: case SHADER_OPCODE_TXS: case SHADER_OPCODE_LOD: diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index ddb4524..88aa169 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -449,6 +449,8 @@ brw_instruction_name(enum opcode op) return "txb"; case SHADER_OPCODE_TXF_MS: return "txf_ms"; + case SHADER_OPCODE_TXF_MCS: + return "txf_mcs"; case SHADER_OPCODE_TG4: return "tg4"; case SHADER_OPCODE_TG4_OFFSET: @@ -544,6 +546,7 @@ backend_instruction::is_tex() opcode == SHADER_OPCODE_TXD || opcode == SHADER_OPCODE_TXF || opcode == SHADER_OPCODE_TXF_MS || + opcode == SHADER_OPCODE_TXF_MCS || opcode == SHADER_OPCODE_TXL || opcode == SHADER_OPCODE_TXS || opcode == SHADER_OPCODE_LOD || diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 73f91a0..73ec811 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -273,6 +273,7 @@ vec4_visitor::implied_mrf_writes(vec4_instruction *inst) case SHADER_OPCODE_TXD: case SHADER_OPCODE_TXF: case SHADER_OPCODE_TXF_MS: + case SHADER_OPCODE_TXF_MCS: case SHADER_OPCODE_TXS: case SHADER_OPCODE_TG4: case SHADER_OPCODE_TG4_OFFSET: diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp index 30c2ca2..c1ef81d 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp @@ -305,6 +305,10 @@ vec4_generator::generate_tex(vec4_instruction *inst, else msg_type = GEN5_SAMPLER_MESSAGE_SAMPLE_LD; break; + case SHADER_OPCODE_TXF_MCS: + assert(brw->gen >= 7); + msg_type = GEN7_SAMPLER_MESSAGE_SAMPLE_LD_MCS; + break; case SHADER_OPCODE_TXS: msg_type = GEN5_SAMPLER_MESSAGE_SAMPLE_RESINFO; break; @@ -1138,6 +1142,7 @@ vec4_generator::generate_vec4_instruction(vec4_instruction *instruction, case SHADER_OPCODE_TXD: case SHADER_OPCODE_TXF: case SHADER_OPCODE_TXF_MS: + case SHADER_OPCODE_TXF_MCS: case SHADER_OPCODE_TXL: case SHADER_OPCODE_TXS: case SHADER_OPCODE_TG4: -- 1.8.4.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/6] i965/fs: Sample from MCS surface when required
Signed-off-by: Chris Forbes --- src/mesa/drivers/dri/i965/brw_fs.h | 3 +- src/mesa/drivers/dri/i965/brw_fs_fp.cpp | 2 +- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 46 +++- 3 files changed, 41 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 7991b87..7a70b1f 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -353,7 +353,8 @@ public: fs_reg sample_index); fs_inst *emit_texture_gen7(ir_texture *ir, fs_reg dst, fs_reg coordinate, fs_reg shadow_comp, fs_reg lod, fs_reg lod2, - fs_reg sample_index); + fs_reg sample_index, fs_reg mcs); + fs_reg emit_mcs_fetch(ir_texture *ir, fs_reg coordinate, int sampler); fs_reg fix_math_operand(fs_reg src); fs_inst *emit_math(enum opcode op, fs_reg dst, fs_reg src0); fs_inst *emit_math(enum opcode op, fs_reg dst, fs_reg src0, fs_reg src1); diff --git a/src/mesa/drivers/dri/i965/brw_fs_fp.cpp b/src/mesa/drivers/dri/i965/brw_fs_fp.cpp index 1ebaa4f..5aec757 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_fp.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_fp.cpp @@ -501,7 +501,7 @@ fs_visitor::emit_fragment_program_code() fs_inst *inst; if (brw->gen >= 7) { -inst = emit_texture_gen7(ir, dst, coordinate, shadow_c, lod, dpdy, sample_index); +inst = emit_texture_gen7(ir, dst, coordinate, shadow_c, lod, dpdy, sample_index, fs_reg(0u)); } else if (brw->gen >= 5) { inst = emit_texture_gen5(ir, dst, coordinate, shadow_c, lod, dpdy, sample_index); } else { diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 9eb9a9d..8c3d19c 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -1223,7 +1223,7 @@ fs_visitor::emit_texture_gen5(ir_texture *ir, fs_reg dst, fs_reg coordinate, fs_inst * fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg dst, fs_reg coordinate, fs_reg shadow_c, fs_reg lod, fs_reg lod2, - fs_reg sample_index) + fs_reg sample_index, fs_reg mcs) { int reg_width = dispatch_width / 8; bool header_present = false; @@ -1322,11 +1322,8 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg dst, fs_reg coordinate, emit(MOV(next.retype(BRW_REGISTER_TYPE_UD), sample_index)); next.reg_offset++; - /* constant zero MCS; we arrange to never actually have a compressed - * multisample surface here for now. TODO: issue ld_mcs to get this first, - * if we ever support texturing from compressed multisample surfaces - */ - emit(MOV(next.retype(BRW_REGISTER_TYPE_UD), fs_reg(0u))); + /* data from the multisample control surface */ + emit(MOV(next.retype(BRW_REGISTER_TYPE_UD), mcs)); next.reg_offset++; /* there is no offsetting for this message; just copy in the integer @@ -1517,6 +1514,34 @@ fs_visitor::rescale_texcoord(ir_texture *ir, fs_reg coordinate, return coordinate; } +/* Sample from the MCS surface attached to this multisample texture. */ +fs_reg +fs_visitor::emit_mcs_fetch(ir_texture *ir, fs_reg coordinate, int sampler) +{ + int reg_width = dispatch_width / 8; + fs_reg payload = fs_reg(this, glsl_type::float_type); + fs_reg dest = fs_reg(this, glsl_type::uvec4_type); + fs_reg next = payload; + + /* parameters are: u, v, r, lod; missing parameters are treated as zero */ + for (int i = 0; i < ir->coordinate->type->vector_elements; i++) { + emit(MOV(next.retype(BRW_REGISTER_TYPE_D), coordinate)); + coordinate.reg_offset++; + next.reg_offset++; + } + + fs_inst *inst = emit(SHADER_OPCODE_TXF_MCS, dest, payload); + inst->base_mrf = -1; + inst->mlen = next.reg_offset * reg_width; + inst->header_present = false; + inst->regs_written = 4 * reg_width; /* we only care about one reg of response, +* but the sampler always writes 4/8 +*/ + inst->sampler = sampler; + + return dest; +} + void fs_visitor::visit(ir_texture *ir) { @@ -1575,7 +1600,7 @@ fs_visitor::visit(ir_texture *ir) shadow_comparitor = this->result; } - fs_reg lod, lod2, sample_index; + fs_reg lod, lod2, sample_index, mcs; switch (ir->op) { case ir_tex: case ir_lod: @@ -1602,6 +1627,11 @@ fs_visitor::visit(ir_texture *ir) case ir_txf_ms: ir->lod_info.sample_index->accept(this); sample_index = this->result; + + if (brw->gen >= 7 && c->key.tex.compressed_multisample_layout_mask & (1= 7) { inst = emit_texture_gen7(ir, dst, coordinate, shadow_comparitor, -
[Mesa-dev] [PATCH 2/6] i965/Gen7: Include bitfield in the sampler key for CMS layout
We need to emit extra shader code in this case to sample the MCS surface first; we can't just blindly do this all the time since IVB will sometimes try to access the MCS surface even if disabled. Signed-off-by: Chris Forbes --- src/mesa/drivers/dri/i965/brw_program.h | 5 + src/mesa/drivers/dri/i965/brw_wm.c | 9 + 2 files changed, 14 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_program.h b/src/mesa/drivers/dri/i965/brw_program.h index 07be4a0..51182ea 100644 --- a/src/mesa/drivers/dri/i965/brw_program.h +++ b/src/mesa/drivers/dri/i965/brw_program.h @@ -45,6 +45,11 @@ struct brw_sampler_prog_key_data { * For RG32F, gather4's channel select is broken. */ uint16_t gather_channel_quirk_mask; + + /** +* Whether this sampler uses the compressed multisample surface layout. +*/ + uint16_t compressed_multisample_layout_mask; }; #ifdef __cplusplus diff --git a/src/mesa/drivers/dri/i965/brw_wm.c b/src/mesa/drivers/dri/i965/brw_wm.c index bc1480c..414481b 100644 --- a/src/mesa/drivers/dri/i965/brw_wm.c +++ b/src/mesa/drivers/dri/i965/brw_wm.c @@ -38,6 +38,7 @@ #include "main/samplerobj.h" #include "program/prog_parameter.h" #include "program/program.h" +#include "intel_mipmap_tree.h" #include "glsl/ralloc.h" @@ -356,6 +357,14 @@ brw_populate_sampler_prog_key_data(struct gl_context *ctx, if (img->InternalFormat == GL_RG32F) key->gather_channel_quirk_mask |= 1 << s; } + + /* If this is a multisample sampler, and uses the CMS MSAA layout, then + * we need to emit slightly different code to first sample the MCS surface. + */ + if (brw->gen >= 7 && img->NumSamples && + intel_compute_msaa_layout(brw, img->TexFormat, t->Target) == INTEL_MSAA_LAYOUT_CMS) { +key->compressed_multisample_layout_mask |= 1 << s; + } } } } -- 1.8.4.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] nv50: wait on the buf's fence before sticking it into pushbuf
This resolves some rendering issues in source games. See https://bugs.freedesktop.org/show_bug.cgi?id=64323 Signed-off-by: Ilia Mirkin Cc: "9.2 10.0" --- Doing a nouveau_bo_wait works as well, but I got a slightly higher framerate from glretrace doing it this way. I tried to get an actual source game running, but was unsuccessful... (something about a missing filesystem_steam.so which indeed was absent) This is clearly not optimal, but neither is having broken source games. The other workaround simply went the other path which would have to do a wait and a double-copy (vram -> pushbuf, pushbuf xfer from gart), which is worse than just the wait. src/gallium/drivers/nouveau/nouveau_buffer.c | 3 +++ src/gallium/drivers/nouveau/nv50/nv50_vbo.c | 9 + 2 files changed, 12 insertions(+) diff --git a/src/gallium/drivers/nouveau/nouveau_buffer.c b/src/gallium/drivers/nouveau/nouveau_buffer.c index 3e04049..95905a8 100644 --- a/src/gallium/drivers/nouveau/nouveau_buffer.c +++ b/src/gallium/drivers/nouveau/nouveau_buffer.c @@ -205,6 +205,9 @@ nouveau_transfer_write(struct nouveau_context *nv, struct nouveau_transfer *tx, base, size / 4, (const uint32_t *)data); else nv->push_data(nv, buf->bo, buf->offset + base, buf->domain, size, data); + + nouveau_fence_ref(nv->screen->fence.current, &buf->fence); + nouveau_fence_ref(nv->screen->fence.current, &buf->fence_wr); } diff --git a/src/gallium/drivers/nouveau/nv50/nv50_vbo.c b/src/gallium/drivers/nouveau/nv50/nv50_vbo.c index c6162b5..947c67d 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_vbo.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_vbo.c @@ -597,6 +597,15 @@ nv50_draw_elements(struct nv50_context *nv50, boolean shorten, assert(nouveau_resource_mapped_by_gpu(nv50->idxbuf.buffer)); + /* This shouldn't have to be here. The going theory is that the buffer + * is being filled in by PGRAPH, and it's not done yet by the time it + * gets submitted to PFIFO, which in turn starts immediately prefetching + * the not-yet-written data. Ideally this wait would only happen on + * pushbuf submit, but it's probably not a big performance difference. + */ + if (buf->fence_wr && !nouveau_fence_signalled(buf->fence_wr)) + nouveau_fence_wait(buf->fence_wr); + while (instance_count--) { BEGIN_NV04(push, NV50_3D(VERTEX_BEGIN_GL), 1); PUSH_DATA (push, prim); -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] nouveau: avoid leaking fences while waiting
This fixes a memory leak in some situations. Also avoids emitting an extra fence if the kick handler does the call to nouveau_fence_next itself. Signed-off-by: Ilia Mirkin Cc: "9.2 10.0" --- TBH I'm pretty confused by the whole fence refcounting logic and its interaction with emits, updates, etc. However valgrind was happy with this. But it wasn't happy when I was doing nouveau_fence_wait from nv50_draw_elements, saying that the fence allocated by nouveau_fence_new was leaked. (Note that the kick handler when doing vbo stuff does NOT do nouveau_fence_next on its own... but adding that there still didn't fix all my issues, nor is it likely desirable.) src/gallium/drivers/nouveau/nouveau_fence.c | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/src/gallium/drivers/nouveau/nouveau_fence.c b/src/gallium/drivers/nouveau/nouveau_fence.c index dea146c..c686710 100644 --- a/src/gallium/drivers/nouveau/nouveau_fence.c +++ b/src/gallium/drivers/nouveau/nouveau_fence.c @@ -189,16 +189,15 @@ nouveau_fence_wait(struct nouveau_fence *fence) /* wtf, someone is waiting on a fence in flush_notify handler? */ assert(fence->state != NOUVEAU_FENCE_STATE_EMITTING); - if (fence->state < NOUVEAU_FENCE_STATE_EMITTED) { + if (fence->state < NOUVEAU_FENCE_STATE_EMITTED) nouveau_fence_emit(fence); - if (fence == screen->fence.current) - nouveau_fence_new(screen, &screen->fence.current, FALSE); - } - if (fence->state < NOUVEAU_FENCE_STATE_FLUSHED) { + if (fence->state < NOUVEAU_FENCE_STATE_FLUSHED) if (nouveau_pushbuf_kick(screen->pushbuf, screen->pushbuf->channel)) return FALSE; - } + + if (fence == screen->fence.current) + nouveau_fence_next(screen); do { nouveau_fence_update(screen, FALSE); -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/4] glsl: Fix loop analysis of nested loops.
On 28 November 2013 11:45, Chris Forbes wrote: > + /* The assignmnet to the variable in the loop must be unconditional > and > + * not inside a nested loop. > */ > > s/assignmnet/assignment/ > Fixed, thanks. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev