Re: [Mesa-dev] [PATCH 1/7] mesa: clean up #includes in ff_fragment_shader.cpp
There seems to be some 'extra includes', not sure if you want to start going through them at this same time? As example sampler.cpp does not need to include glsl/ir_visitor.h since it comes with glsl/ir.h. Anyway nice cleanup, series is Reviewed-by: Tapani Pälli On 10/01/2015 12:35 AM, Brian Paul wrote: Get rid of "../glsl/" paths. Sort alphabetically. --- src/mesa/main/ff_fragment_shader.cpp | 22 ++ 1 file changed, 10 insertions(+), 12 deletions(-) diff --git a/src/mesa/main/ff_fragment_shader.cpp b/src/mesa/main/ff_fragment_shader.cpp index c682892..c0030bc 100644 --- a/src/mesa/main/ff_fragment_shader.cpp +++ b/src/mesa/main/ff_fragment_shader.cpp @@ -27,29 +27,27 @@ * **/ -#include "glheader.h" -#include "imports.h" -#include "mtypes.h" +#include "main/glheader.h" #include "main/context.h" +#include "main/imports.h" #include "main/macros.h" #include "main/samplerobj.h" #include "main/texenvprogram.h" #include "main/texobj.h" #include "main/uniforms.h" +#include "glsl/ir_builder.h" +#include "glsl/ir_optimization.h" +#include "glsl/glsl_parser_extras.h" +#include "glsl/glsl_symbol_table.h" +#include "glsl/glsl_types.h" +#include "program/ir_to_mesa.h" #include "program/program.h" -#include "program/prog_parameter.h" +#include "program/programopt.h" #include "program/prog_cache.h" #include "program/prog_instruction.h" +#include "program/prog_parameter.h" #include "program/prog_print.h" #include "program/prog_statevars.h" -#include "program/programopt.h" -#include "../glsl/glsl_types.h" -#include "../glsl/ir.h" -#include "../glsl/ir_builder.h" -#include "../glsl/glsl_symbol_table.h" -#include "../glsl/glsl_parser_extras.h" -#include "../glsl/ir_optimization.h" -#include "../program/ir_to_mesa.h" using namespace ir_builder; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 92183] linker.cpp:3187:46: error: ‘strtok_r’ was not declared in this scope
https://bugs.freedesktop.org/show_bug.cgi?id=92183 Vinson Lee changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED |--- --- Comment #5 from Vinson Lee --- mesa: ca2e16d26ec46e604c76ab72d3ec14a7a8643cc8 (master 11.1.0-devel) Build is still failing on older versions on MinGW. This error occurs with mingw-w64 2.0.1-1 on Ubuntu 12.04. Compiling src/glsl/linker.cpp ... src/glsl/linker.cpp: In function ‘bool included_in_packed_varying(ir_variable*, const char*)’: src/glsl/linker.cpp:3241:46: error: ‘strtok_s’ was not declared in this scope Mesa might decide instead to raise the minimum required version of mingw-w64. -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] radeon/llvm: Initialize gallivm targets when initializing the AMDGPU target v2
Hi, The series is: Reviewed-by: Mathias Fröhlich Thanks for taking more care of multithread behavior! Mathias On Wednesday, September 30, 2015 18:59:44 Tom Stellard wrote: > This fixes a race condition in the glx-multithreaded-shader-compile > test. > > v2: > - Replace gallivm_init_llvm_{begin,end}() with gallivm_init_llvm_targets(). > > CC: "10.6 11.0" > --- > src/gallium/drivers/radeon/radeon_llvm_emit.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c > b/src/gallium/drivers/radeon/radeon_llvm_emit.c > index 5d93a4d..1a66a55 100644 > --- a/src/gallium/drivers/radeon/radeon_llvm_emit.c > +++ b/src/gallium/drivers/radeon/radeon_llvm_emit.c > @@ -26,6 +26,7 @@ > #include "radeon_llvm_emit.h" > #include "radeon_elf_util.h" > #include "c11/threads.h" > +#include "gallivm/lp_bld_misc.h" > #include "util/u_memory.h" > #include "pipe/p_shader_tokens.h" > > @@ -87,6 +88,7 @@ void radeon_llvm_shader_type(LLVMValueRef F, unsigned type) > > static void init_r600_target() > { > + gallivm_init_llvm_targets(); > #if HAVE_LLVM < 0x0307 > LLVMInitializeR600TargetInfo(); > LLVMInitializeR600Target(); > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] List of unsupported extensions per driver
2015-10-01 3:33 GMT+02:00 Ian Romanick : > The two lines that just have () are only supported on i965, nvc0, and > radeonsi. It's not really that clear. > Yep, this part I understood, once you understand it's hierarchical, everything becomes clearer ;) Also, a guy on GitHub helped me to understand some subtleties in the way the drivers were supporting the hardwares, it's starting here: https://github.com/MightyCreak/mesamatrix/issues/12#issuecomment-144483615 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: emit row_major matrix's SSBO stores only for components in writemask
When writing to a column of a row-major matrix, each component of the vector is stored to non-consecutive memory addresses, so we generate one instruction per component. This patch skips the disabled components in the writemask, saving some store instructions plus avoid storing wrong data on each disabled component. Signed-off-by: Samuel Iglesias Gonsalvez --- src/glsl/lower_ubo_reference.cpp | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/glsl/lower_ubo_reference.cpp b/src/glsl/lower_ubo_reference.cpp index e581306..247620e 100644 --- a/src/glsl/lower_ubo_reference.cpp +++ b/src/glsl/lower_ubo_reference.cpp @@ -754,6 +754,12 @@ lower_ubo_reference_visitor::emit_access(bool is_write, add(base_offset, new(mem_ctx) ir_constant(deref_offset + i * matrix_stride)); if (is_write) { +/* If the component is not in the writemask, then don't + * store any value. + */ +if (!((1 << i) & write_mask)) + continue; + base_ir->insert_after(ssbo_store(swizzle(deref, i, 1), chan_offset, 1)); } else { if (!this->is_shader_storage) { -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] dri/common: drop loading /etc/drirc
Hi, On 10/01/2015 06:07 AM, Michel Dänzer wrote: On 01.10.2015 04:35, Marek Olšák wrote: [...] For your issue, the only way is to remove loading ~/.drirc from Mesa, I think that would be the proverbial "throwing out the baby with the bathwater". :) making driconf useless. Do we really care about driconf? I don't think so. ~/.drirc isn't only for driconf, it's also for allowing users (who may not have write access to .../etc/drirc) to have their own persistent DRI configuration. One of the original problem was users not telling / knowing that they have bogus drirc values when providing data to Mesa developers. What if Mesa would store drirc settings it loads/uses & from which files it read them, provide API for requesting that info and "glxinfo" would include that info to its output? Then one needs only to ask for "glxinfo" output to see relevant Mesa version and settings it uses. This would allow you to see also bad settings from the global drirc. Same thing should be output also by Mesa debug options and Mesa could output a warning about potentially stale user drirc file, if loaded user drirc is older than global drirc file. - Eero ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeonsi: don't set DATA_FORMAT if ADD_TID_ENABLE is set on VI (v2)
On 01.10.2015 05:44, Michel Dänzer wrote: On 01.10.2015 04:11, Marek Olšák wrote: From: Marek Olšák This can cause incorrect address calculations and hangs. v2: do it properly Cc: mesa-sta...@lists.freedesktop.org Tested-and-Reviewed-by: Christian König --- src/gallium/drivers/radeonsi/si_descriptors.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c b/src/gallium/drivers/radeonsi/si_descriptors.c index b07ab3b..b56219a 100644 --- a/src/gallium/drivers/radeonsi/si_descriptors.c +++ b/src/gallium/drivers/radeonsi/si_descriptors.c @@ -619,11 +619,18 @@ void si_set_ring_buffer(struct pipe_context *ctx, uint shader, uint slot, S_008F0C_DST_SEL_Z(V_008F0C_SQ_SEL_Z) | S_008F0C_DST_SEL_W(V_008F0C_SQ_SEL_W) | S_008F0C_NUM_FORMAT(V_008F0C_BUF_NUM_FORMAT_FLOAT) | - S_008F0C_DATA_FORMAT(V_008F0C_BUF_DATA_FORMAT_32) | S_008F0C_ELEMENT_SIZE(element_size) | S_008F0C_INDEX_STRIDE(index_stride) | S_008F0C_ADD_TID_ENABLE(add_tid); + /* If ADD_TID_ENABLE is set on VI, DATA_FORMAT specifies +* STRIDE bits [14:17] +*/ + if (sctx->b.chip_class >= VI && add_tid) + desc[3] |= S_008F0C_DATA_FORMAT(stride >> 14); + else + desc[3] |= S_008F0C_DATA_FORMAT(V_008F0C_BUF_DATA_FORMAT_32); The beginning of the function has: /* The stride field in the resource descriptor has 14 bits */ assert(stride < (1 << 14)); So the if-branch is dead code in a non-release build. Would be nice if that could be reconciled somehow, but I'm fine with doing it in a follow-up change. Either way, this change is Reviewed-by: Michel Dänzer Yeah, agree. Might be nice if someone can come up with a test for this, but I don't think this is absolutely necessary. For now the patch is Reviewed-by: Christian König Regards, Christian. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] glx: Fix build errors with --enable-mangling (v2)
On 28 September 2015 at 18:59, Kyle Brenneman wrote: > Rearranged the GLX_ALIAS macro in glextensions.h so that it will pick up > the renames from glx_mangle.h. > > Fixed the alias attribute for glXGetProcAddress when USE_MGL_NAMESPACE is > defined. > > v2: Add a comment clarifying why GLX_ALIAS needs two macros. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=2 > Signed-off-by: Kyle Brenneman > Cc: "10.6 11.0" > --- > src/glx/glxcmds.c | 4 > src/glx/glxextensions.h | 10 -- > 2 files changed, 12 insertions(+), 2 deletions(-) > > diff --git a/src/glx/glxcmds.c b/src/glx/glxcmds.c > index 26ff804..93e8db5 100644 > --- a/src/glx/glxcmds.c > +++ b/src/glx/glxcmds.c > @@ -2646,7 +2646,11 @@ _X_EXPORT void (*glXGetProcAddressARB(const GLubyte * > procName)) (void) > */ > _X_EXPORT void (*glXGetProcAddress(const GLubyte * procName)) (void) > #if defined(__GNUC__) && !defined(GLX_ALIAS_UNSUPPORTED) > +# if defined(USE_MGL_NAMESPACE) > + __attribute__ ((alias("mglXGetProcAddressARB"))); > +# else > __attribute__ ((alias("glXGetProcAddressARB"))); > +# endif > #else > { > return glXGetProcAddressARB(procName); > diff --git a/src/glx/glxextensions.h b/src/glx/glxextensions.h > index 90b173f..3a9bc82 100644 > --- a/src/glx/glxextensions.h > +++ b/src/glx/glxextensions.h > @@ -281,11 +281,17 @@ typedef void (*PFNGLXDISABLEEXTENSIONPROC) (const char > *name); > # define GLX_ALIAS_VOID(real_func, proto_args, args, aliased_func) > #else > # if defined(__GNUC__) && !defined(GLX_ALIAS_UNSUPPORTED) > -# define GLX_ALIAS(return_type, real_func, proto_args, args, aliased_func) \ > +/* GLX_ALIAS and GLX_ALIAS_VOID both expand to the macro GLX_ALIAS2. Using > the > + * extra expansion means that the name mangling macros in glx_mangle.h will > + * apply before stringification, so the alias attribute will have a string > like > + * "mglXFoo" instead of "glXFoo". */ Hmm things are happening in quite Interesting order. On the good side, we've been using the two layer macros in our dispatch (glapi) for a while so things should be fully fixed now. Reviewed-by: Emil Velikov Thank you for the fixes Kyle. I'll push the three patches tomorrow, barring any objections of course. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] New/final stable-branch 10.6 candidate pushed
Hello list, The candidate for the Mesa 10.6.9 is now available. Currently we have: - 16 queued - 18 nominated (outstanding) - and 2 rejected/obsolete patches The present queue covers KDE crashes with i965, regression in Unreal ournament with gallium drivers, fixes for Redway3D Flat (Demo) and Astromenace. Take a look at section "Mesa stable queue" for more information. Testing --- The following results are against piglit 246791c51ec. Changes - classic i965(snb) --- None. Changes - swrast classic None. Changes - gallium softpipe -- None. Changes - gallium llvmpipe (LLVM 3.6.2) --- None. Testing reports/general approval Any testing reports (or general approval of the state of the branch) will be greatly appreciated. Trivial merge conflicts --- None. The plan is to have 10.6.9 this Friday (2nd of October) or shortly after. If you have any questions or comments that you would like to share before the release, please go ahead. Cheers, Emil Mesa stable queue - Nominated (18) == Ben Widawsky (1): i965/skl: Use larger URB size where available. Boyan Ding (1): i915: Add XRGB format to intel_screen_make_configs Brian Paul (1): configure: don't try to build gallium DRI drivers if --disable-dri is set Emil Velikov (1): i965: store reference to the context within struct brw_fence Ian Romanick (1): meta: Handle array textures in scaled MSAA blits Kyle Brenneman (3): glx: Don't hard-code the name "libGL.so.1" in driOpenDriver (v3) glx: Fix build errors with --enable-mangling (v2) mapi: Make _glapi_get_stub work with "gl" or "mgl" prefix. Matthew Waters (1): egl: rework handling EGL_CONTEXT_FLAGS Rob Clark (1): xa: add xa_surface_from_handle2 Tapani Pälli (1): i965: fix textureGrad for cubemaps Tom Stellard (7): clover: Call clBuildProgram() notification function when build completes v2 gallium/drivers: Add threadsafe wrappers for pipe_context v2 clover: Use threadsafe wrappers for pipe_context v2 clover: Properly initialize LLVM targets when linking with component libs gallium/radeon: Use call_once() when initailizing LLVM targets gallivm: Allow drivers and state trackers to initialize gallivm LLVM targets v2 radeon/llvm: Initialize gallivm targets when initializing the AMDGPU target v2 Queued (16) === Brian Paul (1): st/mesa: try PIPE_BIND_RENDER_TARGET when choosing float texture formats Chris Wilson (1): i965: Remove early release of DRI2 miptree Emil Velikov (3): docs: add sha256 checksums for 10.6.8 cherry-ignore: add commit non applicable for 10.6 cherry-ignore: add commit non applicable for 10.6 Iago Toral Quiroga (1): mesa: Fix GL_FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE for default framebuffer. Ian Romanick (5): t_dd_dmatmp: Make "count" actually be the count t_dd_dmatmp: Clean up improper code formatting from previous patch t_dd_dmatmp: Use '& 3' instead of '% 4' everywhere t_dd_dmatmp: Pull out common 'count -= count & 3' code t_dd_dmatmp: Use addition instead of subtraction in loop bounds Jeremy Huddleston (1): configure.ac: Add support to enable read-only text segment on x86. Kristian Høgsberg Kristensen (1): i965: Respect stride and subreg_offset for ATTR registers Leo Liu (1): radeon/vce: fix vui time_scale zero error Marek Olšák (1): st/mesa: fix front buffer regression after dropping st_validate_state in Blit Roland Scheidegger (1): mesa: fix mipmap generation for immutable, compressed textures Obsolete (2) Tom Stellard (2): gallivm: Allow drivers and state trackers to initialize gallivm LLVM targets radeon/llvm: Initialize gallivm targets when initializing the AMDGPU target ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC 2/2] mesa: Add {Num}UniformBlocks and {Num}ShaderStorageBlocks to gl_shader_program
These arrays provide backends with separate index spaces for UBOS and SSBOs. --- src/glsl/linker.cpp | 35 +++ src/glsl/standalone_scaffolding.cpp | 9 + src/mesa/main/mtypes.h | 6 ++ 3 files changed, 50 insertions(+) diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp index e6eba94..3da773d 100644 --- a/src/glsl/linker.cpp +++ b/src/glsl/linker.cpp @@ -4107,6 +4107,41 @@ link_shaders(struct gl_context *ctx, struct gl_shader_program *prog) } } + /* Split prog->BufferInterfaceBlocks into prog->UniformBlocks and +* prog->ShaderStorageBlocks, so that drivers that need separate index +* spaces for each set can have that. +*/ + unsigned num_ubo_blocks; + unsigned num_ssbo_blocks; + num_ubo_blocks = 0; + num_ssbo_blocks = 0; + for (unsigned i = 0; i < prog->NumBufferInterfaceBlocks; i++) { + if (prog->BufferInterfaceBlocks[i].IsShaderStorage) + num_ssbo_blocks++; + else + num_ubo_blocks++; + } + + prog->UniformBlocks = + ralloc_array(mem_ctx, gl_uniform_block *, num_ubo_blocks); + prog->NumUniformBlocks = 0; + + prog->ShaderStorageBlocks = + ralloc_array(mem_ctx, gl_uniform_block *, num_ssbo_blocks); + prog->NumShaderStorageBlocks = 0; + + for (unsigned i = 0; i < prog->NumBufferInterfaceBlocks; i++) { + if (prog->BufferInterfaceBlocks[i].IsShaderStorage) + prog->ShaderStorageBlocks[prog->NumShaderStorageBlocks++] = +&prog->BufferInterfaceBlocks[i]; + else + prog->UniformBlocks[prog->NumUniformBlocks++] = +&prog->BufferInterfaceBlocks[i]; + } + + assert(prog->NumUniformBlocks + prog->NumShaderStorageBlocks == + prog->NumBufferInterfaceBlocks); + /* FINISHME: Assign fragment shader output locations. */ done: diff --git a/src/glsl/standalone_scaffolding.cpp b/src/glsl/standalone_scaffolding.cpp index 0c53589..658245f 100644 --- a/src/glsl/standalone_scaffolding.cpp +++ b/src/glsl/standalone_scaffolding.cpp @@ -102,6 +102,15 @@ _mesa_clear_shader_program_data(struct gl_shader_program *shProg) ralloc_free(shProg->BufferInterfaceBlocks); shProg->BufferInterfaceBlocks = NULL; shProg->NumBufferInterfaceBlocks = 0; + + ralloc_free(shProg->UniformBlocks); + shProg->UniformBlocks = NULL; + shProg->NumUniformBlocks = 0; + + ralloc_free(shProg->ShaderStorageBlocks); + shProg->ShaderStorageBlocks = NULL; + shProg->NumShaderStorageBlocks = 0; + for (i = 0; i < MESA_SHADER_STAGES; i++) { ralloc_free(shProg->UniformBlockStageIndex[i]); shProg->UniformBlockStageIndex[i] = NULL; diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 347da14..2362f54 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -2692,6 +2692,12 @@ struct gl_shader_program unsigned NumBufferInterfaceBlocks; struct gl_uniform_block *BufferInterfaceBlocks; + unsigned NumUniformBlocks; + struct gl_uniform_block **UniformBlocks; + + unsigned NumShaderStorageBlocks; + struct gl_uniform_block **ShaderStorageBlocks; + /** * Indices into the _LinkedShaders's UniformBlocks[] array for each stage * they're used in, or -1. -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC 1/2] mesa: Rename {Num}UniformBlocks to {Num}BufferInterfaceBlocks
Currently, these arrays in gl_shader and gl_shader_program hold both UBOs and SSBOs, so this looks like a better name. We were already using NumBufferInterfaceBlocks in gl_shader_program, so this makes things more consistent as well. In a later patch we will add {Num}UniformBlocks and {Num}ShaderStorageBlocks which will contain only references to UBOs and SSBOs respectively that will provide backends with a separate index space for both types of objects. --- src/glsl/link_uniform_initializers.cpp | 4 +-- src/glsl/link_uniforms.cpp | 22 ++-- src/glsl/linker.cpp | 44 src/glsl/lower_ubo_reference.cpp | 8 ++--- src/glsl/standalone_scaffolding.cpp | 4 +-- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 2 +- src/mesa/drivers/dri/i965/brw_shader.cpp | 2 +- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 2 +- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 10 +++--- src/mesa/main/mtypes.h | 6 ++-- src/mesa/main/shader_query.cpp | 4 +-- src/mesa/main/shaderapi.c| 4 +-- src/mesa/main/shaderobj.c| 4 +-- src/mesa/main/uniforms.c | 12 +++ src/mesa/state_tracker/st_atom_constbuf.c| 4 +-- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 4 +-- 16 files changed, 68 insertions(+), 68 deletions(-) diff --git a/src/glsl/link_uniform_initializers.cpp b/src/glsl/link_uniform_initializers.cpp index 3483082..ab7a150 100644 --- a/src/glsl/link_uniform_initializers.cpp +++ b/src/glsl/link_uniform_initializers.cpp @@ -49,7 +49,7 @@ get_uniform_block_index(const gl_shader_program *shProg, const char *uniformBlockName) { for (unsigned i = 0; i < shProg->NumBufferInterfaceBlocks; i++) { - if (!strcmp(shProg->UniformBlocks[i].Name, uniformBlockName)) + if (!strcmp(shProg->BufferInterfaceBlocks[i].Name, uniformBlockName)) return i; } @@ -169,7 +169,7 @@ set_block_binding(gl_shader_program *prog, const char *block_name, int binding) if (stage_index != -1) { struct gl_shader *sh = prog->_LinkedShaders[i]; -sh->UniformBlocks[stage_index].Binding = binding; +sh->BufferInterfaceBlocks[stage_index].Binding = binding; } } } diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp index 47d49c8..53b9b45 100644 --- a/src/glsl/link_uniforms.cpp +++ b/src/glsl/link_uniforms.cpp @@ -502,9 +502,9 @@ public: for (unsigned i = 0; i < prog->NumBufferInterfaceBlocks; i++) { if (strncmp(var->get_interface_type()->name, - prog->UniformBlocks[i].Name, + prog->BufferInterfaceBlocks[i].Name, l) == 0 - && prog->UniformBlocks[i].Name[l] == '[') { + && prog->BufferInterfaceBlocks[i].Name[l] == '[') { ubo_block_index = i; break; } @@ -512,7 +512,7 @@ public: } else { for (unsigned i = 0; i < prog->NumBufferInterfaceBlocks; i++) { if (strcmp(var->get_interface_type()->name, - prog->UniformBlocks[i].Name) == 0) { + prog->BufferInterfaceBlocks[i].Name) == 0) { ubo_block_index = i; break; } @@ -530,7 +530,7 @@ public: ubo_byte_offset = 0; } else { const struct gl_uniform_block *const block = - &prog->UniformBlocks[ubo_block_index]; + &prog->BufferInterfaceBlocks[ubo_block_index]; assert(var->data.location != -1); @@ -975,10 +975,10 @@ link_update_uniform_buffer_variables(struct gl_shader *shader) } const unsigned l = strlen(var->name); - for (unsigned i = 0; i < shader->NumUniformBlocks; i++) { -for (unsigned j = 0; j < shader->UniformBlocks[i].NumUniforms; j++) { + for (unsigned i = 0; i < shader->NumBufferInterfaceBlocks; i++) { +for (unsigned j = 0; j < shader->BufferInterfaceBlocks[i].NumUniforms; j++) { if (sentinel) { - const char *begin = shader->UniformBlocks[i].Uniforms[j].Name; + const char *begin = shader->BufferInterfaceBlocks[i].Uniforms[j].Name; const char *end = strchr(begin, sentinel); if (end == NULL) @@ -993,7 +993,7 @@ link_update_uniform_buffer_variables(struct gl_shader *shader) break; } } else if (!strcmp(var->name, - shader->UniformBlocks[i].Uniforms[j].Name)) { + shader->BufferInterfaceBlocks[i].Uniforms[j].Name)) { found = true; var->data.loc
[Mesa-dev] [RFC 0/2] Provide separate index spaces for UBOs and SSBOs
The compiler has a lot of code to implement UBOs that uses UniformBlocks intensively in multiples places. When we implemented SSBOs, which need to go pretty much through the same process, we had two options: 1) Rework all this code (and other parts of the frontend that use this array) to accomodate a second array of objects and have all the pieces access/index into one or the other (or both) depending on the kind of object we are manipulating or what the code is trying to achieve. 2) Simply merge SSBOs into UniformBlocks. This lets the compiler do all the things it needs to do for SSBOs without significant changes to the code. This is what we implemented, since it it was less invasive and it would not increase the complexity of code that is already complex enough as it is. It should also make maintenance easy, since as far the compiler is involved, we don't really have to care if we are manipulating SSBOs or UBOs, so fixes or changes to the code will apply naturally to both, which is probably what we want most of the time. However, based on recent discussions with Ilia and Curro this creates an inconvenience for drivers that expect different index spaces for both kinds of objects, which is probably the majority since in the end the GL API defines both as separate entities. As it is now, drivers that want separate index spaces for UBOs and SSBOs require to remap the indices, which can be done, but is not nice. This 2-patch series implements a solution for this that tries to keep the compiler frontend untouched while providing backends with separate index spaces. The idea is that we can have the compiler work with a single list of blocks and do all the linking process like that. No changes here. Then, after linking, we can create two separate arrays for UBOs and SSBOs. Drivers that need a separate index space only need to reference these arrays instead of the one that mixes both kinds. I think this gives everyone what they need without major changes so hopefully it is a good way to go for now, even if we still want to do the bigger change in the future (not sure that we want to do that though). Would this be sufficient? Patch1: Renames {Num}UniformBlocks to {Num}BufferInterfaceBlocks. This is more consistent with the current implementation. Patch2: Adds separate UniformBlocks and ShaderStorageBlocks arrays. Iago Toral Quiroga (2): mesa: Rename {Num}UniformBlocks to {Num}BufferInterfaceBlocks mesa: Add {Num}UniformBlocks and {Num}ShaderStorageBlocks to gl_shader_program src/glsl/link_uniform_initializers.cpp | 4 +- src/glsl/link_uniforms.cpp | 22 +++ src/glsl/linker.cpp | 79 +--- src/glsl/lower_ubo_reference.cpp | 8 +-- src/glsl/standalone_scaffolding.cpp | 11 +++- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 2 +- src/mesa/drivers/dri/i965/brw_shader.cpp | 2 +- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 2 +- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 10 +-- src/mesa/main/mtypes.h | 12 +++- src/mesa/main/shader_query.cpp | 4 +- src/mesa/main/shaderapi.c| 4 +- src/mesa/main/shaderobj.c| 4 +- src/mesa/main/uniforms.c | 12 ++-- src/mesa/state_tracker/st_atom_constbuf.c| 4 +- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 4 +- 16 files changed, 117 insertions(+), 67 deletions(-) -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: emit row_major matrix's SSBO stores only for components in writemask
On Thu, 2015-10-01 at 09:41 +0200, Samuel Iglesias Gonsalvez wrote: > When writing to a column of a row-major matrix, each component of the > vector is stored to non-consecutive memory addresses, so we generate > one instruction per component. > > This patch skips the disabled components in the writemask, saving some > store instructions plus avoid storing wrong data on each disabled > component. Reviewed-by: Iago Toral Quiroga > Signed-off-by: Samuel Iglesias Gonsalvez > --- > src/glsl/lower_ubo_reference.cpp | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/src/glsl/lower_ubo_reference.cpp > b/src/glsl/lower_ubo_reference.cpp > index e581306..247620e 100644 > --- a/src/glsl/lower_ubo_reference.cpp > +++ b/src/glsl/lower_ubo_reference.cpp > @@ -754,6 +754,12 @@ lower_ubo_reference_visitor::emit_access(bool is_write, > add(base_offset, > new(mem_ctx) ir_constant(deref_offset + i * matrix_stride)); > if (is_write) { > +/* If the component is not in the writemask, then don't > + * store any value. > + */ > +if (!((1 << i) & write_mask)) > + continue; > + > base_ir->insert_after(ssbo_store(swizzle(deref, i, 1), > chan_offset, 1)); > } else { > if (!this->is_shader_storage) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: error out if non-constant indexing of SSBO arrays with GLSL ES
Fixes Piglit compiler test: shader-storage-block-array-dynamic-indexing.frag Signed-off-by: Tapani Pälli --- src/glsl/ast_array_index.cpp | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/src/glsl/ast_array_index.cpp b/src/glsl/ast_array_index.cpp index dfb3107..5e8f49d 100644 --- a/src/glsl/ast_array_index.cpp +++ b/src/glsl/ast_array_index.cpp @@ -231,15 +231,17 @@ _mesa_ast_array_index_to_hir(void *mem_ctx, _mesa_glsl_error(&loc, state, "unsized array index must be constant"); } } else if (array->type->fields.array->is_interface() - && array->variable_referenced()->data.mode == ir_var_uniform + && (array->variable_referenced()->data.mode == ir_var_uniform || + array->variable_referenced()->data.mode == ir_var_shader_storage) && !state->is_version(400, 0) && !state->ARB_gpu_shader5_enable) { -/* Page 46 in section 4.3.7 of the OpenGL ES 3.00 spec says: +/* Page 50 in section 4.3.9 of the OpenGL ES 3.10 spec says: * - * "All indexes used to index a uniform block array must be - * constant integral expressions." + * "All indices used to index a uniform or shader storage block + * array must be constant integral expressions." */ -_mesa_glsl_error(&loc, state, - "uniform block array index must be constant"); +_mesa_glsl_error(&loc, state, "%s block array index must be constant", + array->variable_referenced()->data.mode + == ir_var_uniform ? "uniform" : "shader storage"); } else { /* whole_variable_referenced can return NULL if the array is a * member of a structure. In this case it is safe to not update -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: error out if non-constant indexing of SSBO arrays with GLSL ES
Reviewed-by: Samuel Iglesias Gonsálvez On 01/10/15 13:54, Tapani Pälli wrote: > Fixes Piglit compiler test: > shader-storage-block-array-dynamic-indexing.frag > > Signed-off-by: Tapani Pälli > --- > src/glsl/ast_array_index.cpp | 14 -- > 1 file changed, 8 insertions(+), 6 deletions(-) > > diff --git a/src/glsl/ast_array_index.cpp b/src/glsl/ast_array_index.cpp > index dfb3107..5e8f49d 100644 > --- a/src/glsl/ast_array_index.cpp > +++ b/src/glsl/ast_array_index.cpp > @@ -231,15 +231,17 @@ _mesa_ast_array_index_to_hir(void *mem_ctx, > _mesa_glsl_error(&loc, state, "unsized array index must be > constant"); > } >} else if (array->type->fields.array->is_interface() > - && array->variable_referenced()->data.mode == ir_var_uniform > + && (array->variable_referenced()->data.mode == > ir_var_uniform || > + array->variable_referenced()->data.mode == > ir_var_shader_storage) > && !state->is_version(400, 0) && > !state->ARB_gpu_shader5_enable) { > - /* Page 46 in section 4.3.7 of the OpenGL ES 3.00 spec says: > + /* Page 50 in section 4.3.9 of the OpenGL ES 3.10 spec says: > * > - * "All indexes used to index a uniform block array must be > - * constant integral expressions." > + * "All indices used to index a uniform or shader storage block > + * array must be constant integral expressions." > */ > - _mesa_glsl_error(&loc, state, > - "uniform block array index must be constant"); > + _mesa_glsl_error(&loc, state, "%s block array index must be constant", > + array->variable_referenced()->data.mode > + == ir_var_uniform ? "uniform" : "shader storage"); >} else { >/* whole_variable_referenced can return NULL if the array is a > * member of a structure. In this case it is safe to not update > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: add GL_UNSIGNED_INT_24_8 to _mesa_pack_depth_span
On Thu, 2015-10-01 at 08:28 +0300, Tapani Pälli wrote: > Patch adds missing type (used with NV_read_depth) so that it gets > handled correctly. Also add type to _mesa_problem output to aid > debugging. > > Signed-off-by: Tapani Pälli > --- > src/mesa/main/pack.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/main/pack.c b/src/mesa/main/pack.c > index 7147fd6..54a0c42 100644 > --- a/src/mesa/main/pack.c > +++ b/src/mesa/main/pack.c > @@ -1074,6 +1074,7 @@ _mesa_pack_depth_span( struct gl_context *ctx, GLuint > n, GLvoid *dest, > } >} >break; > + case GL_UNSIGNED_INT_24_8: Is it okay to store 32-bit integers in this case? that's what the code below does. The spec says that the 8 stencil bits are undefined, but don't we need to convert the depth value to a 24-bit integer scale? (i.e. make 1.0 translate to 2^24-1 not 2^32-1). Iago > case GL_UNSIGNED_INT: >{ > GLuint *dst = (GLuint *) dest; > @@ -1124,7 +1125,8 @@ _mesa_pack_depth_span( struct gl_context *ctx, GLuint > n, GLvoid *dest, >} >break; > default: > - _mesa_problem(ctx, "bad type in _mesa_pack_depth_span"); > + _mesa_problem(ctx, "bad type in _mesa_pack_depth_span (%s)", > +_mesa_enum_to_string(dstType)); > } > > free(depthCopy); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: Add abs input modifier to base for POW in ffvertex_prog
The result of POW for a negative base is undefined. Even when the result is multiplied by zero (which is the case here whenever the base is negative), the Inf and NaNs can propagate past that. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91342 Signed-off-by: Daniel Scharrer Cc: "10.6 11.0" --- src/mesa/main/ffvertex_prog.c | 17 ++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/src/mesa/main/ffvertex_prog.c b/src/mesa/main/ffvertex_prog.c index 95b428d..a6183b4 100644 --- a/src/mesa/main/ffvertex_prog.c +++ b/src/mesa/main/ffvertex_prog.c @@ -293,9 +293,10 @@ struct ureg { GLuint file:4; GLint idx:9; /* relative addressing may be negative */ /* sizeof(idx) should == sizeof(prog_src_reg::Index) */ + GLuint abs:1; GLuint negate:1; GLuint swz:12; - GLuint pad:6; + GLuint pad:5; }; @@ -324,6 +325,7 @@ static const struct ureg undef = { 0, 0, 0, + 0, 0 }; @@ -342,6 +344,7 @@ static struct ureg make_ureg(GLuint file, GLint idx) struct ureg reg; reg.file = file; reg.idx = idx; + reg.abs = 0; reg.negate = 0; reg.swz = SWIZZLE_NOOP; reg.pad = 0; @@ -350,6 +353,14 @@ static struct ureg make_ureg(GLuint file, GLint idx) +static struct ureg absolute( struct ureg reg ) +{ + reg.abs = 1; + reg.negate = 0; + return reg; +} + + static struct ureg negate( struct ureg reg ) { reg.negate ^= 1; @@ -526,8 +537,8 @@ static void emit_arg( struct prog_src_register *src, src->File = reg.file; src->Index = reg.idx; src->Swizzle = reg.swz; + src->Abs = reg.abs; src->Negate = reg.negate ? NEGATE_XYZW : NEGATE_NONE; - src->Abs = 0; src->RelAddr = 0; /* Check that bitfield sizes aren't exceeded */ assert(src->Index == reg.idx); @@ -953,7 +964,7 @@ static struct ureg calculate_light_attenuation( struct tnl_program *p, emit_op2(p, OPCODE_DP3, spot, 0, negate(VPpli), spot_dir_norm); emit_op2(p, OPCODE_SLT, slt, 0, swizzle1(spot_dir_norm,W), spot); - emit_op2(p, OPCODE_POW, spot, 0, spot, swizzle1(attenuation, W)); + emit_op2(p, OPCODE_POW, spot, 0, absolute(spot), swizzle1(attenuation, W)); emit_op2(p, OPCODE_MUL, att, 0, slt, spot); release_temp(p, spot); -- 2.6.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/7] mesa: clean up #includes in ff_fragment_shader.cpp
On 10/01/2015 01:03 AM, Tapani Pälli wrote: There seems to be some 'extra includes', not sure if you want to start going through them at this same time? Right, I'm not too concerned with that now. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] pipe-loader: abstract GALLIUM_STATIC_TARGETS behind pipe_loader API
On Wed, Sep 30, 2015 at 7:37 PM, Marek Olšák wrote: > On Wed, Sep 30, 2015 at 10:43 PM, Rob Clark wrote: >> From: Rob Clark >> >> Not actually working yet, ie. doesn't even compile yet, but an idea. >> >> Initial motivation was for drm_gralloc/pipe, which is essentially a sort >> of mini state-tracker, that needs to be able to share pipe_screen with >> libGL linked into the same process (to ensure we didn't end up with >> duplicate pipe_resource's, etc). I think same situation happens with >> vdpau+gl interop. >> >> Currently drm_gralloc/pipe (and other state trackers??) statically link >> the winsys code, which completely defeats the purpose magic in the >> winsys code to detect when the same device is opened multiple times and >> return the same pipe_screen (since you end up w/ multiple copies of the >> hashtable in the same process). See for example: >> >> 5bb41d9094b3c9bdf0669fd55418981ed83347e3 >> fee0686c21c631d96d6042741267a3c218c23ffc >> >> Rough idea is that we should have something like libgallium.so which >> contains the pipe-loader API, and then optionally (depending on >> GALLIUM_STATIC_TARGETS) all of the gallium pipe drivers. The various >> different state trackers would link against (or dlopen()) libgallium.so >> and use the pipe-loader API to get themselves a pipe_screen (and config >> params, etc). And then you end up with: >> >> +---+ >> | l | >> clover --> | i | >> | b | >> mesa-st --> | g | >> | a |--> pipe driver >> vdapau --> | l | >> | l | >> gralloc --> | i | >> | u | >> xa --> | m | >> | | >> +---+ >> or: >> +---+-+ >> | l | | >> clover --> | i | | >> | b | | >> mesa-st --> | g | | >> | a | pipe driver | >> vdapau --> | l | | >> | l | | >> gralloc --> | i | | >> | u | | >> xa --> | m | | >> | | | >> +---+-+ >> >> depending on GALLIUM_STATIC_TARGETS. Either way, all the state trackers >> in the same process share a single copy of the hashtable in the winsys >> code which allows them to share the same pipe_screen. >> >> I think that ends up being an extra level of library indirection vs >> current state w/ pipe drivers, ie. with mesa dri loader stuff directly >> loading gallium_dri.so which contains all the drivers plus mesa state >> tracker. If this was concerning to someone, what I'd suggest would be >> to, for all the state trackers that already have some sort of loader >> sitting between them and the user, just pull them directly into the >> mega-mega libgallium.so, ie. something like: >> >> +-+---+-+ >> | | | | >> | clover | l | | >> | | i | | >> | mesa-st | b | | >> | | g | pipe driver | >> | vdapau | a | | >> | | l | | >> +-| l | | >> | i | | >> gralloc --> | u | | >> | m | | >> xa --> | | | >> | | | >> +---+-+ >> >> Anyways, I'm far from an expert in the build architecture of things so >> I might have some facts wrong or be a bit confused. And I'll probably >> regret bringing the subject up. But somehow or another it seems like >> it would be good to (a) clean up all the GALLIUM_STATIC_TARGETS >> ifdeffery spread throughout all the different state trackers, and (b) >> have the pipe driver[*] only exist once per process rather than once per >> state-tracker. Especially for android where each process using GL will >> have both gralloc and mesa-st.. and perhaps even clover/omx/etc. > > Radeon and amdgpu already have one pipe_screen per process per fd. It > works like this: > > - The GL, VDPAU, OMX etc. libs each ship its own copy of the whole > driver, so we have multiple screens and multiple winsyses. > > - All libs export "radeon_drm_winsys_create", which is this function: > struct radeon_winsys *radeon_drm_winsys_create( >int fd, radeon_screen_create_t screen_create); > > - The first lib that is loaded registers "radeon_drm_winsys_create", > all other libs loaded later can't override it. Therefore, if any lib > calls radeon_drm_winsys_create, the version from the first loaded lib > is called. > > - The first caller passes the screen_create function to > radeon_drm_winsys_create, the screen is created only once for each > input fd in that function along with the winsys. When the winsys is > returned, radeon_winsys::screen is the returned screen. > > - Any oth
Re: [Mesa-dev] [PATCH 18/18] t_dd_dmatmp: Use 'X &= ~N' instead of 'X -= X & N'
On 09/30/2015 02:58 PM, Ian Romanick wrote: From: Ian Romanick Signed-off-by: Ian Romanick Suggested-by: Brian Paul --- src/mesa/tnl_dd/t_dd_dmatmp.h | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/mesa/tnl_dd/t_dd_dmatmp.h b/src/mesa/tnl_dd/t_dd_dmatmp.h index 4eff883..3a30142 100644 --- a/src/mesa/tnl_dd/t_dd_dmatmp.h +++ b/src/mesa/tnl_dd/t_dd_dmatmp.h @@ -96,7 +96,7 @@ static void TAG(render_lines_verts)(struct gl_context *ctx, /* Emit whole number of lines in total and in each buffer: */ - count -= count & 1; + count &= ~1; unsigned currentsz = GET_CURRENT_VB_MAX_VERTS() & ~1; if (currentsz < 8) @@ -229,7 +229,7 @@ static void TAG(render_tri_strip_verts)(struct gl_context *ctx, /* From here on emit even numbers of tris when wrapping over buffers: */ - currentsz -= (currentsz & 1); + currentsz &= ~1; unsigned nr; for (unsigned j = 0; j + 2 < count; j += nr - 2) { @@ -322,7 +322,7 @@ static void TAG(render_quad_strip_verts)(struct gl_context *ctx, /* Emit whole number of quads in total, and in each buffer. */ - count -= count & 1; + count &= ~1; unsigned currentsz = GET_CURRENT_VB_MAX_VERTS() & ~1; if (currentsz < 8) @@ -352,7 +352,7 @@ static void TAG(render_quads_verts)(struct gl_context *ctx, LOCAL_VARS; /* Emit whole number of quads in total. */ - count -= count & 3; + count &= ~3; /* Hardware doesn't have a quad primitive type -- try to simulate it using * triangle primitive. This is a win for gears, but is it useful in the Reviewed-by: Brian Paul ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] pipe-loader: abstract GALLIUM_STATIC_TARGETS behind pipe_loader API
Hi Rob, I believe that Marek managed to nicely explain how things work. Ideally you'll be able to get on the same boat (i.e. the Android C runtime/linker won't give you the finger). Regardless some comments inline. On 30 September 2015 at 21:43, Rob Clark wrote: > From: Rob Clark > > Not actually working yet, ie. doesn't even compile yet, but an idea. > > Initial motivation was for drm_gralloc/pipe, which is essentially a sort > of mini state-tracker, that needs to be able to share pipe_screen with > libGL linked into the same process (to ensure we didn't end up with > duplicate pipe_resource's, etc). I think same situation happens with > vdpau+gl interop. > > Currently drm_gralloc/pipe (and other state trackers??) statically link > the winsys code, which completely defeats the purpose magic in the > winsys code to detect when the same device is opened multiple times and > return the same pipe_screen (since you end up w/ multiple copies of the > hashtable in the same process). See for example: > > 5bb41d9094b3c9bdf0669fd55418981ed83347e3 > fee0686c21c631d96d6042741267a3c218c23ffc > > Rough idea is that we should have something like libgallium.so which > contains the pipe-loader API, and then optionally (depending on > GALLIUM_STATIC_TARGETS) all of the gallium pipe drivers. The various > different state trackers would link against (or dlopen()) libgallium.so > and use the pipe-loader API to get themselves a pipe_screen (and config > params, etc). And then you end up with: > > +---+ > | l | > clover --> | i | > | b | > mesa-st --> | g | > | a |--> pipe driver > vdapau --> | l | > | l | > gralloc --> | i | > | u | > xa --> | m | > | | > +---+ > or: > +---+-+ > | l | | > clover --> | i | | > | b | | > mesa-st --> | g | | > | a | pipe driver | > vdapau --> | l | | > | l | | > gralloc --> | i | | > | u | | > xa --> | m | | > | | | > +---+-+ > > depending on GALLIUM_STATIC_TARGETS. Either way, all the state trackers > in the same process share a single copy of the hashtable in the winsys > code which allows them to share the same pipe_screen. > As others I'm not too excited about adding new library/ies. Amongst others, It boils down to people mix'n'matching things, causing all sorts of problems. > I think that ends up being an extra level of library indirection vs > current state w/ pipe drivers, ie. with mesa dri loader stuff directly > loading gallium_dri.so which contains all the drivers plus mesa state > tracker. If this was concerning to someone, what I'd suggest would be > to, for all the state trackers that already have some sort of loader > sitting between them and the user, just pull them directly into the > mega-mega libgallium.so, ie. something like: > > +-+---+-+ > | | | | > | clover | l | | > | | i | | > | mesa-st | b | | > | | g | pipe driver | > | vdapau | a | | > | | l | | > +-| l | | > | i | | > gralloc --> | u | | > | m | | > xa --> | | | > | | | > +---+-+ > This won't work due to funny dependencies. I.e. you don't want to force xcb and friends onto the dri module, clover etc, just because vdpau needs it. > Anyways, I'm far from an expert in the build architecture of things so > I might have some facts wrong or be a bit confused. And I'll probably > regret bringing the subject up. But somehow or another it seems like > it would be good to (a) clean up all the GALLIUM_STATIC_TARGETS > ifdeffery spread throughout all the different state trackers, Sure that's always great, yet it is somewhat orthogonal to the original issue ;-) > and (b) > have the pipe driver[*] only exist once per process rather than once per > state-tracker. Especially for android where each process using GL will > have both gralloc and mesa-st.. and perhaps even clover/omx/etc. > Speaking of Android, have you tried the dynamic-list approach ? Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] pipe-loader: abstract GALLIUM_STATIC_TARGETS behind pipe_loader API
On Thu, Oct 1, 2015 at 3:53 PM, Rob Clark wrote: > On Wed, Sep 30, 2015 at 7:37 PM, Marek Olšák wrote: >> On Wed, Sep 30, 2015 at 10:43 PM, Rob Clark wrote: >>> From: Rob Clark >>> >>> Not actually working yet, ie. doesn't even compile yet, but an idea. >>> >>> Initial motivation was for drm_gralloc/pipe, which is essentially a sort >>> of mini state-tracker, that needs to be able to share pipe_screen with >>> libGL linked into the same process (to ensure we didn't end up with >>> duplicate pipe_resource's, etc). I think same situation happens with >>> vdpau+gl interop. >>> >>> Currently drm_gralloc/pipe (and other state trackers??) statically link >>> the winsys code, which completely defeats the purpose magic in the >>> winsys code to detect when the same device is opened multiple times and >>> return the same pipe_screen (since you end up w/ multiple copies of the >>> hashtable in the same process). See for example: >>> >>> 5bb41d9094b3c9bdf0669fd55418981ed83347e3 >>> fee0686c21c631d96d6042741267a3c218c23ffc >>> >>> Rough idea is that we should have something like libgallium.so which >>> contains the pipe-loader API, and then optionally (depending on >>> GALLIUM_STATIC_TARGETS) all of the gallium pipe drivers. The various >>> different state trackers would link against (or dlopen()) libgallium.so >>> and use the pipe-loader API to get themselves a pipe_screen (and config >>> params, etc). And then you end up with: >>> >>> +---+ >>> | l | >>> clover --> | i | >>> | b | >>> mesa-st --> | g | >>> | a |--> pipe driver >>> vdapau --> | l | >>> | l | >>> gralloc --> | i | >>> | u | >>> xa --> | m | >>> | | >>> +---+ >>> or: >>> +---+-+ >>> | l | | >>> clover --> | i | | >>> | b | | >>> mesa-st --> | g | | >>> | a | pipe driver | >>> vdapau --> | l | | >>> | l | | >>> gralloc --> | i | | >>> | u | | >>> xa --> | m | | >>> | | | >>> +---+-+ >>> >>> depending on GALLIUM_STATIC_TARGETS. Either way, all the state trackers >>> in the same process share a single copy of the hashtable in the winsys >>> code which allows them to share the same pipe_screen. >>> >>> I think that ends up being an extra level of library indirection vs >>> current state w/ pipe drivers, ie. with mesa dri loader stuff directly >>> loading gallium_dri.so which contains all the drivers plus mesa state >>> tracker. If this was concerning to someone, what I'd suggest would be >>> to, for all the state trackers that already have some sort of loader >>> sitting between them and the user, just pull them directly into the >>> mega-mega libgallium.so, ie. something like: >>> >>> +-+---+-+ >>> | | | | >>> | clover | l | | >>> | | i | | >>> | mesa-st | b | | >>> | | g | pipe driver | >>> | vdapau | a | | >>> | | l | | >>> +-| l | | >>> | i | | >>> gralloc --> | u | | >>> | m | | >>> xa --> | | | >>> | | | >>> +---+-+ >>> >>> Anyways, I'm far from an expert in the build architecture of things so >>> I might have some facts wrong or be a bit confused. And I'll probably >>> regret bringing the subject up. But somehow or another it seems like >>> it would be good to (a) clean up all the GALLIUM_STATIC_TARGETS >>> ifdeffery spread throughout all the different state trackers, and (b) >>> have the pipe driver[*] only exist once per process rather than once per >>> state-tracker. Especially for android where each process using GL will >>> have both gralloc and mesa-st.. and perhaps even clover/omx/etc. >> >> Radeon and amdgpu already have one pipe_screen per process per fd. It >> works like this: >> >> - The GL, VDPAU, OMX etc. libs each ship its own copy of the whole >> driver, so we have multiple screens and multiple winsyses. >> >> - All libs export "radeon_drm_winsys_create", which is this function: >> struct radeon_winsys *radeon_drm_winsys_create( >>int fd, radeon_screen_create_t screen_create); >> >> - The first lib that is loaded registers "radeon_drm_winsys_create", >> all other libs loaded later can't override it. Therefore, if any lib >> calls radeon_drm_winsys_create, the version from the first loaded lib >> is called. >> >> - The first caller passes the screen_create function to >> radeon_drm_winsys_create, the screen is created
Re: [Mesa-dev] [PATCH 3/3] radeon/llvm: Initialize gallivm targets when initializing the AMDGPU target v2
On 30 September 2015 at 19:59, Tom Stellard wrote: > This fixes a race condition in the glx-multithreaded-shader-compile > test. > > v2: > - Replace gallivm_init_llvm_{begin,end}() with gallivm_init_llvm_targets(). > Thanks for sticking with my suggestion Tom. Fwiw the series is: Reviewed-by: Emil Velikov Can you please confirm if we need the threadsafe wrappers, with this series in place ? If we do, humble ping :) -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/11] i965: Pull stage_prog_data.nr_params out of the NIR shader
On Wed, 2015-09-30 at 18:41 -0700, Jason Ekstrand wrote: > Previously, we had a bunch of code in each stage to figure out how many > slots we needed in stage_prog_data.param. This code was mostly identical > across the stages and had been copied and pasted around. Unfortunately, > this meant that any time you did something special, you had to add code for > it to each of these places. In particular, none of the stages took > subroutines into account; they were working entirely by accident. By > taking this data from the NIR shader, we know the exact number of entries > we need and everything goes a bit smoother. > --- > src/mesa/drivers/dri/i965/brw_cs.c | 4 ++-- > src/mesa/drivers/dri/i965/brw_gs.c | 5 ++--- > src/mesa/drivers/dri/i965/brw_vs.c | 16 > src/mesa/drivers/dri/i965/brw_wm.c | 10 +++--- > 4 files changed, 11 insertions(+), 24 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_cs.c > b/src/mesa/drivers/dri/i965/brw_cs.c > index 02eeeda..24120fb 100644 > --- a/src/mesa/drivers/dri/i965/brw_cs.c > +++ b/src/mesa/drivers/dri/i965/brw_cs.c > @@ -30,6 +30,7 @@ > #include "intel_mipmap_tree.h" > #include "brw_state.h" > #include "intel_batchbuffer.h" > +#include "glsl/nir/nir.h" > > static bool > brw_codegen_cs_prog(struct brw_context *brw, > @@ -55,8 +56,7 @@ brw_codegen_cs_prog(struct brw_context *brw, > * prog_data associated with the compiled program, and which will be freed > * by the state cache. > */ > - int param_count = cs->base.num_uniform_components + > - cs->base.NumImages * BRW_IMAGE_PARAM_SIZE; > + int param_count = cp->program.Base.nir->num_uniforms; > > /* The backend also sometimes adds params for texture size. */ > param_count += 2 * > ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits; > diff --git a/src/mesa/drivers/dri/i965/brw_gs.c > b/src/mesa/drivers/dri/i965/brw_gs.c > index 61e7b2a..0cf7ec8 100644 > --- a/src/mesa/drivers/dri/i965/brw_gs.c > +++ b/src/mesa/drivers/dri/i965/brw_gs.c > @@ -32,6 +32,7 @@ > #include "brw_vec4_gs_visitor.h" > #include "brw_state.h" > #include "brw_ff_gs.h" > +#include "glsl/nir/nir.h" > > > bool > @@ -60,9 +61,7 @@ brw_codegen_gs_prog(struct brw_context *brw, > * every uniform is a float which gets padded to the size of a vec4. > */ > struct gl_shader *gs = prog->_LinkedShaders[MESA_SHADER_GEOMETRY]; > - int param_count = gs->num_uniform_components * 4; > - > - param_count += gs->NumImages * BRW_IMAGE_PARAM_SIZE; > + int param_count = gp->program.Base.nir->num_uniforms * 4; I think the vec4 nir backend does not handle image uniforms at the moment, does it? At least I see that the FS backend has code specifically for that in fs_visitor::nir_setup_uniform. Not sure if we support images in geometry stages though, but the code you remove seems to account for that... > c.prog_data.base.base.param = >rzalloc_array(NULL, const gl_constant_value *, param_count); > diff --git a/src/mesa/drivers/dri/i965/brw_vs.c > b/src/mesa/drivers/dri/i965/brw_vs.c > index e1a0d9c..391411c 100644 > --- a/src/mesa/drivers/dri/i965/brw_vs.c > +++ b/src/mesa/drivers/dri/i965/brw_vs.c > @@ -109,18 +109,10 @@ brw_codegen_vs_prog(struct brw_context *brw, > * prog_data associated with the compiled program, and which will be freed > * by the state cache. > */ > - int param_count; > - if (vs) { > - /* We add padding around uniform values below vec4 size, with the worst > - * case being a float value that gets blown up to a vec4, so be > - * conservative here. > - */ > - param_count = vs->base.num_uniform_components * 4 + > -vs->base.NumImages * BRW_IMAGE_PARAM_SIZE; > - stage_prog_data->nr_image_params = vs->base.NumImages; > - } else { > - param_count = vp->program.Base.Parameters->NumParameters * 4; > - } > + int param_count = vp->program.Base.nir->num_uniforms; > + if (!brw->intelScreen->compiler->scalar_vs) > + param_count *= 4; Same thing here. Also, I guess this also means that for the scalar_vs cases we were computing a larger param_count than we really should, right? > /* vec4_visitor::setup_uniform_clipplane_values() also uploads user clip > * planes as uniforms. > */ > diff --git a/src/mesa/drivers/dri/i965/brw_wm.c > b/src/mesa/drivers/dri/i965/brw_wm.c > index cc97d6a..08f2416 100644 > --- a/src/mesa/drivers/dri/i965/brw_wm.c > +++ b/src/mesa/drivers/dri/i965/brw_wm.c > @@ -35,6 +35,7 @@ > #include "program/prog_parameter.h" > #include "program/program.h" > #include "intel_mipmap_tree.h" > +#include "glsl/nir/nir.h" > > #include "util/ralloc.h" > > @@ -173,14 +174,9 @@ brw_codegen_wm_prog(struct brw_context *brw, > * prog_data associated with the compiled program, and which will be freed > * by the state cache. > */ > - int param_count; > - if (fs) { > - param_count = fs->base.num
Re: [Mesa-dev] [PATCH 01/11] glsl/types: Make subroutine types have a single matrix column
Reviewed-by: Iago Toral Quiroga On Wed, 2015-09-30 at 18:41 -0700, Jason Ekstrand wrote: > That way, if we do the usual thing of multiplying vector_elements by > matrix_columns we get the actual number of components in the type as per > component_slots(). > > While we're at it, we also switch to using the actual C++ field > initializers for vector_elements and matrix_columns. > --- > src/glsl/glsl_types.cpp | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp > index 8586b2e..25927f6 100644 > --- a/src/glsl/glsl_types.cpp > +++ b/src/glsl/glsl_types.cpp > @@ -172,7 +172,7 @@ glsl_type::glsl_type(const char *subroutine_name) : > base_type(GLSL_TYPE_SUBROUTINE), > sampler_dimensionality(0), sampler_shadow(0), sampler_array(0), > sampler_type(0), interface_packing(0), > - vector_elements(0), matrix_columns(0), > + vector_elements(1), matrix_columns(1), > length(0) > { > mtx_lock(&glsl_type::mutex); > @@ -180,7 +180,6 @@ glsl_type::glsl_type(const char *subroutine_name) : > init_ralloc_type_ctx(); > assert(subroutine_name != NULL); > this->name = ralloc_strdup(this->mem_ctx, subroutine_name); > - this->vector_elements = 1; > mtx_unlock(&glsl_type::mutex); > } > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/11] i965/vs: Move lazy NIR creation to codegen_vs_prog
Reviewed-by: Iago Toral Quiroga On Wed, 2015-09-30 at 18:41 -0700, Jason Ekstrand wrote: > The next commit will add code to codegen_vs_prog that requires the NIR > shader to be there in all cases. It doesn't hurt anything to just move it > from brw_vs_emit to its only caller. > --- > src/mesa/drivers/dri/i965/brw_vec4.cpp | 12 > src/mesa/drivers/dri/i965/brw_vs.c | 13 + > 2 files changed, 13 insertions(+), 12 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp > b/src/mesa/drivers/dri/i965/brw_vec4.cpp > index 056ce39..407698f 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp > @@ -1953,18 +1953,6 @@ brw_vs_emit(struct brw_context *brw, > if (unlikely(INTEL_DEBUG & DEBUG_VS)) >brw_dump_ir("vertex", prog, &shader->base, &vp->Base); > > - if (!vp->Base.nir) { > - /* Normally we generate NIR in LinkShader() or > - * ProgramStringNotify(), but Mesa's fixed-function vertex program > - * handling doesn't notify the driver at all. Just do it here, at > - * the last minute, even though it's lame. > - */ > - assert(vp->Base.Id == 0 && prog == NULL); > - vp->Base.nir = > - brw_create_nir(brw, NULL, &vp->Base, MESA_SHADER_VERTEX, > -brw->intelScreen->compiler->scalar_vs); > - } > - > if (brw->intelScreen->compiler->scalar_vs) { >prog_data->base.dispatch_mode = DISPATCH_MODE_SIMD8; > > diff --git a/src/mesa/drivers/dri/i965/brw_vs.c > b/src/mesa/drivers/dri/i965/brw_vs.c > index 0a348a5..e1a0d9c 100644 > --- a/src/mesa/drivers/dri/i965/brw_vs.c > +++ b/src/mesa/drivers/dri/i965/brw_vs.c > @@ -37,6 +37,7 @@ > #include "brw_state.h" > #include "program/prog_print.h" > #include "program/prog_parameter.h" > +#include "brw_nir.h" > > #include "util/ralloc.h" > > @@ -81,6 +82,18 @@ brw_codegen_vs_prog(struct brw_context *brw, > bool start_busy = false; > double start_time = 0; > > + if (!vp->program.Base.nir) { > + /* Normally we generate NIR in LinkShader() or > + * ProgramStringNotify(), but Mesa's fixed-function vertex program > + * handling doesn't notify the driver at all. Just do it here, at > + * the last minute, even though it's lame. > + */ > + assert(vp->program.Base.Id == 0 && prog == NULL); > + vp->program.Base.nir = > + brw_create_nir(brw, NULL, &vp->program.Base, MESA_SHADER_VERTEX, > +brw->intelScreen->compiler->scalar_vs); > + } > + > if (prog) >vs = (struct brw_shader *) prog->_LinkedShaders[MESA_SHADER_VERTEX]; > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/11] i965: Pull stage_prog_data.nr_params out of the NIR shader
On Thu, Oct 1, 2015 at 7:52 AM, Iago Toral wrote: > On Wed, 2015-09-30 at 18:41 -0700, Jason Ekstrand wrote: >> Previously, we had a bunch of code in each stage to figure out how many >> slots we needed in stage_prog_data.param. This code was mostly identical >> across the stages and had been copied and pasted around. Unfortunately, >> this meant that any time you did something special, you had to add code for >> it to each of these places. In particular, none of the stages took >> subroutines into account; they were working entirely by accident. By >> taking this data from the NIR shader, we know the exact number of entries >> we need and everything goes a bit smoother. >> --- >> src/mesa/drivers/dri/i965/brw_cs.c | 4 ++-- >> src/mesa/drivers/dri/i965/brw_gs.c | 5 ++--- >> src/mesa/drivers/dri/i965/brw_vs.c | 16 >> src/mesa/drivers/dri/i965/brw_wm.c | 10 +++--- >> 4 files changed, 11 insertions(+), 24 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_cs.c >> b/src/mesa/drivers/dri/i965/brw_cs.c >> index 02eeeda..24120fb 100644 >> --- a/src/mesa/drivers/dri/i965/brw_cs.c >> +++ b/src/mesa/drivers/dri/i965/brw_cs.c >> @@ -30,6 +30,7 @@ >> #include "intel_mipmap_tree.h" >> #include "brw_state.h" >> #include "intel_batchbuffer.h" >> +#include "glsl/nir/nir.h" >> >> static bool >> brw_codegen_cs_prog(struct brw_context *brw, >> @@ -55,8 +56,7 @@ brw_codegen_cs_prog(struct brw_context *brw, >> * prog_data associated with the compiled program, and which will be >> freed >> * by the state cache. >> */ >> - int param_count = cs->base.num_uniform_components + >> - cs->base.NumImages * BRW_IMAGE_PARAM_SIZE; >> + int param_count = cp->program.Base.nir->num_uniforms; >> >> /* The backend also sometimes adds params for texture size. */ >> param_count += 2 * >> ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits; >> diff --git a/src/mesa/drivers/dri/i965/brw_gs.c >> b/src/mesa/drivers/dri/i965/brw_gs.c >> index 61e7b2a..0cf7ec8 100644 >> --- a/src/mesa/drivers/dri/i965/brw_gs.c >> +++ b/src/mesa/drivers/dri/i965/brw_gs.c >> @@ -32,6 +32,7 @@ >> #include "brw_vec4_gs_visitor.h" >> #include "brw_state.h" >> #include "brw_ff_gs.h" >> +#include "glsl/nir/nir.h" >> >> >> bool >> @@ -60,9 +61,7 @@ brw_codegen_gs_prog(struct brw_context *brw, >> * every uniform is a float which gets padded to the size of a vec4. >> */ >> struct gl_shader *gs = prog->_LinkedShaders[MESA_SHADER_GEOMETRY]; >> - int param_count = gs->num_uniform_components * 4; >> - >> - param_count += gs->NumImages * BRW_IMAGE_PARAM_SIZE; >> + int param_count = gp->program.Base.nir->num_uniforms * 4; > > I think the vec4 nir backend does not handle image uniforms at the > moment, does it? At least I see that the FS backend has code > specifically for that in fs_visitor::nir_setup_uniform. Not sure if we > support images in geometry stages though, but the code you remove seems > to account for that... We don't. When Curro initially implemented it, he did have vec4 support. However, that was dropped in favor of doing things differently in the fs compiler. It may yet come back though. This code was probably merged a bit speculatively but doesn't really hurt anything. >> c.prog_data.base.base.param = >>rzalloc_array(NULL, const gl_constant_value *, param_count); >> diff --git a/src/mesa/drivers/dri/i965/brw_vs.c >> b/src/mesa/drivers/dri/i965/brw_vs.c >> index e1a0d9c..391411c 100644 >> --- a/src/mesa/drivers/dri/i965/brw_vs.c >> +++ b/src/mesa/drivers/dri/i965/brw_vs.c >> @@ -109,18 +109,10 @@ brw_codegen_vs_prog(struct brw_context *brw, >> * prog_data associated with the compiled program, and which will be >> freed >> * by the state cache. >> */ >> - int param_count; >> - if (vs) { >> - /* We add padding around uniform values below vec4 size, with the >> worst >> - * case being a float value that gets blown up to a vec4, so be >> - * conservative here. >> - */ >> - param_count = vs->base.num_uniform_components * 4 + >> -vs->base.NumImages * BRW_IMAGE_PARAM_SIZE; >> - stage_prog_data->nr_image_params = vs->base.NumImages; >> - } else { >> - param_count = vp->program.Base.Parameters->NumParameters * 4; >> - } >> + int param_count = vp->program.Base.nir->num_uniforms; >> + if (!brw->intelScreen->compiler->scalar_vs) >> + param_count *= 4; > > Same thing here. > > Also, I guess this also means that for the scalar_vs cases we were > computing a larger param_count than we really should, right? They were, but it had nothing to do with images. It only had to do with the fact that we were multiplying num_uniform_components by 4 which means that if your uniforms were all a bunch of floats, it would be 4x the size needed. >> /* vec4_visitor::setup_uniform_clipplane_values() also uploads user clip >> * planes as uniforms. >>
Re: [Mesa-dev] [PATCH 03/11] i965: Pull stage_prog_data.nr_params out of the NIR shader
On Thu, 2015-10-01 at 07:58 -0700, Jason Ekstrand wrote: > On Thu, Oct 1, 2015 at 7:52 AM, Iago Toral wrote: > > On Wed, 2015-09-30 at 18:41 -0700, Jason Ekstrand wrote: > >> Previously, we had a bunch of code in each stage to figure out how many > >> slots we needed in stage_prog_data.param. This code was mostly identical > >> across the stages and had been copied and pasted around. Unfortunately, > >> this meant that any time you did something special, you had to add code for > >> it to each of these places. In particular, none of the stages took > >> subroutines into account; they were working entirely by accident. By > >> taking this data from the NIR shader, we know the exact number of entries > >> we need and everything goes a bit smoother. > >> --- > >> src/mesa/drivers/dri/i965/brw_cs.c | 4 ++-- > >> src/mesa/drivers/dri/i965/brw_gs.c | 5 ++--- > >> src/mesa/drivers/dri/i965/brw_vs.c | 16 > >> src/mesa/drivers/dri/i965/brw_wm.c | 10 +++--- > >> 4 files changed, 11 insertions(+), 24 deletions(-) > >> > >> diff --git a/src/mesa/drivers/dri/i965/brw_cs.c > >> b/src/mesa/drivers/dri/i965/brw_cs.c > >> index 02eeeda..24120fb 100644 > >> --- a/src/mesa/drivers/dri/i965/brw_cs.c > >> +++ b/src/mesa/drivers/dri/i965/brw_cs.c > >> @@ -30,6 +30,7 @@ > >> #include "intel_mipmap_tree.h" > >> #include "brw_state.h" > >> #include "intel_batchbuffer.h" > >> +#include "glsl/nir/nir.h" > >> > >> static bool > >> brw_codegen_cs_prog(struct brw_context *brw, > >> @@ -55,8 +56,7 @@ brw_codegen_cs_prog(struct brw_context *brw, > >> * prog_data associated with the compiled program, and which will be > >> freed > >> * by the state cache. > >> */ > >> - int param_count = cs->base.num_uniform_components + > >> - cs->base.NumImages * BRW_IMAGE_PARAM_SIZE; > >> + int param_count = cp->program.Base.nir->num_uniforms; > >> > >> /* The backend also sometimes adds params for texture size. */ > >> param_count += 2 * > >> ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits; > >> diff --git a/src/mesa/drivers/dri/i965/brw_gs.c > >> b/src/mesa/drivers/dri/i965/brw_gs.c > >> index 61e7b2a..0cf7ec8 100644 > >> --- a/src/mesa/drivers/dri/i965/brw_gs.c > >> +++ b/src/mesa/drivers/dri/i965/brw_gs.c > >> @@ -32,6 +32,7 @@ > >> #include "brw_vec4_gs_visitor.h" > >> #include "brw_state.h" > >> #include "brw_ff_gs.h" > >> +#include "glsl/nir/nir.h" > >> > >> > >> bool > >> @@ -60,9 +61,7 @@ brw_codegen_gs_prog(struct brw_context *brw, > >> * every uniform is a float which gets padded to the size of a vec4. > >> */ > >> struct gl_shader *gs = prog->_LinkedShaders[MESA_SHADER_GEOMETRY]; > >> - int param_count = gs->num_uniform_components * 4; > >> - > >> - param_count += gs->NumImages * BRW_IMAGE_PARAM_SIZE; > >> + int param_count = gp->program.Base.nir->num_uniforms * 4; > > > > I think the vec4 nir backend does not handle image uniforms at the > > moment, does it? At least I see that the FS backend has code > > specifically for that in fs_visitor::nir_setup_uniform. Not sure if we > > support images in geometry stages though, but the code you remove seems > > to account for that... > > We don't. When Curro initially implemented it, he did have vec4 > support. However, that was dropped in favor of doing things > differently in the fs compiler. It may yet come back though. This > code was probably merged a bit speculatively but doesn't really hurt > anything. Yeah, that's what I recall too, I was surprised to find image stuff here and thought that maybe I missed something. > >> c.prog_data.base.base.param = > >>rzalloc_array(NULL, const gl_constant_value *, param_count); > >> diff --git a/src/mesa/drivers/dri/i965/brw_vs.c > >> b/src/mesa/drivers/dri/i965/brw_vs.c > >> index e1a0d9c..391411c 100644 > >> --- a/src/mesa/drivers/dri/i965/brw_vs.c > >> +++ b/src/mesa/drivers/dri/i965/brw_vs.c > >> @@ -109,18 +109,10 @@ brw_codegen_vs_prog(struct brw_context *brw, > >> * prog_data associated with the compiled program, and which will be > >> freed > >> * by the state cache. > >> */ > >> - int param_count; > >> - if (vs) { > >> - /* We add padding around uniform values below vec4 size, with the > >> worst > >> - * case being a float value that gets blown up to a vec4, so be > >> - * conservative here. > >> - */ > >> - param_count = vs->base.num_uniform_components * 4 + > >> -vs->base.NumImages * BRW_IMAGE_PARAM_SIZE; > >> - stage_prog_data->nr_image_params = vs->base.NumImages; > >> - } else { > >> - param_count = vp->program.Base.Parameters->NumParameters * 4; > >> - } > >> + int param_count = vp->program.Base.nir->num_uniforms; > >> + if (!brw->intelScreen->compiler->scalar_vs) > >> + param_count *= 4; > > > > Same thing here. > > > > Also, I guess this also means that for the scalar_vs cases we were > > computing a la
Re: [Mesa-dev] [PATCH v2 4/7] nir: add an instruction set API
On Wed, Sep 30, 2015 at 12:35 PM, Jason Ekstrand wrote: > On Wed, Sep 30, 2015 at 8:11 AM, Connor Abbott wrote: >> This will replace direct usage of nir_instrs_equal() in the CSE pass, >> which reduces an O(n^2) algorithm with an effectively O(n) one. It'll >> also be useful for implementing GVN on top of GCM. >> >> v2: >> - Add texture support. >> - Add more comments. >> - Rename instr_can_hash() to instr_can_rewrite() since it's really more >> about whether its uses can be rewritten, and it's implicitly used by >> nir_instrs_equal() as well. >> - Rename nir_instr_set_add() to nir_instr_set_add_or_rewrite() (Jason). >> - Make the HASH() macro less magical (Topi). >> - Rewrite the commit message. >> >> Signed-off-by: Connor Abbott >> --- >> src/glsl/nir/nir_instr_set.c | 311 >> +++ >> src/glsl/nir/nir_instr_set.h | 35 + >> 2 files changed, 346 insertions(+) >> >> diff --git a/src/glsl/nir/nir_instr_set.c b/src/glsl/nir/nir_instr_set.c >> index 72ab048..a6f2226 100644 >> --- a/src/glsl/nir/nir_instr_set.c >> +++ b/src/glsl/nir/nir_instr_set.c >> @@ -23,6 +23,178 @@ >> >> #include "nir_instr_set.h" >> >> +#define HASH(hash, data) _mesa_fnv32_1a_accumulate((hash), (data)) >> + >> +static uint32_t >> +hash_src(uint32_t hash, const nir_src *src) >> +{ >> + assert(src->is_ssa); >> + hash = HASH(hash, src->ssa); >> + return hash; >> +} >> + >> +static uint32_t >> +hash_alu_src(uint32_t hash, const nir_alu_src *src, unsigned num_components) >> +{ >> + hash = HASH(hash, src->abs); >> + hash = HASH(hash, src->negate); >> + >> + for (unsigned i = 0; i < num_components; i++) >> + hash = HASH(hash, src->swizzle[i]); >> + >> + hash = hash_src(hash, &src->src); >> + return hash; >> +} >> + >> +static uint32_t >> +hash_alu(uint32_t hash, const nir_alu_instr *instr) >> +{ >> + hash = HASH(hash, instr->op); >> + hash = HASH(hash, instr->dest.dest.ssa.num_components); > > Why are you hasing the dest.ssa.num_components instead of the > writemask? They should be the same in any case, but this seems a bit > out-of-place. No particular reason. I can change it if you want. > >> + >> + if (nir_op_infos[instr->op].algebraic_properties & >> NIR_OP_IS_COMMUTATIVE) { >> + assert(nir_op_infos[instr->op].num_inputs == 2); >> + uint32_t hash0 = hash_alu_src(hash, &instr->src[0], >> +nir_ssa_alu_instr_src_components(instr, >> 0)); >> + uint32_t hash1 = hash_alu_src(hash, &instr->src[1], >> +nir_ssa_alu_instr_src_components(instr, >> 1)); >> + /* For commutative operations, we need some commutative way of >> + * combining the hashes. One option would be to XOR them but that >> + * means that anything with two identical sources will hash to 0 and >> + * that's common enough we probably don't want the guaranteed >> + * collision. Either addition or multiplication will also work. >> + */ >> + hash = hash0 * hash1; >> + } else { >> + for (unsigned i = 0; i < nir_op_infos[instr->op].num_inputs; i++) { >> + hash = hash_alu_src(hash, &instr->src[i], >> + nir_ssa_alu_instr_src_components(instr, i)); >> + } >> + } >> + >> + return hash; >> +} >> + >> +static uint32_t >> +hash_load_const(uint32_t hash, const nir_load_const_instr *instr) >> +{ >> + hash = HASH(hash, instr->def.num_components); >> + >> + hash = _mesa_fnv32_1a_accumulate_block(hash, instr->value.f, >> + instr->def.num_components >> + * sizeof(instr->value.f[0])); >> + >> + return hash; >> +} >> + >> +static int >> +cmp_phi_src(const void *data1, const void *data2) >> +{ >> + const nir_phi_src *src1 = data1, *src2 = data2; >> + return src1->pred->index - src2->pred->index; >> +} >> + >> +static uint32_t >> +hash_phi(uint32_t hash, const nir_phi_instr *instr) >> +{ >> + hash = HASH(hash, instr->instr.block); >> + >> + /* sort sources by predecessor index, since the order shouldn't matter */ >> + unsigned num_preds = instr->instr.block->predecessors->entries; >> + nir_phi_src *srcs = malloc(num_preds * sizeof(nir_phi_src)); > > You could use NIR_VLA and put it on the stack. That'd probably be more > efficient. Also, why are you copying the entire source and not just > making a list of pointers? Good point. It does prevent a pointer dereference when comparing, but that's probably less expensive than copying the entire thing when swapping. > >> + unsigned i = 0; >> + nir_foreach_phi_src(instr, src) { >> + srcs[i++] = *src; >> + } >> + qsort(srcs, num_preds, sizeof(nir_phi_src), cmp_phi_src); > > Sorting seems like as good a way as any. Another option would be to > do like we do for commutative ALU ops and multiply. Might be more > efficient but it probably doesn't matter too much. Perhaps, although sorting se
[Mesa-dev] [PATCH 1/2] i965/mt: Declare some functions as static
intel_tiling_supports_non_msrt_mcs() and intel_miptree_is_fast_clear_capable() are not used outside of intel_mipmap_tree.c. --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 4 ++-- src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 6 +- 2 files changed, 3 insertions(+), 7 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index ffc356c..05dc291 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -160,7 +160,7 @@ intel_get_non_msrt_mcs_alignment(struct intel_mipmap_tree *mt, } } -bool +static bool intel_tiling_supports_non_msrt_mcs(struct brw_context *brw, unsigned tiling) { /* From the Ivy Bridge PRM, Vol2 Part1 11.7 "MCS Buffer for Render @@ -193,7 +193,7 @@ intel_tiling_supports_non_msrt_mcs(struct brw_context *brw, unsigned tiling) * - MCS buffer for non-MSRT is supported only for RT formats 32bpp, * 64bpp, and 128bpp. */ -bool +static bool intel_miptree_is_fast_clear_capable(struct brw_context *brw, struct intel_mipmap_tree *mt) { diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h index 486e5c6..805cd71 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h @@ -658,11 +658,7 @@ struct intel_mipmap_tree void intel_get_non_msrt_mcs_alignment(struct intel_mipmap_tree *mt, unsigned *width_px, unsigned *height); -bool -intel_tiling_supports_non_msrt_mcs(struct brw_context *brw, unsigned tiling); -bool -intel_miptree_is_fast_clear_capable(struct brw_context *brw, -struct intel_mipmap_tree *mt); + bool intel_miptree_alloc_non_msrt_mcs(struct brw_context *brw, struct intel_mipmap_tree *mt); -- 2.5.0.342.g44e0223 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] i965: Fix intel_miptree_is_fast_clear_capable()
There are three types of fast clears: a. fast depth clears b. fast singlesample color clears c. fast multisample color clears Function intel_miptree_is_fast_clear_capable() checks if a miptree supports fast clears of type (b). Rename the function to disambiguate what it does: old: intel_miptree_is_fast_clear_capable new: intel_miptree_supports_non_msrt_fast_clear The functionally *accidentally* rejected multisampled color surfaces because it thought they were singlesample array surfaces. Fix that by explicitly rejecting surfaces with samples > 1. This fix would have been needed before we enabled layered fast singlesample color clears (introduced in gen8), which we want to do eventually. For now, though, this patch changes no behavior; it just fixes how the driver chooses its behavior. --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 24 +++- 1 file changed, 19 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 05dc291..a169c41 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -194,8 +194,8 @@ intel_tiling_supports_non_msrt_mcs(struct brw_context *brw, unsigned tiling) * 64bpp, and 128bpp. */ static bool -intel_miptree_is_fast_clear_capable(struct brw_context *brw, -struct intel_mipmap_tree *mt) +intel_miptree_supports_non_msrt_fast_clear(struct brw_context *brw, + struct intel_mipmap_tree *mt) { /* MCS support does not exist prior to Gen7 */ if (brw->gen < 7) @@ -204,6 +204,10 @@ intel_miptree_is_fast_clear_capable(struct brw_context *brw, if (mt->disable_aux_buffers) return false; + /* This function applies only to non-multisampled render targets. */ + if (mt->num_samples > 1) + return false; + /* MCS is only supported for color buffers */ switch (_mesa_get_format_base_format(mt->format)) { case GL_DEPTH_COMPONENT: @@ -222,7 +226,16 @@ intel_miptree_is_fast_clear_capable(struct brw_context *brw, return false; } + + /* Check for layered surfaces. */ if (mt->physical_depth0 != 1) { + /* Multisample surfaces with the CMS layout are not layered surfaces, +* yet still have physical_depth0 > 1. Assert that we don't +* accidentally reject a multisampled surface here. We should have +* rejected it earlier by explicitly checking the sample count. +*/ + assert(mt->num_samples <= 1); + if (brw->gen >= 8) { perf_debug("Layered fast clear - giving up. (%dx%d%d)\n", mt->logical_width0, mt->logical_height0, @@ -494,7 +507,7 @@ intel_miptree_create_layout(struct brw_context *brw, * 7 | ? |? * 6 | ? |? */ - if (intel_miptree_is_fast_clear_capable(brw, mt)) { + if (intel_miptree_supports_non_msrt_fast_clear(brw, mt)) { if (brw->gen >= 9 || (brw->gen == 8 && num_samples <= 1)) layout_flags |= MIPTREE_LAYOUT_FORCE_HALIGN16; } else if (brw->gen >= 9 && num_samples > 1) { @@ -692,7 +705,7 @@ intel_miptree_create(struct brw_context *brw, * clear actually occurs. */ if (intel_tiling_supports_non_msrt_mcs(brw, mt->tiling) && - intel_miptree_is_fast_clear_capable(brw, mt)) { + intel_miptree_supports_non_msrt_fast_clear(brw, mt)) { mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_RESOLVED; assert(brw->gen < 8 || mt->halign == 16 || num_samples <= 1); } @@ -800,8 +813,9 @@ intel_update_winsys_renderbuffer_miptree(struct brw_context *intel, * clear actually occurs. */ if (intel_tiling_supports_non_msrt_mcs(intel, singlesample_mt->tiling) && - intel_miptree_is_fast_clear_capable(intel, singlesample_mt)) + intel_miptree_supports_non_msrt_fast_clear(intel, singlesample_mt)) { singlesample_mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_RESOLVED; + } if (num_samples == 0) { intel_miptree_release(&irb->mt); -- 2.5.0.342.g44e0223 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/2] i965: A few miptree fixes
Chad Versace (2): i965/mt: Declare some functions as static i965: Fix intel_miptree_is_fast_clear_capable() src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 28 --- src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 6 +- 2 files changed, 22 insertions(+), 12 deletions(-) -- 2.5.0.342.g44e0223 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 4/7] nir: add an instruction set API
On Thu, Oct 1, 2015 at 8:14 AM, Connor Abbott wrote: > On Wed, Sep 30, 2015 at 12:35 PM, Jason Ekstrand wrote: >> On Wed, Sep 30, 2015 at 8:11 AM, Connor Abbott wrote: >>> This will replace direct usage of nir_instrs_equal() in the CSE pass, >>> which reduces an O(n^2) algorithm with an effectively O(n) one. It'll >>> also be useful for implementing GVN on top of GCM. >>> >>> v2: >>> - Add texture support. >>> - Add more comments. >>> - Rename instr_can_hash() to instr_can_rewrite() since it's really more >>> about whether its uses can be rewritten, and it's implicitly used by >>> nir_instrs_equal() as well. >>> - Rename nir_instr_set_add() to nir_instr_set_add_or_rewrite() (Jason). >>> - Make the HASH() macro less magical (Topi). >>> - Rewrite the commit message. >>> >>> Signed-off-by: Connor Abbott >>> --- >>> src/glsl/nir/nir_instr_set.c | 311 >>> +++ >>> src/glsl/nir/nir_instr_set.h | 35 + >>> 2 files changed, 346 insertions(+) >>> >>> diff --git a/src/glsl/nir/nir_instr_set.c b/src/glsl/nir/nir_instr_set.c >>> index 72ab048..a6f2226 100644 >>> --- a/src/glsl/nir/nir_instr_set.c >>> +++ b/src/glsl/nir/nir_instr_set.c >>> @@ -23,6 +23,178 @@ >>> >>> #include "nir_instr_set.h" >>> >>> +#define HASH(hash, data) _mesa_fnv32_1a_accumulate((hash), (data)) >>> + >>> +static uint32_t >>> +hash_src(uint32_t hash, const nir_src *src) >>> +{ >>> + assert(src->is_ssa); >>> + hash = HASH(hash, src->ssa); >>> + return hash; >>> +} >>> + >>> +static uint32_t >>> +hash_alu_src(uint32_t hash, const nir_alu_src *src, unsigned >>> num_components) >>> +{ >>> + hash = HASH(hash, src->abs); >>> + hash = HASH(hash, src->negate); >>> + >>> + for (unsigned i = 0; i < num_components; i++) >>> + hash = HASH(hash, src->swizzle[i]); >>> + >>> + hash = hash_src(hash, &src->src); >>> + return hash; >>> +} >>> + >>> +static uint32_t >>> +hash_alu(uint32_t hash, const nir_alu_instr *instr) >>> +{ >>> + hash = HASH(hash, instr->op); >>> + hash = HASH(hash, instr->dest.dest.ssa.num_components); >> >> Why are you hasing the dest.ssa.num_components instead of the >> writemask? They should be the same in any case, but this seems a bit >> out-of-place. > > No particular reason. I can change it if you want. > >> >>> + >>> + if (nir_op_infos[instr->op].algebraic_properties & >>> NIR_OP_IS_COMMUTATIVE) { >>> + assert(nir_op_infos[instr->op].num_inputs == 2); >>> + uint32_t hash0 = hash_alu_src(hash, &instr->src[0], >>> + >>> nir_ssa_alu_instr_src_components(instr, 0)); >>> + uint32_t hash1 = hash_alu_src(hash, &instr->src[1], >>> + >>> nir_ssa_alu_instr_src_components(instr, 1)); >>> + /* For commutative operations, we need some commutative way of >>> + * combining the hashes. One option would be to XOR them but that >>> + * means that anything with two identical sources will hash to 0 and >>> + * that's common enough we probably don't want the guaranteed >>> + * collision. Either addition or multiplication will also work. >>> + */ >>> + hash = hash0 * hash1; >>> + } else { >>> + for (unsigned i = 0; i < nir_op_infos[instr->op].num_inputs; i++) { >>> + hash = hash_alu_src(hash, &instr->src[i], >>> + nir_ssa_alu_instr_src_components(instr, i)); >>> + } >>> + } >>> + >>> + return hash; >>> +} >>> + >>> +static uint32_t >>> +hash_load_const(uint32_t hash, const nir_load_const_instr *instr) >>> +{ >>> + hash = HASH(hash, instr->def.num_components); >>> + >>> + hash = _mesa_fnv32_1a_accumulate_block(hash, instr->value.f, >>> + instr->def.num_components >>> + * sizeof(instr->value.f[0])); >>> + >>> + return hash; >>> +} >>> + >>> +static int >>> +cmp_phi_src(const void *data1, const void *data2) >>> +{ >>> + const nir_phi_src *src1 = data1, *src2 = data2; >>> + return src1->pred->index - src2->pred->index; >>> +} >>> + >>> +static uint32_t >>> +hash_phi(uint32_t hash, const nir_phi_instr *instr) >>> +{ >>> + hash = HASH(hash, instr->instr.block); >>> + >>> + /* sort sources by predecessor index, since the order shouldn't matter >>> */ >>> + unsigned num_preds = instr->instr.block->predecessors->entries; >>> + nir_phi_src *srcs = malloc(num_preds * sizeof(nir_phi_src)); >> >> You could use NIR_VLA and put it on the stack. That'd probably be more >> efficient. Also, why are you copying the entire source and not just >> making a list of pointers? > > Good point. It does prevent a pointer dereference when comparing, but > that's probably less expensive than copying the entire thing when > swapping. > >> >>> + unsigned i = 0; >>> + nir_foreach_phi_src(instr, src) { >>> + srcs[i++] = *src; >>> + } >>> + qsort(srcs, num_preds, sizeof(nir_phi_src), cmp_phi_src); >> >> Sorting seems
Re: [Mesa-dev] [PATCH 3/3] radeon/llvm: Initialize gallivm targets when initializing the AMDGPU target v2
On Thu, Oct 01, 2015 at 03:40:45PM +0100, Emil Velikov wrote: > On 30 September 2015 at 19:59, Tom Stellard wrote: > > This fixes a race condition in the glx-multithreaded-shader-compile > > test. > > > > v2: > > - Replace gallivm_init_llvm_{begin,end}() with > > gallivm_init_llvm_targets(). > > > Thanks for sticking with my suggestion Tom. > > Fwiw the series is: > Reviewed-by: Emil Velikov > > Can you please confirm if we need the threadsafe wrappers, with this > series in place ? If we do, humble ping :) > We do still need the threadsafe wrappers. -Tom > -Emil > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965/mt: Declare some functions as static
On Thu, Oct 1, 2015 at 8:20 AM, Chad Versace wrote: > intel_tiling_supports_non_msrt_mcs() and > intel_miptree_is_fast_clear_capable() are not used outside of > intel_mipmap_tree.c. > --- > src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 4 ++-- > src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 6 +- > 2 files changed, 3 insertions(+), 7 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > index ffc356c..05dc291 100644 > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > @@ -160,7 +160,7 @@ intel_get_non_msrt_mcs_alignment(struct intel_mipmap_tree > *mt, > } > } > > -bool > +static bool > intel_tiling_supports_non_msrt_mcs(struct brw_context *brw, unsigned tiling) > { > /* From the Ivy Bridge PRM, Vol2 Part1 11.7 "MCS Buffer for Render > @@ -193,7 +193,7 @@ intel_tiling_supports_non_msrt_mcs(struct brw_context > *brw, unsigned tiling) > * - MCS buffer for non-MSRT is supported only for RT formats 32bpp, > * 64bpp, and 128bpp. > */ > -bool > +static bool > intel_miptree_is_fast_clear_capable(struct brw_context *brw, > struct intel_mipmap_tree *mt) > { > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h > index 486e5c6..805cd71 100644 > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h > @@ -658,11 +658,7 @@ struct intel_mipmap_tree > void > intel_get_non_msrt_mcs_alignment(struct intel_mipmap_tree *mt, > unsigned *width_px, unsigned *height); > -bool > -intel_tiling_supports_non_msrt_mcs(struct brw_context *brw, unsigned tiling); > -bool > -intel_miptree_is_fast_clear_capable(struct brw_context *brw, > -struct intel_mipmap_tree *mt); > + > bool > intel_miptree_alloc_non_msrt_mcs(struct brw_context *brw, > struct intel_mipmap_tree *mt); > -- > 2.5.0.342.g44e0223 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev Both patches are: Reviewed-by: Anuj Phogat ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: add GL_UNSIGNED_INT_24_8 to _mesa_pack_depth_span
On 10/01/2015 03:17 PM, Iago Toral wrote: On Thu, 2015-10-01 at 08:28 +0300, Tapani Pälli wrote: Patch adds missing type (used with NV_read_depth) so that it gets handled correctly. Also add type to _mesa_problem output to aid debugging. Signed-off-by: Tapani Pälli --- src/mesa/main/pack.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/pack.c b/src/mesa/main/pack.c index 7147fd6..54a0c42 100644 --- a/src/mesa/main/pack.c +++ b/src/mesa/main/pack.c @@ -1074,6 +1074,7 @@ _mesa_pack_depth_span( struct gl_context *ctx, GLuint n, GLvoid *dest, } } break; + case GL_UNSIGNED_INT_24_8: Is it okay to store 32-bit integers in this case? that's what the code below does. The spec says that the 8 stencil bits are undefined, but don't we need to convert the depth value to a 24-bit integer scale? (i.e. make 1.0 translate to 2^24-1 not 2^32-1). Oh, I thought stencil was included, I'll check if there's any existing app/test that would actually use this. Iago case GL_UNSIGNED_INT: { GLuint *dst = (GLuint *) dest; @@ -1124,7 +1125,8 @@ _mesa_pack_depth_span( struct gl_context *ctx, GLuint n, GLvoid *dest, } break; default: - _mesa_problem(ctx, "bad type in _mesa_pack_depth_span"); + _mesa_problem(ctx, "bad type in _mesa_pack_depth_span (%s)", +_mesa_enum_to_string(dstType)); } free(depthCopy); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC] pipe-loader: abstract GALLIUM_STATIC_TARGETS behind pipe_loader API
On Thu, Oct 1, 2015 at 10:30 AM, Emil Velikov wrote: > Hi Rob, > > I believe that Marek managed to nicely explain how things work. > Ideally you'll be able to get on the same boat (i.e. the Android C > runtime/linker won't give you the finger). Regardless some comments > inline. > > On 30 September 2015 at 21:43, Rob Clark wrote: >> From: Rob Clark >> >> Not actually working yet, ie. doesn't even compile yet, but an idea. >> >> Initial motivation was for drm_gralloc/pipe, which is essentially a sort >> of mini state-tracker, that needs to be able to share pipe_screen with >> libGL linked into the same process (to ensure we didn't end up with >> duplicate pipe_resource's, etc). I think same situation happens with >> vdpau+gl interop. >> >> Currently drm_gralloc/pipe (and other state trackers??) statically link >> the winsys code, which completely defeats the purpose magic in the >> winsys code to detect when the same device is opened multiple times and >> return the same pipe_screen (since you end up w/ multiple copies of the >> hashtable in the same process). See for example: >> >> 5bb41d9094b3c9bdf0669fd55418981ed83347e3 >> fee0686c21c631d96d6042741267a3c218c23ffc >> >> Rough idea is that we should have something like libgallium.so which >> contains the pipe-loader API, and then optionally (depending on >> GALLIUM_STATIC_TARGETS) all of the gallium pipe drivers. The various >> different state trackers would link against (or dlopen()) libgallium.so >> and use the pipe-loader API to get themselves a pipe_screen (and config >> params, etc). And then you end up with: >> >> +---+ >> | l | >> clover --> | i | >> | b | >> mesa-st --> | g | >> | a |--> pipe driver >> vdapau --> | l | >> | l | >> gralloc --> | i | >> | u | >> xa --> | m | >> | | >> +---+ >> or: >> +---+-+ >> | l | | >> clover --> | i | | >> | b | | >> mesa-st --> | g | | >> | a | pipe driver | >> vdapau --> | l | | >> | l | | >> gralloc --> | i | | >> | u | | >> xa --> | m | | >> | | | >> +---+-+ >> >> depending on GALLIUM_STATIC_TARGETS. Either way, all the state trackers >> in the same process share a single copy of the hashtable in the winsys >> code which allows them to share the same pipe_screen. >> > As others I'm not too excited about adding new library/ies. Amongst > others, It boils down to people mix'n'matching things, causing all > sorts of problems. hmm, sounds like a distro issue, doesn't it? I mean in non-GALLIUM_STATIC_TARGETS build configuration (which, afaict doesn't even work these days) you couldn't mix and match state tracker .so's and pipe-driver .so's.. If really needed we could stuff module version symbols all around and refuse to load with nice error msg if mismatch.. seems like a relatively solvable problem.. >> I think that ends up being an extra level of library indirection vs >> current state w/ pipe drivers, ie. with mesa dri loader stuff directly >> loading gallium_dri.so which contains all the drivers plus mesa state >> tracker. If this was concerning to someone, what I'd suggest would be >> to, for all the state trackers that already have some sort of loader >> sitting between them and the user, just pull them directly into the >> mega-mega libgallium.so, ie. something like: >> >> +-+---+-+ >> | | | | >> | clover | l | | >> | | i | | >> | mesa-st | b | | >> | | g | pipe driver | >> | vdapau | a | | >> | | l | | >> +-| l | | >> | i | | >> gralloc --> | u | | >> | m | | >> xa --> | | | >> | | | >> +---+-+ >> > This won't work due to funny dependencies. I.e. you don't want to > force xcb and friends onto the dri module, clover etc, just because > vdpau needs it. Hmm, perhaps.. although I thought dri already was converted to use xcb.. otherwise I'd guess vdpau and mesa-st would have roughly the same sorts of dependencies. Although perhaps it would be funny for clover to have those dep's.. >> Anyways, I'm far from an expert in the build architecture of things so >> I might have some facts wrong or be a bit confused. And I'll probably >> regret bringing the subject up. But somehow or another it seems like >> it would be good to (a) clean up all the GALLIUM_STATIC_TARGETS >> ifdeffery spread throughou
[Mesa-dev] [PATCH] mesa: Add missing _mm_mfence() before streaming loads.
According to the Intel Software Development Manual: Streaming loads may be weakly ordered and may appear to software to execute out of order with respect to other memory operations. Software must explicitly use fences (e.g. MFENCE) if it needs to preserve order among streaming loads or between streaming loads and other memory operations. That is, a memory fence is needed to preserve the order between the GPU writing the buffer and the streaming loads reading it back. Reported-by: Joseph Nuzman --- src/mesa/main/streaming-load-memcpy.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/main/streaming-load-memcpy.c b/src/mesa/main/streaming-load-memcpy.c index d7147af..32854b6 100644 --- a/src/mesa/main/streaming-load-memcpy.c +++ b/src/mesa/main/streaming-load-memcpy.c @@ -54,16 +54,19 @@ _mesa_streaming_load_memcpy(void *restrict dst, void *restrict src, size_t len) memcpy(d, s, MIN2(bytes_before_alignment_boundary, len)); d = (char *)ALIGN((uintptr_t)d, 16); s = (char *)ALIGN((uintptr_t)s, 16); len -= MIN2(bytes_before_alignment_boundary, len); } + if (len >= 64) + _mm_mfence(); + while (len >= 64) { __m128i *dst_cacheline = (__m128i *)d; __m128i *src_cacheline = (__m128i *)s; __m128i temp1 = _mm_stream_load_si128(src_cacheline + 0); __m128i temp2 = _mm_stream_load_si128(src_cacheline + 1); __m128i temp3 = _mm_stream_load_si128(src_cacheline + 2); __m128i temp4 = _mm_stream_load_si128(src_cacheline + 3); -- 2.4.9 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/23] nir: Introduce new nir_intrinsic_load_per_vertex_input intrinsics.
On Wednesday, September 30, 2015 12:58:09 AM Kenneth Graunke wrote: > Discuss :) NAK on patches 3-5. It turns out they break if the outermost array index is non-constant. I've got two new patches that replace them. signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC 2/2] mesa: Add {Num}UniformBlocks and {Num}ShaderStorageBlocks to gl_shader_program
On Thu, Oct 1, 2015 at 7:09 AM, Iago Toral Quiroga wrote: > These arrays provide backends with separate index spaces for UBOS and SSBOs. > --- > src/glsl/linker.cpp | 35 +++ > src/glsl/standalone_scaffolding.cpp | 9 + > src/mesa/main/mtypes.h | 6 ++ > 3 files changed, 50 insertions(+) > > diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp > index e6eba94..3da773d 100644 > --- a/src/glsl/linker.cpp > +++ b/src/glsl/linker.cpp > @@ -4107,6 +4107,41 @@ link_shaders(struct gl_context *ctx, struct > gl_shader_program *prog) >} > } > > + /* Split prog->BufferInterfaceBlocks into prog->UniformBlocks and > +* prog->ShaderStorageBlocks, so that drivers that need separate index > +* spaces for each set can have that. > +*/ > + unsigned num_ubo_blocks; > + unsigned num_ssbo_blocks; > + num_ubo_blocks = 0; > + num_ssbo_blocks = 0; > + for (unsigned i = 0; i < prog->NumBufferInterfaceBlocks; i++) { > + if (prog->BufferInterfaceBlocks[i].IsShaderStorage) > + num_ssbo_blocks++; > + else > + num_ubo_blocks++; > + } > + > + prog->UniformBlocks = > + ralloc_array(mem_ctx, gl_uniform_block *, num_ubo_blocks); > + prog->NumUniformBlocks = 0; > + > + prog->ShaderStorageBlocks = > + ralloc_array(mem_ctx, gl_uniform_block *, num_ssbo_blocks); > + prog->NumShaderStorageBlocks = 0; > + > + for (unsigned i = 0; i < prog->NumBufferInterfaceBlocks; i++) { > + if (prog->BufferInterfaceBlocks[i].IsShaderStorage) > + prog->ShaderStorageBlocks[prog->NumShaderStorageBlocks++] = > +&prog->BufferInterfaceBlocks[i]; > + else > + prog->UniformBlocks[prog->NumUniformBlocks++] = > +&prog->BufferInterfaceBlocks[i]; > + } Shouldn't this go through and also adjust the indices of the linked programs? Or... something along those lines? With this, I still need a remapping table to go from the index passed to a load_ssbo/load_ubo instruction to a ssbo/ubo index. Perhaps a few well-placed helper functions can alleviate this? Also this should erase the need for some of the O(n) iterations that have sprung up as a result of this combined list. IMHO ideally the BufferInterfaceBlocks list would get freed at the end of this function. But I understand that this will require work, and the onus is probably on me (or anyone wanting to add ssbo support to other drivers) to do it, or work around it. -ilia > + > + assert(prog->NumUniformBlocks + prog->NumShaderStorageBlocks == > + prog->NumBufferInterfaceBlocks); > + > /* FINISHME: Assign fragment shader output locations. */ > > done: > diff --git a/src/glsl/standalone_scaffolding.cpp > b/src/glsl/standalone_scaffolding.cpp > index 0c53589..658245f 100644 > --- a/src/glsl/standalone_scaffolding.cpp > +++ b/src/glsl/standalone_scaffolding.cpp > @@ -102,6 +102,15 @@ _mesa_clear_shader_program_data(struct gl_shader_program > *shProg) > ralloc_free(shProg->BufferInterfaceBlocks); > shProg->BufferInterfaceBlocks = NULL; > shProg->NumBufferInterfaceBlocks = 0; > + > + ralloc_free(shProg->UniformBlocks); > + shProg->UniformBlocks = NULL; > + shProg->NumUniformBlocks = 0; > + > + ralloc_free(shProg->ShaderStorageBlocks); > + shProg->ShaderStorageBlocks = NULL; > + shProg->NumShaderStorageBlocks = 0; > + > for (i = 0; i < MESA_SHADER_STAGES; i++) { >ralloc_free(shProg->UniformBlockStageIndex[i]); >shProg->UniformBlockStageIndex[i] = NULL; > diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h > index 347da14..2362f54 100644 > --- a/src/mesa/main/mtypes.h > +++ b/src/mesa/main/mtypes.h > @@ -2692,6 +2692,12 @@ struct gl_shader_program > unsigned NumBufferInterfaceBlocks; > struct gl_uniform_block *BufferInterfaceBlocks; > > + unsigned NumUniformBlocks; > + struct gl_uniform_block **UniformBlocks; > + > + unsigned NumShaderStorageBlocks; > + struct gl_uniform_block **ShaderStorageBlocks; > + > /** > * Indices into the _LinkedShaders's UniformBlocks[] array for each stage > * they're used in, or -1. > -- > 1.9.1 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 1/2] nir/lower_io: Return has_indirect flag from get_io_offset().
get_io_offset() already walks the dereference chain and discovers whether or not we have an indirect; we can just return that rather than computing it a second time. This means moving the call a bit earlier. More importantly, I'm about to introduce special handling for the outermost array index, which would require changing the dereference we pass to deref_has_indirect(). Or we can just eliminate that. Signed-off-by: Kenneth Graunke --- src/glsl/nir/nir_lower_io.c | 41 ++--- 1 file changed, 14 insertions(+), 27 deletions(-) This replaces patches 3-5 of my recent 23 patch series: [PATCH 03/23] nir/lower_io: Switch on shader stage for inputs in load_op(). [PATCH 04/23] nir/lower_io: Make get_io_offset take a nir_deref, not a nir_deref_var. [PATCH 05/23] nir: Introduce new nir_intrinsic_load_per_vertex_input intrinsics. diff --git a/src/glsl/nir/nir_lower_io.c b/src/glsl/nir/nir_lower_io.c index f32c09d..80967fc 100644 --- a/src/glsl/nir/nir_lower_io.c +++ b/src/glsl/nir/nir_lower_io.c @@ -63,22 +63,9 @@ nir_assign_var_locations(struct exec_list *var_list, unsigned *size, *size = location; } -static bool -deref_has_indirect(nir_deref_var *deref) -{ - for (nir_deref *tail = deref->deref.child; tail; tail = tail->child) { - if (tail->deref_type == nir_deref_type_array) { - nir_deref_array *arr = nir_deref_as_array(tail); - if (arr->deref_array_type == nir_deref_array_type_indirect) -return true; - } - } - - return false; -} - static unsigned -get_io_offset(nir_deref_var *deref, nir_instr *instr, nir_src *indirect, +get_io_offset(nir_deref_var *deref, nir_instr *instr, + nir_src *indirect, bool *has_indirect, struct lower_io_state *state) { bool found_indirect = false; @@ -122,6 +109,7 @@ get_io_offset(nir_deref_var *deref, nir_instr *instr, nir_src *indirect, } } + *has_indirect = found_indirect; return base_offset; } @@ -169,17 +157,17 @@ nir_lower_io_block(nir_block *block, void *void_state) if (mode != nir_var_shader_in && mode != nir_var_uniform) continue; - bool has_indirect = deref_has_indirect(intrin->variables[0]); + nir_src indirect; + bool has_indirect; + + unsigned offset = get_io_offset(intrin->variables[0], &intrin->instr, + &indirect, &has_indirect, state); nir_intrinsic_instr *load = nir_intrinsic_instr_create(state->mem_ctx, load_op(mode, has_indirect)); load->num_components = intrin->num_components; - nir_src indirect; - unsigned offset = get_io_offset(intrin->variables[0], - &intrin->instr, &indirect, state); - unsigned location = intrin->variables[0]->var->data.driver_location; if (mode == nir_var_uniform) { load->const_index[0] = location; @@ -209,7 +197,12 @@ nir_lower_io_block(nir_block *block, void *void_state) if (intrin->variables[0]->var->data.mode != nir_var_shader_out) continue; - bool has_indirect = deref_has_indirect(intrin->variables[0]); + nir_src indirect; + bool has_indirect; + + unsigned offset = get_io_offset(intrin->variables[0], &intrin->instr, + &indirect, &has_indirect, state); + offset += intrin->variables[0]->var->data.driver_location; nir_intrinsic_op store_op; if (has_indirect) { @@ -221,12 +214,6 @@ nir_lower_io_block(nir_block *block, void *void_state) nir_intrinsic_instr *store = nir_intrinsic_instr_create(state->mem_ctx, store_op); store->num_components = intrin->num_components; - - nir_src indirect; - unsigned offset = get_io_offset(intrin->variables[0], - &intrin->instr, &indirect, state); - offset += intrin->variables[0]->var->data.driver_location; - store->const_index[0] = offset; nir_src_copy(&store->src[0], &intrin->src[0], store); -- 2.5.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 2/2] nir: Introduce new nir_intrinsic_load_per_vertex_input intrinsics.
Geometry and tessellation shaders process multiple vertices; their inputs are arrays indexed by the vertex number. While GLSL makes this look like a normal array, it can be very different behind the scenes. On Intel hardware, all inputs for a particular vertex are stored together - as if they were grouped into a single struct. This means that consecutive elements of these top-level arrays are not contiguous. In fact, they may sometimes be in completely disjoint memory segments. NIR's existing load_input intrinsics are awkward for this case, as they distill everything down to a single offset. We'd much rather keep the vertex ID separate, but build up an offset as normal beyond that. This patch introduces new nir_intrinsic_load_per_vertex_input intrinsics to handle this case. They work like ordinary load_input intrinsics, but have an extra source (src[0]) which represents the outermost array index. Signed-off-by: Kenneth Graunke --- src/glsl/nir/nir_intrinsics.h | 1 + src/glsl/nir/nir_lower_io.c | 54 ++--- src/glsl/nir/nir_print.c | 2 + src/mesa/drivers/dri/i965/brw_nir.c | 13 +- src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp | 58 +++ 5 files changed, 86 insertions(+), 42 deletions(-) diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h index ac4c2ba..263d8c1 100644 --- a/src/glsl/nir/nir_intrinsics.h +++ b/src/glsl/nir/nir_intrinsics.h @@ -228,6 +228,7 @@ SYSTEM_VALUE(num_work_groups, 3, 0) LOAD(uniform, 0, 2, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) LOAD(ubo, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) LOAD(input, 0, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) +LOAD(per_vertex_input, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) LOAD(ssbo, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE) /* diff --git a/src/glsl/nir/nir_lower_io.c b/src/glsl/nir/nir_lower_io.c index 80967fc..d066ca6 100644 --- a/src/glsl/nir/nir_lower_io.c +++ b/src/glsl/nir/nir_lower_io.c @@ -63,8 +63,20 @@ nir_assign_var_locations(struct exec_list *var_list, unsigned *size, *size = location; } +/** + * Returns true if we're processing a stage whose inputs are arrays indexed + * by a vertex number (such as geometry shader inputs). + */ +static bool +stage_uses_per_vertex_inputs(struct lower_io_state *state) +{ + gl_shader_stage stage = state->builder.shader->stage; + return stage == MESA_SHADER_GEOMETRY; +} + static unsigned get_io_offset(nir_deref_var *deref, nir_instr *instr, + nir_src *vertex_index, nir_src *indirect, bool *has_indirect, struct lower_io_state *state) { @@ -75,6 +87,22 @@ get_io_offset(nir_deref_var *deref, nir_instr *instr, b->cursor = nir_before_instr(instr); nir_deref *tail = &deref->deref; + + /* For per-vertex input arrays (i.e. geometry shader inputs), keep the +* outermost array index separate. Process the rest normally. +*/ + if (vertex_index != NULL) { + tail = tail->child; + assert(tail->deref_type == nir_deref_type_array); + nir_deref_array *deref_array = nir_deref_as_array(tail); + + nir_ssa_def *vtx = nir_imm_int(b, deref_array->base_offset); + if (deref_array->deref_array_type == nir_deref_array_type_indirect) { + vtx = nir_iadd(b, vtx, nir_ssa_for_src(b, deref_array->indirect, 1)); + } + *vertex_index = nir_src_for_ssa(vtx); + } + while (tail->child != NULL) { const struct glsl_type *parent_type = tail->type; tail = tail->child; @@ -114,13 +142,19 @@ get_io_offset(nir_deref_var *deref, nir_instr *instr, } static nir_intrinsic_op -load_op(nir_variable_mode mode, bool has_indirect) +load_op(struct lower_io_state *state, +nir_variable_mode mode, bool per_vertex, bool has_indirect) { nir_intrinsic_op op; switch (mode) { case nir_var_shader_in: - op = has_indirect ? nir_intrinsic_load_input_indirect : - nir_intrinsic_load_input; + if (per_vertex) { + op = has_indirect ? nir_intrinsic_load_per_vertex_input_indirect : + nir_intrinsic_load_per_vertex_input; + } else { + op = has_indirect ? nir_intrinsic_load_input_indirect : + nir_intrinsic_load_input; + } break; case nir_var_uniform: op = has_indirect ? nir_intrinsic_load_uniform_indirect : @@ -157,15 +191,21 @@ nir_lower_io_block(nir_block *block, void *void_state) if (mode != nir_var_shader_in && mode != nir_var_uniform) continue; + bool per_vertex = mode == nir_var_shader_in && + stage_uses_per_vertex_inputs(state); + + nir_src vertex_index; nir_src indirect; bool has_indirect; unsigned offset = get_io_offset(intrin->variables[0], &intrin->instr, +
Re: [Mesa-dev] [PATCH] mesa: Add missing _mm_mfence() before streaming loads.
On 2015-10-01 10:11:33, Matt Turner wrote: > According to the Intel Software Development Manual: How about a more specific doc location? According to the Intel Software Development Manual (Volume 1: Basic Architecture, 12.10.3 Streaming Load Hint Instruction): >Streaming loads may be weakly ordered and may appear to software to >execute out of order with respect to other memory operations. >Software must explicitly use fences (e.g. MFENCE) if it needs to >preserve order among streaming loads or between streaming loads and >other memory operations. Does this mean we need a mfence following the load as well? -Jordan > That is, a memory fence is needed to preserve the order between the GPU > writing the buffer and the streaming loads reading it back. > > Reported-by: Joseph Nuzman > --- > src/mesa/main/streaming-load-memcpy.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/src/mesa/main/streaming-load-memcpy.c > b/src/mesa/main/streaming-load-memcpy.c > index d7147af..32854b6 100644 > --- a/src/mesa/main/streaming-load-memcpy.c > +++ b/src/mesa/main/streaming-load-memcpy.c > @@ -54,16 +54,19 @@ _mesa_streaming_load_memcpy(void *restrict dst, void > *restrict src, size_t len) > >memcpy(d, s, MIN2(bytes_before_alignment_boundary, len)); > >d = (char *)ALIGN((uintptr_t)d, 16); >s = (char *)ALIGN((uintptr_t)s, 16); >len -= MIN2(bytes_before_alignment_boundary, len); > } > > + if (len >= 64) > + _mm_mfence(); > + > while (len >= 64) { >__m128i *dst_cacheline = (__m128i *)d; >__m128i *src_cacheline = (__m128i *)s; > >__m128i temp1 = _mm_stream_load_si128(src_cacheline + 0); >__m128i temp2 = _mm_stream_load_si128(src_cacheline + 1); >__m128i temp3 = _mm_stream_load_si128(src_cacheline + 2); >__m128i temp4 = _mm_stream_load_si128(src_cacheline + 3); > -- > 2.4.9 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: add GL_UNSIGNED_INT_24_8 to _mesa_pack_depth_span
On 10/01/2015 09:50 AM, Tapani Pälli wrote: > On 10/01/2015 03:17 PM, Iago Toral wrote: >> On Thu, 2015-10-01 at 08:28 +0300, Tapani Pälli wrote: >>> Patch adds missing type (used with NV_read_depth) so that it gets NV_read_depth_stencil? >>> handled correctly. Also add type to _mesa_problem output to aid >>> debugging. >>> >>> Signed-off-by: Tapani Pälli >>> --- >>> src/mesa/main/pack.c | 4 +++- >>> 1 file changed, 3 insertions(+), 1 deletion(-) >>> >>> diff --git a/src/mesa/main/pack.c b/src/mesa/main/pack.c >>> index 7147fd6..54a0c42 100644 >>> --- a/src/mesa/main/pack.c >>> +++ b/src/mesa/main/pack.c >>> @@ -1074,6 +1074,7 @@ _mesa_pack_depth_span( struct gl_context *ctx, >>> GLuint n, GLvoid *dest, >>>} >>> } >>> break; >>> + case GL_UNSIGNED_INT_24_8: >> Is it okay to store 32-bit integers in this case? that's what the code >> below does. The spec says that the 8 stencil bits are undefined, but >> don't we need to convert the depth value to a 24-bit integer scale? >> (i.e. make 1.0 translate to 2^24-1 not 2^32-1). I was going to make the same comment. > Oh, I thought stencil was included, I'll check if there's any existing > app/test that would actually use this. The incoming depthSpan is a float*, so there's no stencil there. I think we'll need to cobble together a piglit test for this case. Can this only be hit of NV_read_depth_stencil is exported? I'm just wondering because Mesa doesn't support that extension. How is this even being hit? >> Iago >> >>> case GL_UNSIGNED_INT: >>> { >>>GLuint *dst = (GLuint *) dest; >>> @@ -1124,7 +1125,8 @@ _mesa_pack_depth_span( struct gl_context *ctx, >>> GLuint n, GLvoid *dest, >>> } >>> break; >>> default: >>> - _mesa_problem(ctx, "bad type in _mesa_pack_depth_span"); >>> + _mesa_problem(ctx, "bad type in _mesa_pack_depth_span (%s)", >>> +_mesa_enum_to_string(dstType)); >>> } >>>free(depthCopy); I definitely approve of this part. If you want to split that into its own patch, that's Reviewed-by: Ian Romanick > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: add GL_UNSIGNED_INT_24_8 to _mesa_pack_depth_span
On Thu, Oct 1, 2015 at 3:12 PM, Ian Romanick wrote: > I'm just > wondering because Mesa doesn't support that extension. How is this even > being hit? See 81d2fd91a90 (mesa: add NV_read_{depth,stencil,depth_stencil} extensions) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] nir: Add a function to determine if a source is dynamically uniform
On Fri, Aug 7, 2015 at 9:31 AM, Neil Roberts wrote: > Adds nir_src_is_dynamically_uniform which returns true if the source > is known to be dynamically uniform. This will be used in a later patch > to add a workaround for cases that only work with dynamically uniform > sources. Note that the function is not definitive, it can return false > negatives (but not false positives). Currently it only detects > constants and uniform accesses. It could easily be extended to include > more cases. > --- > src/glsl/nir/nir.c | 29 + > src/glsl/nir/nir.h | 1 + > 2 files changed, 30 insertions(+) > > diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c > index 78ff886..242f0b4 100644 > --- a/src/glsl/nir/nir.c > +++ b/src/glsl/nir/nir.c > @@ -1784,6 +1784,35 @@ nir_src_as_const_value(nir_src src) > return &load->value; > } > > +/** > + * Returns true if the source is known to be dynamically uniform. Otherwise > it > + * returns false which means it may or may not be dynamically uniform but it > + * can't be determined. > + */ > +bool > +nir_src_is_dynamically_uniform(nir_src src) > +{ > + if (!src.is_ssa) > + return false; > + > + /* Constants are trivially dynamically uniform */ > + if (src.ssa->parent_instr->type == nir_instr_type_load_const) > + return true; > + > + /* As are uniform variables */ > + if (src.ssa->parent_instr->type == nir_instr_type_intrinsic) { > + nir_intrinsic_instr *intr = > nir_instr_as_intrinsic(src.ssa->parent_instr); > + > + if (intr->intrinsic == nir_intrinsic_load_uniform) > + return true; > + } > + > + /* XXX: this could have many more tests, such as when a sampler function > is > +* called with dynamically uniform arguments. > +*/ > + return false; > +} This functions seems correct as-is, so it gets a Reviewed-by: Matt Turner On top of being useful for fixing the nonconst/nonuniform piglit tests, knowing which values are uniform can allow better optimization and knowing which branches are uniform would allow us to enable Single Program Flow. Cc'ing Jason for a discussion about how we'd like to handle this kind of information in NIR. Add a bool is_uniform or something to nir_ssa_def and hook into the metadata system? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] pipe-loader: abstract GALLIUM_STATIC_TARGETS behind pipe_loader API
From: Rob Clark Signed-off-by: Rob Clark --- Drop the idea of more ambitious re-arrangement if libs, but keep the pipe-loader refactoring. With this at least drm_gralloc could still dlopen() gallium_dri.so and then use the pipe-loader API to figure out which pipe driver to load and hand back a screen. The nine st is not updated.. but I don't claim to understand the whole screen + sw-screen thing. So I figured I'd let someone who knew what they were doing update nine. Once that happens, we should change to not expose the dd_xyz fxns outside of pipe-loader, imho.. configure.ac| 6 -- src/gallium/auxiliary/pipe-loader/Makefile.am | 14 ++ src/gallium/auxiliary/pipe-loader/pipe_loader.c | 2 -- src/gallium/auxiliary/pipe-loader/pipe_loader.h | 4 src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c | 12 src/gallium/auxiliary/vl/vl_winsys_dri.c| 8 src/gallium/state_trackers/dri/dri2.c | 13 - src/gallium/state_trackers/dri/dri_screen.c | 2 -- src/gallium/state_trackers/xa/xa_tracker.c | 10 +- src/gallium/targets/dri/Makefile.am | 1 + src/gallium/targets/dri/dri.sym | 3 +++ src/gallium/targets/omx/Makefile.am | 4 +--- src/gallium/targets/va/Makefile.am | 4 +--- src/gallium/targets/vdpau/Makefile.am | 4 +--- src/gallium/targets/xa/Makefile.am | 4 +--- src/gallium/targets/xvmc/Makefile.am| 5 ++--- 16 files changed, 25 insertions(+), 71 deletions(-) diff --git a/configure.ac b/configure.ac index 1ef5fbc..7d0f9d9 100644 --- a/configure.ac +++ b/configure.ac @@ -2044,7 +2044,6 @@ gallium_require_drm_loader() { if test "x$need_pci_id$have_pci_id" = xyesno; then AC_MSG_ERROR([Gallium drm loader requires libudev >= $LIBUDEV_REQUIRED or sysfs]) fi -enable_gallium_drm_loader=yes fi if test "x$enable_va" = xyes && test "x$7" != x; then GALLIUM_TARGET_DIRS="$GALLIUM_TARGET_DIRS $7" @@ -2248,10 +2247,6 @@ if test "x$enable_gallium_loader" = xyes; then GALLIUM_PIPE_LOADER_DEFINES="$GALLIUM_PIPE_LOADER_DEFINES -DHAVE_PIPE_LOADER_DRI" fi -if test "x$enable_gallium_drm_loader" = xyes; then -GALLIUM_PIPE_LOADER_DEFINES="$GALLIUM_PIPE_LOADER_DEFINES -DHAVE_PIPE_LOADER_DRM" -fi - AC_SUBST([GALLIUM_PIPE_LOADER_DEFINES]) fi @@ -2269,7 +2264,6 @@ AM_CONDITIONAL(NEED_WINSYS_XLIB, test "x$NEED_WINSYS_XLIB" = xyes) AM_CONDITIONAL(NEED_RADEON_LLVM, test x$NEED_RADEON_LLVM = xyes) AM_CONDITIONAL(USE_R600_LLVM_COMPILER, test x$USE_R600_LLVM_COMPILER = xyes) AM_CONDITIONAL(HAVE_LOADER_GALLIUM, test x$enable_gallium_loader = xyes) -AM_CONDITIONAL(HAVE_DRM_LOADER_GALLIUM, test x$enable_gallium_drm_loader = xyes) AM_CONDITIONAL(HAVE_GALLIUM_COMPUTE, test x$enable_opencl = xyes) AM_CONDITIONAL(HAVE_MESA_LLVM, test x$MESA_LLVM = x1) AM_CONDITIONAL(USE_VC4_SIMULATOR, test x$USE_VC4_SIMULATOR = xyes) diff --git a/src/gallium/auxiliary/pipe-loader/Makefile.am b/src/gallium/auxiliary/pipe-loader/Makefile.am index 8c83799..0df5d4a 100644 --- a/src/gallium/auxiliary/pipe-loader/Makefile.am +++ b/src/gallium/auxiliary/pipe-loader/Makefile.am @@ -5,6 +5,7 @@ include $(top_srcdir)/src/gallium/Automake.inc AM_CFLAGS = \ -I$(top_srcdir)/src/loader \ -I$(top_srcdir)/src/gallium/winsys \ + $(LIBDRM_CFLAGS) \ $(GALLIUM_PIPE_LOADER_DEFINES) \ $(GALLIUM_CFLAGS) \ $(VISIBILITY_CFLAGS) @@ -12,17 +13,6 @@ AM_CFLAGS = \ noinst_LTLIBRARIES = libpipe_loader.la libpipe_loader_la_SOURCES = \ - $(COMMON_SOURCES) - -if HAVE_DRM_LOADER_GALLIUM -AM_CFLAGS += \ - $(LIBDRM_CFLAGS) - -libpipe_loader_la_SOURCES += \ + $(COMMON_SOURCES) \ $(DRM_SOURCES) -libpipe_loader_la_LIBADD = \ - $(top_builddir)/src/loader/libloader.la - -endif - diff --git a/src/gallium/auxiliary/pipe-loader/pipe_loader.c b/src/gallium/auxiliary/pipe-loader/pipe_loader.c index 8e79f85..535c2ce 100644 --- a/src/gallium/auxiliary/pipe-loader/pipe_loader.c +++ b/src/gallium/auxiliary/pipe-loader/pipe_loader.c @@ -35,9 +35,7 @@ #define MODULE_PREFIX "pipe_" static int (*backends[])(struct pipe_loader_device **, int) = { -#ifdef HAVE_PIPE_LOADER_DRM &pipe_loader_drm_probe, -#endif &pipe_loader_sw_probe }; diff --git a/src/gallium/auxiliary/pipe-loader/pipe_loader.h b/src/gallium/auxiliary/pipe-loader/pipe_loader.h index 9b87126..cf0c51b 100644 --- a/src/gallium/auxiliary/pipe-loader/pipe_loader.h +++ b/src/gallium/auxiliary/pipe-loader/pipe_loader.h @@ -158,8 +158,6 @@ boolean pipe_loader_sw_probe_wrapped(struct pipe_loader_device **dev, struct pipe_screen *screen); -#ifdef HAVE_PIPE_LOADER_DRM - /** * Get a list of known DRM devices. * @@ -180,8 +178,6 @@ pipe_l
Re: [Mesa-dev] [PATCH] mesa: Add abs input modifier to base for POW in ffvertex_prog
On Thu, Oct 1, 2015 at 8:36 AM, Daniel Scharrer wrote: > The result of POW for a negative base is undefined. Even when the result > is multiplied by zero (which is the case here whenever the base is > negative), the Inf and NaNs can propagate past that. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91342 > Signed-off-by: Daniel Scharrer > Cc: "10.6 11.0" Reviewed-by: Ilia Mirkin This also fixes the trace in question on nouveau (tested with a GK208). Nice find! I did some digging in the ARB_vertex_program spec, and it explicitly says that LG2(negative) is undefined, and that POW can be expressed as EX2 + LG2, so the compilers are well within their right to compile the program as they have been. Let me know if you need me to push this out for you. -ilia > --- > src/mesa/main/ffvertex_prog.c | 17 ++--- > 1 file changed, 14 insertions(+), 3 deletions(-) > > diff --git a/src/mesa/main/ffvertex_prog.c b/src/mesa/main/ffvertex_prog.c > index 95b428d..a6183b4 100644 > --- a/src/mesa/main/ffvertex_prog.c > +++ b/src/mesa/main/ffvertex_prog.c > @@ -293,9 +293,10 @@ struct ureg { > GLuint file:4; > GLint idx:9; /* relative addressing may be negative */ > /* sizeof(idx) should == sizeof(prog_src_reg::Index) */ > + GLuint abs:1; > GLuint negate:1; > GLuint swz:12; > - GLuint pad:6; > + GLuint pad:5; > }; > > > @@ -324,6 +325,7 @@ static const struct ureg undef = { > 0, > 0, > 0, > + 0, > 0 > }; > > @@ -342,6 +344,7 @@ static struct ureg make_ureg(GLuint file, GLint idx) > struct ureg reg; > reg.file = file; > reg.idx = idx; > + reg.abs = 0; > reg.negate = 0; > reg.swz = SWIZZLE_NOOP; > reg.pad = 0; > @@ -350,6 +353,14 @@ static struct ureg make_ureg(GLuint file, GLint idx) > > > > +static struct ureg absolute( struct ureg reg ) > +{ > + reg.abs = 1; > + reg.negate = 0; > + return reg; > +} > + > + > static struct ureg negate( struct ureg reg ) > { > reg.negate ^= 1; > @@ -526,8 +537,8 @@ static void emit_arg( struct prog_src_register *src, > src->File = reg.file; > src->Index = reg.idx; > src->Swizzle = reg.swz; > + src->Abs = reg.abs; > src->Negate = reg.negate ? NEGATE_XYZW : NEGATE_NONE; > - src->Abs = 0; > src->RelAddr = 0; > /* Check that bitfield sizes aren't exceeded */ > assert(src->Index == reg.idx); > @@ -953,7 +964,7 @@ static struct ureg calculate_light_attenuation( struct > tnl_program *p, > >emit_op2(p, OPCODE_DP3, spot, 0, negate(VPpli), spot_dir_norm); >emit_op2(p, OPCODE_SLT, slt, 0, swizzle1(spot_dir_norm,W), spot); > - emit_op2(p, OPCODE_POW, spot, 0, spot, swizzle1(attenuation, W)); > + emit_op2(p, OPCODE_POW, spot, 0, absolute(spot), swizzle1(attenuation, > W)); >emit_op2(p, OPCODE_MUL, att, 0, slt, spot); > >release_temp(p, spot); > -- > 2.6.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 92221] Unintended code changes in _mesa_base_tex_format commit
https://bugs.freedesktop.org/show_bug.cgi?id=92221 Bug ID: 92221 Summary: Unintended code changes in _mesa_base_tex_format commit Product: Mesa Version: git Hardware: All OS: All Status: NEW Severity: normal Priority: medium Component: Mesa core Assignee: mesa-dev@lists.freedesktop.org Reporter: nanleych...@gmail.com QA Contact: mesa-dev@lists.freedesktop.org Commit c6bf1cd1467ea5d5370394ba99366dd8a59a385c was supposed to move _mesa_base_tex_format() from one file to another, but it also adds extra code into the function and removes a case which handles ASTC formats. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 92221] Unintended code changes in _mesa_base_tex_format commit
https://bugs.freedesktop.org/show_bug.cgi?id=92221 Nanley Chery changed: What|Removed |Added CC||el...@igalia.com, ||ja...@jlekstrand.net, ||mark.a.ja...@intel.com -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: Add abs input modifier to base for POW in ffvertex_prog
On 2015-10-01 21:45, Ilia Mirkin wrote: > On Thu, Oct 1, 2015 at 8:36 AM, Daniel Scharrer wrote: >> The result of POW for a negative base is undefined. Even when the result >> is multiplied by zero (which is the case here whenever the base is >> negative), the Inf and NaNs can propagate past that. >> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91342 >> Signed-off-by: Daniel Scharrer >> Cc: "10.6 11.0" > > Reviewed-by: Ilia Mirkin > > This also fixes the trace in question on nouveau (tested with a > GK208). Nice find! I did some digging in the ARB_vertex_program spec, > and it explicitly says that LG2(negative) is undefined, and that POW > can be expressed as EX2 + LG2, so the compilers are well within their > right to compile the program as they have been. > > Let me know if you need me to push this out for you. Yeah, I don't have commit access. > > -ilia > >> --- >> src/mesa/main/ffvertex_prog.c | 17 ++--- >> 1 file changed, 14 insertions(+), 3 deletions(-) >> >> diff --git a/src/mesa/main/ffvertex_prog.c b/src/mesa/main/ffvertex_prog.c >> index 95b428d..a6183b4 100644 >> --- a/src/mesa/main/ffvertex_prog.c >> +++ b/src/mesa/main/ffvertex_prog.c >> @@ -293,9 +293,10 @@ struct ureg { >> GLuint file:4; >> GLint idx:9; /* relative addressing may be negative */ >> /* sizeof(idx) should == sizeof(prog_src_reg::Index) */ >> + GLuint abs:1; >> GLuint negate:1; >> GLuint swz:12; >> - GLuint pad:6; >> + GLuint pad:5; >> }; >> >> >> @@ -324,6 +325,7 @@ static const struct ureg undef = { >> 0, >> 0, >> 0, >> + 0, >> 0 >> }; >> >> @@ -342,6 +344,7 @@ static struct ureg make_ureg(GLuint file, GLint idx) >> struct ureg reg; >> reg.file = file; >> reg.idx = idx; >> + reg.abs = 0; >> reg.negate = 0; >> reg.swz = SWIZZLE_NOOP; >> reg.pad = 0; >> @@ -350,6 +353,14 @@ static struct ureg make_ureg(GLuint file, GLint idx) >> >> >> >> +static struct ureg absolute( struct ureg reg ) >> +{ >> + reg.abs = 1; >> + reg.negate = 0; >> + return reg; >> +} >> + >> + >> static struct ureg negate( struct ureg reg ) >> { >> reg.negate ^= 1; >> @@ -526,8 +537,8 @@ static void emit_arg( struct prog_src_register *src, >> src->File = reg.file; >> src->Index = reg.idx; >> src->Swizzle = reg.swz; >> + src->Abs = reg.abs; >> src->Negate = reg.negate ? NEGATE_XYZW : NEGATE_NONE; >> - src->Abs = 0; >> src->RelAddr = 0; >> /* Check that bitfield sizes aren't exceeded */ >> assert(src->Index == reg.idx); >> @@ -953,7 +964,7 @@ static struct ureg calculate_light_attenuation( struct >> tnl_program *p, >> >>emit_op2(p, OPCODE_DP3, spot, 0, negate(VPpli), spot_dir_norm); >>emit_op2(p, OPCODE_SLT, slt, 0, swizzle1(spot_dir_norm,W), spot); >> - emit_op2(p, OPCODE_POW, spot, 0, spot, swizzle1(attenuation, W)); >> + emit_op2(p, OPCODE_POW, spot, 0, absolute(spot), >> swizzle1(attenuation, W)); >>emit_op2(p, OPCODE_MUL, att, 0, slt, spot); >> >>release_temp(p, spot); >> -- >> 2.6.0 >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev -- Daniel Scharrer http://constexpr.org/ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 92221] Unintended code changes in _mesa_base_tex_format commit
https://bugs.freedesktop.org/show_bug.cgi?id=92221 Mark Janes changed: What|Removed |Added CC||nanleych...@gmail.com -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 92221] Unintended code changes in _mesa_base_tex_format commit
https://bugs.freedesktop.org/show_bug.cgi?id=92221 --- Comment #1 from Mark Janes --- Nanley, should there have been a piglit failure associated with ASTC functionality that was disabled? -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 92221] Unintended code changes in _mesa_base_tex_format commit
https://bugs.freedesktop.org/show_bug.cgi?id=92221 --- Comment #2 from Nanley Chery --- (In reply to Mark Janes from comment #1) > Nanley, should there have been a piglit failure associated with ASTC > functionality that was disabled? Not at the moment. The tests are still on the list and haven't been merged to master. Some others developers and I are working to get them integrated. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Make brw_varying_to_offset take a const pointer to the VUE map.
It doesn't modify it. Signed-off-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_context.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 8ae3666..c46dbe2 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -581,8 +581,8 @@ static inline GLuint brw_vue_slot_to_offset(GLuint slot) * Convert a vertex output (brw_varying_slot) into a byte offset within the * VUE. */ -static inline GLuint brw_varying_to_offset(struct brw_vue_map *vue_map, - GLuint varying) +static inline GLuint +brw_varying_to_offset(const struct brw_vue_map *vue_map, GLuint varying) { return brw_vue_slot_to_offset(vue_map->varying_to_slot[varying]); } -- 2.5.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91596] EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI
https://bugs.freedesktop.org/show_bug.cgi?id=91596 --- Comment #8 from Mauro Rossi --- Hi Ilia, Sorry I noticed my previous sentence was not clear, I have performed the test by disabling EGL_KHR_gl_colorspace, independently frome Chih-Wei attempt with MESA_EXTENSION_OVERRIDE, by applying the following changes that were suggested to me by Emil: --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -586,7 +586,7 @@ dri2_setup_screen(_EGLDisplay *disp) if (dri2_renderer_query_integer(dri2_dpy, __DRI2_RENDERER_HAS_FRAMEBUFFER_SRGB)) - disp->Extensions.KHR_gl_colorspace = EGL_TRUE; + disp->Extensions.KHR_gl_colorspace = EGL_FALSE; if (dri2_dpy->dri2 && dri2_dpy->dri2->base.version >= 3) { disp->Extensions.KHR_create_context = EGL_TRUE; The result was still flashing/phantom elements in android GUI. So there may be some other effect induced by c2c2e9a commit. Are all the dri_single/dri_double srgb/non srgb correctly set? What is strange is that all kind of chipsets are affected. Mauro -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91596] EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI
https://bugs.freedesktop.org/show_bug.cgi?id=91596 --- Comment #9 from Mauro Rossi --- >> Hi Ilia, My greeting also to Ilia, but you are Marek Sorry Mauro -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Make brw_varying_to_offset take a const pointer to the VUE map.
Reviewed-by: Matt Turner ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] gallium: add per-sample interpolation control into rasterizer state
Ping On Tue, Sep 29, 2015 at 2:37 AM, Marek Olšák wrote: > From: Marek Olšák > > Required by ARB_sample_shading for drivers that don't want a shader variant > in st/mesa. > --- ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] gallium: add per-sample interpolation control into rasterizer state
To clarify, the effect on drivers not exposing this cap is nil right? On Thu, Oct 1, 2015 at 6:24 PM, Marek Olšák wrote: > Ping > > On Tue, Sep 29, 2015 at 2:37 AM, Marek Olšák wrote: >> From: Marek Olšák >> >> Required by ARB_sample_shading for drivers that don't want a shader variant >> in st/mesa. >> --- > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] gallium: add per-sample interpolation control into rasterizer state
On Fri, Oct 2, 2015 at 12:26 AM, Ilia Mirkin wrote: > To clarify, the effect on drivers not exposing this cap is nil right? Yes. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/3] st/mesa: set force_persample_interp if ARB_sample_shading is used
On Mon, Sep 28, 2015 at 8:38 PM, Marek Olšák wrote: > From: Marek Olšák > > This is only a half of the work. The next patch will handle > gl_SampleID/SamplePos, which is the other half of ARB_sample_shading. > --- > src/mesa/state_tracker/st_atom_rasterizer.c | 8 > src/mesa/state_tracker/st_atom_shader.c | 1 + > src/mesa/state_tracker/st_context.c | 2 ++ > src/mesa/state_tracker/st_context.h | 1 + > 4 files changed, 12 insertions(+) > > diff --git a/src/mesa/state_tracker/st_atom_rasterizer.c > b/src/mesa/state_tracker/st_atom_rasterizer.c > index cceed42..3a9c461 100644 > --- a/src/mesa/state_tracker/st_atom_rasterizer.c > +++ b/src/mesa/state_tracker/st_atom_rasterizer.c > @@ -237,6 +237,14 @@ static void update_raster_state( struct st_context *st ) > /* _NEW_MULTISAMPLE */ > raster->multisample = ctx->Multisample._Enabled; > > + /* _NEW_MULTISAMPLE | _NEW_BUFFERS */ > + raster->force_persample_interp = > + st->can_force_persample_interp && > + ctx->Multisample._Enabled && > + ctx->Multisample.SampleShading && > + MAX2(ceil(ctx->Multisample.MinSampleShadingValue * > + ctx->DrawBuffer->Visual.samples), 1) > 1; The max seems unnecessary... as does the ceil. You should just be able to do MinSampleShadingValue * samples > 1. With that fixed, Reviewed-by: Ilia Mirkin > + > /* _NEW_SCISSOR */ > raster->scissor = ctx->Scissor.EnableFlags; > > diff --git a/src/mesa/state_tracker/st_atom_shader.c > b/src/mesa/state_tracker/st_atom_shader.c > index fee15a9..dc03156 100644 > --- a/src/mesa/state_tracker/st_atom_shader.c > +++ b/src/mesa/state_tracker/st_atom_shader.c > @@ -72,6 +72,7 @@ update_fp( struct st_context *st ) > > /* Ignore sample qualifier while computing this flag. */ > key.persample_shading = > + !st->can_force_persample_interp && >_mesa_get_min_invocations_per_fragment(st->ctx, &stfp->Base, true) > 1; > > st->fp_variant = st_get_fp_variant(st, stfp, &key); > diff --git a/src/mesa/state_tracker/st_context.c > b/src/mesa/state_tracker/st_context.c > index 72c23ca..7061191 100644 > --- a/src/mesa/state_tracker/st_context.c > +++ b/src/mesa/state_tracker/st_context.c > @@ -237,6 +237,8 @@ st_create_context_priv( struct gl_context *ctx, struct > pipe_context *pipe, >PIPE_BIND_SAMPLER_VIEW); > st->prefer_blit_based_texture_transfer = screen->get_param(screen, >PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER); > + st->can_force_persample_interp = screen->get_param(screen, > + PIPE_CAP_FORCE_PERSAMPLE_INTERP); > > st->needs_texcoord_semantic = >screen->get_param(screen, PIPE_CAP_TGSI_TEXCOORD); > diff --git a/src/mesa/state_tracker/st_context.h > b/src/mesa/state_tracker/st_context.h > index 81d5480..a4cda29 100644 > --- a/src/mesa/state_tracker/st_context.h > +++ b/src/mesa/state_tracker/st_context.h > @@ -98,6 +98,7 @@ struct st_context > boolean has_etc1; > boolean has_etc2; > boolean prefer_blit_based_texture_transfer; > + boolean can_force_persample_interp; > > boolean needs_texcoord_semantic; > boolean apply_texture_swizzle_to_border_color; > -- > 2.1.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] st/mesa: automatically set per-sample interpolation if using SampleID/Pos
Reviewed-by: Ilia Mirkin On Mon, Sep 28, 2015 at 8:38 PM, Marek Olšák wrote: > From: Marek Olšák > > --- > src/mesa/state_tracker/st_atom_shader.c | 8 +++- > src/mesa/state_tracker/st_program.c | 4 +++- > 2 files changed, 10 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/state_tracker/st_atom_shader.c > b/src/mesa/state_tracker/st_atom_shader.c > index dc03156..1e880a1 100644 > --- a/src/mesa/state_tracker/st_atom_shader.c > +++ b/src/mesa/state_tracker/st_atom_shader.c > @@ -70,9 +70,15 @@ update_fp( struct st_context *st ) > key.clamp_color = st->clamp_frag_color_in_shader && > st->ctx->Color._ClampFragmentColor; > > - /* Ignore sample qualifier while computing this flag. */ > + /* Don't set it if the driver can force the interpolation by itself. > +* If SAMPLE_ID or SAMPLE_POS are used, the interpolation is set > +* automatically. > +* Ignore sample qualifier while computing this flag. > +*/ > key.persample_shading = >!st->can_force_persample_interp && > + !(stfp->Base.Base.SystemValuesRead & (SYSTEM_BIT_SAMPLE_ID | > +SYSTEM_BIT_SAMPLE_POS)) && >_mesa_get_min_invocations_per_fragment(st->ctx, &stfp->Base, true) > 1; > > st->fp_variant = st_get_fp_variant(st, stfp, &key); > diff --git a/src/mesa/state_tracker/st_program.c > b/src/mesa/state_tracker/st_program.c > index 9c27147..a07f8fe 100644 > --- a/src/mesa/state_tracker/st_program.c > +++ b/src/mesa/state_tracker/st_program.c > @@ -619,7 +619,9 @@ st_translate_fragment_program(struct st_context *st, > else > interpLocation[slot] = TGSI_INTERPOLATE_LOC_CENTER; > > - if (key->persample_shading) > + if (stfp->Base.Base.SystemValuesRead & (SYSTEM_BIT_SAMPLE_ID | > + SYSTEM_BIT_SAMPLE_POS) || > + key->persample_shading) > interpLocation[slot] = TGSI_INTERPOLATE_LOC_SAMPLE; > > switch (attr) { > -- > 2.1.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] gallium: add per-sample interpolation control into rasterizer state
Reviewed-by: Ilia Mirkin On Mon, Sep 28, 2015 at 8:37 PM, Marek Olšák wrote: > From: Marek Olšák > > Required by ARB_sample_shading for drivers that don't want a shader variant > in st/mesa. > --- > src/gallium/docs/source/screen.rst | 9 + > src/gallium/drivers/freedreno/freedreno_screen.c | 1 + > src/gallium/drivers/i915/i915_screen.c | 1 + > src/gallium/drivers/ilo/ilo_screen.c | 1 + > src/gallium/drivers/llvmpipe/lp_screen.c | 1 + > src/gallium/drivers/nouveau/nv30/nv30_screen.c | 1 + > src/gallium/drivers/nouveau/nv50/nv50_screen.c | 1 + > src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 1 + > src/gallium/drivers/r300/r300_screen.c | 1 + > src/gallium/drivers/r600/r600_pipe.c | 1 + > src/gallium/drivers/radeonsi/si_pipe.c | 1 + > src/gallium/drivers/softpipe/sp_screen.c | 1 + > src/gallium/drivers/svga/svga_screen.c | 1 + > src/gallium/drivers/vc4/vc4_screen.c | 1 + > src/gallium/include/pipe/p_defines.h | 1 + > src/gallium/include/pipe/p_state.h | 1 + > 16 files changed, 24 insertions(+) > > diff --git a/src/gallium/docs/source/screen.rst > b/src/gallium/docs/source/screen.rst > index e780047..e08844b 100644 > --- a/src/gallium/docs/source/screen.rst > +++ b/src/gallium/docs/source/screen.rst > @@ -268,6 +268,15 @@ The integer capabilities: >bounds_max states of pipe_depth_stencil_alpha_state behave according >to the GL_EXT_depth_bounds_test specification. > * ``PIPE_CAP_TGSI_TXQS``: Whether the `TXQS` opcode is supported > +* ``PIPE_CAP_FORCE_PERSAMPLE_INTERP``: If the driver can force per-sample > + interpolation for all fragment shader inputs if > + pipe_rasterizer_state::force_persample_interp is set. This is only used > + by GL3-level sample shading (ARB_sample_shading). GL4-level sample shading > + (ARB_gpu_shader5) doesn't use this. While GL3 hardware has a state for it, > + GL4 hardware will likely need to emulate it with a shader variant, or by > + selecting the interpolation weights with a conditional assignment > + in the shader. > + > > > .. _pipe_capf: > diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c > b/src/gallium/drivers/freedreno/freedreno_screen.c > index 9a684d4..0d01005 100644 > --- a/src/gallium/drivers/freedreno/freedreno_screen.c > +++ b/src/gallium/drivers/freedreno/freedreno_screen.c > @@ -235,6 +235,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum > pipe_cap param) > case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS: > case PIPE_CAP_DEPTH_BOUNDS_TEST: > case PIPE_CAP_TGSI_TXQS: > + case PIPE_CAP_FORCE_PERSAMPLE_INTERP: > return 0; > > case PIPE_CAP_MAX_VIEWPORTS: > diff --git a/src/gallium/drivers/i915/i915_screen.c > b/src/gallium/drivers/i915/i915_screen.c > index 51c64ed..9d6b3d3 100644 > --- a/src/gallium/drivers/i915/i915_screen.c > +++ b/src/gallium/drivers/i915/i915_screen.c > @@ -248,6 +248,7 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap > cap) > case PIPE_CAP_TEXTURE_HALF_FLOAT_LINEAR: > case PIPE_CAP_DEPTH_BOUNDS_TEST: > case PIPE_CAP_TGSI_TXQS: > + case PIPE_CAP_FORCE_PERSAMPLE_INTERP: >return 0; > > case PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS: > diff --git a/src/gallium/drivers/ilo/ilo_screen.c > b/src/gallium/drivers/ilo/ilo_screen.c > index 9e37e24..76812a6 100644 > --- a/src/gallium/drivers/ilo/ilo_screen.c > +++ b/src/gallium/drivers/ilo/ilo_screen.c > @@ -470,6 +470,7 @@ ilo_get_param(struct pipe_screen *screen, enum pipe_cap > param) > case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS: > case PIPE_CAP_DEPTH_BOUNDS_TEST: > case PIPE_CAP_TGSI_TXQS: > + case PIPE_CAP_FORCE_PERSAMPLE_INTERP: >return 0; > > case PIPE_CAP_VENDOR_ID: > diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c > b/src/gallium/drivers/llvmpipe/lp_screen.c > index 697e3d9..50c3781 100644 > --- a/src/gallium/drivers/llvmpipe/lp_screen.c > +++ b/src/gallium/drivers/llvmpipe/lp_screen.c > @@ -297,6 +297,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum > pipe_cap param) > case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS: > case PIPE_CAP_DEPTH_BOUNDS_TEST: > case PIPE_CAP_TGSI_TXQS: > + case PIPE_CAP_FORCE_PERSAMPLE_INTERP: >return 0; > } > /* should only get here on unhandled cases */ > diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c > b/src/gallium/drivers/nouveau/nv30/nv30_screen.c > index 806d4e6..39267b3 100644 > --- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c > +++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c > @@ -170,6 +170,7 @@ nv30_screen_get_param(struct pipe_screen *pscreen, enum > pipe_cap param) > case PIPE_CAP_TEXTURE_FLOAT_LINEAR: > case PIPE_CAP_TEXTURE_HALF_FLOAT_LINEAR: > case PIPE_CAP_TGSI_TXQS: > + case PIPE_CAP_FORCE_PERSAMPLE_INTERP: >return 0; > > case PIPE_CAP_VENDOR_ID: > diff -
[Mesa-dev] [Bug 91596] EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI
https://bugs.freedesktop.org/show_bug.cgi?id=91596 --- Comment #10 from Emil Velikov --- In the few moments that android-x86 was up I've noticed a very interesting commit [1]. In case you don't have a local copy, it enables a workaround #10194508, as seen here [2]. Not sure if the android-x86 version has the extra qcom 'hacks' so it might be worth experimenting on the topic - adding/reverting the qcom commit, reverting the original patch [1], all that whist toggling on/off the extension ;-) [1] http://git.android-x86.org/?p=platform/frameworks/native.git;a=commit;h=39aa8695464e496ec4f6d22e9af54457bcfdf4ae [2] https://github.com/omnirom/android_frameworks_native/blob/HEAD/opengl/libs/EGL/eglApi.cpp#L431 -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 2/2] i965/fs: Handle non-const sample number in interpolateAtSample
On Fri, Aug 7, 2015 at 9:31 AM, Neil Roberts wrote: > If a non-const sample number is given to interpolateAtSample it will > now generate an indirect send message with the sample ID similar to > how non-const sampler array indexing works. Previously non-const > values were ignored and instead it ended up using a constant 0 value. > > The generator will try to determine if the sample ID is dynamically > uniform via nir_src_is_dynamically_uniform. If not it will query the > pixel interpolator in a loop, once for each possible sample number. > This is necessary because the indirect send message doesn't seem to > have a way to specify a different value for each fragment. Heh, I think this was the solution Curro had in mind a couple of years ago. :) I've Cc'd him. It'd be good for him to review this. > The range of possible sample numbers is determined using > STATE_NUM_SAMPLES. When linking the shader it will now add a reference > to this state if any dynamically non-uniform calls to > interpolateAtSample are found. > > This fixes the following two Piglit tests: > > arb_gpu_shader5-interpolateAtSample-nonconst > arb_gpu_shader5-interpolateAtSample-dynamically-nonuniform > > v2: Handle dynamically non-uniform sample ids. > --- > src/mesa/drivers/dri/i965/brw_eu.h | 2 +- > src/mesa/drivers/dri/i965/brw_eu_emit.c| 34 --- > src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 5 +- > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 119 > + > src/mesa/drivers/dri/i965/brw_program.c| 54 +++ > src/mesa/drivers/dri/i965/brw_program.h| 1 + > src/mesa/drivers/dri/i965/brw_shader.cpp | 2 + > 7 files changed, 185 insertions(+), 32 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_eu.h > b/src/mesa/drivers/dri/i965/brw_eu.h > index 761aa0e..0ac1ad9 100644 > --- a/src/mesa/drivers/dri/i965/brw_eu.h > +++ b/src/mesa/drivers/dri/i965/brw_eu.h > @@ -461,7 +461,7 @@ brw_pixel_interpolator_query(struct brw_codegen *p, > struct brw_reg mrf, > bool noperspective, > unsigned mode, > - unsigned data, > + struct brw_reg data, > unsigned msg_length, > unsigned response_length); > > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c > b/src/mesa/drivers/dri/i965/brw_eu_emit.c > index 4d39762..25524d4 100644 > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c > @@ -3192,26 +3192,38 @@ brw_pixel_interpolator_query(struct brw_codegen *p, > struct brw_reg mrf, > bool noperspective, > unsigned mode, > - unsigned data, > + struct brw_reg data, > unsigned msg_length, > unsigned response_length) > { > const struct brw_device_info *devinfo = p->devinfo; > - struct brw_inst *insn = next_insn(p, BRW_OPCODE_SEND); > + struct brw_inst *insn; > + uint16_t exec_size; > > - brw_set_dest(p, insn, dest); > - brw_set_src0(p, insn, mrf); > - brw_set_message_descriptor(p, insn, GEN7_SFID_PIXEL_INTERPOLATOR, > - msg_length, response_length, > - false /* header is never present for PI */, > - false); > + if (data.file == BRW_IMMEDIATE_VALUE) { > + insn = next_insn(p, BRW_OPCODE_SEND); > + brw_set_dest(p, insn, dest); > + brw_set_src0(p, insn, mrf); > + brw_set_message_descriptor(p, insn, GEN7_SFID_PIXEL_INTERPOLATOR, > + msg_length, response_length, > + false /* header is never present for PI */, > + false); > + brw_inst_set_pi_message_data(devinfo, insn, data.dw1.ud); > + } else { > + insn = brw_send_indirect_message(p, > + GEN7_SFID_PIXEL_INTERPOLATOR, > + dest, > + mrf, > + vec1(data)); > + brw_inst_set_mlen(devinfo, insn, msg_length); > + brw_inst_set_rlen(devinfo, insn, response_length); > + } > > - brw_inst_set_pi_simd_mode( > - devinfo, insn, brw_inst_exec_size(devinfo, insn) == BRW_EXECUTE_16); > + exec_size = brw_inst_exec_size(devinfo, p->current); > + brw_inst_set_pi_simd_mode(devinfo, insn, exec_size == BRW_EXECUTE_16); > brw_inst_set_pi_slot_group(devinfo, insn, 0); /* zero unless 32/64px > dispatch */ > brw_inst_set_pi_nopersp(devinfo, insn, noperspective); > brw_inst_set_pi_message_type(devinfo, insn, mode); > - brw_inst_set_pi_message_data(devinfo, insn, data); > } >
[Mesa-dev] [Bug 92221] Unintended code changes in _mesa_base_tex_format commit
https://bugs.freedesktop.org/show_bug.cgi?id=92221 --- Comment #3 from Nanley Chery --- The code that was added was previously removed with this refactoring commit: 99b1f4751f97631011b64fabcb57acf6beae01ac -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: avoid leaking closure when iterating over a string_to_uint_map
Signed-off-by: Ilia Mirkin --- src/mesa/program/hash_table.h | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/program/hash_table.h b/src/mesa/program/hash_table.h index e85a836..d0a2abf 100644 --- a/src/mesa/program/hash_table.h +++ b/src/mesa/program/hash_table.h @@ -249,6 +249,7 @@ public: wrapper->closure = closure; hash_table_call_foreach(this->ht, subtract_one_wrapper, wrapper); + free(wrapper); } /** -- 2.4.9 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: avoid leaking hiddenUniforms map when there are no uniforms
Signed-off-by: Ilia Mirkin --- src/glsl/link_uniforms.cpp | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp index 47d49c8..740b0a4 100644 --- a/src/glsl/link_uniforms.cpp +++ b/src/glsl/link_uniforms.cpp @@ -1131,15 +1131,15 @@ link_assign_uniform_locations(struct gl_shader_program *prog, const unsigned num_data_slots = uniform_size.num_values; const unsigned hidden_uniforms = uniform_size.num_hidden_uniforms; + /* assign hidden uniforms a slot id */ + hiddenUniforms->iterate(assign_hidden_uniform_slot_id, &uniform_size); + delete hiddenUniforms; + /* On the outside chance that there were no uniforms, bail out. */ if (num_uniforms == 0) return; - /* assign hidden uniforms a slot id */ - hiddenUniforms->iterate(assign_hidden_uniform_slot_id, &uniform_size); - delete hiddenUniforms; - struct gl_uniform_storage *uniforms = rzalloc_array(prog, struct gl_uniform_storage, num_uniforms); union gl_constant_value *data = -- 2.4.9 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: don't forget to free image_param on prog_data free
Signed-off-by: Ilia Mirkin --- src/mesa/drivers/dri/i965/brw_program.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/drivers/dri/i965/brw_program.c b/src/mesa/drivers/dri/i965/brw_program.c index fee96a8..0e4b823 100644 --- a/src/mesa/drivers/dri/i965/brw_program.c +++ b/src/mesa/drivers/dri/i965/brw_program.c @@ -551,6 +551,7 @@ brw_stage_prog_data_free(const void *p) ralloc_free(prog_data->param); ralloc_free(prog_data->pull_param); + ralloc_free(prog_data->image_param); } void -- 2.4.9 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] gallium: add per-sample interpolation control into rasterizer state
Can't say I'm a big fan of doing essentially the same thing with different methods, but well it's coming from GL, and if it helps some drivers... Acked-by: Roland Scheidegger Am 02.10.2015 um 00:24 schrieb Marek Olšák: > Ping > > On Tue, Sep 29, 2015 at 2:37 AM, Marek Olšák wrote: >> From: Marek Olšák >> >> Required by ARB_sample_shading for drivers that don't want a shader variant >> in st/mesa. >> --- > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddev&d=BQIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=Vjtt0vs_iqoI31UfJxBl7yv9I2FeiaeAYgMTLKRBc_I&m=GgTDbmCyFPdfRfqaUOiQKcxwA2fX_L2ryeWXZyYNZP4&s=G7zIQcl0J49zTLP3pzsgtFiRZCCYkqUpno7eiJdOXAg&e= > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/13] i965/backend_shader: Add a field to store the NIR shader
--- src/mesa/drivers/dri/i965/brw_fs.h | 8 +++ src/mesa/drivers/dri/i965/brw_fs_nir.cpp| 32 - src/mesa/drivers/dri/i965/brw_shader.cpp| 1 + src/mesa/drivers/dri/i965/brw_shader.h | 1 + src/mesa/drivers/dri/i965/brw_vec4.h| 6 ++--- src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp | 6 ++--- src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h | 2 +- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 24 +-- 8 files changed, 39 insertions(+), 41 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 3ab01c7..b932ed2 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -236,10 +236,10 @@ public: uint32_t spill_offset, int count); void emit_nir_code(); - void nir_setup_inputs(nir_shader *shader); - void nir_setup_outputs(nir_shader *shader); - void nir_setup_uniforms(nir_shader *shader); - void nir_emit_system_values(nir_shader *shader); + void nir_setup_inputs(); + void nir_setup_outputs(); + void nir_setup_uniforms(); + void nir_emit_system_values(); void nir_emit_impl(nir_function_impl *impl); void nir_emit_cf_list(exec_list *list); void nir_emit_if(nir_if *if_stmt); diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index d70c672..3793791 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -37,15 +37,13 @@ using namespace brw::surface_access; void fs_visitor::emit_nir_code() { - nir_shader *nir = prog->nir; - /* emit the arrays used for inputs and outputs - load/store intrinsics will * be converted to reads/writes of these arrays */ - nir_setup_inputs(nir); - nir_setup_outputs(nir); - nir_setup_uniforms(nir); - nir_emit_system_values(nir); + nir_setup_inputs(); + nir_setup_outputs(); + nir_setup_uniforms(); + nir_emit_system_values(); /* get the main function and emit it */ nir_foreach_overload(nir, overload) { @@ -56,11 +54,11 @@ fs_visitor::emit_nir_code() } void -fs_visitor::nir_setup_inputs(nir_shader *shader) +fs_visitor::nir_setup_inputs() { - nir_inputs = bld.vgrf(BRW_REGISTER_TYPE_F, shader->num_inputs); + nir_inputs = bld.vgrf(BRW_REGISTER_TYPE_F, nir->num_inputs); - foreach_list_typed(nir_variable, var, node, &shader->inputs) { + foreach_list_typed(nir_variable, var, node, &nir->inputs) { enum brw_reg_type type = brw_type_for_base_type(var->type); fs_reg input = offset(nir_inputs, bld, var->data.driver_location); @@ -118,13 +116,13 @@ fs_visitor::nir_setup_inputs(nir_shader *shader) } void -fs_visitor::nir_setup_outputs(nir_shader *shader) +fs_visitor::nir_setup_outputs() { brw_wm_prog_key *key = (brw_wm_prog_key*) this->key; - nir_outputs = bld.vgrf(BRW_REGISTER_TYPE_F, shader->num_outputs); + nir_outputs = bld.vgrf(BRW_REGISTER_TYPE_F, nir->num_outputs); - foreach_list_typed(nir_variable, var, node, &shader->outputs) { + foreach_list_typed(nir_variable, var, node, &nir->outputs) { fs_reg reg = offset(nir_outputs, bld, var->data.driver_location); int vector_elements = @@ -175,14 +173,14 @@ fs_visitor::nir_setup_outputs(nir_shader *shader) } void -fs_visitor::nir_setup_uniforms(nir_shader *shader) +fs_visitor::nir_setup_uniforms() { if (dispatch_width != 8) return; - uniforms = shader->num_uniforms; + uniforms = nir->num_uniforms; - foreach_list_typed(nir_variable, var, node, &shader->uniforms) { + foreach_list_typed(nir_variable, var, node, &nir->uniforms) { /* UBO's and atomics don't take up space in the uniform file */ if (var->interface_type != NULL || var->type->contains_atomic()) continue; @@ -274,10 +272,10 @@ emit_system_values_block(nir_block *block, void *void_visitor) } void -fs_visitor::nir_emit_system_values(nir_shader *shader) +fs_visitor::nir_emit_system_values() { nir_system_values = ralloc_array(mem_ctx, fs_reg, SYSTEM_VALUE_MAX); - nir_foreach_overload(shader, overload) { + nir_foreach_overload(nir, overload) { assert(strcmp(overload->function->name, "main") == 0); assert(overload->impl); nir_foreach_block(overload->impl, emit_system_values_block, this); diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index 7c6043d..09a9859 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -913,6 +913,7 @@ backend_shader::backend_shader(const struct brw_compiler *compiler, : compiler(compiler), log_data(log_data), devinfo(compiler->devinfo), + nir(prog->nir), shader(shader_prog ? (struct brw_shader *)shader_prog->_LinkedShaders[stage] : NULL), shader_prog(shader_prog), diff --git a/src/mesa/drivers/dri/i965/brw_shader.h
[Mesa-dev] [PATCH 08/13] i965/cs: Remove the prog argument from local_id_payload_dwords
--- src/mesa/drivers/dri/i965/brw_cs.h| 3 +-- src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +- src/mesa/drivers/dri/i965/gen7_cs_state.c | 7 +++ 3 files changed, 5 insertions(+), 7 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_cs.h b/src/mesa/drivers/dri/i965/brw_cs.h index 0018c04..0c0ed2b 100644 --- a/src/mesa/drivers/dri/i965/brw_cs.h +++ b/src/mesa/drivers/dri/i965/brw_cs.h @@ -49,8 +49,7 @@ brw_cs_emit(struct brw_context *brw, unsigned *final_assembly_size); unsigned -brw_cs_prog_local_id_payload_dwords(const struct gl_program *prog, -unsigned dispatch_width); +brw_cs_prog_local_id_payload_dwords(unsigned dispatch_width); #ifdef __cplusplus } diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index a4865b2..699d3d0 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -4727,7 +4727,7 @@ fs_visitor::setup_cs_payload() if (prog->SystemValuesRead & SYSTEM_BIT_LOCAL_INVOCATION_ID) { const unsigned local_id_dwords = - brw_cs_prog_local_id_payload_dwords(prog, dispatch_width); + brw_cs_prog_local_id_payload_dwords(dispatch_width); assert((local_id_dwords & 0x7) == 0); const unsigned local_id_regs = local_id_dwords / 8; payload.local_invocation_id_reg = payload.num_regs; diff --git a/src/mesa/drivers/dri/i965/gen7_cs_state.c b/src/mesa/drivers/dri/i965/gen7_cs_state.c index 0b88b2c..5edc4fc 100644 --- a/src/mesa/drivers/dri/i965/gen7_cs_state.c +++ b/src/mesa/drivers/dri/i965/gen7_cs_state.c @@ -72,7 +72,7 @@ brw_upload_cs_state(struct brw_context *brw) if (prog->SystemValuesRead & SYSTEM_BIT_LOCAL_INVOCATION_ID) { local_id_dwords = - brw_cs_prog_local_id_payload_dwords(prog, cs_prog_data->simd_size); + brw_cs_prog_local_id_payload_dwords(cs_prog_data->simd_size); } unsigned push_constant_data_size = @@ -216,8 +216,7 @@ const struct brw_tracked_state brw_cs_state = { * */ unsigned -brw_cs_prog_local_id_payload_dwords(const struct gl_program *prog, -unsigned dispatch_width) +brw_cs_prog_local_id_payload_dwords(unsigned dispatch_width) { return 3 * dispatch_width; } @@ -272,7 +271,7 @@ brw_upload_cs_push_constants(struct brw_context *brw, if (prog->SystemValuesRead & SYSTEM_BIT_LOCAL_INVOCATION_ID) { local_id_dwords = - brw_cs_prog_local_id_payload_dwords(prog, cs_prog_data->simd_size); + brw_cs_prog_local_id_payload_dwords(cs_prog_data->simd_size); } /* Updates the ParamaterValues[i] pointers for all parameters of the -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 02/13] i965: Move binding table setup to codegen time.
Setting up binding tables really has little to do with the actual process of turning shaders into instructions; it's more part of setting up prog_data. This commit moves it out of the visitors and with the rest of the prog_data setup stuff. --- src/mesa/drivers/dri/i965/brw_cs.c| 20 + src/mesa/drivers/dri/i965/brw_fs.cpp | 42 --- src/mesa/drivers/dri/i965/brw_fs.h| 2 -- src/mesa/drivers/dri/i965/brw_gs.c| 19 src/mesa/drivers/dri/i965/brw_vec4.cpp| 9 -- src/mesa/drivers/dri/i965/brw_vec4.h | 1 - src/mesa/drivers/dri/i965/brw_vs.c| 5 src/mesa/drivers/dri/i965/brw_wm.c| 23 +++ src/mesa/drivers/dri/i965/gen6_gs_visitor.cpp | 11 --- src/mesa/drivers/dri/i965/gen6_gs_visitor.h | 1 - 10 files changed, 67 insertions(+), 66 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_cs.c b/src/mesa/drivers/dri/i965/brw_cs.c index 24120fb..1857962 100644 --- a/src/mesa/drivers/dri/i965/brw_cs.c +++ b/src/mesa/drivers/dri/i965/brw_cs.c @@ -32,6 +32,23 @@ #include "intel_batchbuffer.h" #include "glsl/nir/nir.h" +static void +assign_cs_binding_table_offsets(const struct brw_device_info *devinfo, +const struct gl_shader_program *shader_prog, +const struct gl_program *prog, +struct brw_cs_prog_data *prog_data) +{ + uint32_t next_binding_table_offset = 0; + + /* May not be used if the gl_NumWorkGroups variable is not accessed. */ + prog_data->binding_table.work_groups_start = next_binding_table_offset; + next_binding_table_offset++; + + brw_assign_common_binding_table_offsets(MESA_SHADER_COMPUTE, devinfo, + shader_prog, prog, &prog_data->base, + next_binding_table_offset); +} + static bool brw_codegen_cs_prog(struct brw_context *brw, struct gl_shader_program *prog, @@ -52,6 +69,9 @@ brw_codegen_cs_prog(struct brw_context *brw, memset(&prog_data, 0, sizeof(prog_data)); + assign_cs_binding_table_offsets(brw->intelScreen->devinfo, prog, + &cp->program.Base, &prog_data); + /* Allocate the references to the uniforms that will end up in the * prog_data associated with the compiled program, and which will be freed * by the state cache. diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index e90151b..a4865b2 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -4736,41 +4736,6 @@ fs_visitor::setup_cs_payload() } void -fs_visitor::assign_fs_binding_table_offsets() -{ - assert(stage == MESA_SHADER_FRAGMENT); - brw_wm_prog_data *prog_data = (brw_wm_prog_data*) this->prog_data; - brw_wm_prog_key *key = (brw_wm_prog_key*) this->key; - uint32_t next_binding_table_offset = 0; - - /* If there are no color regions, we still perform an FB write to a null -* renderbuffer, which we place at surface index 0. -*/ - prog_data->binding_table.render_target_start = next_binding_table_offset; - next_binding_table_offset += MAX2(key->nr_color_regions, 1); - - brw_assign_common_binding_table_offsets(MESA_SHADER_FRAGMENT, devinfo, - shader_prog, prog, stage_prog_data, - next_binding_table_offset); -} - -void -fs_visitor::assign_cs_binding_table_offsets() -{ - assert(stage == MESA_SHADER_COMPUTE); - brw_cs_prog_data *prog_data = (brw_cs_prog_data*) this->prog_data; - uint32_t next_binding_table_offset = 0; - - /* May not be used if the gl_NumWorkGroups variable is not accessed. */ - prog_data->binding_table.work_groups_start = next_binding_table_offset; - next_binding_table_offset++; - - brw_assign_common_binding_table_offsets(MESA_SHADER_COMPUTE, devinfo, - shader_prog, prog, stage_prog_data, - next_binding_table_offset); -} - -void fs_visitor::calculate_register_pressure() { invalidate_live_intervals(); @@ -4983,9 +4948,6 @@ fs_visitor::run_vs(gl_clip_plane *clip_planes) { assert(stage == MESA_SHADER_VERTEX); - brw_assign_common_binding_table_offsets(MESA_SHADER_VERTEX, devinfo, - shader_prog, prog, stage_prog_data, - 0); setup_vs_payload(); if (shader_time_index >= 0) @@ -5026,8 +4988,6 @@ fs_visitor::run_fs(bool do_rep_send) sanity_param_count = prog->Parameters->NumParameters; - assign_fs_binding_table_offsets(); - if (devinfo->gen >= 6) setup_payload_gen6(); else @@ -5114,8 +5074,6 @@ fs_visitor::run_cs() sanity_param_count = prog->Parameters->NumParameters; - assign_cs_bind
[Mesa-dev] [PATCH 01/13] i965/shader: Pull assign_common_binding_table_offsets out of backend_shader
This really has nothing to do with the backend compiler and we'd like to eventually be able to set this up earlier in the compile process. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 12 +--- src/mesa/drivers/dri/i965/brw_shader.cpp | 17 + src/mesa/drivers/dri/i965/brw_shader.h| 11 +-- src/mesa/drivers/dri/i965/brw_vec4.cpp| 3 ++- src/mesa/drivers/dri/i965/gen6_gs_visitor.cpp | 4 +++- 5 files changed, 36 insertions(+), 11 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index b062219..e90151b 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -4749,7 +4749,9 @@ fs_visitor::assign_fs_binding_table_offsets() prog_data->binding_table.render_target_start = next_binding_table_offset; next_binding_table_offset += MAX2(key->nr_color_regions, 1); - assign_common_binding_table_offsets(next_binding_table_offset); + brw_assign_common_binding_table_offsets(MESA_SHADER_FRAGMENT, devinfo, + shader_prog, prog, stage_prog_data, + next_binding_table_offset); } void @@ -4763,7 +4765,9 @@ fs_visitor::assign_cs_binding_table_offsets() prog_data->binding_table.work_groups_start = next_binding_table_offset; next_binding_table_offset++; - assign_common_binding_table_offsets(next_binding_table_offset); + brw_assign_common_binding_table_offsets(MESA_SHADER_COMPUTE, devinfo, + shader_prog, prog, stage_prog_data, + next_binding_table_offset); } void @@ -4979,7 +4983,9 @@ fs_visitor::run_vs(gl_clip_plane *clip_planes) { assert(stage == MESA_SHADER_VERTEX); - assign_common_binding_table_offsets(0); + brw_assign_common_binding_table_offsets(MESA_SHADER_VERTEX, devinfo, + shader_prog, prog, stage_prog_data, + 0); setup_vs_payload(); if (shader_time_index >= 0) diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index 1d184a7..7c6043d 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -1363,16 +1363,25 @@ backend_shader::invalidate_cfg() * trigger some of our asserts that surface indices are < BRW_MAX_SURFACES. */ void -backend_shader::assign_common_binding_table_offsets(uint32_t next_binding_table_offset) +brw_assign_common_binding_table_offsets(gl_shader_stage stage, +const struct brw_device_info *devinfo, +const struct gl_shader_program *shader_prog, +const struct gl_program *prog, +struct brw_stage_prog_data *stage_prog_data, +uint32_t next_binding_table_offset) { + const struct gl_shader *shader = NULL; int num_textures = _mesa_fls(prog->SamplersUsed); + if (shader_prog) + shader = shader_prog->_LinkedShaders[stage]; + stage_prog_data->binding_table.texture_start = next_binding_table_offset; next_binding_table_offset += num_textures; if (shader) { stage_prog_data->binding_table.ubo_start = next_binding_table_offset; - next_binding_table_offset += shader->base.NumUniformBlocks; + next_binding_table_offset += shader->NumUniformBlocks; } else { stage_prog_data->binding_table.ubo_start = 0xd0d0d0d0; } @@ -1403,9 +1412,9 @@ backend_shader::assign_common_binding_table_offsets(uint32_t next_binding_table_ stage_prog_data->binding_table.abo_start = 0xd0d0d0d0; } - if (shader && shader->base.NumImages) { + if (shader && shader->NumImages) { stage_prog_data->binding_table.image_start = next_binding_table_offset; - next_binding_table_offset += shader->base.NumImages; + next_binding_table_offset += shader->NumImages; } else { stage_prog_data->binding_table.image_start = 0xd0d0d0d0; } diff --git a/src/mesa/drivers/dri/i965/brw_shader.h b/src/mesa/drivers/dri/i965/brw_shader.h index eeb3306..9690332 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.h +++ b/src/mesa/drivers/dri/i965/brw_shader.h @@ -24,6 +24,7 @@ #include #include "brw_reg.h" #include "brw_defines.h" +#include "brw_context.h" #include "main/compiler.h" #include "glsl/ir.h" #include "program/prog_parameter.h" @@ -266,8 +267,6 @@ public: void calculate_cfg(); void invalidate_cfg(); - void assign_common_binding_table_offsets(uint32_t next_binding_table_offset); - virtual void invalidate_live_intervals() = 0; }; @@ -295,6 +294,14 @@ extern "C" { struct brw_compiler * brw_compiler_create(void *mem_ctx, const struct brw_device_info *devinfo); +void +brw_assign_common_binding_t
[Mesa-dev] [PATCH 03/13] i965: Move prog_data uniform setup to the codegen level
As of now, uniform setup is more-or-less unified between vec4 and fs and no longer requires the fs_visitor. This makes uniform setup more of a language/API thing than a backend compiler thing. This commit moves setting up the stage_prog_data.params arrays to the same place as we set up the rest of stage_prog_data. --- src/mesa/drivers/dri/i965/brw_cs.c | 5 - src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 7 --- src/mesa/drivers/dri/i965/brw_gs.c | 5 - src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 7 --- src/mesa/drivers/dri/i965/brw_vs.c | 9 + src/mesa/drivers/dri/i965/brw_wm.c | 10 +- 6 files changed, 26 insertions(+), 17 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_cs.c b/src/mesa/drivers/dri/i965/brw_cs.c index 1857962..6b64030 100644 --- a/src/mesa/drivers/dri/i965/brw_cs.c +++ b/src/mesa/drivers/dri/i965/brw_cs.c @@ -30,7 +30,7 @@ #include "intel_mipmap_tree.h" #include "brw_state.h" #include "intel_batchbuffer.h" -#include "glsl/nir/nir.h" +#include "brw_nir.h" static void assign_cs_binding_table_offsets(const struct brw_device_info *devinfo, @@ -89,6 +89,9 @@ brw_codegen_cs_prog(struct brw_context *brw, prog_data.base.nr_params = param_count; prog_data.base.nr_image_params = cs->base.NumImages; + brw_nir_setup_glsl_uniforms(cp->program.Base.nir, prog, &cp->program.Base, + &prog_data.base, true); + if (unlikely(brw->perf_debug)) { start_busy = (brw->batch.last_bo && drm_intel_bo_busy(brw->batch.last_bo)); diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index eb0fe7b..d70c672 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -182,13 +182,6 @@ fs_visitor::nir_setup_uniforms(nir_shader *shader) uniforms = shader->num_uniforms; - if (shader_prog) { - brw_nir_setup_glsl_uniforms(shader, shader_prog, prog, - stage_prog_data, true); - } else { - brw_nir_setup_arb_uniforms(shader, prog, stage_prog_data); - } - foreach_list_typed(nir_variable, var, node, &shader->uniforms) { /* UBO's and atomics don't take up space in the uniform file */ if (var->interface_type != NULL || var->type->contains_atomic()) diff --git a/src/mesa/drivers/dri/i965/brw_gs.c b/src/mesa/drivers/dri/i965/brw_gs.c index cfed35c..f6b9874 100644 --- a/src/mesa/drivers/dri/i965/brw_gs.c +++ b/src/mesa/drivers/dri/i965/brw_gs.c @@ -32,7 +32,7 @@ #include "brw_vec4_gs_visitor.h" #include "brw_state.h" #include "brw_ff_gs.h" -#include "glsl/nir/nir.h" +#include "brw_nir.h" static void assign_gs_binding_table_offsets(const struct brw_device_info *devinfo, @@ -91,6 +91,9 @@ brw_codegen_gs_prog(struct brw_context *brw, c.prog_data.base.base.nr_params = param_count; c.prog_data.base.base.nr_image_params = gs->NumImages; + brw_nir_setup_glsl_uniforms(gp->program.Base.nir, prog, &gp->program.Base, + &c.prog_data.base.base, false); + if (brw->gen >= 8) { c.prog_data.static_vertex_count = !gp->program.Base.nir ? -1 : nir_gs_count_vertices(gp->program.Base.nir); diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index 8274d48..41667eb 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp @@ -136,13 +136,6 @@ vec4_visitor::nir_setup_uniforms(nir_shader *shader) { uniforms = shader->num_uniforms; - if (shader_prog) { - brw_nir_setup_glsl_uniforms(shader, shader_prog, prog, - stage_prog_data, false); - } else { - brw_nir_setup_arb_uniforms(shader, prog, stage_prog_data); - } - foreach_list_typed(nir_variable, var, node, &shader->uniforms) { /* UBO's and atomics don't take up space in the uniform file */ if (var->interface_type != NULL || var->type->contains_atomic()) diff --git a/src/mesa/drivers/dri/i965/brw_vs.c b/src/mesa/drivers/dri/i965/brw_vs.c index 8607825..3c6ee0a 100644 --- a/src/mesa/drivers/dri/i965/brw_vs.c +++ b/src/mesa/drivers/dri/i965/brw_vs.c @@ -135,6 +135,15 @@ brw_codegen_vs_prog(struct brw_context *brw, stage_prog_data->nr_image_params); stage_prog_data->nr_params = param_count; + if (prog) { + brw_nir_setup_glsl_uniforms(vp->program.Base.nir, prog, &vp->program.Base, + &prog_data.base.base, + brw->intelScreen->compiler->scalar_vs); + } else { + brw_nir_setup_arb_uniforms(vp->program.Base.nir, &vp->program.Base, + &prog_data.base.base); + } + GLbitfield64 outputs_written = vp->program.Base.OutputsWritten; prog_data.inputs_read = vp->program.Base.InputsRead; diff --git a/src/mesa/drivers/dri/i965/brw_wm.
[Mesa-dev] [PATCH 06/13] nir: Move GS data to nir_shader_info
--- src/glsl/nir/glsl_to_nir.cpp | 4 ++-- src/glsl/nir/nir.c | 3 --- src/glsl/nir/nir.h | 16 src/glsl/nir/nir_lower_gs_intrinsics.c | 2 +- 4 files changed, 11 insertions(+), 14 deletions(-) diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp index 6820962..4dd6287 100644 --- a/src/glsl/nir/glsl_to_nir.cpp +++ b/src/glsl/nir/glsl_to_nir.cpp @@ -150,8 +150,6 @@ glsl_to_nir(const struct gl_shader_program *shader_prog, if (sh->Program->SamplersUsed & (1 << i)) num_textures = i; - shader->gs.vertices_out = sh->Geom.VerticesOut; - shader->gs.invocations = sh->Geom.Invocations; shader->info.name = ralloc_asprintf(shader, "GLSL%d", sh->Name); shader->info.num_textures = num_textures; shader->info.num_ubos = sh->NumUniformBlocks; @@ -164,6 +162,8 @@ glsl_to_nir(const struct gl_shader_program *shader_prog, shader->info.uses_texture_gather = sh->Program->UsesGather; shader->info.uses_clip_distance_out = sh->Program->UsesClipDistanceOut; shader->info.separate_shader = shader_prog->SeparateShader; + shader->info.gs.vertices_out = sh->Geom.VerticesOut; + shader->info.gs.invocations = sh->Geom.Invocations; return shader; } diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c index 7a469a6..e12da80 100644 --- a/src/glsl/nir/nir.c +++ b/src/glsl/nir/nir.c @@ -55,9 +55,6 @@ nir_shader_create(void *mem_ctx, shader->stage = stage; - shader->gs.vertices_out = 0; - shader->gs.invocations = 0; - return shader; } diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index 320ffee..7bc6934 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -1483,6 +1483,14 @@ typedef struct nir_shader_info { /* Whether or not separate shader objects were used */ bool separate_shader; + + struct { + /** The maximum number of vertices the geometry shader might write. */ + unsigned vertices_out; + + /** 1 .. MAX_GEOMETRY_SHADER_INVOCATIONS */ + unsigned invocations; + } gs; } nir_shader_info; typedef struct nir_shader { @@ -1527,14 +1535,6 @@ typedef struct nir_shader { /** The shader stage, such as MESA_SHADER_VERTEX. */ gl_shader_stage stage; - - struct { - /** The maximum number of vertices the geometry shader might write. */ - unsigned vertices_out; - - /** 1 .. MAX_GEOMETRY_SHADER_INVOCATIONS */ - unsigned invocations; - } gs; } nir_shader; #define nir_foreach_overload(shader, overload)\ diff --git a/src/glsl/nir/nir_lower_gs_intrinsics.c b/src/glsl/nir/nir_lower_gs_intrinsics.c index 2ee4e5c..e0d0678 100644 --- a/src/glsl/nir/nir_lower_gs_intrinsics.c +++ b/src/glsl/nir/nir_lower_gs_intrinsics.c @@ -76,7 +76,7 @@ rewrite_emit_vertex(nir_intrinsic_instr *intrin, struct state *state) b->cursor = nir_before_instr(&intrin->instr); nir_ssa_def *count = nir_load_var(b, state->vertex_count_var); - nir_ssa_def *max_vertices = nir_imm_int(b, b->shader->gs.vertices_out); + nir_ssa_def *max_vertices = nir_imm_int(b, b->shader->info.gs.vertices_out); /* Create: if (vertex_count < max_vertices) and insert it. * -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 05/13] nir: Add a a nir_shader_info struct
This commit also adds code to glsl_to_nir and prog_to_nir to fill it out. --- src/glsl/nir/glsl_to_nir.cpp | 18 ++ src/glsl/nir/nir.c | 1 + src/glsl/nir/nir.h | 34 ++ src/mesa/program/prog_to_nir.c | 13 + 4 files changed, 66 insertions(+) diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp index ba08e17..6820962 100644 --- a/src/glsl/nir/glsl_to_nir.cpp +++ b/src/glsl/nir/glsl_to_nir.cpp @@ -144,8 +144,26 @@ glsl_to_nir(const struct gl_shader_program *shader_prog, nir_lower_outputs_to_temporaries(shader); + /* TODO: Use _mesa_fls instead */ + unsigned num_textures = 0; + for (unsigned i = 0; i < 8 * sizeof(sh->Program->SamplersUsed); i++) + if (sh->Program->SamplersUsed & (1 << i)) + num_textures = i; + shader->gs.vertices_out = sh->Geom.VerticesOut; shader->gs.invocations = sh->Geom.Invocations; + shader->info.name = ralloc_asprintf(shader, "GLSL%d", sh->Name); + shader->info.num_textures = num_textures; + shader->info.num_ubos = sh->NumUniformBlocks; + shader->info.num_abos = shader_prog->NumAtomicBuffers; + shader->info.num_ssbos = shader_prog->NumBufferInterfaceBlocks; + shader->info.num_images = sh->NumImages; + shader->info.inputs_read = sh->Program->InputsRead; + shader->info.outputs_written = sh->Program->OutputsWritten; + shader->info.system_values_read = sh->Program->SystemValuesRead; + shader->info.uses_texture_gather = sh->Program->UsesGather; + shader->info.uses_clip_distance_out = sh->Program->UsesClipDistanceOut; + shader->info.separate_shader = shader_prog->SeparateShader; return shader; } diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c index fe10b38..7a469a6 100644 --- a/src/glsl/nir/nir.c +++ b/src/glsl/nir/nir.c @@ -41,6 +41,7 @@ nir_shader_create(void *mem_ctx, exec_list_make_empty(&shader->outputs); shader->options = options; + memset(&shader->info, 0, sizeof(shader->info)); exec_list_make_empty(&shader->functions); exec_list_make_empty(&shader->registers); diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index d0c7b04..320ffee 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -1454,6 +1454,37 @@ typedef struct nir_shader_compiler_options { bool native_integers; } nir_shader_compiler_options; +typedef struct nir_shader_info { + const char *name; + + /* Number of textures used by this shader */ + unsigned num_textures; + /* Number of uniform buffers used by this shader */ + unsigned num_ubos; + /* Number of atomic buffers used by this shader */ + unsigned num_abos; + /* Number of shader storage buffers used by this shader */ + unsigned num_ssbos; + /* Number of images used by this shader */ + unsigned num_images; + + /* Which inputs are actually read */ + uint64_t inputs_read; + /* Which outputs are actually written */ + uint64_t outputs_written; + /* Which system values are actually read */ + uint64_t system_values_read; + + /* Whether or not this shader ever uses textureGather() */ + bool uses_texture_gather; + + /* Whether or not this shader uses the gl_ClipDistance output */ + bool uses_clip_distance_out; + + /* Whether or not separate shader objects were used */ + bool separate_shader; +} nir_shader_info; + typedef struct nir_shader { /** list of uniforms (nir_variable) */ struct exec_list uniforms; @@ -1471,6 +1502,9 @@ typedef struct nir_shader { */ const struct nir_shader_compiler_options *options; + /** Various bits of compile-time information about a given shader */ + struct nir_shader_info info; + /** list of global variables in the shader (nir_variable) */ struct exec_list globals; diff --git a/src/mesa/program/prog_to_nir.c b/src/mesa/program/prog_to_nir.c index 1bd735a..fc00534 100644 --- a/src/mesa/program/prog_to_nir.c +++ b/src/mesa/program/prog_to_nir.c @@ -1122,6 +1122,19 @@ prog_to_nir(const struct gl_program *prog, ptn_add_output_stores(c); + s->info.name = ralloc_asprintf(s, "ARB%d", prog->Id); + s->info.num_textures = _mesa_fls(prog->SamplersUsed); + s->info.num_ubos = 0; + s->info.num_abos = 0; + s->info.num_ssbos = 0; + s->info.num_images = 0; + s->info.inputs_read = prog->InputsRead; + s->info.outputs_written = prog->OutputsWritten; + s->info.system_values_read = prog->SystemValuesRead; + s->info.uses_texture_gather = false; + s->info.uses_clip_distance_out = false; + s->info.separate_shader = false; + fail: if (c->error) { ralloc_free(s); -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 11/13] i965/vec4: Use nir info instead of pulling things out of [shader_]prog
--- src/mesa/drivers/dri/i965/brw_vec4.cpp | 8 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 8 src/mesa/drivers/dri/i965/brw_wm_iz.cpp| 3 +-- 3 files changed, 9 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 9c90726..a8cdd19 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -1839,8 +1839,8 @@ vec4_visitor::run() \ if (unlikely(INTEL_DEBUG & DEBUG_OPTIMIZER) && this_progress) { \ char filename[64];\ - snprintf(filename, 64, "%s-%04d-%02d-%02d-" #pass,\ - stage_abbrev, shader_prog ? shader_prog->Name : 0, iteration, pass_num); \ + snprintf(filename, 64, "%s-%s-%02d-%02d-" #pass, \ + stage_abbrev, nir->info.name, iteration, pass_num); \ \ backend_shader::dump_instructions(filename); \ }\ @@ -1852,8 +1852,8 @@ vec4_visitor::run() if (unlikely(INTEL_DEBUG & DEBUG_OPTIMIZER)) { char filename[64]; - snprintf(filename, 64, "%s-%04d-00-start", - stage_abbrev, shader_prog ? shader_prog->Name : 0); + snprintf(filename, 64, "%s-%s-00-start", + stage_abbrev, nir->info.name); backend_shader::dump_instructions(filename); } diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp index 8da8e2c..b25058c 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp @@ -470,7 +470,7 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr) brw_mark_surface_used(&prog_data->base, prog_data->base.binding_table.ubo_start + - shader_prog->NumBufferInterfaceBlocks - 1); + nir->info.num_ssbos - 1); } /* Offset */ @@ -617,7 +617,7 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr) */ brw_mark_surface_used(&prog_data->base, prog_data->base.binding_table.ubo_start + - shader_prog->NumBufferInterfaceBlocks - 1); + nir->info.num_ssbos - 1); } src_reg offset_reg = src_reg(this, glsl_type::uint_type); @@ -767,7 +767,7 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr) */ brw_mark_surface_used(&prog_data->base, prog_data->base.binding_table.ubo_start + - shader_prog->NumBufferInterfaceBlocks - 1); + nir->info.num_ssbos - 1); } unsigned const_offset = instr->const_index[0]; @@ -837,7 +837,7 @@ vec4_visitor::nir_emit_ssbo_atomic(int op, nir_intrinsic_instr *instr) */ brw_mark_surface_used(&prog_data->base, prog_data->base.binding_table.ubo_start + -shader_prog->NumBufferInterfaceBlocks - 1); +nir->info.num_ssbos - 1); } src_reg offset = get_nir_src(instr->src[1], 1); diff --git a/src/mesa/drivers/dri/i965/brw_wm_iz.cpp b/src/mesa/drivers/dri/i965/brw_wm_iz.cpp index 14930eb..6f22f29 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_iz.cpp +++ b/src/mesa/drivers/dri/i965/brw_wm_iz.cpp @@ -124,12 +124,11 @@ void fs_visitor::setup_payload_gen4() { assert(stage == MESA_SHADER_FRAGMENT); brw_wm_prog_key *key = (brw_wm_prog_key*) this->key; - gl_fragment_program *fp = (gl_fragment_program*) prog; GLuint reg = 2; bool kill_stats_promoted_workaround = false; int lookup = key->iz_lookup; bool uses_depth = - (fp->Base.InputsRead & (1 << VARYING_SLOT_POS)) != 0; + (nir->info.inputs_read & (1 << VARYING_SLOT_POS)) != 0; assert(lookup < IZ_BIT_MAX); -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 04/13] nir/glsl: Take a gl_shader_program and a stage rather than a gl_shader
--- src/glsl/nir/glsl_to_nir.cpp| 8 ++-- src/glsl/nir/glsl_to_nir.h | 3 ++- src/mesa/drivers/dri/i965/brw_nir.c | 3 +-- 3 files changed, 9 insertions(+), 5 deletions(-) diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp index f874582..ba08e17 100644 --- a/src/glsl/nir/glsl_to_nir.cpp +++ b/src/glsl/nir/glsl_to_nir.cpp @@ -129,9 +129,13 @@ private: }; /* end of anonymous namespace */ nir_shader * -glsl_to_nir(struct gl_shader *sh, const nir_shader_compiler_options *options) +glsl_to_nir(const struct gl_shader_program *shader_prog, +gl_shader_stage stage, +const nir_shader_compiler_options *options) { - nir_shader *shader = nir_shader_create(NULL, sh->Stage, options); + struct gl_shader *sh = shader_prog->_LinkedShaders[stage]; + + nir_shader *shader = nir_shader_create(NULL, stage, options); nir_visitor v1(shader); nir_function_visitor v2(&v1); diff --git a/src/glsl/nir/glsl_to_nir.h b/src/glsl/nir/glsl_to_nir.h index 3801e8c..29badcd 100644 --- a/src/glsl/nir/glsl_to_nir.h +++ b/src/glsl/nir/glsl_to_nir.h @@ -32,7 +32,8 @@ extern "C" { #endif -nir_shader *glsl_to_nir(struct gl_shader *sh, +nir_shader *glsl_to_nir(const struct gl_shader_program *shader_prog, +gl_shader_stage stage, const nir_shader_compiler_options *options); #ifdef __cplusplus diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c index 1d4f6ab..6ec2d76 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.c +++ b/src/mesa/drivers/dri/i965/brw_nir.c @@ -83,13 +83,12 @@ brw_create_nir(struct brw_context *brw, static const nir_lower_tex_options tex_options = { .lower_txp = ~0, }; - struct gl_shader *shader = shader_prog ? shader_prog->_LinkedShaders[stage] : NULL; bool debug_enabled = INTEL_DEBUG & intel_debug_flag_for_shader_stage(stage); nir_shader *nir; /* First, lower the GLSL IR or Mesa IR to NIR */ if (shader_prog) { - nir = glsl_to_nir(shader, options); + nir = glsl_to_nir(shader_prog, stage, options); } else { nir = prog_to_nir(prog, options); nir_convert_to_ssa(nir); /* turn registers into SSA */ -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 00/13] i965: Pull shader, prog, and shader_prog out of the
This series does a bunch of code-shuffling with the end objective of getting the gl_* data structures out of the backend compiler. Now that we have NIR, we would like the compiler to be a NIR -> binary translator and not be loaded full of mesa and GL-specific stuff. This series gets us 95% of the way there. Unfortunately, we still have to pass gl_program into geometry shaders for transform-fedback and, for fragment shaders, we need to pass in gl_shader_program so that we can fix up TEXTURE_RECTANGLE on older hardware (pre-Broadwell). Incidentally, according to our CI system, this series fixes 2 piglit tests on Iron Lake and GM45. I have no idea why. Jason Ekstrand (13): i965/shader: Pull assign_common_binding_table_offsets out of backend_shader i965: Move binding table setup to codegen time. i965: Move prog_data uniform setup to the codegen level nir/glsl: Take a gl_shader_program and a stage rather than a gl_shader nir: Add a a nir_shader_info struct nir: Move GS data to nir_shader_info i965/backend_shader: Add a field to store the NIR shader i965/cs: Remove the prog argument from local_id_payload_dwords i965/fs: Move sampler unit lookup into rescale_texcoord i965/fs: Use the nir info instead of pulling things out of [shader_]prog i965/vec4: Use nir info instead of pulling things out of [shader_]prog i965/fs,vec4: Get rid of the sanity_param_count i965/shader: Get rid of the shader, prog, and shader_prog fields src/glsl/nir/glsl_to_nir.cpp | 30 +- src/glsl/nir/glsl_to_nir.h| 3 +- src/glsl/nir/nir.c| 4 +- src/glsl/nir/nir.h| 50 -- src/glsl/nir/nir_lower_gs_intrinsics.c| 2 +- src/mesa/drivers/dri/i965/brw_cs.c| 25 - src/mesa/drivers/dri/i965/brw_cs.h| 3 +- src/mesa/drivers/dri/i965/brw_fs.cpp | 106 +- src/mesa/drivers/dri/i965/brw_fs.h| 20 ++-- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 57 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 24 +++-- src/mesa/drivers/dri/i965/brw_gs.c| 24 - src/mesa/drivers/dri/i965/brw_nir.c | 3 +- src/mesa/drivers/dri/i965/brw_shader.cpp | 30 +++--- src/mesa/drivers/dri/i965/brw_shader.h| 21 +++-- src/mesa/drivers/dri/i965/brw_vec4.cpp| 34 ++- src/mesa/drivers/dri/i965/brw_vec4.h | 13 +-- src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp | 6 +- src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 24 ++--- src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h | 5 +- src/mesa/drivers/dri/i965/brw_vec4_nir.cpp| 41 +++-- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp| 8 +- src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp | 11 +-- src/mesa/drivers/dri/i965/brw_vs.c| 14 +++ src/mesa/drivers/dri/i965/brw_vs.h| 4 +- src/mesa/drivers/dri/i965/brw_wm.c| 33 ++- src/mesa/drivers/dri/i965/brw_wm_iz.cpp | 3 +- src/mesa/drivers/dri/i965/gen6_gs_visitor.cpp | 9 -- src/mesa/drivers/dri/i965/gen6_gs_visitor.h | 4 +- src/mesa/drivers/dri/i965/gen7_cs_state.c | 7 +- src/mesa/program/prog_to_nir.c| 13 +++ 31 files changed, 333 insertions(+), 298 deletions(-) -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/13] i965/fs: Move sampler unit lookup into rescale_texcoord
The texunit variable we create and assign in nir_emit_texture gets passed through two more layers of function calls before it gets to its sole use in rescale_texcoord. The best part is that we already pass the sampler into rescale_texcoord so we can just look it up there. --- src/mesa/drivers/dri/i965/brw_fs.h | 5 ++--- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 8 +--- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 14 ++ 3 files changed, 13 insertions(+), 14 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index b932ed2..213b2e0 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -205,7 +205,7 @@ public: void emit_interpolation_setup_gen6(); void compute_sample_position(fs_reg dst, fs_reg int_sample_pos); fs_reg rescale_texcoord(fs_reg coordinate, int coord_components, - bool is_rect, uint32_t sampler, int texunit); + bool is_rect, uint32_t sampler); void emit_texture(ir_texture_opcode op, const glsl_type *dest_type, fs_reg coordinate, int components, @@ -218,8 +218,7 @@ public: bool is_cube_array, bool is_rect, uint32_t sampler, - fs_reg sampler_reg, - int texunit); + fs_reg sampler_reg); fs_reg emit_mcs_fetch(const fs_reg &coordinate, unsigned components, const fs_reg &sampler); void emit_gen6_gather_wa(uint8_t wa, fs_reg dst); diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 3793791..42bf050 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -1917,12 +1917,6 @@ fs_visitor::nir_emit_texture(const fs_builder &bld, nir_tex_instr *instr) unsigned sampler = instr->sampler_index; fs_reg sampler_reg(sampler); - /* FINISHME: We're failing to recompile our programs when the sampler is -* updated. This only matters for the texture rectangle scale parameters -* (pre-gen6, or gen6+ with GL_CLAMP). -*/ - int texunit = prog->SamplerUnits[sampler]; - int gather_component = instr->component; bool is_rect = instr->sampler_dim == GLSL_SAMPLER_DIM_RECT; @@ -2063,7 +2057,7 @@ fs_visitor::nir_emit_texture(const fs_builder &bld, nir_tex_instr *instr) emit_texture(op, dest_type, coordinate, instr->coord_components, shadow_comparitor, lod, lod2, lod_components, sample_index, tex_offset, mcs, gather_component, -is_cube_array, is_rect, sampler, sampler_reg, texunit); +is_cube_array, is_rect, sampler, sampler_reg); fs_reg dest = get_nir_dest(instr->dest); dest.type = this->result.type; diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 47d7ae4..c319bf7 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -79,7 +79,7 @@ fs_visitor::emit_vs_system_value(int location) fs_reg fs_visitor::rescale_texcoord(fs_reg coordinate, int coord_components, - bool is_rect, uint32_t sampler, int texunit) + bool is_rect, uint32_t sampler) { bool needs_gl_clamp = true; fs_reg scale_x, scale_y; @@ -93,10 +93,16 @@ fs_visitor::rescale_texcoord(fs_reg coordinate, int coord_components, (devinfo->gen >= 6 && (key_tex->gl_clamp_mask[0] & (1 << sampler) || key_tex->gl_clamp_mask[1] & (1 << sampler) { struct gl_program_parameter_list *params = prog->Parameters; + + + /* FINISHME: We're failing to recompile our programs when the sampler is + * updated. This only matters for the texture rectangle scale + * parameters (pre-gen6, or gen6+ with GL_CLAMP). + */ int tokens[STATE_LENGTH] = { STATE_INTERNAL, STATE_TEXRECT_SCALE, -texunit, +prog->SamplerUnits[sampler], 0, 0 }; @@ -221,7 +227,7 @@ fs_visitor::emit_texture(ir_texture_opcode op, bool is_cube_array, bool is_rect, uint32_t sampler, - fs_reg sampler_reg, int texunit) + fs_reg sampler_reg) { fs_inst *inst = NULL; @@ -256,7 +262,7 @@ fs_visitor::emit_texture(ir_texture_opcode op, * samplers. This should only be a problem with GL_CLAMP on Gen7. */ coordinate = rescale_texcoord(coordinate, coord_components, is_rect, -sampler, texunit); +sampler); } /* Writemasking doesn't eliminate channels on SIMD8 texture -- 2.5.0.400.gff86faf
[Mesa-dev] [PATCH 10/13] i965/fs: Use the nir info instead of pulling things out of [shader_]prog
--- src/mesa/drivers/dri/i965/brw_fs.cpp | 29 ++-- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 8 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 2 +- 3 files changed, 19 insertions(+), 20 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 699d3d0..2d67085 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -1401,7 +1401,7 @@ fs_visitor::calculate_urb_setup() int urb_next = 0; /* Figure out where each of the incoming setup attributes lands. */ if (devinfo->gen >= 6) { - if (_mesa_bitcount_64(prog->InputsRead & + if (_mesa_bitcount_64(nir->info.inputs_read & BRW_FS_VARYING_INPUT_MASK) <= 16) { /* The SF/SBE pipeline stage can do arbitrary rearrangement of the * first 16 varying inputs, so we can put them wherever we want. @@ -1413,7 +1413,7 @@ fs_visitor::calculate_urb_setup() * a different vertex (or geometry) shader. */ for (unsigned int i = 0; i < VARYING_SLOT_MAX; i++) { -if (prog->InputsRead & BRW_FS_VARYING_INPUT_MASK & +if (nir->info.inputs_read & BRW_FS_VARYING_INPUT_MASK & BITFIELD64_BIT(i)) { prog_data->urb_setup[i] = urb_next++; } @@ -1427,7 +1427,7 @@ fs_visitor::calculate_urb_setup() struct brw_vue_map prev_stage_vue_map; brw_compute_vue_map(devinfo, &prev_stage_vue_map, key->input_slots_valid, - shader_prog->SeparateShader); + nir->info.separate_shader); int first_slot = 2 * BRW_SF_URB_ENTRY_READ_OFFSET; assert(prev_stage_vue_map.num_slots <= first_slot + 32); for (int slot = first_slot; slot < prev_stage_vue_map.num_slots; @@ -1437,7 +1437,7 @@ fs_visitor::calculate_urb_setup() * unused. */ if (varying != BRW_VARYING_SLOT_COUNT && -(prog->InputsRead & BRW_FS_VARYING_INPUT_MASK & +(nir->info.inputs_read & BRW_FS_VARYING_INPUT_MASK & BITFIELD64_BIT(varying))) { prog_data->urb_setup[varying] = slot - first_slot; } @@ -1470,7 +1470,7 @@ fs_visitor::calculate_urb_setup() * * See compile_sf_prog() for more info. */ - if (prog->InputsRead & BITFIELD64_BIT(VARYING_SLOT_PNTC)) + if (nir->info.inputs_read & BITFIELD64_BIT(VARYING_SLOT_PNTC)) prog_data->urb_setup[VARYING_SLOT_PNTC] = urb_next++; } @@ -4634,7 +4634,7 @@ void fs_visitor::setup_payload_gen6() { bool uses_depth = - (prog->InputsRead & (1 << VARYING_SLOT_POS)) != 0; + (nir->info.inputs_read & (1 << VARYING_SLOT_POS)) != 0; unsigned barycentric_interp_modes = (stage == MESA_SHADER_FRAGMENT) ? ((brw_wm_prog_data*) this->prog_data)->barycentric_interp_modes : 0; @@ -4693,7 +4693,7 @@ fs_visitor::setup_payload_gen6() } /* R32: MSAA input coverage mask */ - if (prog->SystemValuesRead & SYSTEM_BIT_SAMPLE_MASK_IN) { + if (nir->info.system_values_read & SYSTEM_BIT_SAMPLE_MASK_IN) { assert(devinfo->gen >= 7); payload.sample_mask_in_reg = payload.num_regs; payload.num_regs++; @@ -4706,7 +4706,7 @@ fs_visitor::setup_payload_gen6() /* R34-: bary for 32-pixel. */ /* R58-59: interp W for 32-pixel. */ - if (prog->OutputsWritten & BITFIELD64_BIT(FRAG_RESULT_DEPTH)) { + if (nir->info.outputs_written & BITFIELD64_BIT(FRAG_RESULT_DEPTH)) { source_depth_to_render_target = true; } } @@ -4725,7 +4725,7 @@ fs_visitor::setup_cs_payload() payload.num_regs = 1; - if (prog->SystemValuesRead & SYSTEM_BIT_LOCAL_INVOCATION_ID) { + if (nir->info.system_values_read & SYSTEM_BIT_LOCAL_INVOCATION_ID) { const unsigned local_id_dwords = brw_cs_prog_local_id_payload_dwords(dispatch_width); assert((local_id_dwords & 0x7) == 0); @@ -4786,8 +4786,8 @@ fs_visitor::optimize() \ if (unlikely(INTEL_DEBUG & DEBUG_OPTIMIZER) && this_progress) { \ char filename[64]; \ - snprintf(filename, 64, "%s%d-%04d-%02d-%02d-" #pass, \ - stage_abbrev, dispatch_width, shader_prog ? shader_prog->Name : 0, iteration, pass_num); \ + snprintf(filename, 64, "%s%d-%s-%02d-%02d-" #pass, \ + stage_abbrev, dispatch_width, nir->info.name, iteration, pass_num); \ \ backend_shader::dump_instructions(filename); \ } \ @@ -4800,9 +4800,8 @@ fs_visitor::optimize() if (unlikely(INTEL_D
[Mesa-dev] [PATCH 12/13] i965/fs, vec4: Get rid of the sanity_param_count
It doesn't exist for anything other than an assert that, as far as I can tell, isn't possible to trip. Soon, we will remove prog from the visitor entirely and this will become even more impossible to hit. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 18 -- src/mesa/drivers/dri/i965/brw_fs.h | 1 - src/mesa/drivers/dri/i965/brw_vec4.cpp | 9 - src/mesa/drivers/dri/i965/brw_vec4.h | 2 -- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 1 - 5 files changed, 31 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 2d67085..7e6022b 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -4985,8 +4985,6 @@ fs_visitor::run_fs(bool do_rep_send) assert(stage == MESA_SHADER_FRAGMENT); - sanity_param_count = prog->Parameters->NumParameters; - if (devinfo->gen >= 6) setup_payload_gen6(); else @@ -5055,13 +5053,6 @@ fs_visitor::run_fs(bool do_rep_send) else wm_prog_data->reg_blocks_16 = brw_register_blocks(grf_used); - /* If any state parameters were appended, then ParameterValues could have -* been realloced, in which case the driver uniform storage set up by -* _mesa_associate_uniform_storage() would point to freed memory. Make -* sure that didn't happen. -*/ - assert(sanity_param_count == prog->Parameters->NumParameters); - return !failed; } @@ -5071,8 +5062,6 @@ fs_visitor::run_cs() assert(stage == MESA_SHADER_COMPUTE); assert(shader); - sanity_param_count = prog->Parameters->NumParameters; - setup_cs_payload(); if (shader_time_index >= 0) @@ -5100,13 +5089,6 @@ fs_visitor::run_cs() if (failed) return false; - /* If any state parameters were appended, then ParameterValues could have -* been realloced, in which case the driver uniform storage set up by -* _mesa_associate_uniform_storage() would point to freed memory. Make -* sure that didn't happen. -*/ - assert(sanity_param_count == prog->Parameters->NumParameters); - return !failed; } diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 213b2e0..321d302 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -300,7 +300,6 @@ public: const struct brw_sampler_prog_key_data *key_tex; struct brw_stage_prog_data *prog_data; - unsigned int sanity_param_count; int *param_size; diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index a8cdd19..8a76f36 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -1803,8 +1803,6 @@ vec4_visitor::emit_shader_time_write(int shader_time_subindex, src_reg value) bool vec4_visitor::run() { - sanity_param_count = prog->Parameters->NumParameters; - if (shader_time_index >= 0) emit_shader_time_begin(); @@ -1927,13 +1925,6 @@ vec4_visitor::run() brw_get_scratch_size(last_scratch * REG_SIZE); } - /* If any state parameters were appended, then ParameterValues could have -* been realloced, in which case the driver uniform storage set up by -* _mesa_associate_uniform_storage() would point to freed memory. Make -* sure that didn't happen. -*/ - assert(sanity_param_count == prog->Parameters->NumParameters); - return !failed; } diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 6e3af60..3095a51 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -97,8 +97,6 @@ public: const struct brw_sampler_prog_key_data * const key_tex; struct brw_vue_prog_data * const prog_data; - unsigned int sanity_param_count; - char *fail_msg; bool failed; diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index bc9d9a0..f4cbc9e 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -1836,7 +1836,6 @@ vec4_visitor::vec4_visitor(const struct brw_compiler *compiler, shader_prog, prog, &prog_data->base, stage), key_tex(key_tex), prog_data(prog_data), - sanity_param_count(0), fail_msg(NULL), first_non_payload_grf(0), need_all_constants_in_pull_buffer(false), -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] nir: Add a function to determine if a source is dynamically uniform
On Oct 1, 2015 12:18 PM, "Matt Turner" wrote: > > On Fri, Aug 7, 2015 at 9:31 AM, Neil Roberts wrote: > > Adds nir_src_is_dynamically_uniform which returns true if the source > > is known to be dynamically uniform. This will be used in a later patch > > to add a workaround for cases that only work with dynamically uniform > > sources. Note that the function is not definitive, it can return false > > negatives (but not false positives). Currently it only detects > > constants and uniform accesses. It could easily be extended to include > > more cases. > > --- > > src/glsl/nir/nir.c | 29 + > > src/glsl/nir/nir.h | 1 + > > 2 files changed, 30 insertions(+) > > > > diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c > > index 78ff886..242f0b4 100644 > > --- a/src/glsl/nir/nir.c > > +++ b/src/glsl/nir/nir.c > > @@ -1784,6 +1784,35 @@ nir_src_as_const_value(nir_src src) > > return &load->value; > > } > > > > +/** > > + * Returns true if the source is known to be dynamically uniform. Otherwise it > > + * returns false which means it may or may not be dynamically uniform but it > > + * can't be determined. > > + */ > > +bool > > +nir_src_is_dynamically_uniform(nir_src src) > > +{ > > + if (!src.is_ssa) > > + return false; > > + > > + /* Constants are trivially dynamically uniform */ > > + if (src.ssa->parent_instr->type == nir_instr_type_load_const) > > + return true; > > + > > + /* As are uniform variables */ > > + if (src.ssa->parent_instr->type == nir_instr_type_intrinsic) { > > + nir_intrinsic_instr *intr = nir_instr_as_intrinsic(src.ssa->parent_instr); > > + > > + if (intr->intrinsic == nir_intrinsic_load_uniform) > > + return true; > > + } > > + > > + /* XXX: this could have many more tests, such as when a sampler function is > > +* called with dynamically uniform arguments. > > +*/ > > + return false; > > +} > > This functions seems correct as-is, so it gets a > > Reviewed-by: Matt Turner > > On top of being useful for fixing the nonconst/nonuniform piglit > tests, knowing which values are uniform can allow better optimization > and knowing which branches are uniform would allow us to enable Single > Program Flow. > > Cc'ing Jason for a discussion about how we'd like to handle this kind > of information in NIR. Add a bool is_uniform or something to > nir_ssa_def and hook into the metadata system? That was more-or-less my plan. We should probably have a proper pass that handles things like phi nodes and ALU operations. --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 92231] [swrast] piglit gl-1.0-beginend-coverage regression
https://bugs.freedesktop.org/show_bug.cgi?id=92231 Bug ID: 92231 Summary: [swrast] piglit gl-1.0-beginend-coverage regression Product: Mesa Version: git Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Keywords: bisected, regression Severity: normal Priority: medium Component: Mesa core Assignee: mesa-dev@lists.freedesktop.org Reporter: v...@freedesktop.org QA Contact: mesa-dev@lists.freedesktop.org CC: bri...@vmware.com, lem...@gmail.com mesa: b3f9c5cc0fab6d770d220efd87ba3746c6673875 (master 11.1.0-devel) $ ./bin/gl-1.0-beginend-coverage -auto -fbo [...] testing glTexParameterf() Testing within begin/end Unexpected GL error: GL_INVALID_FRAMEBUFFER_OPERATION 0x506 (Error at tests/spec/gl-1.0/beginend-coverage.c:686) Expected GL error: GL_INVALID_OPERATION 0x502 PIGLIT: {"subtest": {"glTexParameterf" : "fail"}} c277fa394087272c79d65fc308d268fc768b91e5 is the first bad commit commit c277fa394087272c79d65fc308d268fc768b91e5 Author: Brian Paul Date: Wed Sep 30 11:29:13 2015 -0600 mesa: consolidate texture binding code Before, we were doing the actual _mesa_reference_texobj() call and ctx->Driver.BindTexture() and misc housekeeping in three different places. This consolidates the common code in a new bind_texture() function. Reviewed-by: Tapani Pälli :04 04 01cfb3f9437ac679e95f9507e0b47516ecec347e 7d921525c01aef94abfb26cc232d225ebe5f50af Msrc bisect run success -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965: Fix intel_miptree_is_fast_clear_capable()
On Thu, Oct 01, 2015 at 08:20:07AM -0700, Chad Versace wrote: > There are three types of fast clears: > a. fast depth clears > b. fast singlesample color clears > c. fast multisample color clears > Function intel_miptree_is_fast_clear_capable() checks if a miptree > supports fast clears of type (b). > > Rename the function to disambiguate what it does: > old: intel_miptree_is_fast_clear_capable > new: intel_miptree_supports_non_msrt_fast_clear > > The functionally *accidentally* rejected multisampled color surfaces > because it thought they were singlesample array surfaces. Fix that by > explicitly rejecting surfaces with samples > 1. > I wasn't going to say anything except you put "accidentally" in bold. I don't think you can determine what Paul was thinking when he originally implemented that code. Calling it accidental is unnecessarily judgmental (as is my response here). The code was correct, and unless you want to ask him, we should assume it was intentional. FWIW: I didn't spot anything wrong in this patch, but I don't know the code well enough to call that a reviewed-by in the time I spent looking at it. > This fix would have been needed before we enabled layered fast > singlesample color clears (introduced in gen8), which we want to do > eventually. For now, though, this patch changes no behavior; it just > fixes how the driver chooses its behavior. > --- > src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 24 +++- > 1 file changed, 19 insertions(+), 5 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > index 05dc291..a169c41 100644 > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > @@ -194,8 +194,8 @@ intel_tiling_supports_non_msrt_mcs(struct brw_context > *brw, unsigned tiling) > * 64bpp, and 128bpp. > */ > static bool > -intel_miptree_is_fast_clear_capable(struct brw_context *brw, > -struct intel_mipmap_tree *mt) > +intel_miptree_supports_non_msrt_fast_clear(struct brw_context *brw, > + struct intel_mipmap_tree *mt) > { > /* MCS support does not exist prior to Gen7 */ > if (brw->gen < 7) > @@ -204,6 +204,10 @@ intel_miptree_is_fast_clear_capable(struct brw_context > *brw, > if (mt->disable_aux_buffers) >return false; > > + /* This function applies only to non-multisampled render targets. */ > + if (mt->num_samples > 1) > + return false; > + > /* MCS is only supported for color buffers */ > switch (_mesa_get_format_base_format(mt->format)) { > case GL_DEPTH_COMPONENT: > @@ -222,7 +226,16 @@ intel_miptree_is_fast_clear_capable(struct brw_context > *brw, > >return false; > } > + > + /* Check for layered surfaces. */ > if (mt->physical_depth0 != 1) { > + /* Multisample surfaces with the CMS layout are not layered surfaces, > +* yet still have physical_depth0 > 1. Assert that we don't > +* accidentally reject a multisampled surface here. We should have > +* rejected it earlier by explicitly checking the sample count. > +*/ > + assert(mt->num_samples <= 1); > + >if (brw->gen >= 8) { > perf_debug("Layered fast clear - giving up. (%dx%d%d)\n", > mt->logical_width0, mt->logical_height0, > @@ -494,7 +507,7 @@ intel_miptree_create_layout(struct brw_context *brw, > * 7 | ? |? > * 6 | ? |? > */ > - if (intel_miptree_is_fast_clear_capable(brw, mt)) { > + if (intel_miptree_supports_non_msrt_fast_clear(brw, mt)) { >if (brw->gen >= 9 || (brw->gen == 8 && num_samples <= 1)) > layout_flags |= MIPTREE_LAYOUT_FORCE_HALIGN16; > } else if (brw->gen >= 9 && num_samples > 1) { > @@ -692,7 +705,7 @@ intel_miptree_create(struct brw_context *brw, > * clear actually occurs. > */ > if (intel_tiling_supports_non_msrt_mcs(brw, mt->tiling) && > - intel_miptree_is_fast_clear_capable(brw, mt)) { > + intel_miptree_supports_non_msrt_fast_clear(brw, mt)) { >mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_RESOLVED; >assert(brw->gen < 8 || mt->halign == 16 || num_samples <= 1); > } > @@ -800,8 +813,9 @@ intel_update_winsys_renderbuffer_miptree(struct > brw_context *intel, > * clear actually occurs. > */ > if (intel_tiling_supports_non_msrt_mcs(intel, singlesample_mt->tiling) && > - intel_miptree_is_fast_clear_capable(intel, singlesample_mt)) > + intel_miptree_supports_non_msrt_fast_clear(intel, singlesample_mt)) { >singlesample_mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_RESOLVED; > + } > > if (num_samples == 0) { >intel_miptree_release(&irb->mt); > -- > 2.5.0.342.g44e0223 > > ___ > mesa-dev mailing list
[Mesa-dev] [Bug 92231] [swrast] piglit gl-1.0-beginend-coverage regression
https://bugs.freedesktop.org/show_bug.cgi?id=92231 Tapani Pälli changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |DUPLICATE --- Comment #1 from Tapani Pälli --- *** This bug has been marked as a duplicate of bug 92216 *** -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: fix GetProgramiv/GetActiveAttrib regression
Ping, could someone ack/review this? If not, I will revert the patches that caused this for now and figure out alternative solution with more time for packed varyings. It is one of the CTS tests that will fail with revert and nothing else gets hurt since it's a corner-case with SSO. I think currently any game using Unity engine is broken because of this issue. // Tapani On 09/28/2015 12:12 PM, Tapani Pälli wrote: Commit 4639cea2921669527eb43dcb49724c05afb27e8e added packed varyings as part of program resource list. We need to take this to account with all functions dealing with active input attributes. Patch fixes the issue by adding additional check to is_active_attrib and iterating active attribs explicitly in GetActiveAttrib function. I've tested that fix works for Assault Android Cactus (demo version) and does not cause Piglit or CTS regressions in glGetProgramiv tests. Signed-off-by: Tapani Pälli Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92122 --- src/mesa/main/shader_query.cpp | 44 +- 1 file changed, 35 insertions(+), 9 deletions(-) diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp index 99d9e10..d023504 100644 --- a/src/mesa/main/shader_query.cpp +++ b/src/mesa/main/shader_query.cpp @@ -101,15 +101,34 @@ _mesa_BindAttribLocation(GLhandleARB program, GLuint index, */ } +/* Function checks if given input variable is packed varying by checking + * if there also exists output variable with the same name. + */ +static bool +is_packed_varying(struct gl_shader_program *shProg, const ir_variable *input) +{ + assert(input->data.mode == ir_var_shader_in); + unsigned array_index = 0; + struct gl_program_resource *out = + _mesa_program_resource_find_name(shProg, GL_PROGRAM_OUTPUT, input->name, + &array_index); + if (out) + return true; + return false; +} + static bool -is_active_attrib(const ir_variable *var) +is_active_attrib(struct gl_shader_program *shProg, + const ir_variable *var) { if (!var) return false; switch (var->data.mode) { case ir_var_shader_in: - return var->data.location != -1; + return (var->data.location != -1 && + var->data.location_frac == 0 && + !is_packed_varying(shProg, var)); case ir_var_system_value: /* From GL 4.3 core spec, section 11.1.1 (Vertex Attributes): @@ -154,19 +173,25 @@ _mesa_GetActiveAttrib(GLhandleARB program, GLuint desired_index, return; } - struct gl_program_resource *res = - _mesa_program_resource_find_index(shProg, GL_PROGRAM_INPUT, -desired_index); + struct gl_program_resource *res = shProg->ProgramResourceList; + unsigned index = -1; + for (unsigned i = 0; i < shProg->NumProgramResourceList; i++, res++) { + if (res->Type != GL_PROGRAM_INPUT || + !is_active_attrib(shProg, RESOURCE_VAR(res))) + continue; + if (++index == desired_index) + break; + } /* User asked for index that does not exist. */ - if (!res) { + if (!res || index != desired_index) { _mesa_error(ctx, GL_INVALID_VALUE, "glGetActiveAttrib(index)"); return; } const ir_variable *const var = RESOURCE_VAR(res); - if (!is_active_attrib(var)) + if (!is_active_attrib(shProg, var)) return; const char *var_name = var->name; @@ -252,7 +277,7 @@ _mesa_count_active_attribs(struct gl_shader_program *shProg) for (unsigned j = 0; j < shProg->NumProgramResourceList; j++, res++) { if (res->Type == GL_PROGRAM_INPUT && res->StageReferences & (1 << MESA_SHADER_VERTEX) && - is_active_attrib(RESOURCE_VAR(res))) + is_active_attrib(shProg, RESOURCE_VAR(res))) count++; } return count; @@ -271,7 +296,8 @@ _mesa_longest_attribute_name_length(struct gl_shader_program *shProg) size_t longest = 0; for (unsigned j = 0; j < shProg->NumProgramResourceList; j++, res++) { if (res->Type == GL_PROGRAM_INPUT && - res->StageReferences & (1 << MESA_SHADER_VERTEX)) { + res->StageReferences & (1 << MESA_SHADER_VERTEX) && + is_active_attrib(shProg, RESOURCE_VAR(res))) { const size_t length = strlen(RESOURCE_VAR(res)->name); if (length >= longest) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC 2/2] mesa: Add {Num}UniformBlocks and {Num}ShaderStorageBlocks to gl_shader_program
On Thu, 2015-10-01 at 14:01 -0400, Ilia Mirkin wrote: > On Thu, Oct 1, 2015 at 7:09 AM, Iago Toral Quiroga wrote: > > These arrays provide backends with separate index spaces for UBOS and SSBOs. > > --- > > src/glsl/linker.cpp | 35 > > +++ > > src/glsl/standalone_scaffolding.cpp | 9 + > > src/mesa/main/mtypes.h | 6 ++ > > 3 files changed, 50 insertions(+) > > > > diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp > > index e6eba94..3da773d 100644 > > --- a/src/glsl/linker.cpp > > +++ b/src/glsl/linker.cpp > > @@ -4107,6 +4107,41 @@ link_shaders(struct gl_context *ctx, struct > > gl_shader_program *prog) > >} > > } > > > > + /* Split prog->BufferInterfaceBlocks into prog->UniformBlocks and > > +* prog->ShaderStorageBlocks, so that drivers that need separate index > > +* spaces for each set can have that. > > +*/ > > + unsigned num_ubo_blocks; > > + unsigned num_ssbo_blocks; > > + num_ubo_blocks = 0; > > + num_ssbo_blocks = 0; > > + for (unsigned i = 0; i < prog->NumBufferInterfaceBlocks; i++) { > > + if (prog->BufferInterfaceBlocks[i].IsShaderStorage) > > + num_ssbo_blocks++; > > + else > > + num_ubo_blocks++; > > + } > > + > > + prog->UniformBlocks = > > + ralloc_array(mem_ctx, gl_uniform_block *, num_ubo_blocks); > > + prog->NumUniformBlocks = 0; > > + > > + prog->ShaderStorageBlocks = > > + ralloc_array(mem_ctx, gl_uniform_block *, num_ssbo_blocks); > > + prog->NumShaderStorageBlocks = 0; > > + > > + for (unsigned i = 0; i < prog->NumBufferInterfaceBlocks; i++) { > > + if (prog->BufferInterfaceBlocks[i].IsShaderStorage) > > + prog->ShaderStorageBlocks[prog->NumShaderStorageBlocks++] = > > +&prog->BufferInterfaceBlocks[i]; > > + else > > + prog->UniformBlocks[prog->NumUniformBlocks++] = > > +&prog->BufferInterfaceBlocks[i]; > > + } > > Shouldn't this go through and also adjust the indices of the linked > programs? Or... something along those lines? With this, I still need a > remapping table to go from the index passed to a load_ssbo/load_ubo > instruction to a ssbo/ubo index. Maybe I am missing something, but in the case of i965 for example, we use a lowering pass (lower_ubo_reference) to compute the block indices (see src/glsl/lower_ubo_reference.cpp). If you look at lower_ubo_reference_visitor::setup_for_load_or_store, there is the code we use to compute the block index. This happens in the backend already, so I was thinking that you would do something similar, but instead of using BufferInterfaceBlocks you would use these two arrays instead. Until this point there are no references to any blocks in the shaders. > Perhaps a few well-placed helper functions can alleviate this? Also > this should erase the need for some of the O(n) iterations that have > sprung up as a result of this combined list. > > IMHO ideally the BufferInterfaceBlocks list would get freed at the end > of this function. But I understand that this will require work, and > the onus is probably on me (or anyone wanting to add ssbo support to > other drivers) to do it, or work around it. Yeah, it could go away I guess, but as you say that would require some extra work (and also rewrite the i965 driver first to use separate index spaces). In any case, notice that UniformBlocks and ShaderStorageBlocks only contain pointers into BufferInterfaceBlocks, so we would only be saving a few bytes. Iago > -ilia > > > + > > + assert(prog->NumUniformBlocks + prog->NumShaderStorageBlocks == > > + prog->NumBufferInterfaceBlocks); > > + > > /* FINISHME: Assign fragment shader output locations. */ > > > > done: > > diff --git a/src/glsl/standalone_scaffolding.cpp > > b/src/glsl/standalone_scaffolding.cpp > > index 0c53589..658245f 100644 > > --- a/src/glsl/standalone_scaffolding.cpp > > +++ b/src/glsl/standalone_scaffolding.cpp > > @@ -102,6 +102,15 @@ _mesa_clear_shader_program_data(struct > > gl_shader_program *shProg) > > ralloc_free(shProg->BufferInterfaceBlocks); > > shProg->BufferInterfaceBlocks = NULL; > > shProg->NumBufferInterfaceBlocks = 0; > > + > > + ralloc_free(shProg->UniformBlocks); > > + shProg->UniformBlocks = NULL; > > + shProg->NumUniformBlocks = 0; > > + > > + ralloc_free(shProg->ShaderStorageBlocks); > > + shProg->ShaderStorageBlocks = NULL; > > + shProg->NumShaderStorageBlocks = 0; > > + > > for (i = 0; i < MESA_SHADER_STAGES; i++) { > >ralloc_free(shProg->UniformBlockStageIndex[i]); > >shProg->UniformBlockStageIndex[i] = NULL; > > diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h > > index 347da14..2362f54 100644 > > --- a/src/mesa/main/mtypes.h > > +++ b/src/mesa/main/mtypes.h > > @@ -2692,6 +2692,12 @@ struct gl_shader_program > > unsigned NumBufferInterfaceBlocks; > > struct gl_uniform_block *BufferInterfaceBlocks; > > >
Re: [Mesa-dev] [PATCH] mesa: avoid leaking closure when iterating over a string_to_uint_map
Reviewed-by: Iago Toral Quiroga On Thu, 2015-10-01 at 20:19 -0400, Ilia Mirkin wrote: > Signed-off-by: Ilia Mirkin > --- > src/mesa/program/hash_table.h | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/mesa/program/hash_table.h b/src/mesa/program/hash_table.h > index e85a836..d0a2abf 100644 > --- a/src/mesa/program/hash_table.h > +++ b/src/mesa/program/hash_table.h > @@ -249,6 +249,7 @@ public: >wrapper->closure = closure; > >hash_table_call_foreach(this->ht, subtract_one_wrapper, wrapper); > + free(wrapper); > } > > /** ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: avoid leaking hiddenUniforms map when there are no uniforms
On Thu, 2015-10-01 at 20:22 -0400, Ilia Mirkin wrote: > Signed-off-by: Ilia Mirkin > --- > src/glsl/link_uniforms.cpp | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp > index 47d49c8..740b0a4 100644 > --- a/src/glsl/link_uniforms.cpp > +++ b/src/glsl/link_uniforms.cpp > @@ -1131,15 +1131,15 @@ link_assign_uniform_locations(struct > gl_shader_program *prog, > const unsigned num_data_slots = uniform_size.num_values; > const unsigned hidden_uniforms = uniform_size.num_hidden_uniforms; > > + /* assign hidden uniforms a slot id */ > + hiddenUniforms->iterate(assign_hidden_uniform_slot_id, &uniform_size); > + delete hiddenUniforms; > + > /* On the outside chance that there were no uniforms, bail out. > */ > if (num_uniforms == 0) >return; > > - /* assign hidden uniforms a slot id */ > - hiddenUniforms->iterate(assign_hidden_uniform_slot_id, &uniform_size); > - delete hiddenUniforms; > - I suppose there is no much gain in simply adding the delete statement right before the return... Reviewed-by: Iago Toral Quiroga > struct gl_uniform_storage *uniforms = >rzalloc_array(prog, struct gl_uniform_storage, num_uniforms); > union gl_constant_value *data = ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: don't forget to free image_param on prog_data free
Reviewed-by: Iago Toral Quiroga On Thu, 2015-10-01 at 20:27 -0400, Ilia Mirkin wrote: > Signed-off-by: Ilia Mirkin > --- > src/mesa/drivers/dri/i965/brw_program.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_program.c > b/src/mesa/drivers/dri/i965/brw_program.c > index fee96a8..0e4b823 100644 > --- a/src/mesa/drivers/dri/i965/brw_program.c > +++ b/src/mesa/drivers/dri/i965/brw_program.c > @@ -551,6 +551,7 @@ brw_stage_prog_data_free(const void *p) > > ralloc_free(prog_data->param); > ralloc_free(prog_data->pull_param); > + ralloc_free(prog_data->image_param); > } > > void ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev