Re: [Mesa-dev] [PATCH 1/7] mesa: clean up #includes in ff_fragment_shader.cpp

2015-10-01 Thread Tapani Pälli
There seems to be some 'extra includes', not sure if you want to start 
going through them at this same time? As example sampler.cpp does not 
need to include glsl/ir_visitor.h since it comes with glsl/ir.h.


Anyway nice cleanup, series is
Reviewed-by: Tapani Pälli 

On 10/01/2015 12:35 AM, Brian Paul wrote:

Get rid of "../glsl/" paths.  Sort alphabetically.
---
  src/mesa/main/ff_fragment_shader.cpp | 22 ++
  1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/src/mesa/main/ff_fragment_shader.cpp 
b/src/mesa/main/ff_fragment_shader.cpp
index c682892..c0030bc 100644
--- a/src/mesa/main/ff_fragment_shader.cpp
+++ b/src/mesa/main/ff_fragment_shader.cpp
@@ -27,29 +27,27 @@
   *
   **/

-#include "glheader.h"
-#include "imports.h"
-#include "mtypes.h"
+#include "main/glheader.h"
  #include "main/context.h"
+#include "main/imports.h"
  #include "main/macros.h"
  #include "main/samplerobj.h"
  #include "main/texenvprogram.h"
  #include "main/texobj.h"
  #include "main/uniforms.h"
+#include "glsl/ir_builder.h"
+#include "glsl/ir_optimization.h"
+#include "glsl/glsl_parser_extras.h"
+#include "glsl/glsl_symbol_table.h"
+#include "glsl/glsl_types.h"
+#include "program/ir_to_mesa.h"
  #include "program/program.h"
-#include "program/prog_parameter.h"
+#include "program/programopt.h"
  #include "program/prog_cache.h"
  #include "program/prog_instruction.h"
+#include "program/prog_parameter.h"
  #include "program/prog_print.h"
  #include "program/prog_statevars.h"
-#include "program/programopt.h"
-#include "../glsl/glsl_types.h"
-#include "../glsl/ir.h"
-#include "../glsl/ir_builder.h"
-#include "../glsl/glsl_symbol_table.h"
-#include "../glsl/glsl_parser_extras.h"
-#include "../glsl/ir_optimization.h"
-#include "../program/ir_to_mesa.h"

  using namespace ir_builder;



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92183] linker.cpp:3187:46: error: ‘strtok_r’ was not declared in this scope

2015-10-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92183

Vinson Lee  changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|FIXED   |---

--- Comment #5 from Vinson Lee  ---
mesa: ca2e16d26ec46e604c76ab72d3ec14a7a8643cc8 (master 11.1.0-devel)

Build is still failing on older versions on MinGW. This error occurs with
mingw-w64 2.0.1-1 on Ubuntu 12.04.

  Compiling src/glsl/linker.cpp ...
src/glsl/linker.cpp: In function ‘bool included_in_packed_varying(ir_variable*,
const char*)’:
src/glsl/linker.cpp:3241:46: error: ‘strtok_s’ was not declared in this scope


Mesa might decide instead to raise the minimum required version of mingw-w64.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] radeon/llvm: Initialize gallivm targets when initializing the AMDGPU target v2

2015-10-01 Thread Mathias Fröhlich
Hi,

The series is:

Reviewed-by: Mathias Fröhlich 

Thanks for taking more care of multithread behavior!

Mathias


On Wednesday, September 30, 2015 18:59:44 Tom Stellard wrote:
> This fixes a race condition in the glx-multithreaded-shader-compile
> test.
> 
> v2:
>   - Replace gallivm_init_llvm_{begin,end}() with gallivm_init_llvm_targets().
> 
> CC: "10.6 11.0" 
> ---
>  src/gallium/drivers/radeon/radeon_llvm_emit.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c 
> b/src/gallium/drivers/radeon/radeon_llvm_emit.c
> index 5d93a4d..1a66a55 100644
> --- a/src/gallium/drivers/radeon/radeon_llvm_emit.c
> +++ b/src/gallium/drivers/radeon/radeon_llvm_emit.c
> @@ -26,6 +26,7 @@
>  #include "radeon_llvm_emit.h"
>  #include "radeon_elf_util.h"
>  #include "c11/threads.h"
> +#include "gallivm/lp_bld_misc.h"
>  #include "util/u_memory.h"
>  #include "pipe/p_shader_tokens.h"
>  
> @@ -87,6 +88,7 @@ void radeon_llvm_shader_type(LLVMValueRef F, unsigned type)
>  
>  static void init_r600_target()
>  {
> + gallivm_init_llvm_targets();
>  #if HAVE_LLVM < 0x0307
>   LLVMInitializeR600TargetInfo();
>   LLVMInitializeR600Target();
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] List of unsupported extensions per driver

2015-10-01 Thread Romain Failliot
2015-10-01 3:33 GMT+02:00 Ian Romanick :

> The two lines that just have () are only supported on i965, nvc0, and
> radeonsi.  It's not really that clear.
>

Yep, this part I understood, once you understand it's hierarchical,
everything becomes clearer ;)

Also, a guy on GitHub helped me to understand some subtleties in the way
the drivers were supporting the hardwares, it's starting here:
https://github.com/MightyCreak/mesamatrix/issues/12#issuecomment-144483615
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: emit row_major matrix's SSBO stores only for components in writemask

2015-10-01 Thread Samuel Iglesias Gonsalvez
When writing to a column of a row-major matrix, each component of the
vector is stored to non-consecutive memory addresses, so we generate
one instruction per component.

This patch skips the disabled components in the writemask, saving some
store instructions plus avoid storing wrong data on each disabled
component.

Signed-off-by: Samuel Iglesias Gonsalvez 
---
 src/glsl/lower_ubo_reference.cpp | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/glsl/lower_ubo_reference.cpp b/src/glsl/lower_ubo_reference.cpp
index e581306..247620e 100644
--- a/src/glsl/lower_ubo_reference.cpp
+++ b/src/glsl/lower_ubo_reference.cpp
@@ -754,6 +754,12 @@ lower_ubo_reference_visitor::emit_access(bool is_write,
 add(base_offset,
 new(mem_ctx) ir_constant(deref_offset + i * matrix_stride));
  if (is_write) {
+/* If the component is not in the writemask, then don't
+ * store any value.
+ */
+if (!((1 << i) & write_mask))
+   continue;
+
 base_ir->insert_after(ssbo_store(swizzle(deref, i, 1), 
chan_offset, 1));
  } else {
 if (!this->is_shader_storage) {
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] dri/common: drop loading /etc/drirc

2015-10-01 Thread Eero Tamminen

Hi,

On 10/01/2015 06:07 AM, Michel Dänzer wrote:

On 01.10.2015 04:35, Marek Olšák wrote:

[...]

For your issue, the only way is to remove loading ~/.drirc from Mesa,


I think that would be the proverbial "throwing out the baby with the
bathwater". :)



making driconf useless. Do we really care about driconf? I don't think
so.


~/.drirc isn't only for driconf, it's also for allowing users (who may
not have write access to .../etc/drirc) to have their own persistent DRI
configuration.


One of the original problem was users not telling / knowing that they 
have bogus drirc values when providing data to Mesa developers.


What if Mesa would store drirc settings it loads/uses & from which files 
it read them, provide API for requesting that info and "glxinfo" would 
include that info to its output?  Then one needs only to ask for 
"glxinfo" output to see relevant Mesa version and settings it uses.


This would allow you to see also bad settings from the global drirc.


Same thing should be output also by Mesa debug options and Mesa could 
output a warning about potentially stale user drirc file, if loaded user 
drirc is older than global drirc file.



- Eero

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: don't set DATA_FORMAT if ADD_TID_ENABLE is set on VI (v2)

2015-10-01 Thread Christian König

On 01.10.2015 05:44, Michel Dänzer wrote:

On 01.10.2015 04:11, Marek Olšák wrote:

From: Marek Olšák 

This can cause incorrect address calculations and hangs.

v2: do it properly

Cc: mesa-sta...@lists.freedesktop.org
Tested-and-Reviewed-by: Christian König 
---
  src/gallium/drivers/radeonsi/si_descriptors.c | 9 -
  1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index b07ab3b..b56219a 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -619,11 +619,18 @@ void si_set_ring_buffer(struct pipe_context *ctx, uint 
shader, uint slot,
  S_008F0C_DST_SEL_Z(V_008F0C_SQ_SEL_Z) |
  S_008F0C_DST_SEL_W(V_008F0C_SQ_SEL_W) |
  S_008F0C_NUM_FORMAT(V_008F0C_BUF_NUM_FORMAT_FLOAT) |
- S_008F0C_DATA_FORMAT(V_008F0C_BUF_DATA_FORMAT_32) |
  S_008F0C_ELEMENT_SIZE(element_size) |
  S_008F0C_INDEX_STRIDE(index_stride) |
  S_008F0C_ADD_TID_ENABLE(add_tid);
  
+		/* If ADD_TID_ENABLE is set on VI, DATA_FORMAT specifies

+* STRIDE bits [14:17]
+*/
+   if (sctx->b.chip_class >= VI && add_tid)
+   desc[3] |= S_008F0C_DATA_FORMAT(stride >> 14);
+   else
+   desc[3] |= 
S_008F0C_DATA_FORMAT(V_008F0C_BUF_DATA_FORMAT_32);

The beginning of the function has:

/* The stride field in the resource descriptor has 14 bits */
assert(stride < (1 << 14));

So the if-branch is dead code in a non-release build. Would be nice if
that could be reconciled somehow, but I'm fine with doing it in a
follow-up change. Either way, this change is

Reviewed-by: Michel Dänzer 


Yeah, agree. Might be nice if someone can come up with a test for this, 
but I don't think this is absolutely necessary.


For now the patch is Reviewed-by: Christian König 

Regards,
Christian.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] glx: Fix build errors with --enable-mangling (v2)

2015-10-01 Thread Emil Velikov
On 28 September 2015 at 18:59, Kyle Brenneman  wrote:
> Rearranged the GLX_ALIAS macro in glextensions.h so that it will pick up
> the renames from glx_mangle.h.
>
> Fixed the alias attribute for glXGetProcAddress when USE_MGL_NAMESPACE is
> defined.
>
> v2: Add a comment clarifying why GLX_ALIAS needs two macros.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=2
> Signed-off-by: Kyle Brenneman 
> Cc: "10.6 11.0" 
> ---
>  src/glx/glxcmds.c   |  4 
>  src/glx/glxextensions.h | 10 --
>  2 files changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/src/glx/glxcmds.c b/src/glx/glxcmds.c
> index 26ff804..93e8db5 100644
> --- a/src/glx/glxcmds.c
> +++ b/src/glx/glxcmds.c
> @@ -2646,7 +2646,11 @@ _X_EXPORT void (*glXGetProcAddressARB(const GLubyte * 
> procName)) (void)
>   */
>  _X_EXPORT void (*glXGetProcAddress(const GLubyte * procName)) (void)
>  #if defined(__GNUC__) && !defined(GLX_ALIAS_UNSUPPORTED)
> +# if defined(USE_MGL_NAMESPACE)
> +   __attribute__ ((alias("mglXGetProcAddressARB")));
> +# else
> __attribute__ ((alias("glXGetProcAddressARB")));
> +# endif
>  #else
>  {
> return glXGetProcAddressARB(procName);
> diff --git a/src/glx/glxextensions.h b/src/glx/glxextensions.h
> index 90b173f..3a9bc82 100644
> --- a/src/glx/glxextensions.h
> +++ b/src/glx/glxextensions.h
> @@ -281,11 +281,17 @@ typedef void (*PFNGLXDISABLEEXTENSIONPROC) (const char 
> *name);
>  # define GLX_ALIAS_VOID(real_func, proto_args, args, aliased_func)
>  #else
>  # if defined(__GNUC__) && !defined(GLX_ALIAS_UNSUPPORTED)
> -#  define GLX_ALIAS(return_type, real_func, proto_args, args, aliased_func) \
> +/* GLX_ALIAS and GLX_ALIAS_VOID both expand to the macro GLX_ALIAS2. Using 
> the
> + * extra expansion means that the name mangling macros in glx_mangle.h will
> + * apply before stringification, so the alias attribute will have a string 
> like
> + * "mglXFoo" instead of "glXFoo". */
Hmm things are happening in quite Interesting order. On the good side,
we've been using the two layer macros in our dispatch (glapi) for a
while so things should be fully fixed now.

Reviewed-by: Emil Velikov 

Thank you for the fixes Kyle. I'll push the three patches tomorrow,
barring any objections of course.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] New/final stable-branch 10.6 candidate pushed

2015-10-01 Thread Emil Velikov
Hello list,

The candidate for the Mesa 10.6.9 is now available. Currently we have:
 - 16 queued
 - 18 nominated (outstanding)
 - and 2 rejected/obsolete patches

The present queue covers KDE crashes with i965, regression in Unreal
ournament with gallium drivers, fixes for Redway3D Flat (Demo) and
Astromenace.

Take a look at section "Mesa stable queue" for more information.


Testing
---
The following results are against piglit 246791c51ec.


Changes - classic i965(snb)
---
None.


Changes - swrast classic

None.


Changes - gallium softpipe
--
None.


Changes - gallium llvmpipe (LLVM 3.6.2)
---
None.


Testing reports/general approval

Any testing reports (or general approval of the state of the branch)
will be greatly appreciated.


Trivial merge conflicts
---
None.


The plan is to have 10.6.9 this Friday (2nd of October) or shortly after.

If you have any questions or comments that you would like to share
before the release, please go ahead.


Cheers,
Emil


Mesa stable queue
-

Nominated (18)
==

Ben Widawsky (1):
  i965/skl: Use larger URB size where available.

Boyan Ding (1):
  i915: Add XRGB format to intel_screen_make_configs

Brian Paul (1):
  configure: don't try to build gallium DRI drivers if --disable-dri is set

Emil Velikov (1):
  i965: store reference to the context within struct brw_fence

Ian Romanick (1):
  meta: Handle array textures in scaled MSAA blits

Kyle Brenneman (3):
  glx: Don't hard-code the name "libGL.so.1" in driOpenDriver (v3)
  glx: Fix build errors with --enable-mangling (v2)
  mapi: Make _glapi_get_stub work with "gl" or "mgl" prefix.

Matthew Waters (1):
  egl: rework handling EGL_CONTEXT_FLAGS

Rob Clark (1):
  xa: add xa_surface_from_handle2

Tapani Pälli (1):
  i965: fix textureGrad for cubemaps

Tom Stellard (7):
  clover: Call clBuildProgram() notification function when build completes 
v2
  gallium/drivers: Add threadsafe wrappers for pipe_context v2
  clover: Use threadsafe wrappers for pipe_context v2
  clover: Properly initialize LLVM targets when linking with component libs
  gallium/radeon: Use call_once() when initailizing LLVM targets
  gallivm: Allow drivers and state trackers to initialize gallivm LLVM 
targets v2
  radeon/llvm: Initialize gallivm targets when initializing the AMDGPU 
target v2



Queued (16)
===

Brian Paul (1):
  st/mesa: try PIPE_BIND_RENDER_TARGET when choosing float texture formats

Chris Wilson (1):
  i965: Remove early release of DRI2 miptree

Emil Velikov (3):
  docs: add sha256 checksums for 10.6.8
  cherry-ignore: add commit non applicable for 10.6
  cherry-ignore: add commit non applicable for 10.6

Iago Toral Quiroga (1):
  mesa: Fix GL_FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE for default framebuffer.

Ian Romanick (5):
  t_dd_dmatmp: Make "count" actually be the count
  t_dd_dmatmp: Clean up improper code formatting from previous patch
  t_dd_dmatmp: Use '& 3' instead of '% 4' everywhere
  t_dd_dmatmp: Pull out common 'count -= count & 3' code
  t_dd_dmatmp: Use addition instead of subtraction in loop bounds

Jeremy Huddleston (1):
  configure.ac: Add support to enable read-only text segment on x86.

Kristian Høgsberg Kristensen (1):
  i965: Respect stride and subreg_offset for ATTR registers

Leo Liu (1):
  radeon/vce: fix vui time_scale zero error

Marek Olšák (1):
  st/mesa: fix front buffer regression after dropping st_validate_state in 
Blit

Roland Scheidegger (1):
  mesa: fix mipmap generation for immutable, compressed textures


Obsolete (2)


Tom Stellard (2):
  gallivm: Allow drivers and state trackers to initialize gallivm LLVM 
targets
  radeon/llvm: Initialize gallivm targets when initializing the AMDGPU 
target
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 2/2] mesa: Add {Num}UniformBlocks and {Num}ShaderStorageBlocks to gl_shader_program

2015-10-01 Thread Iago Toral Quiroga
These arrays provide backends with separate index spaces for UBOS and SSBOs.
---
 src/glsl/linker.cpp | 35 +++
 src/glsl/standalone_scaffolding.cpp |  9 +
 src/mesa/main/mtypes.h  |  6 ++
 3 files changed, 50 insertions(+)

diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index e6eba94..3da773d 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -4107,6 +4107,41 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
   }
}
 
+   /* Split prog->BufferInterfaceBlocks into prog->UniformBlocks and
+* prog->ShaderStorageBlocks, so that drivers that need separate index
+* spaces for each set can have that.
+*/
+   unsigned num_ubo_blocks;
+   unsigned num_ssbo_blocks;
+   num_ubo_blocks = 0;
+   num_ssbo_blocks = 0;
+   for (unsigned i = 0; i < prog->NumBufferInterfaceBlocks; i++) {
+  if (prog->BufferInterfaceBlocks[i].IsShaderStorage)
+ num_ssbo_blocks++;
+  else
+ num_ubo_blocks++;
+   }
+
+   prog->UniformBlocks =
+  ralloc_array(mem_ctx, gl_uniform_block *, num_ubo_blocks);
+   prog->NumUniformBlocks = 0;
+
+   prog->ShaderStorageBlocks =
+  ralloc_array(mem_ctx, gl_uniform_block *, num_ssbo_blocks);
+   prog->NumShaderStorageBlocks = 0;
+
+   for (unsigned i = 0; i < prog->NumBufferInterfaceBlocks; i++) {
+  if (prog->BufferInterfaceBlocks[i].IsShaderStorage)
+ prog->ShaderStorageBlocks[prog->NumShaderStorageBlocks++] =
+&prog->BufferInterfaceBlocks[i];
+  else
+ prog->UniformBlocks[prog->NumUniformBlocks++] =
+&prog->BufferInterfaceBlocks[i];
+   }
+
+   assert(prog->NumUniformBlocks + prog->NumShaderStorageBlocks ==
+  prog->NumBufferInterfaceBlocks);
+
/* FINISHME: Assign fragment shader output locations. */
 
 done:
diff --git a/src/glsl/standalone_scaffolding.cpp 
b/src/glsl/standalone_scaffolding.cpp
index 0c53589..658245f 100644
--- a/src/glsl/standalone_scaffolding.cpp
+++ b/src/glsl/standalone_scaffolding.cpp
@@ -102,6 +102,15 @@ _mesa_clear_shader_program_data(struct gl_shader_program 
*shProg)
ralloc_free(shProg->BufferInterfaceBlocks);
shProg->BufferInterfaceBlocks = NULL;
shProg->NumBufferInterfaceBlocks = 0;
+
+   ralloc_free(shProg->UniformBlocks);
+   shProg->UniformBlocks = NULL;
+   shProg->NumUniformBlocks = 0;
+
+   ralloc_free(shProg->ShaderStorageBlocks);
+   shProg->ShaderStorageBlocks = NULL;
+   shProg->NumShaderStorageBlocks = 0;
+
for (i = 0; i < MESA_SHADER_STAGES; i++) {
   ralloc_free(shProg->UniformBlockStageIndex[i]);
   shProg->UniformBlockStageIndex[i] = NULL;
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 347da14..2362f54 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2692,6 +2692,12 @@ struct gl_shader_program
unsigned NumBufferInterfaceBlocks;
struct gl_uniform_block *BufferInterfaceBlocks;
 
+   unsigned NumUniformBlocks;
+   struct gl_uniform_block **UniformBlocks;
+
+   unsigned NumShaderStorageBlocks;
+   struct gl_uniform_block **ShaderStorageBlocks;
+
/**
 * Indices into the _LinkedShaders's UniformBlocks[] array for each stage
 * they're used in, or -1.
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC 1/2] mesa: Rename {Num}UniformBlocks to {Num}BufferInterfaceBlocks

2015-10-01 Thread Iago Toral Quiroga
Currently, these arrays in gl_shader and gl_shader_program hold both
UBOs and SSBOs, so this looks like a better name. We were already
using NumBufferInterfaceBlocks in gl_shader_program, so this makes
things more consistent as well.

In a later patch we will add {Num}UniformBlocks and
{Num}ShaderStorageBlocks which will contain only references to
UBOs and SSBOs respectively that will provide backends with
a separate index space for both types of objects.
---
 src/glsl/link_uniform_initializers.cpp   |  4 +--
 src/glsl/link_uniforms.cpp   | 22 ++--
 src/glsl/linker.cpp  | 44 
 src/glsl/lower_ubo_reference.cpp |  8 ++---
 src/glsl/standalone_scaffolding.cpp  |  4 +--
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  2 +-
 src/mesa/drivers/dri/i965/brw_shader.cpp |  2 +-
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp   |  2 +-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 10 +++---
 src/mesa/main/mtypes.h   |  6 ++--
 src/mesa/main/shader_query.cpp   |  4 +--
 src/mesa/main/shaderapi.c|  4 +--
 src/mesa/main/shaderobj.c|  4 +--
 src/mesa/main/uniforms.c | 12 +++
 src/mesa/state_tracker/st_atom_constbuf.c|  4 +--
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp   |  4 +--
 16 files changed, 68 insertions(+), 68 deletions(-)

diff --git a/src/glsl/link_uniform_initializers.cpp 
b/src/glsl/link_uniform_initializers.cpp
index 3483082..ab7a150 100644
--- a/src/glsl/link_uniform_initializers.cpp
+++ b/src/glsl/link_uniform_initializers.cpp
@@ -49,7 +49,7 @@ get_uniform_block_index(const gl_shader_program *shProg,
 const char *uniformBlockName)
 {
for (unsigned i = 0; i < shProg->NumBufferInterfaceBlocks; i++) {
-  if (!strcmp(shProg->UniformBlocks[i].Name, uniformBlockName))
+  if (!strcmp(shProg->BufferInterfaceBlocks[i].Name, uniformBlockName))
 return i;
}
 
@@ -169,7 +169,7 @@ set_block_binding(gl_shader_program *prog, const char 
*block_name, int binding)
 
  if (stage_index != -1) {
 struct gl_shader *sh = prog->_LinkedShaders[i];
-sh->UniformBlocks[stage_index].Binding = binding;
+sh->BufferInterfaceBlocks[stage_index].Binding = binding;
  }
   }
 }
diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp
index 47d49c8..53b9b45 100644
--- a/src/glsl/link_uniforms.cpp
+++ b/src/glsl/link_uniforms.cpp
@@ -502,9 +502,9 @@ public:
 
 for (unsigned i = 0; i < prog->NumBufferInterfaceBlocks; i++) {
if (strncmp(var->get_interface_type()->name,
-   prog->UniformBlocks[i].Name,
+   prog->BufferInterfaceBlocks[i].Name,
l) == 0
-   && prog->UniformBlocks[i].Name[l] == '[') {
+   && prog->BufferInterfaceBlocks[i].Name[l] == '[') {
   ubo_block_index = i;
   break;
}
@@ -512,7 +512,7 @@ public:
  } else {
 for (unsigned i = 0; i < prog->NumBufferInterfaceBlocks; i++) {
if (strcmp(var->get_interface_type()->name,
-  prog->UniformBlocks[i].Name) == 0) {
+  prog->BufferInterfaceBlocks[i].Name) == 0) {
   ubo_block_index = i;
   break;
}
@@ -530,7 +530,7 @@ public:
 ubo_byte_offset = 0;
  } else {
 const struct gl_uniform_block *const block =
-   &prog->UniformBlocks[ubo_block_index];
+   &prog->BufferInterfaceBlocks[ubo_block_index];
 
 assert(var->data.location != -1);
 
@@ -975,10 +975,10 @@ link_update_uniform_buffer_variables(struct gl_shader 
*shader)
   }
 
   const unsigned l = strlen(var->name);
-  for (unsigned i = 0; i < shader->NumUniformBlocks; i++) {
-for (unsigned j = 0; j < shader->UniformBlocks[i].NumUniforms; j++) {
+  for (unsigned i = 0; i < shader->NumBufferInterfaceBlocks; i++) {
+for (unsigned j = 0; j < shader->BufferInterfaceBlocks[i].NumUniforms; 
j++) {
 if (sentinel) {
-   const char *begin = shader->UniformBlocks[i].Uniforms[j].Name;
+   const char *begin = 
shader->BufferInterfaceBlocks[i].Uniforms[j].Name;
const char *end = strchr(begin, sentinel);
 
if (end == NULL)
@@ -993,7 +993,7 @@ link_update_uniform_buffer_variables(struct gl_shader 
*shader)
   break;
}
 } else if (!strcmp(var->name,
-   shader->UniformBlocks[i].Uniforms[j].Name)) {
+   
shader->BufferInterfaceBlocks[i].Uniforms[j].Name)) {
   found = true;
   var->data.loc

[Mesa-dev] [RFC 0/2] Provide separate index spaces for UBOs and SSBOs

2015-10-01 Thread Iago Toral Quiroga
The compiler has a lot of code to implement UBOs that uses UniformBlocks
intensively in multiples places. When we implemented SSBOs, which need
to go pretty much through the same process, we had two options:

1) Rework all this code (and other parts of the frontend that use this
array) to accomodate a second array of objects and have all the pieces
access/index into one or the other (or both) depending on the kind of
object we are manipulating or what the code is trying to achieve.

2) Simply merge SSBOs into UniformBlocks. This lets the compiler do all
the things it needs to do for SSBOs without significant changes to the
code. This is what we implemented, since it it was less invasive and it
would not increase the complexity of code that is already complex enough
as it is. It should also make maintenance easy, since as far the compiler
is involved, we don't really have to care if we are manipulating SSBOs or
UBOs, so fixes or changes to the code will apply naturally to both, which
is probably what we want most of the time.

However, based on recent discussions with Ilia and Curro this creates an
inconvenience for drivers that expect different index spaces for both kinds
of objects, which is probably the majority since in the end the GL API
defines both as separate entities. As it is now, drivers that want separate
index spaces for UBOs and SSBOs require to remap the indices, which can be
done, but is not nice.

This 2-patch series implements a solution for this that tries to keep 
the compiler frontend untouched while providing backends with separate
index spaces. The idea is that we can have the compiler work with a
single list of blocks and do all the linking process like that.
No changes here. Then, after linking, we can create two separate arrays
for UBOs and SSBOs. Drivers that need a separate index space only need
to reference these arrays instead of the one that mixes both kinds.

I think this gives everyone what they need without major changes so
hopefully it is a good way to go for now, even if we still want to
do the bigger change in the future (not sure that we want to do that
though). Would this be sufficient?

Patch1: Renames {Num}UniformBlocks to {Num}BufferInterfaceBlocks. This is
more consistent with the current implementation.

Patch2: Adds separate UniformBlocks and ShaderStorageBlocks arrays.

Iago Toral Quiroga (2):
  mesa: Rename {Num}UniformBlocks to {Num}BufferInterfaceBlocks
  mesa: Add {Num}UniformBlocks and {Num}ShaderStorageBlocks to
gl_shader_program

 src/glsl/link_uniform_initializers.cpp   |  4 +-
 src/glsl/link_uniforms.cpp   | 22 +++
 src/glsl/linker.cpp  | 79 +---
 src/glsl/lower_ubo_reference.cpp |  8 +--
 src/glsl/standalone_scaffolding.cpp  | 11 +++-
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  2 +-
 src/mesa/drivers/dri/i965/brw_shader.cpp |  2 +-
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp   |  2 +-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 10 +--
 src/mesa/main/mtypes.h   | 12 +++-
 src/mesa/main/shader_query.cpp   |  4 +-
 src/mesa/main/shaderapi.c|  4 +-
 src/mesa/main/shaderobj.c|  4 +-
 src/mesa/main/uniforms.c | 12 ++--
 src/mesa/state_tracker/st_atom_constbuf.c|  4 +-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp   |  4 +-
 16 files changed, 117 insertions(+), 67 deletions(-)

-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: emit row_major matrix's SSBO stores only for components in writemask

2015-10-01 Thread Iago Toral
On Thu, 2015-10-01 at 09:41 +0200, Samuel Iglesias Gonsalvez wrote:
> When writing to a column of a row-major matrix, each component of the
> vector is stored to non-consecutive memory addresses, so we generate
> one instruction per component.
> 
> This patch skips the disabled components in the writemask, saving some
> store instructions plus avoid storing wrong data on each disabled
> component.

Reviewed-by: Iago Toral Quiroga 

> Signed-off-by: Samuel Iglesias Gonsalvez 
> ---
>  src/glsl/lower_ubo_reference.cpp | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/src/glsl/lower_ubo_reference.cpp 
> b/src/glsl/lower_ubo_reference.cpp
> index e581306..247620e 100644
> --- a/src/glsl/lower_ubo_reference.cpp
> +++ b/src/glsl/lower_ubo_reference.cpp
> @@ -754,6 +754,12 @@ lower_ubo_reference_visitor::emit_access(bool is_write,
>  add(base_offset,
>  new(mem_ctx) ir_constant(deref_offset + i * matrix_stride));
>   if (is_write) {
> +/* If the component is not in the writemask, then don't
> + * store any value.
> + */
> +if (!((1 << i) & write_mask))
> +   continue;
> +
>  base_ir->insert_after(ssbo_store(swizzle(deref, i, 1), 
> chan_offset, 1));
>   } else {
>  if (!this->is_shader_storage) {


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: error out if non-constant indexing of SSBO arrays with GLSL ES

2015-10-01 Thread Tapani Pälli
Fixes Piglit compiler test:
shader-storage-block-array-dynamic-indexing.frag

Signed-off-by: Tapani Pälli 
---
 src/glsl/ast_array_index.cpp | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/glsl/ast_array_index.cpp b/src/glsl/ast_array_index.cpp
index dfb3107..5e8f49d 100644
--- a/src/glsl/ast_array_index.cpp
+++ b/src/glsl/ast_array_index.cpp
@@ -231,15 +231,17 @@ _mesa_ast_array_index_to_hir(void *mem_ctx,
 _mesa_glsl_error(&loc, state, "unsized array index must be 
constant");
  }
   } else if (array->type->fields.array->is_interface()
- && array->variable_referenced()->data.mode == ir_var_uniform
+ && (array->variable_referenced()->data.mode == ir_var_uniform 
||
+ array->variable_referenced()->data.mode == 
ir_var_shader_storage)
  && !state->is_version(400, 0) && 
!state->ARB_gpu_shader5_enable) {
-/* Page 46 in section 4.3.7 of the OpenGL ES 3.00 spec says:
+/* Page 50 in section 4.3.9 of the OpenGL ES 3.10 spec says:
  *
- * "All indexes used to index a uniform block array must be
- * constant integral expressions."
+ * "All indices used to index a uniform or shader storage block
+ * array must be constant integral expressions."
  */
-_mesa_glsl_error(&loc, state,
- "uniform block array index must be constant");
+_mesa_glsl_error(&loc, state, "%s block array index must be constant",
+  array->variable_referenced()->data.mode
+  == ir_var_uniform ? "uniform" : "shader storage");
   } else {
 /* whole_variable_referenced can return NULL if the array is a
  * member of a structure.  In this case it is safe to not update
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: error out if non-constant indexing of SSBO arrays with GLSL ES

2015-10-01 Thread Samuel Iglesias Gonsálvez

Reviewed-by: Samuel Iglesias Gonsálvez 

On 01/10/15 13:54, Tapani Pälli wrote:
> Fixes Piglit compiler test:
>   shader-storage-block-array-dynamic-indexing.frag
> 
> Signed-off-by: Tapani Pälli 
> ---
>  src/glsl/ast_array_index.cpp | 14 --
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/src/glsl/ast_array_index.cpp b/src/glsl/ast_array_index.cpp
> index dfb3107..5e8f49d 100644
> --- a/src/glsl/ast_array_index.cpp
> +++ b/src/glsl/ast_array_index.cpp
> @@ -231,15 +231,17 @@ _mesa_ast_array_index_to_hir(void *mem_ctx,
>  _mesa_glsl_error(&loc, state, "unsized array index must be 
> constant");
>   }
>} else if (array->type->fields.array->is_interface()
> - && array->variable_referenced()->data.mode == ir_var_uniform
> + && (array->variable_referenced()->data.mode == 
> ir_var_uniform ||
> + array->variable_referenced()->data.mode == 
> ir_var_shader_storage)
>   && !state->is_version(400, 0) && 
> !state->ARB_gpu_shader5_enable) {
> -  /* Page 46 in section 4.3.7 of the OpenGL ES 3.00 spec says:
> +  /* Page 50 in section 4.3.9 of the OpenGL ES 3.10 spec says:
> *
> -   * "All indexes used to index a uniform block array must be
> -   * constant integral expressions."
> +   * "All indices used to index a uniform or shader storage block
> +   * array must be constant integral expressions."
> */
> -  _mesa_glsl_error(&loc, state,
> -   "uniform block array index must be constant");
> +  _mesa_glsl_error(&loc, state, "%s block array index must be constant",
> +  array->variable_referenced()->data.mode
> +  == ir_var_uniform ? "uniform" : "shader storage");
>} else {
>/* whole_variable_referenced can return NULL if the array is a
> * member of a structure.  In this case it is safe to not update
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: add GL_UNSIGNED_INT_24_8 to _mesa_pack_depth_span

2015-10-01 Thread Iago Toral
On Thu, 2015-10-01 at 08:28 +0300, Tapani Pälli wrote:
> Patch adds missing type (used with NV_read_depth) so that it gets
> handled correctly. Also add type to _mesa_problem output to aid
> debugging.
> 
> Signed-off-by: Tapani Pälli 
> ---
>  src/mesa/main/pack.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/main/pack.c b/src/mesa/main/pack.c
> index 7147fd6..54a0c42 100644
> --- a/src/mesa/main/pack.c
> +++ b/src/mesa/main/pack.c
> @@ -1074,6 +1074,7 @@ _mesa_pack_depth_span( struct gl_context *ctx, GLuint 
> n, GLvoid *dest,
>   }
>}
>break;
> +   case GL_UNSIGNED_INT_24_8:

Is it okay to store 32-bit integers in this case? that's what the code
below does. The spec says that the 8 stencil bits are undefined, but
don't we need to convert the depth value to a 24-bit integer scale?
(i.e. make 1.0 translate to 2^24-1 not 2^32-1).

Iago

> case GL_UNSIGNED_INT:
>{
>   GLuint *dst = (GLuint *) dest;
> @@ -1124,7 +1125,8 @@ _mesa_pack_depth_span( struct gl_context *ctx, GLuint 
> n, GLvoid *dest,
>}
>break;
> default:
> -  _mesa_problem(ctx, "bad type in _mesa_pack_depth_span");
> +  _mesa_problem(ctx, "bad type in _mesa_pack_depth_span (%s)",
> +_mesa_enum_to_string(dstType));
> }
>  
> free(depthCopy);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: Add abs input modifier to base for POW in ffvertex_prog

2015-10-01 Thread Daniel Scharrer
The result of POW for a negative base is undefined. Even when the result
is multiplied by zero (which is the case here whenever the base is
negative), the Inf and NaNs can propagate past that.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91342
Signed-off-by: Daniel Scharrer 
Cc: "10.6 11.0" 
---
 src/mesa/main/ffvertex_prog.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/src/mesa/main/ffvertex_prog.c b/src/mesa/main/ffvertex_prog.c
index 95b428d..a6183b4 100644
--- a/src/mesa/main/ffvertex_prog.c
+++ b/src/mesa/main/ffvertex_prog.c
@@ -293,9 +293,10 @@ struct ureg {
GLuint file:4;
GLint idx:9;  /* relative addressing may be negative */
  /* sizeof(idx) should == sizeof(prog_src_reg::Index) */
+   GLuint abs:1;
GLuint negate:1;
GLuint swz:12;
-   GLuint pad:6;
+   GLuint pad:5;
 };
 
 
@@ -324,6 +325,7 @@ static const struct ureg undef = {
0,
0,
0,
+   0,
0
 };
 
@@ -342,6 +344,7 @@ static struct ureg make_ureg(GLuint file, GLint idx)
struct ureg reg;
reg.file = file;
reg.idx = idx;
+   reg.abs = 0;
reg.negate = 0;
reg.swz = SWIZZLE_NOOP;
reg.pad = 0;
@@ -350,6 +353,14 @@ static struct ureg make_ureg(GLuint file, GLint idx)
 
 
 
+static struct ureg absolute( struct ureg reg )
+{
+   reg.abs = 1;
+   reg.negate = 0;
+   return reg;
+}
+
+
 static struct ureg negate( struct ureg reg )
 {
reg.negate ^= 1;
@@ -526,8 +537,8 @@ static void emit_arg( struct prog_src_register *src,
src->File = reg.file;
src->Index = reg.idx;
src->Swizzle = reg.swz;
+   src->Abs = reg.abs;
src->Negate = reg.negate ? NEGATE_XYZW : NEGATE_NONE;
-   src->Abs = 0;
src->RelAddr = 0;
/* Check that bitfield sizes aren't exceeded */
assert(src->Index == reg.idx);
@@ -953,7 +964,7 @@ static struct ureg calculate_light_attenuation( struct 
tnl_program *p,
 
   emit_op2(p, OPCODE_DP3, spot, 0, negate(VPpli), spot_dir_norm);
   emit_op2(p, OPCODE_SLT, slt, 0, swizzle1(spot_dir_norm,W), spot);
-  emit_op2(p, OPCODE_POW, spot, 0, spot, swizzle1(attenuation, W));
+  emit_op2(p, OPCODE_POW, spot, 0, absolute(spot), swizzle1(attenuation, 
W));
   emit_op2(p, OPCODE_MUL, att, 0, slt, spot);
 
   release_temp(p, spot);
-- 
2.6.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] mesa: clean up #includes in ff_fragment_shader.cpp

2015-10-01 Thread Brian Paul

On 10/01/2015 01:03 AM, Tapani Pälli wrote:

There seems to be some 'extra includes', not sure if you want to start
going through them at this same time?


Right, I'm not too concerned with that now.

-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] pipe-loader: abstract GALLIUM_STATIC_TARGETS behind pipe_loader API

2015-10-01 Thread Rob Clark
On Wed, Sep 30, 2015 at 7:37 PM, Marek Olšák  wrote:
> On Wed, Sep 30, 2015 at 10:43 PM, Rob Clark  wrote:
>> From: Rob Clark 
>>
>> Not actually working yet, ie. doesn't even compile yet, but an idea.
>>
>> Initial motivation was for drm_gralloc/pipe, which is essentially a sort
>> of mini state-tracker, that needs to be able to share pipe_screen with
>> libGL linked into the same process (to ensure we didn't end up with
>> duplicate pipe_resource's, etc).  I think same situation happens with
>> vdpau+gl interop.
>>
>> Currently drm_gralloc/pipe (and other state trackers??) statically link
>> the winsys code, which completely defeats the purpose magic in the
>> winsys code to detect when the same device is opened multiple times and
>> return the same pipe_screen (since you end up w/ multiple copies of the
>> hashtable in the same process).  See for example:
>>
>>   5bb41d9094b3c9bdf0669fd55418981ed83347e3
>>   fee0686c21c631d96d6042741267a3c218c23ffc
>>
>> Rough idea is that we should have something like libgallium.so which
>> contains the pipe-loader API, and then optionally (depending on
>> GALLIUM_STATIC_TARGETS) all of the gallium pipe drivers.  The various
>> different state trackers would link against (or dlopen()) libgallium.so
>> and use the pipe-loader API to get themselves a pipe_screen (and config
>> params, etc).  And then you end up with:
>>
>> +---+
>> | l |
>> clover  --> | i |
>> | b |
>> mesa-st --> | g |
>> | a |--> pipe driver
>> vdapau  --> | l |
>> | l |
>> gralloc --> | i |
>> | u |
>> xa  --> | m |
>> |   |
>> +---+
>> or:
>> +---+-+
>> | l | |
>> clover  --> | i | |
>> | b | |
>> mesa-st --> | g | |
>> | a | pipe driver |
>> vdapau  --> | l | |
>> | l | |
>> gralloc --> | i | |
>> | u | |
>> xa  --> | m | |
>> |   | |
>> +---+-+
>>
>> depending on GALLIUM_STATIC_TARGETS.  Either way, all the state trackers
>> in the same process share a single copy of the hashtable in the winsys
>> code which allows them to share the same pipe_screen.
>>
>> I think that ends up being an extra level of library indirection vs
>> current state w/ pipe drivers, ie. with mesa dri loader stuff directly
>> loading gallium_dri.so which contains all the drivers plus mesa state
>> tracker.  If this was concerning to someone, what I'd suggest would be
>> to, for all the state trackers that already have some sort of loader
>> sitting between them and the user, just pull them directly into the
>> mega-mega libgallium.so, ie. something like:
>>
>>   +-+---+-+
>>   | |   | |
>>   | clover  | l | |
>>   | | i | |
>>   | mesa-st | b | |
>>   | | g | pipe driver |
>>   | vdapau  | a | |
>>   | | l | |
>>   +-| l | |
>> | i | |
>> gralloc --> | u | |
>> | m | |
>> xa  --> |   | |
>> |   | |
>> +---+-+
>>
>> Anyways, I'm far from an expert in the build architecture of things so
>> I might have some facts wrong or be a bit confused.  And I'll probably
>> regret bringing the subject up.  But somehow or another it seems like
>> it would be good to (a) clean up all the GALLIUM_STATIC_TARGETS
>> ifdeffery spread throughout all the different state trackers, and (b)
>> have the pipe driver[*] only exist once per process rather than once per
>> state-tracker.  Especially for android where each process using GL will
>> have both gralloc and mesa-st.. and perhaps even clover/omx/etc.
>
> Radeon and amdgpu already have one pipe_screen per process per fd. It
> works like this:
>
> - The GL, VDPAU, OMX etc. libs each ship its own copy of the whole
> driver, so we have multiple screens and multiple winsyses.
>
> - All libs export "radeon_drm_winsys_create", which is this function:
> struct radeon_winsys *radeon_drm_winsys_create(
>int fd, radeon_screen_create_t screen_create);
>
> - The first lib that is loaded registers "radeon_drm_winsys_create",
> all other libs loaded later can't override it. Therefore, if any lib
> calls radeon_drm_winsys_create, the version from the first loaded lib
> is called.
>
> - The first caller passes the screen_create function to
> radeon_drm_winsys_create, the screen is created only once for each
> input fd in that function along with the winsys. When the winsys is
> returned, radeon_winsys::screen is the returned screen.
>
> - Any oth

Re: [Mesa-dev] [PATCH 18/18] t_dd_dmatmp: Use 'X &= ~N' instead of 'X -= X & N'

2015-10-01 Thread Brian Paul

On 09/30/2015 02:58 PM, Ian Romanick wrote:

From: Ian Romanick 

Signed-off-by: Ian Romanick 
Suggested-by: Brian Paul 
---
  src/mesa/tnl_dd/t_dd_dmatmp.h | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/tnl_dd/t_dd_dmatmp.h b/src/mesa/tnl_dd/t_dd_dmatmp.h
index 4eff883..3a30142 100644
--- a/src/mesa/tnl_dd/t_dd_dmatmp.h
+++ b/src/mesa/tnl_dd/t_dd_dmatmp.h
@@ -96,7 +96,7 @@ static void TAG(render_lines_verts)(struct gl_context *ctx,

 /* Emit whole number of lines in total and in each buffer:
  */
-   count -= count & 1;
+   count &= ~1;

 unsigned currentsz = GET_CURRENT_VB_MAX_VERTS() & ~1;
 if (currentsz < 8)
@@ -229,7 +229,7 @@ static void TAG(render_tri_strip_verts)(struct gl_context 
*ctx,

 /* From here on emit even numbers of tris when wrapping over buffers:
  */
-   currentsz -= (currentsz & 1);
+   currentsz &= ~1;

 unsigned nr;
 for (unsigned j = 0; j + 2 < count; j += nr - 2) {
@@ -322,7 +322,7 @@ static void TAG(render_quad_strip_verts)(struct gl_context 
*ctx,

/* Emit whole number of quads in total, and in each buffer.
 */
-  count -= count & 1;
+  count &= ~1;

unsigned currentsz = GET_CURRENT_VB_MAX_VERTS() & ~1;
if (currentsz < 8)
@@ -352,7 +352,7 @@ static void TAG(render_quads_verts)(struct gl_context *ctx,
LOCAL_VARS;

/* Emit whole number of quads in total. */
-  count -= count & 3;
+  count &= ~3;

/* Hardware doesn't have a quad primitive type -- try to simulate it 
using
 * triangle primitive.  This is a win for gears, but is it useful in the



Reviewed-by: Brian Paul 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] pipe-loader: abstract GALLIUM_STATIC_TARGETS behind pipe_loader API

2015-10-01 Thread Emil Velikov
Hi Rob,

I believe that Marek managed to nicely explain how things work.
Ideally you'll be able to get on the same boat (i.e. the Android C
runtime/linker won't give you the finger). Regardless some comments
inline.

On 30 September 2015 at 21:43, Rob Clark  wrote:
> From: Rob Clark 
>
> Not actually working yet, ie. doesn't even compile yet, but an idea.
>
> Initial motivation was for drm_gralloc/pipe, which is essentially a sort
> of mini state-tracker, that needs to be able to share pipe_screen with
> libGL linked into the same process (to ensure we didn't end up with
> duplicate pipe_resource's, etc).  I think same situation happens with
> vdpau+gl interop.
>
> Currently drm_gralloc/pipe (and other state trackers??) statically link
> the winsys code, which completely defeats the purpose magic in the
> winsys code to detect when the same device is opened multiple times and
> return the same pipe_screen (since you end up w/ multiple copies of the
> hashtable in the same process).  See for example:
>
>   5bb41d9094b3c9bdf0669fd55418981ed83347e3
>   fee0686c21c631d96d6042741267a3c218c23ffc
>
> Rough idea is that we should have something like libgallium.so which
> contains the pipe-loader API, and then optionally (depending on
> GALLIUM_STATIC_TARGETS) all of the gallium pipe drivers.  The various
> different state trackers would link against (or dlopen()) libgallium.so
> and use the pipe-loader API to get themselves a pipe_screen (and config
> params, etc).  And then you end up with:
>
> +---+
> | l |
> clover  --> | i |
> | b |
> mesa-st --> | g |
> | a |--> pipe driver
> vdapau  --> | l |
> | l |
> gralloc --> | i |
> | u |
> xa  --> | m |
> |   |
> +---+
> or:
> +---+-+
> | l | |
> clover  --> | i | |
> | b | |
> mesa-st --> | g | |
> | a | pipe driver |
> vdapau  --> | l | |
> | l | |
> gralloc --> | i | |
> | u | |
> xa  --> | m | |
> |   | |
> +---+-+
>
> depending on GALLIUM_STATIC_TARGETS.  Either way, all the state trackers
> in the same process share a single copy of the hashtable in the winsys
> code which allows them to share the same pipe_screen.
>
As others I'm not too excited about adding new library/ies. Amongst
others, It boils down to people mix'n'matching things, causing all
sorts of problems.

> I think that ends up being an extra level of library indirection vs
> current state w/ pipe drivers, ie. with mesa dri loader stuff directly
> loading gallium_dri.so which contains all the drivers plus mesa state
> tracker.  If this was concerning to someone, what I'd suggest would be
> to, for all the state trackers that already have some sort of loader
> sitting between them and the user, just pull them directly into the
> mega-mega libgallium.so, ie. something like:
>
>   +-+---+-+
>   | |   | |
>   | clover  | l | |
>   | | i | |
>   | mesa-st | b | |
>   | | g | pipe driver |
>   | vdapau  | a | |
>   | | l | |
>   +-| l | |
> | i | |
> gralloc --> | u | |
> | m | |
> xa  --> |   | |
> |   | |
> +---+-+
>
This won't work due to funny dependencies. I.e. you don't want to
force xcb and friends onto the dri module, clover etc, just because
vdpau needs it.

> Anyways, I'm far from an expert in the build architecture of things so
> I might have some facts wrong or be a bit confused.  And I'll probably
> regret bringing the subject up.  But somehow or another it seems like
> it would be good to (a) clean up all the GALLIUM_STATIC_TARGETS
> ifdeffery spread throughout all the different state trackers,
Sure that's always great, yet it is somewhat orthogonal to the
original issue ;-)

> and (b)
> have the pipe driver[*] only exist once per process rather than once per
> state-tracker.  Especially for android where each process using GL will
> have both gralloc and mesa-st.. and perhaps even clover/omx/etc.
>
Speaking of Android, have you tried the dynamic-list approach ?

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] pipe-loader: abstract GALLIUM_STATIC_TARGETS behind pipe_loader API

2015-10-01 Thread Marek Olšák
On Thu, Oct 1, 2015 at 3:53 PM, Rob Clark  wrote:
> On Wed, Sep 30, 2015 at 7:37 PM, Marek Olšák  wrote:
>> On Wed, Sep 30, 2015 at 10:43 PM, Rob Clark  wrote:
>>> From: Rob Clark 
>>>
>>> Not actually working yet, ie. doesn't even compile yet, but an idea.
>>>
>>> Initial motivation was for drm_gralloc/pipe, which is essentially a sort
>>> of mini state-tracker, that needs to be able to share pipe_screen with
>>> libGL linked into the same process (to ensure we didn't end up with
>>> duplicate pipe_resource's, etc).  I think same situation happens with
>>> vdpau+gl interop.
>>>
>>> Currently drm_gralloc/pipe (and other state trackers??) statically link
>>> the winsys code, which completely defeats the purpose magic in the
>>> winsys code to detect when the same device is opened multiple times and
>>> return the same pipe_screen (since you end up w/ multiple copies of the
>>> hashtable in the same process).  See for example:
>>>
>>>   5bb41d9094b3c9bdf0669fd55418981ed83347e3
>>>   fee0686c21c631d96d6042741267a3c218c23ffc
>>>
>>> Rough idea is that we should have something like libgallium.so which
>>> contains the pipe-loader API, and then optionally (depending on
>>> GALLIUM_STATIC_TARGETS) all of the gallium pipe drivers.  The various
>>> different state trackers would link against (or dlopen()) libgallium.so
>>> and use the pipe-loader API to get themselves a pipe_screen (and config
>>> params, etc).  And then you end up with:
>>>
>>> +---+
>>> | l |
>>> clover  --> | i |
>>> | b |
>>> mesa-st --> | g |
>>> | a |--> pipe driver
>>> vdapau  --> | l |
>>> | l |
>>> gralloc --> | i |
>>> | u |
>>> xa  --> | m |
>>> |   |
>>> +---+
>>> or:
>>> +---+-+
>>> | l | |
>>> clover  --> | i | |
>>> | b | |
>>> mesa-st --> | g | |
>>> | a | pipe driver |
>>> vdapau  --> | l | |
>>> | l | |
>>> gralloc --> | i | |
>>> | u | |
>>> xa  --> | m | |
>>> |   | |
>>> +---+-+
>>>
>>> depending on GALLIUM_STATIC_TARGETS.  Either way, all the state trackers
>>> in the same process share a single copy of the hashtable in the winsys
>>> code which allows them to share the same pipe_screen.
>>>
>>> I think that ends up being an extra level of library indirection vs
>>> current state w/ pipe drivers, ie. with mesa dri loader stuff directly
>>> loading gallium_dri.so which contains all the drivers plus mesa state
>>> tracker.  If this was concerning to someone, what I'd suggest would be
>>> to, for all the state trackers that already have some sort of loader
>>> sitting between them and the user, just pull them directly into the
>>> mega-mega libgallium.so, ie. something like:
>>>
>>>   +-+---+-+
>>>   | |   | |
>>>   | clover  | l | |
>>>   | | i | |
>>>   | mesa-st | b | |
>>>   | | g | pipe driver |
>>>   | vdapau  | a | |
>>>   | | l | |
>>>   +-| l | |
>>> | i | |
>>> gralloc --> | u | |
>>> | m | |
>>> xa  --> |   | |
>>> |   | |
>>> +---+-+
>>>
>>> Anyways, I'm far from an expert in the build architecture of things so
>>> I might have some facts wrong or be a bit confused.  And I'll probably
>>> regret bringing the subject up.  But somehow or another it seems like
>>> it would be good to (a) clean up all the GALLIUM_STATIC_TARGETS
>>> ifdeffery spread throughout all the different state trackers, and (b)
>>> have the pipe driver[*] only exist once per process rather than once per
>>> state-tracker.  Especially for android where each process using GL will
>>> have both gralloc and mesa-st.. and perhaps even clover/omx/etc.
>>
>> Radeon and amdgpu already have one pipe_screen per process per fd. It
>> works like this:
>>
>> - The GL, VDPAU, OMX etc. libs each ship its own copy of the whole
>> driver, so we have multiple screens and multiple winsyses.
>>
>> - All libs export "radeon_drm_winsys_create", which is this function:
>> struct radeon_winsys *radeon_drm_winsys_create(
>>int fd, radeon_screen_create_t screen_create);
>>
>> - The first lib that is loaded registers "radeon_drm_winsys_create",
>> all other libs loaded later can't override it. Therefore, if any lib
>> calls radeon_drm_winsys_create, the version from the first loaded lib
>> is called.
>>
>> - The first caller passes the screen_create function to
>> radeon_drm_winsys_create, the screen is created

Re: [Mesa-dev] [PATCH 3/3] radeon/llvm: Initialize gallivm targets when initializing the AMDGPU target v2

2015-10-01 Thread Emil Velikov
On 30 September 2015 at 19:59, Tom Stellard  wrote:
> This fixes a race condition in the glx-multithreaded-shader-compile
> test.
>
> v2:
>   - Replace gallivm_init_llvm_{begin,end}() with gallivm_init_llvm_targets().
>
Thanks for sticking with my suggestion Tom.

Fwiw the series is:
Reviewed-by: Emil Velikov 

Can you please confirm if we need the threadsafe wrappers, with this
series in place ? If we do, humble ping :)

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/11] i965: Pull stage_prog_data.nr_params out of the NIR shader

2015-10-01 Thread Iago Toral
On Wed, 2015-09-30 at 18:41 -0700, Jason Ekstrand wrote:
> Previously, we had a bunch of code in each stage to figure out how many
> slots we needed in stage_prog_data.param.  This code was mostly identical
> across the stages and had been copied and pasted around.  Unfortunately,
> this meant that any time you did something special, you had to add code for
> it to each of these places.  In particular, none of the stages took
> subroutines into account; they were working entirely by accident.  By
> taking this data from the NIR shader, we know the exact number of entries
> we need and everything goes a bit smoother.
> ---
>  src/mesa/drivers/dri/i965/brw_cs.c |  4 ++--
>  src/mesa/drivers/dri/i965/brw_gs.c |  5 ++---
>  src/mesa/drivers/dri/i965/brw_vs.c | 16 
>  src/mesa/drivers/dri/i965/brw_wm.c | 10 +++---
>  4 files changed, 11 insertions(+), 24 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_cs.c 
> b/src/mesa/drivers/dri/i965/brw_cs.c
> index 02eeeda..24120fb 100644
> --- a/src/mesa/drivers/dri/i965/brw_cs.c
> +++ b/src/mesa/drivers/dri/i965/brw_cs.c
> @@ -30,6 +30,7 @@
>  #include "intel_mipmap_tree.h"
>  #include "brw_state.h"
>  #include "intel_batchbuffer.h"
> +#include "glsl/nir/nir.h"
>  
>  static bool
>  brw_codegen_cs_prog(struct brw_context *brw,
> @@ -55,8 +56,7 @@ brw_codegen_cs_prog(struct brw_context *brw,
>  * prog_data associated with the compiled program, and which will be freed
>  * by the state cache.
>  */
> -   int param_count = cs->base.num_uniform_components +
> - cs->base.NumImages * BRW_IMAGE_PARAM_SIZE;
> +   int param_count = cp->program.Base.nir->num_uniforms;
>  
> /* The backend also sometimes adds params for texture size. */
> param_count += 2 * 
> ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits;
> diff --git a/src/mesa/drivers/dri/i965/brw_gs.c 
> b/src/mesa/drivers/dri/i965/brw_gs.c
> index 61e7b2a..0cf7ec8 100644
> --- a/src/mesa/drivers/dri/i965/brw_gs.c
> +++ b/src/mesa/drivers/dri/i965/brw_gs.c
> @@ -32,6 +32,7 @@
>  #include "brw_vec4_gs_visitor.h"
>  #include "brw_state.h"
>  #include "brw_ff_gs.h"
> +#include "glsl/nir/nir.h"
>  
> 
>  bool
> @@ -60,9 +61,7 @@ brw_codegen_gs_prog(struct brw_context *brw,
>  * every uniform is a float which gets padded to the size of a vec4.
>  */
> struct gl_shader *gs = prog->_LinkedShaders[MESA_SHADER_GEOMETRY];
> -   int param_count = gs->num_uniform_components * 4;
> -
> -   param_count += gs->NumImages * BRW_IMAGE_PARAM_SIZE;
> +   int param_count = gp->program.Base.nir->num_uniforms * 4;

I think the vec4 nir backend does not handle image uniforms at the
moment, does it? At least I see that the FS backend has code
specifically for that in fs_visitor::nir_setup_uniform. Not sure if we
support images in geometry stages though, but the code you remove seems
to account for that...

> c.prog_data.base.base.param =
>rzalloc_array(NULL, const gl_constant_value *, param_count);
> diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
> b/src/mesa/drivers/dri/i965/brw_vs.c
> index e1a0d9c..391411c 100644
> --- a/src/mesa/drivers/dri/i965/brw_vs.c
> +++ b/src/mesa/drivers/dri/i965/brw_vs.c
> @@ -109,18 +109,10 @@ brw_codegen_vs_prog(struct brw_context *brw,
>  * prog_data associated with the compiled program, and which will be freed
>  * by the state cache.
>  */
> -   int param_count;
> -   if (vs) {
> -  /* We add padding around uniform values below vec4 size, with the worst
> -   * case being a float value that gets blown up to a vec4, so be
> -   * conservative here.
> -   */
> -  param_count = vs->base.num_uniform_components * 4 +
> -vs->base.NumImages * BRW_IMAGE_PARAM_SIZE;
> -  stage_prog_data->nr_image_params = vs->base.NumImages;
> -   } else {
> -  param_count = vp->program.Base.Parameters->NumParameters * 4;
> -   }
> +   int param_count = vp->program.Base.nir->num_uniforms;
> +   if (!brw->intelScreen->compiler->scalar_vs)
> +  param_count *= 4;

Same thing here.

Also, I guess this also means that for the scalar_vs cases we were
computing a larger param_count than we really should, right?

> /* vec4_visitor::setup_uniform_clipplane_values() also uploads user clip
>  * planes as uniforms.
>  */
> diff --git a/src/mesa/drivers/dri/i965/brw_wm.c 
> b/src/mesa/drivers/dri/i965/brw_wm.c
> index cc97d6a..08f2416 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm.c
> @@ -35,6 +35,7 @@
>  #include "program/prog_parameter.h"
>  #include "program/program.h"
>  #include "intel_mipmap_tree.h"
> +#include "glsl/nir/nir.h"
>  
>  #include "util/ralloc.h"
>  
> @@ -173,14 +174,9 @@ brw_codegen_wm_prog(struct brw_context *brw,
>  * prog_data associated with the compiled program, and which will be freed
>  * by the state cache.
>  */
> -   int param_count;
> -   if (fs) {
> -  param_count = fs->base.num

Re: [Mesa-dev] [PATCH 01/11] glsl/types: Make subroutine types have a single matrix column

2015-10-01 Thread Iago Toral
Reviewed-by: Iago Toral Quiroga 

On Wed, 2015-09-30 at 18:41 -0700, Jason Ekstrand wrote:
> That way, if we do the usual thing of multiplying vector_elements by
> matrix_columns we get the actual number of components in the type as per
> component_slots().
> 
> While we're at it, we also switch to using the actual C++ field
> initializers for vector_elements and matrix_columns.
> ---
>  src/glsl/glsl_types.cpp | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp
> index 8586b2e..25927f6 100644
> --- a/src/glsl/glsl_types.cpp
> +++ b/src/glsl/glsl_types.cpp
> @@ -172,7 +172,7 @@ glsl_type::glsl_type(const char *subroutine_name) :
> base_type(GLSL_TYPE_SUBROUTINE),
> sampler_dimensionality(0), sampler_shadow(0), sampler_array(0),
> sampler_type(0), interface_packing(0),
> -   vector_elements(0), matrix_columns(0),
> +   vector_elements(1), matrix_columns(1),
> length(0)
>  {
> mtx_lock(&glsl_type::mutex);
> @@ -180,7 +180,6 @@ glsl_type::glsl_type(const char *subroutine_name) :
> init_ralloc_type_ctx();
> assert(subroutine_name != NULL);
> this->name = ralloc_strdup(this->mem_ctx, subroutine_name);
> -   this->vector_elements = 1;
> mtx_unlock(&glsl_type::mutex);
>  }
>  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/11] i965/vs: Move lazy NIR creation to codegen_vs_prog

2015-10-01 Thread Iago Toral
Reviewed-by: Iago Toral Quiroga 

On Wed, 2015-09-30 at 18:41 -0700, Jason Ekstrand wrote:
> The next commit will add code to codegen_vs_prog that requires the NIR
> shader to be there in all cases.  It doesn't hurt anything to just move it
> from brw_vs_emit to its only caller.
> ---
>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 12 
>  src/mesa/drivers/dri/i965/brw_vs.c | 13 +
>  2 files changed, 13 insertions(+), 12 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index 056ce39..407698f 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -1953,18 +1953,6 @@ brw_vs_emit(struct brw_context *brw,
> if (unlikely(INTEL_DEBUG & DEBUG_VS))
>brw_dump_ir("vertex", prog, &shader->base, &vp->Base);
>  
> -   if (!vp->Base.nir) {
> -  /* Normally we generate NIR in LinkShader() or
> -   * ProgramStringNotify(), but Mesa's fixed-function vertex program
> -   * handling doesn't notify the driver at all.  Just do it here, at
> -   * the last minute, even though it's lame.
> -   */
> -  assert(vp->Base.Id == 0 && prog == NULL);
> -  vp->Base.nir =
> - brw_create_nir(brw, NULL, &vp->Base, MESA_SHADER_VERTEX,
> -brw->intelScreen->compiler->scalar_vs);
> -   }
> -
> if (brw->intelScreen->compiler->scalar_vs) {
>prog_data->base.dispatch_mode = DISPATCH_MODE_SIMD8;
>  
> diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
> b/src/mesa/drivers/dri/i965/brw_vs.c
> index 0a348a5..e1a0d9c 100644
> --- a/src/mesa/drivers/dri/i965/brw_vs.c
> +++ b/src/mesa/drivers/dri/i965/brw_vs.c
> @@ -37,6 +37,7 @@
>  #include "brw_state.h"
>  #include "program/prog_print.h"
>  #include "program/prog_parameter.h"
> +#include "brw_nir.h"
>  
>  #include "util/ralloc.h"
>  
> @@ -81,6 +82,18 @@ brw_codegen_vs_prog(struct brw_context *brw,
> bool start_busy = false;
> double start_time = 0;
>  
> +   if (!vp->program.Base.nir) {
> +  /* Normally we generate NIR in LinkShader() or
> +   * ProgramStringNotify(), but Mesa's fixed-function vertex program
> +   * handling doesn't notify the driver at all.  Just do it here, at
> +   * the last minute, even though it's lame.
> +   */
> +  assert(vp->program.Base.Id == 0 && prog == NULL);
> +  vp->program.Base.nir =
> + brw_create_nir(brw, NULL, &vp->program.Base, MESA_SHADER_VERTEX,
> +brw->intelScreen->compiler->scalar_vs);
> +   }
> +
> if (prog)
>vs = (struct brw_shader *) prog->_LinkedShaders[MESA_SHADER_VERTEX];
>  


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/11] i965: Pull stage_prog_data.nr_params out of the NIR shader

2015-10-01 Thread Jason Ekstrand
On Thu, Oct 1, 2015 at 7:52 AM, Iago Toral  wrote:
> On Wed, 2015-09-30 at 18:41 -0700, Jason Ekstrand wrote:
>> Previously, we had a bunch of code in each stage to figure out how many
>> slots we needed in stage_prog_data.param.  This code was mostly identical
>> across the stages and had been copied and pasted around.  Unfortunately,
>> this meant that any time you did something special, you had to add code for
>> it to each of these places.  In particular, none of the stages took
>> subroutines into account; they were working entirely by accident.  By
>> taking this data from the NIR shader, we know the exact number of entries
>> we need and everything goes a bit smoother.
>> ---
>>  src/mesa/drivers/dri/i965/brw_cs.c |  4 ++--
>>  src/mesa/drivers/dri/i965/brw_gs.c |  5 ++---
>>  src/mesa/drivers/dri/i965/brw_vs.c | 16 
>>  src/mesa/drivers/dri/i965/brw_wm.c | 10 +++---
>>  4 files changed, 11 insertions(+), 24 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_cs.c 
>> b/src/mesa/drivers/dri/i965/brw_cs.c
>> index 02eeeda..24120fb 100644
>> --- a/src/mesa/drivers/dri/i965/brw_cs.c
>> +++ b/src/mesa/drivers/dri/i965/brw_cs.c
>> @@ -30,6 +30,7 @@
>>  #include "intel_mipmap_tree.h"
>>  #include "brw_state.h"
>>  #include "intel_batchbuffer.h"
>> +#include "glsl/nir/nir.h"
>>
>>  static bool
>>  brw_codegen_cs_prog(struct brw_context *brw,
>> @@ -55,8 +56,7 @@ brw_codegen_cs_prog(struct brw_context *brw,
>>  * prog_data associated with the compiled program, and which will be 
>> freed
>>  * by the state cache.
>>  */
>> -   int param_count = cs->base.num_uniform_components +
>> - cs->base.NumImages * BRW_IMAGE_PARAM_SIZE;
>> +   int param_count = cp->program.Base.nir->num_uniforms;
>>
>> /* The backend also sometimes adds params for texture size. */
>> param_count += 2 * 
>> ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits;
>> diff --git a/src/mesa/drivers/dri/i965/brw_gs.c 
>> b/src/mesa/drivers/dri/i965/brw_gs.c
>> index 61e7b2a..0cf7ec8 100644
>> --- a/src/mesa/drivers/dri/i965/brw_gs.c
>> +++ b/src/mesa/drivers/dri/i965/brw_gs.c
>> @@ -32,6 +32,7 @@
>>  #include "brw_vec4_gs_visitor.h"
>>  #include "brw_state.h"
>>  #include "brw_ff_gs.h"
>> +#include "glsl/nir/nir.h"
>>
>>
>>  bool
>> @@ -60,9 +61,7 @@ brw_codegen_gs_prog(struct brw_context *brw,
>>  * every uniform is a float which gets padded to the size of a vec4.
>>  */
>> struct gl_shader *gs = prog->_LinkedShaders[MESA_SHADER_GEOMETRY];
>> -   int param_count = gs->num_uniform_components * 4;
>> -
>> -   param_count += gs->NumImages * BRW_IMAGE_PARAM_SIZE;
>> +   int param_count = gp->program.Base.nir->num_uniforms * 4;
>
> I think the vec4 nir backend does not handle image uniforms at the
> moment, does it? At least I see that the FS backend has code
> specifically for that in fs_visitor::nir_setup_uniform. Not sure if we
> support images in geometry stages though, but the code you remove seems
> to account for that...

We don't.  When Curro initially implemented it, he did have vec4
support.  However, that was dropped in favor of doing things
differently in the fs compiler.  It may yet come back though.  This
code was probably merged a bit speculatively but doesn't really hurt
anything.

>> c.prog_data.base.base.param =
>>rzalloc_array(NULL, const gl_constant_value *, param_count);
>> diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
>> b/src/mesa/drivers/dri/i965/brw_vs.c
>> index e1a0d9c..391411c 100644
>> --- a/src/mesa/drivers/dri/i965/brw_vs.c
>> +++ b/src/mesa/drivers/dri/i965/brw_vs.c
>> @@ -109,18 +109,10 @@ brw_codegen_vs_prog(struct brw_context *brw,
>>  * prog_data associated with the compiled program, and which will be 
>> freed
>>  * by the state cache.
>>  */
>> -   int param_count;
>> -   if (vs) {
>> -  /* We add padding around uniform values below vec4 size, with the 
>> worst
>> -   * case being a float value that gets blown up to a vec4, so be
>> -   * conservative here.
>> -   */
>> -  param_count = vs->base.num_uniform_components * 4 +
>> -vs->base.NumImages * BRW_IMAGE_PARAM_SIZE;
>> -  stage_prog_data->nr_image_params = vs->base.NumImages;
>> -   } else {
>> -  param_count = vp->program.Base.Parameters->NumParameters * 4;
>> -   }
>> +   int param_count = vp->program.Base.nir->num_uniforms;
>> +   if (!brw->intelScreen->compiler->scalar_vs)
>> +  param_count *= 4;
>
> Same thing here.
>
> Also, I guess this also means that for the scalar_vs cases we were
> computing a larger param_count than we really should, right?

They were, but it had nothing to do with images.  It only had to do
with the fact that we were multiplying num_uniform_components by 4
which means that if your uniforms were all a bunch of floats, it would
be 4x the size needed.

>> /* vec4_visitor::setup_uniform_clipplane_values() also uploads user clip
>>  * planes as uniforms.
>>

Re: [Mesa-dev] [PATCH 03/11] i965: Pull stage_prog_data.nr_params out of the NIR shader

2015-10-01 Thread Iago Toral
On Thu, 2015-10-01 at 07:58 -0700, Jason Ekstrand wrote:
> On Thu, Oct 1, 2015 at 7:52 AM, Iago Toral  wrote:
> > On Wed, 2015-09-30 at 18:41 -0700, Jason Ekstrand wrote:
> >> Previously, we had a bunch of code in each stage to figure out how many
> >> slots we needed in stage_prog_data.param.  This code was mostly identical
> >> across the stages and had been copied and pasted around.  Unfortunately,
> >> this meant that any time you did something special, you had to add code for
> >> it to each of these places.  In particular, none of the stages took
> >> subroutines into account; they were working entirely by accident.  By
> >> taking this data from the NIR shader, we know the exact number of entries
> >> we need and everything goes a bit smoother.
> >> ---
> >>  src/mesa/drivers/dri/i965/brw_cs.c |  4 ++--
> >>  src/mesa/drivers/dri/i965/brw_gs.c |  5 ++---
> >>  src/mesa/drivers/dri/i965/brw_vs.c | 16 
> >>  src/mesa/drivers/dri/i965/brw_wm.c | 10 +++---
> >>  4 files changed, 11 insertions(+), 24 deletions(-)
> >>
> >> diff --git a/src/mesa/drivers/dri/i965/brw_cs.c 
> >> b/src/mesa/drivers/dri/i965/brw_cs.c
> >> index 02eeeda..24120fb 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_cs.c
> >> +++ b/src/mesa/drivers/dri/i965/brw_cs.c
> >> @@ -30,6 +30,7 @@
> >>  #include "intel_mipmap_tree.h"
> >>  #include "brw_state.h"
> >>  #include "intel_batchbuffer.h"
> >> +#include "glsl/nir/nir.h"
> >>
> >>  static bool
> >>  brw_codegen_cs_prog(struct brw_context *brw,
> >> @@ -55,8 +56,7 @@ brw_codegen_cs_prog(struct brw_context *brw,
> >>  * prog_data associated with the compiled program, and which will be 
> >> freed
> >>  * by the state cache.
> >>  */
> >> -   int param_count = cs->base.num_uniform_components +
> >> - cs->base.NumImages * BRW_IMAGE_PARAM_SIZE;
> >> +   int param_count = cp->program.Base.nir->num_uniforms;
> >>
> >> /* The backend also sometimes adds params for texture size. */
> >> param_count += 2 * 
> >> ctx->Const.Program[MESA_SHADER_COMPUTE].MaxTextureImageUnits;
> >> diff --git a/src/mesa/drivers/dri/i965/brw_gs.c 
> >> b/src/mesa/drivers/dri/i965/brw_gs.c
> >> index 61e7b2a..0cf7ec8 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_gs.c
> >> +++ b/src/mesa/drivers/dri/i965/brw_gs.c
> >> @@ -32,6 +32,7 @@
> >>  #include "brw_vec4_gs_visitor.h"
> >>  #include "brw_state.h"
> >>  #include "brw_ff_gs.h"
> >> +#include "glsl/nir/nir.h"
> >>
> >>
> >>  bool
> >> @@ -60,9 +61,7 @@ brw_codegen_gs_prog(struct brw_context *brw,
> >>  * every uniform is a float which gets padded to the size of a vec4.
> >>  */
> >> struct gl_shader *gs = prog->_LinkedShaders[MESA_SHADER_GEOMETRY];
> >> -   int param_count = gs->num_uniform_components * 4;
> >> -
> >> -   param_count += gs->NumImages * BRW_IMAGE_PARAM_SIZE;
> >> +   int param_count = gp->program.Base.nir->num_uniforms * 4;
> >
> > I think the vec4 nir backend does not handle image uniforms at the
> > moment, does it? At least I see that the FS backend has code
> > specifically for that in fs_visitor::nir_setup_uniform. Not sure if we
> > support images in geometry stages though, but the code you remove seems
> > to account for that...
> 
> We don't.  When Curro initially implemented it, he did have vec4
> support.  However, that was dropped in favor of doing things
> differently in the fs compiler.  It may yet come back though.  This
> code was probably merged a bit speculatively but doesn't really hurt
> anything.

Yeah, that's what I recall too, I was surprised to find image stuff here
and thought that maybe I missed something.

> >> c.prog_data.base.base.param =
> >>rzalloc_array(NULL, const gl_constant_value *, param_count);
> >> diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
> >> b/src/mesa/drivers/dri/i965/brw_vs.c
> >> index e1a0d9c..391411c 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_vs.c
> >> +++ b/src/mesa/drivers/dri/i965/brw_vs.c
> >> @@ -109,18 +109,10 @@ brw_codegen_vs_prog(struct brw_context *brw,
> >>  * prog_data associated with the compiled program, and which will be 
> >> freed
> >>  * by the state cache.
> >>  */
> >> -   int param_count;
> >> -   if (vs) {
> >> -  /* We add padding around uniform values below vec4 size, with the 
> >> worst
> >> -   * case being a float value that gets blown up to a vec4, so be
> >> -   * conservative here.
> >> -   */
> >> -  param_count = vs->base.num_uniform_components * 4 +
> >> -vs->base.NumImages * BRW_IMAGE_PARAM_SIZE;
> >> -  stage_prog_data->nr_image_params = vs->base.NumImages;
> >> -   } else {
> >> -  param_count = vp->program.Base.Parameters->NumParameters * 4;
> >> -   }
> >> +   int param_count = vp->program.Base.nir->num_uniforms;
> >> +   if (!brw->intelScreen->compiler->scalar_vs)
> >> +  param_count *= 4;
> >
> > Same thing here.
> >
> > Also, I guess this also means that for the scalar_vs cases we were
> > computing a la

Re: [Mesa-dev] [PATCH v2 4/7] nir: add an instruction set API

2015-10-01 Thread Connor Abbott
On Wed, Sep 30, 2015 at 12:35 PM, Jason Ekstrand  wrote:
> On Wed, Sep 30, 2015 at 8:11 AM, Connor Abbott  wrote:
>> This will replace direct usage of nir_instrs_equal() in the CSE pass,
>> which reduces an O(n^2) algorithm with an effectively O(n) one. It'll
>> also be useful for implementing GVN on top of GCM.
>>
>> v2:
>> - Add texture support.
>> - Add more comments.
>> - Rename instr_can_hash() to instr_can_rewrite() since it's really more
>> about whether its uses can be rewritten, and it's implicitly used by
>> nir_instrs_equal() as well.
>> - Rename nir_instr_set_add() to nir_instr_set_add_or_rewrite() (Jason).
>> - Make the HASH() macro less magical (Topi).
>> - Rewrite the commit message.
>>
>> Signed-off-by: Connor Abbott 
>> ---
>>  src/glsl/nir/nir_instr_set.c | 311 
>> +++
>>  src/glsl/nir/nir_instr_set.h |  35 +
>>  2 files changed, 346 insertions(+)
>>
>> diff --git a/src/glsl/nir/nir_instr_set.c b/src/glsl/nir/nir_instr_set.c
>> index 72ab048..a6f2226 100644
>> --- a/src/glsl/nir/nir_instr_set.c
>> +++ b/src/glsl/nir/nir_instr_set.c
>> @@ -23,6 +23,178 @@
>>
>>  #include "nir_instr_set.h"
>>
>> +#define HASH(hash, data) _mesa_fnv32_1a_accumulate((hash), (data))
>> +
>> +static uint32_t
>> +hash_src(uint32_t hash, const nir_src *src)
>> +{
>> +   assert(src->is_ssa);
>> +   hash = HASH(hash, src->ssa);
>> +   return hash;
>> +}
>> +
>> +static uint32_t
>> +hash_alu_src(uint32_t hash, const nir_alu_src *src, unsigned num_components)
>> +{
>> +   hash = HASH(hash, src->abs);
>> +   hash = HASH(hash, src->negate);
>> +
>> +   for (unsigned i = 0; i < num_components; i++)
>> +  hash = HASH(hash, src->swizzle[i]);
>> +
>> +   hash = hash_src(hash, &src->src);
>> +   return hash;
>> +}
>> +
>> +static uint32_t
>> +hash_alu(uint32_t hash, const nir_alu_instr *instr)
>> +{
>> +   hash = HASH(hash, instr->op);
>> +   hash = HASH(hash, instr->dest.dest.ssa.num_components);
>
> Why are you hasing the dest.ssa.num_components instead of the
> writemask?  They should be the same in any case, but this seems a bit
> out-of-place.

No particular reason. I can change it if you want.

>
>> +
>> +   if (nir_op_infos[instr->op].algebraic_properties & 
>> NIR_OP_IS_COMMUTATIVE) {
>> +  assert(nir_op_infos[instr->op].num_inputs == 2);
>> +  uint32_t hash0 = hash_alu_src(hash, &instr->src[0],
>> +nir_ssa_alu_instr_src_components(instr, 
>> 0));
>> +  uint32_t hash1 = hash_alu_src(hash, &instr->src[1],
>> +nir_ssa_alu_instr_src_components(instr, 
>> 1));
>> +  /* For commutative operations, we need some commutative way of
>> +   * combining the hashes.  One option would be to XOR them but that
>> +   * means that anything with two identical sources will hash to 0 and
>> +   * that's common enough we probably don't want the guaranteed
>> +   * collision.  Either addition or multiplication will also work.
>> +   */
>> +  hash = hash0 * hash1;
>> +   } else {
>> +  for (unsigned i = 0; i < nir_op_infos[instr->op].num_inputs; i++) {
>> + hash = hash_alu_src(hash, &instr->src[i],
>> + nir_ssa_alu_instr_src_components(instr, i));
>> +  }
>> +   }
>> +
>> +   return hash;
>> +}
>> +
>> +static uint32_t
>> +hash_load_const(uint32_t hash, const nir_load_const_instr *instr)
>> +{
>> +   hash = HASH(hash, instr->def.num_components);
>> +
>> +   hash = _mesa_fnv32_1a_accumulate_block(hash, instr->value.f,
>> +  instr->def.num_components
>> + * sizeof(instr->value.f[0]));
>> +
>> +   return hash;
>> +}
>> +
>> +static int
>> +cmp_phi_src(const void *data1, const void *data2)
>> +{
>> +   const nir_phi_src *src1 = data1, *src2 = data2;
>> +   return src1->pred->index - src2->pred->index;
>> +}
>> +
>> +static uint32_t
>> +hash_phi(uint32_t hash, const nir_phi_instr *instr)
>> +{
>> +   hash = HASH(hash, instr->instr.block);
>> +
>> +   /* sort sources by predecessor index, since the order shouldn't matter */
>> +   unsigned num_preds = instr->instr.block->predecessors->entries;
>> +   nir_phi_src *srcs = malloc(num_preds * sizeof(nir_phi_src));
>
> You could use NIR_VLA and put it on the stack. That'd probably be more
> efficient.  Also, why are you copying the entire source and not just
> making a list of pointers?

Good point. It does prevent a pointer dereference when comparing, but
that's probably less expensive than copying the entire thing when
swapping.

>
>> +   unsigned i = 0;
>> +   nir_foreach_phi_src(instr, src) {
>> +  srcs[i++] = *src;
>> +   }
>> +   qsort(srcs, num_preds, sizeof(nir_phi_src), cmp_phi_src);
>
> Sorting seems like as good a way as any.  Another option would be to
> do like we do for commutative ALU ops and multiply.  Might be more
> efficient but it probably doesn't matter too much.

Perhaps, although sorting se

[Mesa-dev] [PATCH 1/2] i965/mt: Declare some functions as static

2015-10-01 Thread Chad Versace
intel_tiling_supports_non_msrt_mcs() and
intel_miptree_is_fast_clear_capable() are not used outside of
intel_mipmap_tree.c.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 4 ++--
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 6 +-
 2 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index ffc356c..05dc291 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -160,7 +160,7 @@ intel_get_non_msrt_mcs_alignment(struct intel_mipmap_tree 
*mt,
}
 }
 
-bool
+static bool
 intel_tiling_supports_non_msrt_mcs(struct brw_context *brw, unsigned tiling)
 {
/* From the Ivy Bridge PRM, Vol2 Part1 11.7 "MCS Buffer for Render
@@ -193,7 +193,7 @@ intel_tiling_supports_non_msrt_mcs(struct brw_context *brw, 
unsigned tiling)
  * - MCS buffer for non-MSRT is supported only for RT formats 32bpp,
  *   64bpp, and 128bpp.
  */
-bool
+static bool
 intel_miptree_is_fast_clear_capable(struct brw_context *brw,
 struct intel_mipmap_tree *mt)
 {
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 486e5c6..805cd71 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -658,11 +658,7 @@ struct intel_mipmap_tree
 void
 intel_get_non_msrt_mcs_alignment(struct intel_mipmap_tree *mt,
  unsigned *width_px, unsigned *height);
-bool
-intel_tiling_supports_non_msrt_mcs(struct brw_context *brw, unsigned tiling);
-bool
-intel_miptree_is_fast_clear_capable(struct brw_context *brw,
-struct intel_mipmap_tree *mt);
+
 bool
 intel_miptree_alloc_non_msrt_mcs(struct brw_context *brw,
  struct intel_mipmap_tree *mt);
-- 
2.5.0.342.g44e0223

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965: Fix intel_miptree_is_fast_clear_capable()

2015-10-01 Thread Chad Versace
There are three types of fast clears:
  a. fast depth clears
  b. fast singlesample color clears
  c. fast multisample color clears
Function intel_miptree_is_fast_clear_capable() checks if a miptree
supports fast clears of type (b).

Rename the function to disambiguate what it does:
  old: intel_miptree_is_fast_clear_capable
  new: intel_miptree_supports_non_msrt_fast_clear

The functionally *accidentally* rejected multisampled color surfaces
because it thought they were singlesample array surfaces. Fix that by
explicitly rejecting surfaces with samples > 1.

This fix would have been needed before we enabled layered fast
singlesample color clears (introduced in gen8), which we want to do
eventually. For now, though, this patch changes no behavior; it just
fixes how the driver chooses its behavior.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 24 +++-
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 05dc291..a169c41 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -194,8 +194,8 @@ intel_tiling_supports_non_msrt_mcs(struct brw_context *brw, 
unsigned tiling)
  *   64bpp, and 128bpp.
  */
 static bool
-intel_miptree_is_fast_clear_capable(struct brw_context *brw,
-struct intel_mipmap_tree *mt)
+intel_miptree_supports_non_msrt_fast_clear(struct brw_context *brw,
+   struct intel_mipmap_tree *mt)
 {
/* MCS support does not exist prior to Gen7 */
if (brw->gen < 7)
@@ -204,6 +204,10 @@ intel_miptree_is_fast_clear_capable(struct brw_context 
*brw,
if (mt->disable_aux_buffers)
   return false;
 
+   /* This function applies only to non-multisampled render targets. */
+   if (mt->num_samples > 1)
+  return false;
+
/* MCS is only supported for color buffers */
switch (_mesa_get_format_base_format(mt->format)) {
case GL_DEPTH_COMPONENT:
@@ -222,7 +226,16 @@ intel_miptree_is_fast_clear_capable(struct brw_context 
*brw,
 
   return false;
}
+
+   /* Check for layered surfaces. */
if (mt->physical_depth0 != 1) {
+   /* Multisample surfaces with the CMS layout are not layered surfaces,
+* yet still have physical_depth0 > 1. Assert that we don't
+* accidentally reject a multisampled surface here. We should have
+* rejected it earlier by explicitly checking the sample count.
+*/
+  assert(mt->num_samples <= 1);
+
   if (brw->gen >= 8) {
  perf_debug("Layered fast clear - giving up. (%dx%d%d)\n",
 mt->logical_width0, mt->logical_height0,
@@ -494,7 +507,7 @@ intel_miptree_create_layout(struct brw_context *brw,
 *  7   |  ? |?
 *  6   |  ? |?
 */
-   if (intel_miptree_is_fast_clear_capable(brw, mt)) {
+   if (intel_miptree_supports_non_msrt_fast_clear(brw, mt)) {
   if (brw->gen >= 9 || (brw->gen == 8 && num_samples <= 1))
  layout_flags |= MIPTREE_LAYOUT_FORCE_HALIGN16;
} else if (brw->gen >= 9 && num_samples > 1) {
@@ -692,7 +705,7 @@ intel_miptree_create(struct brw_context *brw,
 * clear actually occurs.
 */
if (intel_tiling_supports_non_msrt_mcs(brw, mt->tiling) &&
-   intel_miptree_is_fast_clear_capable(brw, mt)) {
+   intel_miptree_supports_non_msrt_fast_clear(brw, mt)) {
   mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_RESOLVED;
   assert(brw->gen < 8 || mt->halign == 16 || num_samples <= 1);
}
@@ -800,8 +813,9 @@ intel_update_winsys_renderbuffer_miptree(struct brw_context 
*intel,
 * clear actually occurs.
 */
if (intel_tiling_supports_non_msrt_mcs(intel, singlesample_mt->tiling) &&
-   intel_miptree_is_fast_clear_capable(intel, singlesample_mt))
+   intel_miptree_supports_non_msrt_fast_clear(intel, singlesample_mt)) {
   singlesample_mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_RESOLVED;
+   }
 
if (num_samples == 0) {
   intel_miptree_release(&irb->mt);
-- 
2.5.0.342.g44e0223

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/2] i965: A few miptree fixes

2015-10-01 Thread Chad Versace
Chad Versace (2):
  i965/mt: Declare some functions as static
  i965: Fix intel_miptree_is_fast_clear_capable()

 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 28 ---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  6 +-
 2 files changed, 22 insertions(+), 12 deletions(-)

-- 
2.5.0.342.g44e0223

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 4/7] nir: add an instruction set API

2015-10-01 Thread Jason Ekstrand
On Thu, Oct 1, 2015 at 8:14 AM, Connor Abbott  wrote:
> On Wed, Sep 30, 2015 at 12:35 PM, Jason Ekstrand  wrote:
>> On Wed, Sep 30, 2015 at 8:11 AM, Connor Abbott  wrote:
>>> This will replace direct usage of nir_instrs_equal() in the CSE pass,
>>> which reduces an O(n^2) algorithm with an effectively O(n) one. It'll
>>> also be useful for implementing GVN on top of GCM.
>>>
>>> v2:
>>> - Add texture support.
>>> - Add more comments.
>>> - Rename instr_can_hash() to instr_can_rewrite() since it's really more
>>> about whether its uses can be rewritten, and it's implicitly used by
>>> nir_instrs_equal() as well.
>>> - Rename nir_instr_set_add() to nir_instr_set_add_or_rewrite() (Jason).
>>> - Make the HASH() macro less magical (Topi).
>>> - Rewrite the commit message.
>>>
>>> Signed-off-by: Connor Abbott 
>>> ---
>>>  src/glsl/nir/nir_instr_set.c | 311 
>>> +++
>>>  src/glsl/nir/nir_instr_set.h |  35 +
>>>  2 files changed, 346 insertions(+)
>>>
>>> diff --git a/src/glsl/nir/nir_instr_set.c b/src/glsl/nir/nir_instr_set.c
>>> index 72ab048..a6f2226 100644
>>> --- a/src/glsl/nir/nir_instr_set.c
>>> +++ b/src/glsl/nir/nir_instr_set.c
>>> @@ -23,6 +23,178 @@
>>>
>>>  #include "nir_instr_set.h"
>>>
>>> +#define HASH(hash, data) _mesa_fnv32_1a_accumulate((hash), (data))
>>> +
>>> +static uint32_t
>>> +hash_src(uint32_t hash, const nir_src *src)
>>> +{
>>> +   assert(src->is_ssa);
>>> +   hash = HASH(hash, src->ssa);
>>> +   return hash;
>>> +}
>>> +
>>> +static uint32_t
>>> +hash_alu_src(uint32_t hash, const nir_alu_src *src, unsigned 
>>> num_components)
>>> +{
>>> +   hash = HASH(hash, src->abs);
>>> +   hash = HASH(hash, src->negate);
>>> +
>>> +   for (unsigned i = 0; i < num_components; i++)
>>> +  hash = HASH(hash, src->swizzle[i]);
>>> +
>>> +   hash = hash_src(hash, &src->src);
>>> +   return hash;
>>> +}
>>> +
>>> +static uint32_t
>>> +hash_alu(uint32_t hash, const nir_alu_instr *instr)
>>> +{
>>> +   hash = HASH(hash, instr->op);
>>> +   hash = HASH(hash, instr->dest.dest.ssa.num_components);
>>
>> Why are you hasing the dest.ssa.num_components instead of the
>> writemask?  They should be the same in any case, but this seems a bit
>> out-of-place.
>
> No particular reason. I can change it if you want.
>
>>
>>> +
>>> +   if (nir_op_infos[instr->op].algebraic_properties & 
>>> NIR_OP_IS_COMMUTATIVE) {
>>> +  assert(nir_op_infos[instr->op].num_inputs == 2);
>>> +  uint32_t hash0 = hash_alu_src(hash, &instr->src[0],
>>> +
>>> nir_ssa_alu_instr_src_components(instr, 0));
>>> +  uint32_t hash1 = hash_alu_src(hash, &instr->src[1],
>>> +
>>> nir_ssa_alu_instr_src_components(instr, 1));
>>> +  /* For commutative operations, we need some commutative way of
>>> +   * combining the hashes.  One option would be to XOR them but that
>>> +   * means that anything with two identical sources will hash to 0 and
>>> +   * that's common enough we probably don't want the guaranteed
>>> +   * collision.  Either addition or multiplication will also work.
>>> +   */
>>> +  hash = hash0 * hash1;
>>> +   } else {
>>> +  for (unsigned i = 0; i < nir_op_infos[instr->op].num_inputs; i++) {
>>> + hash = hash_alu_src(hash, &instr->src[i],
>>> + nir_ssa_alu_instr_src_components(instr, i));
>>> +  }
>>> +   }
>>> +
>>> +   return hash;
>>> +}
>>> +
>>> +static uint32_t
>>> +hash_load_const(uint32_t hash, const nir_load_const_instr *instr)
>>> +{
>>> +   hash = HASH(hash, instr->def.num_components);
>>> +
>>> +   hash = _mesa_fnv32_1a_accumulate_block(hash, instr->value.f,
>>> +  instr->def.num_components
>>> + * sizeof(instr->value.f[0]));
>>> +
>>> +   return hash;
>>> +}
>>> +
>>> +static int
>>> +cmp_phi_src(const void *data1, const void *data2)
>>> +{
>>> +   const nir_phi_src *src1 = data1, *src2 = data2;
>>> +   return src1->pred->index - src2->pred->index;
>>> +}
>>> +
>>> +static uint32_t
>>> +hash_phi(uint32_t hash, const nir_phi_instr *instr)
>>> +{
>>> +   hash = HASH(hash, instr->instr.block);
>>> +
>>> +   /* sort sources by predecessor index, since the order shouldn't matter 
>>> */
>>> +   unsigned num_preds = instr->instr.block->predecessors->entries;
>>> +   nir_phi_src *srcs = malloc(num_preds * sizeof(nir_phi_src));
>>
>> You could use NIR_VLA and put it on the stack. That'd probably be more
>> efficient.  Also, why are you copying the entire source and not just
>> making a list of pointers?
>
> Good point. It does prevent a pointer dereference when comparing, but
> that's probably less expensive than copying the entire thing when
> swapping.
>
>>
>>> +   unsigned i = 0;
>>> +   nir_foreach_phi_src(instr, src) {
>>> +  srcs[i++] = *src;
>>> +   }
>>> +   qsort(srcs, num_preds, sizeof(nir_phi_src), cmp_phi_src);
>>
>> Sorting seems

Re: [Mesa-dev] [PATCH 3/3] radeon/llvm: Initialize gallivm targets when initializing the AMDGPU target v2

2015-10-01 Thread Tom Stellard
On Thu, Oct 01, 2015 at 03:40:45PM +0100, Emil Velikov wrote:
> On 30 September 2015 at 19:59, Tom Stellard  wrote:
> > This fixes a race condition in the glx-multithreaded-shader-compile
> > test.
> >
> > v2:
> >   - Replace gallivm_init_llvm_{begin,end}() with 
> > gallivm_init_llvm_targets().
> >
> Thanks for sticking with my suggestion Tom.
> 
> Fwiw the series is:
> Reviewed-by: Emil Velikov 
> 
> Can you please confirm if we need the threadsafe wrappers, with this
> series in place ? If we do, humble ping :)
> 

We do still need the threadsafe wrappers.

-Tom

> -Emil
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965/mt: Declare some functions as static

2015-10-01 Thread Anuj Phogat
On Thu, Oct 1, 2015 at 8:20 AM, Chad Versace  wrote:
> intel_tiling_supports_non_msrt_mcs() and
> intel_miptree_is_fast_clear_capable() are not used outside of
> intel_mipmap_tree.c.
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 4 ++--
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 6 +-
>  2 files changed, 3 insertions(+), 7 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index ffc356c..05dc291 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -160,7 +160,7 @@ intel_get_non_msrt_mcs_alignment(struct intel_mipmap_tree 
> *mt,
> }
>  }
>
> -bool
> +static bool
>  intel_tiling_supports_non_msrt_mcs(struct brw_context *brw, unsigned tiling)
>  {
> /* From the Ivy Bridge PRM, Vol2 Part1 11.7 "MCS Buffer for Render
> @@ -193,7 +193,7 @@ intel_tiling_supports_non_msrt_mcs(struct brw_context 
> *brw, unsigned tiling)
>   * - MCS buffer for non-MSRT is supported only for RT formats 32bpp,
>   *   64bpp, and 128bpp.
>   */
> -bool
> +static bool
>  intel_miptree_is_fast_clear_capable(struct brw_context *brw,
>  struct intel_mipmap_tree *mt)
>  {
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> index 486e5c6..805cd71 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
> @@ -658,11 +658,7 @@ struct intel_mipmap_tree
>  void
>  intel_get_non_msrt_mcs_alignment(struct intel_mipmap_tree *mt,
>   unsigned *width_px, unsigned *height);
> -bool
> -intel_tiling_supports_non_msrt_mcs(struct brw_context *brw, unsigned tiling);
> -bool
> -intel_miptree_is_fast_clear_capable(struct brw_context *brw,
> -struct intel_mipmap_tree *mt);
> +
>  bool
>  intel_miptree_alloc_non_msrt_mcs(struct brw_context *brw,
>   struct intel_mipmap_tree *mt);
> --
> 2.5.0.342.g44e0223
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Both patches are:
Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: add GL_UNSIGNED_INT_24_8 to _mesa_pack_depth_span

2015-10-01 Thread Tapani Pälli

On 10/01/2015 03:17 PM, Iago Toral wrote:

On Thu, 2015-10-01 at 08:28 +0300, Tapani Pälli wrote:

Patch adds missing type (used with NV_read_depth) so that it gets
handled correctly. Also add type to _mesa_problem output to aid
debugging.

Signed-off-by: Tapani Pälli 
---
  src/mesa/main/pack.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/pack.c b/src/mesa/main/pack.c
index 7147fd6..54a0c42 100644
--- a/src/mesa/main/pack.c
+++ b/src/mesa/main/pack.c
@@ -1074,6 +1074,7 @@ _mesa_pack_depth_span( struct gl_context *ctx, GLuint n, 
GLvoid *dest,
   }
}
break;
+   case GL_UNSIGNED_INT_24_8:

Is it okay to store 32-bit integers in this case? that's what the code
below does. The spec says that the 8 stencil bits are undefined, but
don't we need to convert the depth value to a 24-bit integer scale?
(i.e. make 1.0 translate to 2^24-1 not 2^32-1).


Oh, I thought stencil was included, I'll check if there's any existing 
app/test that would actually use this.



Iago


 case GL_UNSIGNED_INT:
{
   GLuint *dst = (GLuint *) dest;
@@ -1124,7 +1125,8 @@ _mesa_pack_depth_span( struct gl_context *ctx, GLuint n, 
GLvoid *dest,
}
break;
 default:
-  _mesa_problem(ctx, "bad type in _mesa_pack_depth_span");
+  _mesa_problem(ctx, "bad type in _mesa_pack_depth_span (%s)",
+_mesa_enum_to_string(dstType));
 }
  
 free(depthCopy);




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] pipe-loader: abstract GALLIUM_STATIC_TARGETS behind pipe_loader API

2015-10-01 Thread Rob Clark
On Thu, Oct 1, 2015 at 10:30 AM, Emil Velikov  wrote:
> Hi Rob,
>
> I believe that Marek managed to nicely explain how things work.
> Ideally you'll be able to get on the same boat (i.e. the Android C
> runtime/linker won't give you the finger). Regardless some comments
> inline.
>
> On 30 September 2015 at 21:43, Rob Clark  wrote:
>> From: Rob Clark 
>>
>> Not actually working yet, ie. doesn't even compile yet, but an idea.
>>
>> Initial motivation was for drm_gralloc/pipe, which is essentially a sort
>> of mini state-tracker, that needs to be able to share pipe_screen with
>> libGL linked into the same process (to ensure we didn't end up with
>> duplicate pipe_resource's, etc).  I think same situation happens with
>> vdpau+gl interop.
>>
>> Currently drm_gralloc/pipe (and other state trackers??) statically link
>> the winsys code, which completely defeats the purpose magic in the
>> winsys code to detect when the same device is opened multiple times and
>> return the same pipe_screen (since you end up w/ multiple copies of the
>> hashtable in the same process).  See for example:
>>
>>   5bb41d9094b3c9bdf0669fd55418981ed83347e3
>>   fee0686c21c631d96d6042741267a3c218c23ffc
>>
>> Rough idea is that we should have something like libgallium.so which
>> contains the pipe-loader API, and then optionally (depending on
>> GALLIUM_STATIC_TARGETS) all of the gallium pipe drivers.  The various
>> different state trackers would link against (or dlopen()) libgallium.so
>> and use the pipe-loader API to get themselves a pipe_screen (and config
>> params, etc).  And then you end up with:
>>
>> +---+
>> | l |
>> clover  --> | i |
>> | b |
>> mesa-st --> | g |
>> | a |--> pipe driver
>> vdapau  --> | l |
>> | l |
>> gralloc --> | i |
>> | u |
>> xa  --> | m |
>> |   |
>> +---+
>> or:
>> +---+-+
>> | l | |
>> clover  --> | i | |
>> | b | |
>> mesa-st --> | g | |
>> | a | pipe driver |
>> vdapau  --> | l | |
>> | l | |
>> gralloc --> | i | |
>> | u | |
>> xa  --> | m | |
>> |   | |
>> +---+-+
>>
>> depending on GALLIUM_STATIC_TARGETS.  Either way, all the state trackers
>> in the same process share a single copy of the hashtable in the winsys
>> code which allows them to share the same pipe_screen.
>>
> As others I'm not too excited about adding new library/ies. Amongst
> others, It boils down to people mix'n'matching things, causing all
> sorts of problems.

hmm, sounds like a distro issue, doesn't it?  I mean in
non-GALLIUM_STATIC_TARGETS build configuration (which, afaict doesn't
even work these days) you couldn't mix and match state tracker .so's
and pipe-driver .so's..

If really needed we could stuff module version symbols all around and
refuse to load with nice error msg if mismatch..  seems like a
relatively solvable problem..

>> I think that ends up being an extra level of library indirection vs
>> current state w/ pipe drivers, ie. with mesa dri loader stuff directly
>> loading gallium_dri.so which contains all the drivers plus mesa state
>> tracker.  If this was concerning to someone, what I'd suggest would be
>> to, for all the state trackers that already have some sort of loader
>> sitting between them and the user, just pull them directly into the
>> mega-mega libgallium.so, ie. something like:
>>
>>   +-+---+-+
>>   | |   | |
>>   | clover  | l | |
>>   | | i | |
>>   | mesa-st | b | |
>>   | | g | pipe driver |
>>   | vdapau  | a | |
>>   | | l | |
>>   +-| l | |
>> | i | |
>> gralloc --> | u | |
>> | m | |
>> xa  --> |   | |
>> |   | |
>> +---+-+
>>
> This won't work due to funny dependencies. I.e. you don't want to
> force xcb and friends onto the dri module, clover etc, just because
> vdpau needs it.

Hmm, perhaps.. although I thought dri already was converted to use
xcb..  otherwise I'd guess vdpau and mesa-st would have roughly the
same sorts of dependencies.  Although perhaps it would be funny for
clover to have those dep's..


>> Anyways, I'm far from an expert in the build architecture of things so
>> I might have some facts wrong or be a bit confused.  And I'll probably
>> regret bringing the subject up.  But somehow or another it seems like
>> it would be good to (a) clean up all the GALLIUM_STATIC_TARGETS
>> ifdeffery spread throughou

[Mesa-dev] [PATCH] mesa: Add missing _mm_mfence() before streaming loads.

2015-10-01 Thread Matt Turner
According to the Intel Software Development Manual:

   Streaming loads may be weakly ordered and may appear to software to
   execute out of order with respect to other memory operations.
   Software must explicitly use fences (e.g. MFENCE) if it needs to
   preserve order among streaming loads or between streaming loads and
   other memory operations.

That is, a memory fence is needed to preserve the order between the GPU
writing the buffer and the streaming loads reading it back.

Reported-by: Joseph Nuzman 
---
 src/mesa/main/streaming-load-memcpy.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/main/streaming-load-memcpy.c 
b/src/mesa/main/streaming-load-memcpy.c
index d7147af..32854b6 100644
--- a/src/mesa/main/streaming-load-memcpy.c
+++ b/src/mesa/main/streaming-load-memcpy.c
@@ -54,16 +54,19 @@ _mesa_streaming_load_memcpy(void *restrict dst, void 
*restrict src, size_t len)
 
   memcpy(d, s, MIN2(bytes_before_alignment_boundary, len));
 
   d = (char *)ALIGN((uintptr_t)d, 16);
   s = (char *)ALIGN((uintptr_t)s, 16);
   len -= MIN2(bytes_before_alignment_boundary, len);
}
 
+   if (len >= 64)
+  _mm_mfence();
+
while (len >= 64) {
   __m128i *dst_cacheline = (__m128i *)d;
   __m128i *src_cacheline = (__m128i *)s;
 
   __m128i temp1 = _mm_stream_load_si128(src_cacheline + 0);
   __m128i temp2 = _mm_stream_load_si128(src_cacheline + 1);
   __m128i temp3 = _mm_stream_load_si128(src_cacheline + 2);
   __m128i temp4 = _mm_stream_load_si128(src_cacheline + 3);
-- 
2.4.9

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/23] nir: Introduce new nir_intrinsic_load_per_vertex_input intrinsics.

2015-10-01 Thread Kenneth Graunke
On Wednesday, September 30, 2015 12:58:09 AM Kenneth Graunke wrote:
> Discuss :)

NAK on patches 3-5.  It turns out they break if the outermost array
index is non-constant.  I've got two new patches that replace them.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 2/2] mesa: Add {Num}UniformBlocks and {Num}ShaderStorageBlocks to gl_shader_program

2015-10-01 Thread Ilia Mirkin
On Thu, Oct 1, 2015 at 7:09 AM, Iago Toral Quiroga  wrote:
> These arrays provide backends with separate index spaces for UBOS and SSBOs.
> ---
>  src/glsl/linker.cpp | 35 +++
>  src/glsl/standalone_scaffolding.cpp |  9 +
>  src/mesa/main/mtypes.h  |  6 ++
>  3 files changed, 50 insertions(+)
>
> diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
> index e6eba94..3da773d 100644
> --- a/src/glsl/linker.cpp
> +++ b/src/glsl/linker.cpp
> @@ -4107,6 +4107,41 @@ link_shaders(struct gl_context *ctx, struct 
> gl_shader_program *prog)
>}
> }
>
> +   /* Split prog->BufferInterfaceBlocks into prog->UniformBlocks and
> +* prog->ShaderStorageBlocks, so that drivers that need separate index
> +* spaces for each set can have that.
> +*/
> +   unsigned num_ubo_blocks;
> +   unsigned num_ssbo_blocks;
> +   num_ubo_blocks = 0;
> +   num_ssbo_blocks = 0;
> +   for (unsigned i = 0; i < prog->NumBufferInterfaceBlocks; i++) {
> +  if (prog->BufferInterfaceBlocks[i].IsShaderStorage)
> + num_ssbo_blocks++;
> +  else
> + num_ubo_blocks++;
> +   }
> +
> +   prog->UniformBlocks =
> +  ralloc_array(mem_ctx, gl_uniform_block *, num_ubo_blocks);
> +   prog->NumUniformBlocks = 0;
> +
> +   prog->ShaderStorageBlocks =
> +  ralloc_array(mem_ctx, gl_uniform_block *, num_ssbo_blocks);
> +   prog->NumShaderStorageBlocks = 0;
> +
> +   for (unsigned i = 0; i < prog->NumBufferInterfaceBlocks; i++) {
> +  if (prog->BufferInterfaceBlocks[i].IsShaderStorage)
> + prog->ShaderStorageBlocks[prog->NumShaderStorageBlocks++] =
> +&prog->BufferInterfaceBlocks[i];
> +  else
> + prog->UniformBlocks[prog->NumUniformBlocks++] =
> +&prog->BufferInterfaceBlocks[i];
> +   }

Shouldn't this go through and also adjust the indices of the linked
programs? Or... something along those lines? With this, I still need a
remapping table to go from the index passed to a load_ssbo/load_ubo
instruction to a ssbo/ubo index.

Perhaps a few well-placed helper functions can alleviate this? Also
this should erase the need for some of the O(n) iterations that have
sprung up as a result of this combined list.

IMHO ideally the BufferInterfaceBlocks list would get freed at the end
of this function. But I understand that this will require work, and
the onus is probably on me (or anyone wanting to add ssbo support to
other drivers) to do it, or work around it.

  -ilia

> +
> +   assert(prog->NumUniformBlocks + prog->NumShaderStorageBlocks ==
> +  prog->NumBufferInterfaceBlocks);
> +
> /* FINISHME: Assign fragment shader output locations. */
>
>  done:
> diff --git a/src/glsl/standalone_scaffolding.cpp 
> b/src/glsl/standalone_scaffolding.cpp
> index 0c53589..658245f 100644
> --- a/src/glsl/standalone_scaffolding.cpp
> +++ b/src/glsl/standalone_scaffolding.cpp
> @@ -102,6 +102,15 @@ _mesa_clear_shader_program_data(struct gl_shader_program 
> *shProg)
> ralloc_free(shProg->BufferInterfaceBlocks);
> shProg->BufferInterfaceBlocks = NULL;
> shProg->NumBufferInterfaceBlocks = 0;
> +
> +   ralloc_free(shProg->UniformBlocks);
> +   shProg->UniformBlocks = NULL;
> +   shProg->NumUniformBlocks = 0;
> +
> +   ralloc_free(shProg->ShaderStorageBlocks);
> +   shProg->ShaderStorageBlocks = NULL;
> +   shProg->NumShaderStorageBlocks = 0;
> +
> for (i = 0; i < MESA_SHADER_STAGES; i++) {
>ralloc_free(shProg->UniformBlockStageIndex[i]);
>shProg->UniformBlockStageIndex[i] = NULL;
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index 347da14..2362f54 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -2692,6 +2692,12 @@ struct gl_shader_program
> unsigned NumBufferInterfaceBlocks;
> struct gl_uniform_block *BufferInterfaceBlocks;
>
> +   unsigned NumUniformBlocks;
> +   struct gl_uniform_block **UniformBlocks;
> +
> +   unsigned NumShaderStorageBlocks;
> +   struct gl_uniform_block **ShaderStorageBlocks;
> +
> /**
>  * Indices into the _LinkedShaders's UniformBlocks[] array for each stage
>  * they're used in, or -1.
> --
> 1.9.1
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 1/2] nir/lower_io: Return has_indirect flag from get_io_offset().

2015-10-01 Thread Kenneth Graunke
get_io_offset() already walks the dereference chain and discovers
whether or not we have an indirect; we can just return that rather than
computing it a second time.  This means moving the call a bit earlier.

More importantly, I'm about to introduce special handling for the
outermost array index, which would require changing the dereference
we pass to deref_has_indirect().  Or we can just eliminate that.

Signed-off-by: Kenneth Graunke 
---
 src/glsl/nir/nir_lower_io.c | 41 ++---
 1 file changed, 14 insertions(+), 27 deletions(-)

This replaces patches 3-5 of my recent 23 patch series:

[PATCH 03/23] nir/lower_io: Switch on shader stage for inputs in load_op().
[PATCH 04/23] nir/lower_io: Make get_io_offset take a nir_deref, not a 
nir_deref_var.
[PATCH 05/23] nir: Introduce new nir_intrinsic_load_per_vertex_input intrinsics.

diff --git a/src/glsl/nir/nir_lower_io.c b/src/glsl/nir/nir_lower_io.c
index f32c09d..80967fc 100644
--- a/src/glsl/nir/nir_lower_io.c
+++ b/src/glsl/nir/nir_lower_io.c
@@ -63,22 +63,9 @@ nir_assign_var_locations(struct exec_list *var_list, 
unsigned *size,
*size = location;
 }
 
-static bool
-deref_has_indirect(nir_deref_var *deref)
-{
-   for (nir_deref *tail = deref->deref.child; tail; tail = tail->child) {
-  if (tail->deref_type == nir_deref_type_array) {
- nir_deref_array *arr = nir_deref_as_array(tail);
- if (arr->deref_array_type == nir_deref_array_type_indirect)
-return true;
-  }
-   }
-
-   return false;
-}
-
 static unsigned
-get_io_offset(nir_deref_var *deref, nir_instr *instr, nir_src *indirect,
+get_io_offset(nir_deref_var *deref, nir_instr *instr,
+  nir_src *indirect, bool *has_indirect,
   struct lower_io_state *state)
 {
bool found_indirect = false;
@@ -122,6 +109,7 @@ get_io_offset(nir_deref_var *deref, nir_instr *instr, 
nir_src *indirect,
   }
}
 
+   *has_indirect = found_indirect;
return base_offset;
 }
 
@@ -169,17 +157,17 @@ nir_lower_io_block(nir_block *block, void *void_state)
  if (mode != nir_var_shader_in && mode != nir_var_uniform)
 continue;
 
- bool has_indirect = deref_has_indirect(intrin->variables[0]);
+ nir_src indirect;
+ bool has_indirect;
+
+ unsigned offset = get_io_offset(intrin->variables[0], &intrin->instr,
+ &indirect, &has_indirect, state);
 
  nir_intrinsic_instr *load =
 nir_intrinsic_instr_create(state->mem_ctx,
load_op(mode, has_indirect));
  load->num_components = intrin->num_components;
 
- nir_src indirect;
- unsigned offset = get_io_offset(intrin->variables[0],
- &intrin->instr, &indirect, state);
-
  unsigned location = intrin->variables[0]->var->data.driver_location;
  if (mode == nir_var_uniform) {
 load->const_index[0] = location;
@@ -209,7 +197,12 @@ nir_lower_io_block(nir_block *block, void *void_state)
  if (intrin->variables[0]->var->data.mode != nir_var_shader_out)
 continue;
 
- bool has_indirect = deref_has_indirect(intrin->variables[0]);
+ nir_src indirect;
+ bool has_indirect;
+
+ unsigned offset = get_io_offset(intrin->variables[0], &intrin->instr,
+ &indirect, &has_indirect, state);
+ offset += intrin->variables[0]->var->data.driver_location;
 
  nir_intrinsic_op store_op;
  if (has_indirect) {
@@ -221,12 +214,6 @@ nir_lower_io_block(nir_block *block, void *void_state)
  nir_intrinsic_instr *store = 
nir_intrinsic_instr_create(state->mem_ctx,
  store_op);
  store->num_components = intrin->num_components;
-
- nir_src indirect;
- unsigned offset = get_io_offset(intrin->variables[0],
- &intrin->instr, &indirect, state);
- offset += intrin->variables[0]->var->data.driver_location;
-
  store->const_index[0] = offset;
 
  nir_src_copy(&store->src[0], &intrin->src[0], store);
-- 
2.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 2/2] nir: Introduce new nir_intrinsic_load_per_vertex_input intrinsics.

2015-10-01 Thread Kenneth Graunke
Geometry and tessellation shaders process multiple vertices; their
inputs are arrays indexed by the vertex number.  While GLSL makes
this look like a normal array, it can be very different behind the
scenes.

On Intel hardware, all inputs for a particular vertex are stored
together - as if they were grouped into a single struct.  This means
that consecutive elements of these top-level arrays are not contiguous.
In fact, they may sometimes be in completely disjoint memory segments.

NIR's existing load_input intrinsics are awkward for this case, as they
distill everything down to a single offset.  We'd much rather keep the
vertex ID separate, but build up an offset as normal beyond that.

This patch introduces new nir_intrinsic_load_per_vertex_input
intrinsics to handle this case.  They work like ordinary load_input
intrinsics, but have an extra source (src[0]) which represents the
outermost array index.

Signed-off-by: Kenneth Graunke 
---
 src/glsl/nir/nir_intrinsics.h |  1 +
 src/glsl/nir/nir_lower_io.c   | 54 ++---
 src/glsl/nir/nir_print.c  |  2 +
 src/mesa/drivers/dri/i965/brw_nir.c   | 13 +-
 src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp | 58 +++
 5 files changed, 86 insertions(+), 42 deletions(-)

diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h
index ac4c2ba..263d8c1 100644
--- a/src/glsl/nir/nir_intrinsics.h
+++ b/src/glsl/nir/nir_intrinsics.h
@@ -228,6 +228,7 @@ SYSTEM_VALUE(num_work_groups, 3, 0)
 LOAD(uniform, 0, 2, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
 LOAD(ubo, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
 LOAD(input, 0, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
+LOAD(per_vertex_input, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | 
NIR_INTRINSIC_CAN_REORDER)
 LOAD(ssbo, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE)
 
 /*
diff --git a/src/glsl/nir/nir_lower_io.c b/src/glsl/nir/nir_lower_io.c
index 80967fc..d066ca6 100644
--- a/src/glsl/nir/nir_lower_io.c
+++ b/src/glsl/nir/nir_lower_io.c
@@ -63,8 +63,20 @@ nir_assign_var_locations(struct exec_list *var_list, 
unsigned *size,
*size = location;
 }
 
+/**
+ * Returns true if we're processing a stage whose inputs are arrays indexed
+ * by a vertex number (such as geometry shader inputs).
+ */
+static bool
+stage_uses_per_vertex_inputs(struct lower_io_state *state)
+{
+   gl_shader_stage stage = state->builder.shader->stage;
+   return stage == MESA_SHADER_GEOMETRY;
+}
+
 static unsigned
 get_io_offset(nir_deref_var *deref, nir_instr *instr,
+  nir_src *vertex_index,
   nir_src *indirect, bool *has_indirect,
   struct lower_io_state *state)
 {
@@ -75,6 +87,22 @@ get_io_offset(nir_deref_var *deref, nir_instr *instr,
b->cursor = nir_before_instr(instr);
 
nir_deref *tail = &deref->deref;
+
+   /* For per-vertex input arrays (i.e. geometry shader inputs), keep the
+* outermost array index separate.  Process the rest normally.
+*/
+   if (vertex_index != NULL) {
+  tail = tail->child;
+  assert(tail->deref_type == nir_deref_type_array);
+  nir_deref_array *deref_array = nir_deref_as_array(tail);
+
+  nir_ssa_def *vtx = nir_imm_int(b, deref_array->base_offset);
+  if (deref_array->deref_array_type == nir_deref_array_type_indirect) {
+ vtx = nir_iadd(b, vtx, nir_ssa_for_src(b, deref_array->indirect, 1));
+  }
+  *vertex_index = nir_src_for_ssa(vtx);
+   }
+
while (tail->child != NULL) {
   const struct glsl_type *parent_type = tail->type;
   tail = tail->child;
@@ -114,13 +142,19 @@ get_io_offset(nir_deref_var *deref, nir_instr *instr,
 }
 
 static nir_intrinsic_op
-load_op(nir_variable_mode mode, bool has_indirect)
+load_op(struct lower_io_state *state,
+nir_variable_mode mode, bool per_vertex, bool has_indirect)
 {
nir_intrinsic_op op;
switch (mode) {
case nir_var_shader_in:
-  op = has_indirect ? nir_intrinsic_load_input_indirect :
-  nir_intrinsic_load_input;
+  if (per_vertex) {
+ op = has_indirect ? nir_intrinsic_load_per_vertex_input_indirect :
+ nir_intrinsic_load_per_vertex_input;
+  } else {
+ op = has_indirect ? nir_intrinsic_load_input_indirect :
+ nir_intrinsic_load_input;
+  }
   break;
case nir_var_uniform:
   op = has_indirect ? nir_intrinsic_load_uniform_indirect :
@@ -157,15 +191,21 @@ nir_lower_io_block(nir_block *block, void *void_state)
  if (mode != nir_var_shader_in && mode != nir_var_uniform)
 continue;
 
+ bool per_vertex = mode == nir_var_shader_in &&
+   stage_uses_per_vertex_inputs(state);
+
+ nir_src vertex_index;
  nir_src indirect;
  bool has_indirect;
 
  unsigned offset = get_io_offset(intrin->variables[0], &intrin->instr,
+  

Re: [Mesa-dev] [PATCH] mesa: Add missing _mm_mfence() before streaming loads.

2015-10-01 Thread Jordan Justen
On 2015-10-01 10:11:33, Matt Turner wrote:
> According to the Intel Software Development Manual:

How about a more specific doc location?

According to the Intel Software Development Manual (Volume 1: Basic
Architecture, 12.10.3 Streaming Load Hint Instruction):

>Streaming loads may be weakly ordered and may appear to software to
>execute out of order with respect to other memory operations.
>Software must explicitly use fences (e.g. MFENCE) if it needs to
>preserve order among streaming loads or between streaming loads and
>other memory operations.

Does this mean we need a mfence following the load as well?

-Jordan

> That is, a memory fence is needed to preserve the order between the GPU
> writing the buffer and the streaming loads reading it back.
> 
> Reported-by: Joseph Nuzman 
> ---
>  src/mesa/main/streaming-load-memcpy.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/src/mesa/main/streaming-load-memcpy.c 
> b/src/mesa/main/streaming-load-memcpy.c
> index d7147af..32854b6 100644
> --- a/src/mesa/main/streaming-load-memcpy.c
> +++ b/src/mesa/main/streaming-load-memcpy.c
> @@ -54,16 +54,19 @@ _mesa_streaming_load_memcpy(void *restrict dst, void 
> *restrict src, size_t len)
>  
>memcpy(d, s, MIN2(bytes_before_alignment_boundary, len));
>  
>d = (char *)ALIGN((uintptr_t)d, 16);
>s = (char *)ALIGN((uintptr_t)s, 16);
>len -= MIN2(bytes_before_alignment_boundary, len);
> }
>  
> +   if (len >= 64)
> +  _mm_mfence();
> +
> while (len >= 64) {
>__m128i *dst_cacheline = (__m128i *)d;
>__m128i *src_cacheline = (__m128i *)s;
>  
>__m128i temp1 = _mm_stream_load_si128(src_cacheline + 0);
>__m128i temp2 = _mm_stream_load_si128(src_cacheline + 1);
>__m128i temp3 = _mm_stream_load_si128(src_cacheline + 2);
>__m128i temp4 = _mm_stream_load_si128(src_cacheline + 3);
> -- 
> 2.4.9
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: add GL_UNSIGNED_INT_24_8 to _mesa_pack_depth_span

2015-10-01 Thread Ian Romanick
On 10/01/2015 09:50 AM, Tapani Pälli wrote:
> On 10/01/2015 03:17 PM, Iago Toral wrote:
>> On Thu, 2015-10-01 at 08:28 +0300, Tapani Pälli wrote:
>>> Patch adds missing type (used with NV_read_depth) so that it gets

   NV_read_depth_stencil?

>>> handled correctly. Also add type to _mesa_problem output to aid
>>> debugging.
>>>
>>> Signed-off-by: Tapani Pälli 
>>> ---
>>>   src/mesa/main/pack.c | 4 +++-
>>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/src/mesa/main/pack.c b/src/mesa/main/pack.c
>>> index 7147fd6..54a0c42 100644
>>> --- a/src/mesa/main/pack.c
>>> +++ b/src/mesa/main/pack.c
>>> @@ -1074,6 +1074,7 @@ _mesa_pack_depth_span( struct gl_context *ctx,
>>> GLuint n, GLvoid *dest,
>>>}
>>> }
>>> break;
>>> +   case GL_UNSIGNED_INT_24_8:
>> Is it okay to store 32-bit integers in this case? that's what the code
>> below does. The spec says that the 8 stencil bits are undefined, but
>> don't we need to convert the depth value to a 24-bit integer scale?
>> (i.e. make 1.0 translate to 2^24-1 not 2^32-1).

I was going to make the same comment.

> Oh, I thought stencil was included, I'll check if there's any existing
> app/test that would actually use this.

The incoming depthSpan is a float*, so there's no stencil there.  I
think we'll need to cobble together a piglit test for this case.  Can
this only be hit of NV_read_depth_stencil is exported?  I'm just
wondering because Mesa doesn't support that extension.  How is this even
being hit?

>> Iago
>>
>>>  case GL_UNSIGNED_INT:
>>> {
>>>GLuint *dst = (GLuint *) dest;
>>> @@ -1124,7 +1125,8 @@ _mesa_pack_depth_span( struct gl_context *ctx,
>>> GLuint n, GLvoid *dest,
>>> }
>>> break;
>>>  default:
>>> -  _mesa_problem(ctx, "bad type in _mesa_pack_depth_span");
>>> +  _mesa_problem(ctx, "bad type in _mesa_pack_depth_span (%s)",
>>> +_mesa_enum_to_string(dstType));
>>>  }
>>>free(depthCopy);

I definitely approve of this part.  If you want to split that into its
own patch, that's

Reviewed-by: Ian Romanick 

> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: add GL_UNSIGNED_INT_24_8 to _mesa_pack_depth_span

2015-10-01 Thread Ilia Mirkin
On Thu, Oct 1, 2015 at 3:12 PM, Ian Romanick  wrote:
> I'm just
> wondering because Mesa doesn't support that extension.  How is this even
> being hit?

See 81d2fd91a90 (mesa: add NV_read_{depth,stencil,depth_stencil} extensions)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] nir: Add a function to determine if a source is dynamically uniform

2015-10-01 Thread Matt Turner
On Fri, Aug 7, 2015 at 9:31 AM, Neil Roberts  wrote:
> Adds nir_src_is_dynamically_uniform which returns true if the source
> is known to be dynamically uniform. This will be used in a later patch
> to add a workaround for cases that only work with dynamically uniform
> sources. Note that the function is not definitive, it can return false
> negatives (but not false positives). Currently it only detects
> constants and uniform accesses. It could easily be extended to include
> more cases.
> ---
>  src/glsl/nir/nir.c | 29 +
>  src/glsl/nir/nir.h |  1 +
>  2 files changed, 30 insertions(+)
>
> diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
> index 78ff886..242f0b4 100644
> --- a/src/glsl/nir/nir.c
> +++ b/src/glsl/nir/nir.c
> @@ -1784,6 +1784,35 @@ nir_src_as_const_value(nir_src src)
> return &load->value;
>  }
>
> +/**
> + * Returns true if the source is known to be dynamically uniform. Otherwise 
> it
> + * returns false which means it may or may not be dynamically uniform but it
> + * can't be determined.
> + */
> +bool
> +nir_src_is_dynamically_uniform(nir_src src)
> +{
> +   if (!src.is_ssa)
> +  return false;
> +
> +   /* Constants are trivially dynamically uniform */
> +   if (src.ssa->parent_instr->type == nir_instr_type_load_const)
> +  return true;
> +
> +   /* As are uniform variables */
> +   if (src.ssa->parent_instr->type == nir_instr_type_intrinsic) {
> +  nir_intrinsic_instr *intr = 
> nir_instr_as_intrinsic(src.ssa->parent_instr);
> +
> +  if (intr->intrinsic == nir_intrinsic_load_uniform)
> + return true;
> +   }
> +
> +   /* XXX: this could have many more tests, such as when a sampler function 
> is
> +* called with dynamically uniform arguments.
> +*/
> +   return false;
> +}

This functions seems correct as-is, so it gets a

Reviewed-by: Matt Turner 

On top of being useful for fixing the nonconst/nonuniform piglit
tests, knowing which values are uniform can allow better optimization
and knowing which branches are uniform would allow us to enable Single
Program Flow.

Cc'ing Jason for a discussion about how we'd like to handle this kind
of information in NIR. Add a bool is_uniform or something to
nir_ssa_def and hook into the metadata system?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] pipe-loader: abstract GALLIUM_STATIC_TARGETS behind pipe_loader API

2015-10-01 Thread Rob Clark
From: Rob Clark 

Signed-off-by: Rob Clark 
---
Drop the idea of more ambitious re-arrangement if libs, but keep the
pipe-loader refactoring.  With this at least drm_gralloc could still
dlopen() gallium_dri.so and then use the pipe-loader API to figure
out which pipe driver to load and hand back a screen.

The nine st is not updated.. but I don't claim to understand the whole
screen + sw-screen thing.  So I figured I'd let someone who knew what
they were doing update nine.  Once that happens, we should change to
not expose the dd_xyz fxns outside of pipe-loader, imho..

 configure.ac|  6 --
 src/gallium/auxiliary/pipe-loader/Makefile.am   | 14 ++
 src/gallium/auxiliary/pipe-loader/pipe_loader.c |  2 --
 src/gallium/auxiliary/pipe-loader/pipe_loader.h |  4 
 src/gallium/auxiliary/pipe-loader/pipe_loader_drm.c | 12 
 src/gallium/auxiliary/vl/vl_winsys_dri.c|  8 
 src/gallium/state_trackers/dri/dri2.c   | 13 -
 src/gallium/state_trackers/dri/dri_screen.c |  2 --
 src/gallium/state_trackers/xa/xa_tracker.c  | 10 +-
 src/gallium/targets/dri/Makefile.am |  1 +
 src/gallium/targets/dri/dri.sym |  3 +++
 src/gallium/targets/omx/Makefile.am |  4 +---
 src/gallium/targets/va/Makefile.am  |  4 +---
 src/gallium/targets/vdpau/Makefile.am   |  4 +---
 src/gallium/targets/xa/Makefile.am  |  4 +---
 src/gallium/targets/xvmc/Makefile.am|  5 ++---
 16 files changed, 25 insertions(+), 71 deletions(-)

diff --git a/configure.ac b/configure.ac
index 1ef5fbc..7d0f9d9 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2044,7 +2044,6 @@ gallium_require_drm_loader() {
 if test "x$need_pci_id$have_pci_id" = xyesno; then
 AC_MSG_ERROR([Gallium drm loader requires libudev >= 
$LIBUDEV_REQUIRED or sysfs])
 fi
-enable_gallium_drm_loader=yes
 fi
 if test "x$enable_va" = xyes && test "x$7" != x; then
  GALLIUM_TARGET_DIRS="$GALLIUM_TARGET_DIRS $7"
@@ -2248,10 +2247,6 @@ if test "x$enable_gallium_loader" = xyes; then
 GALLIUM_PIPE_LOADER_DEFINES="$GALLIUM_PIPE_LOADER_DEFINES 
-DHAVE_PIPE_LOADER_DRI"
 fi
 
-if test "x$enable_gallium_drm_loader" = xyes; then
-GALLIUM_PIPE_LOADER_DEFINES="$GALLIUM_PIPE_LOADER_DEFINES 
-DHAVE_PIPE_LOADER_DRM"
-fi
-
 AC_SUBST([GALLIUM_PIPE_LOADER_DEFINES])
 fi
 
@@ -2269,7 +2264,6 @@ AM_CONDITIONAL(NEED_WINSYS_XLIB, test 
"x$NEED_WINSYS_XLIB" = xyes)
 AM_CONDITIONAL(NEED_RADEON_LLVM, test x$NEED_RADEON_LLVM = xyes)
 AM_CONDITIONAL(USE_R600_LLVM_COMPILER, test x$USE_R600_LLVM_COMPILER = xyes)
 AM_CONDITIONAL(HAVE_LOADER_GALLIUM, test x$enable_gallium_loader = xyes)
-AM_CONDITIONAL(HAVE_DRM_LOADER_GALLIUM, test x$enable_gallium_drm_loader = 
xyes)
 AM_CONDITIONAL(HAVE_GALLIUM_COMPUTE, test x$enable_opencl = xyes)
 AM_CONDITIONAL(HAVE_MESA_LLVM, test x$MESA_LLVM = x1)
 AM_CONDITIONAL(USE_VC4_SIMULATOR, test x$USE_VC4_SIMULATOR = xyes)
diff --git a/src/gallium/auxiliary/pipe-loader/Makefile.am 
b/src/gallium/auxiliary/pipe-loader/Makefile.am
index 8c83799..0df5d4a 100644
--- a/src/gallium/auxiliary/pipe-loader/Makefile.am
+++ b/src/gallium/auxiliary/pipe-loader/Makefile.am
@@ -5,6 +5,7 @@ include $(top_srcdir)/src/gallium/Automake.inc
 AM_CFLAGS = \
-I$(top_srcdir)/src/loader \
-I$(top_srcdir)/src/gallium/winsys \
+   $(LIBDRM_CFLAGS) \
$(GALLIUM_PIPE_LOADER_DEFINES) \
$(GALLIUM_CFLAGS) \
$(VISIBILITY_CFLAGS)
@@ -12,17 +13,6 @@ AM_CFLAGS = \
 noinst_LTLIBRARIES = libpipe_loader.la
 
 libpipe_loader_la_SOURCES = \
-   $(COMMON_SOURCES)
-
-if HAVE_DRM_LOADER_GALLIUM
-AM_CFLAGS += \
-   $(LIBDRM_CFLAGS)
-
-libpipe_loader_la_SOURCES += \
+   $(COMMON_SOURCES) \
$(DRM_SOURCES)
 
-libpipe_loader_la_LIBADD = \
-   $(top_builddir)/src/loader/libloader.la
-
-endif
-
diff --git a/src/gallium/auxiliary/pipe-loader/pipe_loader.c 
b/src/gallium/auxiliary/pipe-loader/pipe_loader.c
index 8e79f85..535c2ce 100644
--- a/src/gallium/auxiliary/pipe-loader/pipe_loader.c
+++ b/src/gallium/auxiliary/pipe-loader/pipe_loader.c
@@ -35,9 +35,7 @@
 #define MODULE_PREFIX "pipe_"
 
 static int (*backends[])(struct pipe_loader_device **, int) = {
-#ifdef HAVE_PIPE_LOADER_DRM
&pipe_loader_drm_probe,
-#endif
&pipe_loader_sw_probe
 };
 
diff --git a/src/gallium/auxiliary/pipe-loader/pipe_loader.h 
b/src/gallium/auxiliary/pipe-loader/pipe_loader.h
index 9b87126..cf0c51b 100644
--- a/src/gallium/auxiliary/pipe-loader/pipe_loader.h
+++ b/src/gallium/auxiliary/pipe-loader/pipe_loader.h
@@ -158,8 +158,6 @@ boolean
 pipe_loader_sw_probe_wrapped(struct pipe_loader_device **dev,
  struct pipe_screen *screen);
 
-#ifdef HAVE_PIPE_LOADER_DRM
-
 /**
  * Get a list of known DRM devices.
  *
@@ -180,8 +178,6 @@ pipe_l

Re: [Mesa-dev] [PATCH] mesa: Add abs input modifier to base for POW in ffvertex_prog

2015-10-01 Thread Ilia Mirkin
On Thu, Oct 1, 2015 at 8:36 AM, Daniel Scharrer  wrote:
> The result of POW for a negative base is undefined. Even when the result
> is multiplied by zero (which is the case here whenever the base is
> negative), the Inf and NaNs can propagate past that.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91342
> Signed-off-by: Daniel Scharrer 
> Cc: "10.6 11.0" 

Reviewed-by: Ilia Mirkin 

This also fixes the trace in question on nouveau (tested with a
GK208). Nice find! I did some digging in the ARB_vertex_program spec,
and it explicitly says that LG2(negative) is undefined, and that POW
can be expressed as EX2 + LG2, so the compilers are well within their
right to compile the program as they have been.

Let me know if you need me to push this out for you.

  -ilia

> ---
>  src/mesa/main/ffvertex_prog.c | 17 ++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/src/mesa/main/ffvertex_prog.c b/src/mesa/main/ffvertex_prog.c
> index 95b428d..a6183b4 100644
> --- a/src/mesa/main/ffvertex_prog.c
> +++ b/src/mesa/main/ffvertex_prog.c
> @@ -293,9 +293,10 @@ struct ureg {
> GLuint file:4;
> GLint idx:9;  /* relative addressing may be negative */
>   /* sizeof(idx) should == sizeof(prog_src_reg::Index) */
> +   GLuint abs:1;
> GLuint negate:1;
> GLuint swz:12;
> -   GLuint pad:6;
> +   GLuint pad:5;
>  };
>
>
> @@ -324,6 +325,7 @@ static const struct ureg undef = {
> 0,
> 0,
> 0,
> +   0,
> 0
>  };
>
> @@ -342,6 +344,7 @@ static struct ureg make_ureg(GLuint file, GLint idx)
> struct ureg reg;
> reg.file = file;
> reg.idx = idx;
> +   reg.abs = 0;
> reg.negate = 0;
> reg.swz = SWIZZLE_NOOP;
> reg.pad = 0;
> @@ -350,6 +353,14 @@ static struct ureg make_ureg(GLuint file, GLint idx)
>
>
>
> +static struct ureg absolute( struct ureg reg )
> +{
> +   reg.abs = 1;
> +   reg.negate = 0;
> +   return reg;
> +}
> +
> +
>  static struct ureg negate( struct ureg reg )
>  {
> reg.negate ^= 1;
> @@ -526,8 +537,8 @@ static void emit_arg( struct prog_src_register *src,
> src->File = reg.file;
> src->Index = reg.idx;
> src->Swizzle = reg.swz;
> +   src->Abs = reg.abs;
> src->Negate = reg.negate ? NEGATE_XYZW : NEGATE_NONE;
> -   src->Abs = 0;
> src->RelAddr = 0;
> /* Check that bitfield sizes aren't exceeded */
> assert(src->Index == reg.idx);
> @@ -953,7 +964,7 @@ static struct ureg calculate_light_attenuation( struct 
> tnl_program *p,
>
>emit_op2(p, OPCODE_DP3, spot, 0, negate(VPpli), spot_dir_norm);
>emit_op2(p, OPCODE_SLT, slt, 0, swizzle1(spot_dir_norm,W), spot);
> -  emit_op2(p, OPCODE_POW, spot, 0, spot, swizzle1(attenuation, W));
> +  emit_op2(p, OPCODE_POW, spot, 0, absolute(spot), swizzle1(attenuation, 
> W));
>emit_op2(p, OPCODE_MUL, att, 0, slt, spot);
>
>release_temp(p, spot);
> --
> 2.6.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92221] Unintended code changes in _mesa_base_tex_format commit

2015-10-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92221

Bug ID: 92221
   Summary: Unintended code changes in _mesa_base_tex_format
commit
   Product: Mesa
   Version: git
  Hardware: All
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: nanleych...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

Commit c6bf1cd1467ea5d5370394ba99366dd8a59a385c was supposed to move
_mesa_base_tex_format() from one file to another, but it also adds extra code
into the function and removes a case which handles ASTC formats.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92221] Unintended code changes in _mesa_base_tex_format commit

2015-10-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92221

Nanley Chery  changed:

   What|Removed |Added

 CC||el...@igalia.com,
   ||ja...@jlekstrand.net,
   ||mark.a.ja...@intel.com

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Add abs input modifier to base for POW in ffvertex_prog

2015-10-01 Thread Daniel Scharrer
On 2015-10-01 21:45, Ilia Mirkin wrote:
> On Thu, Oct 1, 2015 at 8:36 AM, Daniel Scharrer  wrote:
>> The result of POW for a negative base is undefined. Even when the result
>> is multiplied by zero (which is the case here whenever the base is
>> negative), the Inf and NaNs can propagate past that.
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91342
>> Signed-off-by: Daniel Scharrer 
>> Cc: "10.6 11.0" 
> 
> Reviewed-by: Ilia Mirkin 
> 
> This also fixes the trace in question on nouveau (tested with a
> GK208). Nice find! I did some digging in the ARB_vertex_program spec,
> and it explicitly says that LG2(negative) is undefined, and that POW
> can be expressed as EX2 + LG2, so the compilers are well within their
> right to compile the program as they have been.
> 
> Let me know if you need me to push this out for you.

Yeah, I don't have commit access.

> 
>   -ilia
> 
>> ---
>>  src/mesa/main/ffvertex_prog.c | 17 ++---
>>  1 file changed, 14 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/mesa/main/ffvertex_prog.c b/src/mesa/main/ffvertex_prog.c
>> index 95b428d..a6183b4 100644
>> --- a/src/mesa/main/ffvertex_prog.c
>> +++ b/src/mesa/main/ffvertex_prog.c
>> @@ -293,9 +293,10 @@ struct ureg {
>> GLuint file:4;
>> GLint idx:9;  /* relative addressing may be negative */
>>   /* sizeof(idx) should == sizeof(prog_src_reg::Index) */
>> +   GLuint abs:1;
>> GLuint negate:1;
>> GLuint swz:12;
>> -   GLuint pad:6;
>> +   GLuint pad:5;
>>  };
>>
>>
>> @@ -324,6 +325,7 @@ static const struct ureg undef = {
>> 0,
>> 0,
>> 0,
>> +   0,
>> 0
>>  };
>>
>> @@ -342,6 +344,7 @@ static struct ureg make_ureg(GLuint file, GLint idx)
>> struct ureg reg;
>> reg.file = file;
>> reg.idx = idx;
>> +   reg.abs = 0;
>> reg.negate = 0;
>> reg.swz = SWIZZLE_NOOP;
>> reg.pad = 0;
>> @@ -350,6 +353,14 @@ static struct ureg make_ureg(GLuint file, GLint idx)
>>
>>
>>
>> +static struct ureg absolute( struct ureg reg )
>> +{
>> +   reg.abs = 1;
>> +   reg.negate = 0;
>> +   return reg;
>> +}
>> +
>> +
>>  static struct ureg negate( struct ureg reg )
>>  {
>> reg.negate ^= 1;
>> @@ -526,8 +537,8 @@ static void emit_arg( struct prog_src_register *src,
>> src->File = reg.file;
>> src->Index = reg.idx;
>> src->Swizzle = reg.swz;
>> +   src->Abs = reg.abs;
>> src->Negate = reg.negate ? NEGATE_XYZW : NEGATE_NONE;
>> -   src->Abs = 0;
>> src->RelAddr = 0;
>> /* Check that bitfield sizes aren't exceeded */
>> assert(src->Index == reg.idx);
>> @@ -953,7 +964,7 @@ static struct ureg calculate_light_attenuation( struct 
>> tnl_program *p,
>>
>>emit_op2(p, OPCODE_DP3, spot, 0, negate(VPpli), spot_dir_norm);
>>emit_op2(p, OPCODE_SLT, slt, 0, swizzle1(spot_dir_norm,W), spot);
>> -  emit_op2(p, OPCODE_POW, spot, 0, spot, swizzle1(attenuation, W));
>> +  emit_op2(p, OPCODE_POW, spot, 0, absolute(spot), 
>> swizzle1(attenuation, W));
>>emit_op2(p, OPCODE_MUL, att, 0, slt, spot);
>>
>>release_temp(p, spot);
>> --
>> 2.6.0
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


-- 
Daniel Scharrer
http://constexpr.org/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92221] Unintended code changes in _mesa_base_tex_format commit

2015-10-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92221

Mark Janes  changed:

   What|Removed |Added

 CC||nanleych...@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92221] Unintended code changes in _mesa_base_tex_format commit

2015-10-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92221

--- Comment #1 from Mark Janes  ---
Nanley, should there have been a piglit failure associated with ASTC
functionality that was disabled?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92221] Unintended code changes in _mesa_base_tex_format commit

2015-10-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92221

--- Comment #2 from Nanley Chery  ---
(In reply to Mark Janes from comment #1)
> Nanley, should there have been a piglit failure associated with ASTC
> functionality that was disabled?

Not at the moment. The tests are still on the list and haven't been merged to
master. Some others developers and I are working to get them integrated.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Make brw_varying_to_offset take a const pointer to the VUE map.

2015-10-01 Thread Kenneth Graunke
It doesn't modify it.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_context.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 8ae3666..c46dbe2 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -581,8 +581,8 @@ static inline GLuint brw_vue_slot_to_offset(GLuint slot)
  * Convert a vertex output (brw_varying_slot) into a byte offset within the
  * VUE.
  */
-static inline GLuint brw_varying_to_offset(struct brw_vue_map *vue_map,
-   GLuint varying)
+static inline GLuint
+brw_varying_to_offset(const struct brw_vue_map *vue_map, GLuint varying)
 {
return brw_vue_slot_to_offset(vue_map->varying_to_slot[varying]);
 }
-- 
2.5.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91596] EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI

2015-10-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91596

--- Comment #8 from Mauro Rossi  ---
Hi Ilia,

Sorry I noticed my previous sentence was not clear,
I have performed the test by disabling EGL_KHR_gl_colorspace, independently
frome Chih-Wei attempt with MESA_EXTENSION_OVERRIDE, by applying the following
changes that were suggested to me by Emil:

--- a/src/egl/drivers/dri2/egl_dri2.c
+++ b/src/egl/drivers/dri2/egl_dri2.c
@@ -586,7 +586,7 @@ dri2_setup_screen(_EGLDisplay *disp)

   if (dri2_renderer_query_integer(dri2_dpy,
   __DRI2_RENDERER_HAS_FRAMEBUFFER_SRGB))
-  disp->Extensions.KHR_gl_colorspace = EGL_TRUE;
+  disp->Extensions.KHR_gl_colorspace = EGL_FALSE;

   if (dri2_dpy->dri2 && dri2_dpy->dri2->base.version >= 3) {
  disp->Extensions.KHR_create_context = EGL_TRUE;

The result was still flashing/phantom elements in android GUI.
So there may be some other effect induced by c2c2e9a commit.

Are all the dri_single/dri_double srgb/non srgb correctly set?

What is strange is that all kind of chipsets are affected.

Mauro

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 91596] EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI

2015-10-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91596

--- Comment #9 from Mauro Rossi  ---
>> Hi Ilia,

My greeting also to Ilia, but you are Marek
Sorry

Mauro

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Make brw_varying_to_offset take a const pointer to the VUE map.

2015-10-01 Thread Matt Turner
Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallium: add per-sample interpolation control into rasterizer state

2015-10-01 Thread Marek Olšák
Ping

On Tue, Sep 29, 2015 at 2:37 AM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> Required by ARB_sample_shading for drivers that don't want a shader variant
> in st/mesa.
> ---
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallium: add per-sample interpolation control into rasterizer state

2015-10-01 Thread Ilia Mirkin
To clarify, the effect on drivers not exposing this cap is nil right?

On Thu, Oct 1, 2015 at 6:24 PM, Marek Olšák  wrote:
> Ping
>
> On Tue, Sep 29, 2015 at 2:37 AM, Marek Olšák  wrote:
>> From: Marek Olšák 
>>
>> Required by ARB_sample_shading for drivers that don't want a shader variant
>> in st/mesa.
>> ---
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallium: add per-sample interpolation control into rasterizer state

2015-10-01 Thread Marek Olšák
On Fri, Oct 2, 2015 at 12:26 AM, Ilia Mirkin  wrote:
> To clarify, the effect on drivers not exposing this cap is nil right?

Yes.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] st/mesa: set force_persample_interp if ARB_sample_shading is used

2015-10-01 Thread Ilia Mirkin
On Mon, Sep 28, 2015 at 8:38 PM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> This is only a half of the work. The next patch will handle
> gl_SampleID/SamplePos, which is the other half of ARB_sample_shading.
> ---
>  src/mesa/state_tracker/st_atom_rasterizer.c | 8 
>  src/mesa/state_tracker/st_atom_shader.c | 1 +
>  src/mesa/state_tracker/st_context.c | 2 ++
>  src/mesa/state_tracker/st_context.h | 1 +
>  4 files changed, 12 insertions(+)
>
> diff --git a/src/mesa/state_tracker/st_atom_rasterizer.c 
> b/src/mesa/state_tracker/st_atom_rasterizer.c
> index cceed42..3a9c461 100644
> --- a/src/mesa/state_tracker/st_atom_rasterizer.c
> +++ b/src/mesa/state_tracker/st_atom_rasterizer.c
> @@ -237,6 +237,14 @@ static void update_raster_state( struct st_context *st )
> /* _NEW_MULTISAMPLE */
> raster->multisample = ctx->Multisample._Enabled;
>
> +   /* _NEW_MULTISAMPLE | _NEW_BUFFERS */
> +   raster->force_persample_interp =
> + st->can_force_persample_interp &&
> + ctx->Multisample._Enabled &&
> + ctx->Multisample.SampleShading &&
> + MAX2(ceil(ctx->Multisample.MinSampleShadingValue *
> +   ctx->DrawBuffer->Visual.samples), 1) > 1;

The max seems unnecessary... as does the ceil. You should just be able
to do MinSampleShadingValue * samples > 1.

With that fixed, Reviewed-by: Ilia Mirkin 

> +
> /* _NEW_SCISSOR */
> raster->scissor = ctx->Scissor.EnableFlags;
>
> diff --git a/src/mesa/state_tracker/st_atom_shader.c 
> b/src/mesa/state_tracker/st_atom_shader.c
> index fee15a9..dc03156 100644
> --- a/src/mesa/state_tracker/st_atom_shader.c
> +++ b/src/mesa/state_tracker/st_atom_shader.c
> @@ -72,6 +72,7 @@ update_fp( struct st_context *st )
>
> /* Ignore sample qualifier while computing this flag. */
> key.persample_shading =
> +  !st->can_force_persample_interp &&
>_mesa_get_min_invocations_per_fragment(st->ctx, &stfp->Base, true) > 1;
>
> st->fp_variant = st_get_fp_variant(st, stfp, &key);
> diff --git a/src/mesa/state_tracker/st_context.c 
> b/src/mesa/state_tracker/st_context.c
> index 72c23ca..7061191 100644
> --- a/src/mesa/state_tracker/st_context.c
> +++ b/src/mesa/state_tracker/st_context.c
> @@ -237,6 +237,8 @@ st_create_context_priv( struct gl_context *ctx, struct 
> pipe_context *pipe,
>PIPE_BIND_SAMPLER_VIEW);
> st->prefer_blit_based_texture_transfer = screen->get_param(screen,
>PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER);
> +   st->can_force_persample_interp = screen->get_param(screen,
> +  PIPE_CAP_FORCE_PERSAMPLE_INTERP);
>
> st->needs_texcoord_semantic =
>screen->get_param(screen, PIPE_CAP_TGSI_TEXCOORD);
> diff --git a/src/mesa/state_tracker/st_context.h 
> b/src/mesa/state_tracker/st_context.h
> index 81d5480..a4cda29 100644
> --- a/src/mesa/state_tracker/st_context.h
> +++ b/src/mesa/state_tracker/st_context.h
> @@ -98,6 +98,7 @@ struct st_context
> boolean has_etc1;
> boolean has_etc2;
> boolean prefer_blit_based_texture_transfer;
> +   boolean can_force_persample_interp;
>
> boolean needs_texcoord_semantic;
> boolean apply_texture_swizzle_to_border_color;
> --
> 2.1.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] st/mesa: automatically set per-sample interpolation if using SampleID/Pos

2015-10-01 Thread Ilia Mirkin
Reviewed-by: Ilia Mirkin 

On Mon, Sep 28, 2015 at 8:38 PM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> ---
>  src/mesa/state_tracker/st_atom_shader.c | 8 +++-
>  src/mesa/state_tracker/st_program.c | 4 +++-
>  2 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/state_tracker/st_atom_shader.c 
> b/src/mesa/state_tracker/st_atom_shader.c
> index dc03156..1e880a1 100644
> --- a/src/mesa/state_tracker/st_atom_shader.c
> +++ b/src/mesa/state_tracker/st_atom_shader.c
> @@ -70,9 +70,15 @@ update_fp( struct st_context *st )
> key.clamp_color = st->clamp_frag_color_in_shader &&
>   st->ctx->Color._ClampFragmentColor;
>
> -   /* Ignore sample qualifier while computing this flag. */
> +   /* Don't set it if the driver can force the interpolation by itself.
> +* If SAMPLE_ID or SAMPLE_POS are used, the interpolation is set
> +* automatically.
> +* Ignore sample qualifier while computing this flag.
> +*/
> key.persample_shading =
>!st->can_force_persample_interp &&
> +  !(stfp->Base.Base.SystemValuesRead & (SYSTEM_BIT_SAMPLE_ID |
> +SYSTEM_BIT_SAMPLE_POS)) &&
>_mesa_get_min_invocations_per_fragment(st->ctx, &stfp->Base, true) > 1;
>
> st->fp_variant = st_get_fp_variant(st, stfp, &key);
> diff --git a/src/mesa/state_tracker/st_program.c 
> b/src/mesa/state_tracker/st_program.c
> index 9c27147..a07f8fe 100644
> --- a/src/mesa/state_tracker/st_program.c
> +++ b/src/mesa/state_tracker/st_program.c
> @@ -619,7 +619,9 @@ st_translate_fragment_program(struct st_context *st,
>   else
>  interpLocation[slot] = TGSI_INTERPOLATE_LOC_CENTER;
>
> - if (key->persample_shading)
> + if (stfp->Base.Base.SystemValuesRead & (SYSTEM_BIT_SAMPLE_ID |
> + SYSTEM_BIT_SAMPLE_POS) ||
> + key->persample_shading)
>  interpLocation[slot] = TGSI_INTERPOLATE_LOC_SAMPLE;
>
>   switch (attr) {
> --
> 2.1.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallium: add per-sample interpolation control into rasterizer state

2015-10-01 Thread Ilia Mirkin
Reviewed-by: Ilia Mirkin 

On Mon, Sep 28, 2015 at 8:37 PM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> Required by ARB_sample_shading for drivers that don't want a shader variant
> in st/mesa.
> ---
>  src/gallium/docs/source/screen.rst   | 9 +
>  src/gallium/drivers/freedreno/freedreno_screen.c | 1 +
>  src/gallium/drivers/i915/i915_screen.c   | 1 +
>  src/gallium/drivers/ilo/ilo_screen.c | 1 +
>  src/gallium/drivers/llvmpipe/lp_screen.c | 1 +
>  src/gallium/drivers/nouveau/nv30/nv30_screen.c   | 1 +
>  src/gallium/drivers/nouveau/nv50/nv50_screen.c   | 1 +
>  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   | 1 +
>  src/gallium/drivers/r300/r300_screen.c   | 1 +
>  src/gallium/drivers/r600/r600_pipe.c | 1 +
>  src/gallium/drivers/radeonsi/si_pipe.c   | 1 +
>  src/gallium/drivers/softpipe/sp_screen.c | 1 +
>  src/gallium/drivers/svga/svga_screen.c   | 1 +
>  src/gallium/drivers/vc4/vc4_screen.c | 1 +
>  src/gallium/include/pipe/p_defines.h | 1 +
>  src/gallium/include/pipe/p_state.h   | 1 +
>  16 files changed, 24 insertions(+)
>
> diff --git a/src/gallium/docs/source/screen.rst 
> b/src/gallium/docs/source/screen.rst
> index e780047..e08844b 100644
> --- a/src/gallium/docs/source/screen.rst
> +++ b/src/gallium/docs/source/screen.rst
> @@ -268,6 +268,15 @@ The integer capabilities:
>bounds_max states of pipe_depth_stencil_alpha_state behave according
>to the GL_EXT_depth_bounds_test specification.
>  * ``PIPE_CAP_TGSI_TXQS``: Whether the `TXQS` opcode is supported
> +* ``PIPE_CAP_FORCE_PERSAMPLE_INTERP``: If the driver can force per-sample
> +  interpolation for all fragment shader inputs if
> +  pipe_rasterizer_state::force_persample_interp is set. This is only used
> +  by GL3-level sample shading (ARB_sample_shading). GL4-level sample shading
> +  (ARB_gpu_shader5) doesn't use this. While GL3 hardware has a state for it,
> +  GL4 hardware will likely need to emulate it with a shader variant, or by
> +  selecting the interpolation weights with a conditional assignment
> +  in the shader.
> +
>
>
>  .. _pipe_capf:
> diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
> b/src/gallium/drivers/freedreno/freedreno_screen.c
> index 9a684d4..0d01005 100644
> --- a/src/gallium/drivers/freedreno/freedreno_screen.c
> +++ b/src/gallium/drivers/freedreno/freedreno_screen.c
> @@ -235,6 +235,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
> case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS:
> case PIPE_CAP_DEPTH_BOUNDS_TEST:
> case PIPE_CAP_TGSI_TXQS:
> +   case PIPE_CAP_FORCE_PERSAMPLE_INTERP:
> return 0;
>
> case PIPE_CAP_MAX_VIEWPORTS:
> diff --git a/src/gallium/drivers/i915/i915_screen.c 
> b/src/gallium/drivers/i915/i915_screen.c
> index 51c64ed..9d6b3d3 100644
> --- a/src/gallium/drivers/i915/i915_screen.c
> +++ b/src/gallium/drivers/i915/i915_screen.c
> @@ -248,6 +248,7 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap 
> cap)
> case PIPE_CAP_TEXTURE_HALF_FLOAT_LINEAR:
> case PIPE_CAP_DEPTH_BOUNDS_TEST:
> case PIPE_CAP_TGSI_TXQS:
> +   case PIPE_CAP_FORCE_PERSAMPLE_INTERP:
>return 0;
>
> case PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS:
> diff --git a/src/gallium/drivers/ilo/ilo_screen.c 
> b/src/gallium/drivers/ilo/ilo_screen.c
> index 9e37e24..76812a6 100644
> --- a/src/gallium/drivers/ilo/ilo_screen.c
> +++ b/src/gallium/drivers/ilo/ilo_screen.c
> @@ -470,6 +470,7 @@ ilo_get_param(struct pipe_screen *screen, enum pipe_cap 
> param)
> case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS:
> case PIPE_CAP_DEPTH_BOUNDS_TEST:
> case PIPE_CAP_TGSI_TXQS:
> +   case PIPE_CAP_FORCE_PERSAMPLE_INTERP:
>return 0;
>
> case PIPE_CAP_VENDOR_ID:
> diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
> b/src/gallium/drivers/llvmpipe/lp_screen.c
> index 697e3d9..50c3781 100644
> --- a/src/gallium/drivers/llvmpipe/lp_screen.c
> +++ b/src/gallium/drivers/llvmpipe/lp_screen.c
> @@ -297,6 +297,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
> pipe_cap param)
> case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS:
> case PIPE_CAP_DEPTH_BOUNDS_TEST:
> case PIPE_CAP_TGSI_TXQS:
> +   case PIPE_CAP_FORCE_PERSAMPLE_INTERP:
>return 0;
> }
> /* should only get here on unhandled cases */
> diff --git a/src/gallium/drivers/nouveau/nv30/nv30_screen.c 
> b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
> index 806d4e6..39267b3 100644
> --- a/src/gallium/drivers/nouveau/nv30/nv30_screen.c
> +++ b/src/gallium/drivers/nouveau/nv30/nv30_screen.c
> @@ -170,6 +170,7 @@ nv30_screen_get_param(struct pipe_screen *pscreen, enum 
> pipe_cap param)
> case PIPE_CAP_TEXTURE_FLOAT_LINEAR:
> case PIPE_CAP_TEXTURE_HALF_FLOAT_LINEAR:
> case PIPE_CAP_TGSI_TXQS:
> +   case PIPE_CAP_FORCE_PERSAMPLE_INTERP:
>return 0;
>
> case PIPE_CAP_VENDOR_ID:
> diff -

[Mesa-dev] [Bug 91596] EGL_KHR_gl_colorspace (v2) causes problem with Android-x86 GUI

2015-10-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=91596

--- Comment #10 from Emil Velikov  ---
In the few moments that android-x86 was up I've noticed a very interesting
commit [1]. In case you don't have a local copy, it enables a workaround
#10194508, as seen here [2].

Not sure if the android-x86 version has the extra qcom 'hacks' so it might be
worth experimenting on the topic - adding/reverting the qcom commit, reverting
the original patch [1], all that whist toggling on/off the extension ;-)


[1]
http://git.android-x86.org/?p=platform/frameworks/native.git;a=commit;h=39aa8695464e496ec4f6d22e9af54457bcfdf4ae
[2]
https://github.com/omnirom/android_frameworks_native/blob/HEAD/opengl/libs/EGL/eglApi.cpp#L431

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/2] i965/fs: Handle non-const sample number in interpolateAtSample

2015-10-01 Thread Matt Turner
On Fri, Aug 7, 2015 at 9:31 AM, Neil Roberts  wrote:
> If a non-const sample number is given to interpolateAtSample it will
> now generate an indirect send message with the sample ID similar to
> how non-const sampler array indexing works. Previously non-const
> values were ignored and instead it ended up using a constant 0 value.
>
> The generator will try to determine if the sample ID is dynamically
> uniform via nir_src_is_dynamically_uniform. If not it will query the
> pixel interpolator in a loop, once for each possible sample number.
> This is necessary because the indirect send message doesn't seem to
> have a way to specify a different value for each fragment.

Heh, I think this was the solution Curro had in mind a couple of years ago. :)

I've Cc'd him. It'd be good for him to review this.

> The range of possible sample numbers is determined using
> STATE_NUM_SAMPLES. When linking the shader it will now add a reference
> to this state if any dynamically non-uniform calls to
> interpolateAtSample are found.
>
> This fixes the following two Piglit tests:
>
> arb_gpu_shader5-interpolateAtSample-nonconst
> arb_gpu_shader5-interpolateAtSample-dynamically-nonuniform
>
> v2: Handle dynamically non-uniform sample ids.
> ---
>  src/mesa/drivers/dri/i965/brw_eu.h |   2 +-
>  src/mesa/drivers/dri/i965/brw_eu_emit.c|  34 ---
>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp |   5 +-
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 119 
> +
>  src/mesa/drivers/dri/i965/brw_program.c|  54 +++
>  src/mesa/drivers/dri/i965/brw_program.h|   1 +
>  src/mesa/drivers/dri/i965/brw_shader.cpp   |   2 +
>  7 files changed, 185 insertions(+), 32 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_eu.h 
> b/src/mesa/drivers/dri/i965/brw_eu.h
> index 761aa0e..0ac1ad9 100644
> --- a/src/mesa/drivers/dri/i965/brw_eu.h
> +++ b/src/mesa/drivers/dri/i965/brw_eu.h
> @@ -461,7 +461,7 @@ brw_pixel_interpolator_query(struct brw_codegen *p,
>   struct brw_reg mrf,
>   bool noperspective,
>   unsigned mode,
> - unsigned data,
> + struct brw_reg data,
>   unsigned msg_length,
>   unsigned response_length);
>
> diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
> b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> index 4d39762..25524d4 100644
> --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> @@ -3192,26 +3192,38 @@ brw_pixel_interpolator_query(struct brw_codegen *p,
>   struct brw_reg mrf,
>   bool noperspective,
>   unsigned mode,
> - unsigned data,
> + struct brw_reg data,
>   unsigned msg_length,
>   unsigned response_length)
>  {
> const struct brw_device_info *devinfo = p->devinfo;
> -   struct brw_inst *insn = next_insn(p, BRW_OPCODE_SEND);
> +   struct brw_inst *insn;
> +   uint16_t exec_size;
>
> -   brw_set_dest(p, insn, dest);
> -   brw_set_src0(p, insn, mrf);
> -   brw_set_message_descriptor(p, insn, GEN7_SFID_PIXEL_INTERPOLATOR,
> -  msg_length, response_length,
> -  false /* header is never present for PI */,
> -  false);
> +   if (data.file == BRW_IMMEDIATE_VALUE) {
> +  insn = next_insn(p, BRW_OPCODE_SEND);
> +  brw_set_dest(p, insn, dest);
> +  brw_set_src0(p, insn, mrf);
> +  brw_set_message_descriptor(p, insn, GEN7_SFID_PIXEL_INTERPOLATOR,
> + msg_length, response_length,
> + false /* header is never present for PI */,
> + false);
> +  brw_inst_set_pi_message_data(devinfo, insn, data.dw1.ud);
> +   } else {
> +  insn = brw_send_indirect_message(p,
> +   GEN7_SFID_PIXEL_INTERPOLATOR,
> +   dest,
> +   mrf,
> +   vec1(data));
> +  brw_inst_set_mlen(devinfo, insn, msg_length);
> +  brw_inst_set_rlen(devinfo, insn, response_length);
> +   }
>
> -   brw_inst_set_pi_simd_mode(
> - devinfo, insn, brw_inst_exec_size(devinfo, insn) == BRW_EXECUTE_16);
> +   exec_size = brw_inst_exec_size(devinfo, p->current);
> +   brw_inst_set_pi_simd_mode(devinfo, insn, exec_size == BRW_EXECUTE_16);
> brw_inst_set_pi_slot_group(devinfo, insn, 0); /* zero unless 32/64px 
> dispatch */
> brw_inst_set_pi_nopersp(devinfo, insn, noperspective);
> brw_inst_set_pi_message_type(devinfo, insn, mode);
> -   brw_inst_set_pi_message_data(devinfo, insn, data);
>  }
>

[Mesa-dev] [Bug 92221] Unintended code changes in _mesa_base_tex_format commit

2015-10-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92221

--- Comment #3 from Nanley Chery  ---
The code that was added was previously removed with this refactoring commit:
99b1f4751f97631011b64fabcb57acf6beae01ac

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: avoid leaking closure when iterating over a string_to_uint_map

2015-10-01 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/mesa/program/hash_table.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/program/hash_table.h b/src/mesa/program/hash_table.h
index e85a836..d0a2abf 100644
--- a/src/mesa/program/hash_table.h
+++ b/src/mesa/program/hash_table.h
@@ -249,6 +249,7 @@ public:
   wrapper->closure = closure;
 
   hash_table_call_foreach(this->ht, subtract_one_wrapper, wrapper);
+  free(wrapper);
}
 
/**
-- 
2.4.9

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: avoid leaking hiddenUniforms map when there are no uniforms

2015-10-01 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/glsl/link_uniforms.cpp | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp
index 47d49c8..740b0a4 100644
--- a/src/glsl/link_uniforms.cpp
+++ b/src/glsl/link_uniforms.cpp
@@ -1131,15 +1131,15 @@ link_assign_uniform_locations(struct gl_shader_program 
*prog,
const unsigned num_data_slots = uniform_size.num_values;
const unsigned hidden_uniforms = uniform_size.num_hidden_uniforms;
 
+   /* assign hidden uniforms a slot id */
+   hiddenUniforms->iterate(assign_hidden_uniform_slot_id, &uniform_size);
+   delete hiddenUniforms;
+
/* On the outside chance that there were no uniforms, bail out.
 */
if (num_uniforms == 0)
   return;
 
-   /* assign hidden uniforms a slot id */
-   hiddenUniforms->iterate(assign_hidden_uniform_slot_id, &uniform_size);
-   delete hiddenUniforms;
-
struct gl_uniform_storage *uniforms =
   rzalloc_array(prog, struct gl_uniform_storage, num_uniforms);
union gl_constant_value *data =
-- 
2.4.9

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: don't forget to free image_param on prog_data free

2015-10-01 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/mesa/drivers/dri/i965/brw_program.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
b/src/mesa/drivers/dri/i965/brw_program.c
index fee96a8..0e4b823 100644
--- a/src/mesa/drivers/dri/i965/brw_program.c
+++ b/src/mesa/drivers/dri/i965/brw_program.c
@@ -551,6 +551,7 @@ brw_stage_prog_data_free(const void *p)
 
ralloc_free(prog_data->param);
ralloc_free(prog_data->pull_param);
+   ralloc_free(prog_data->image_param);
 }
 
 void
-- 
2.4.9

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallium: add per-sample interpolation control into rasterizer state

2015-10-01 Thread Roland Scheidegger
Can't say I'm a big fan of doing essentially the same thing with
different methods, but well it's coming from GL, and if it helps some
drivers...

Acked-by: Roland Scheidegger 

Am 02.10.2015 um 00:24 schrieb Marek Olšák:
> Ping
> 
> On Tue, Sep 29, 2015 at 2:37 AM, Marek Olšák  wrote:
>> From: Marek Olšák 
>>
>> Required by ARB_sample_shading for drivers that don't want a shader variant
>> in st/mesa.
>> ---
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddev&d=BQIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=Vjtt0vs_iqoI31UfJxBl7yv9I2FeiaeAYgMTLKRBc_I&m=GgTDbmCyFPdfRfqaUOiQKcxwA2fX_L2ryeWXZyYNZP4&s=G7zIQcl0J49zTLP3pzsgtFiRZCCYkqUpno7eiJdOXAg&e=
>  
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/13] i965/backend_shader: Add a field to store the NIR shader

2015-10-01 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_fs.h  |  8 +++
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp| 32 -
 src/mesa/drivers/dri/i965/brw_shader.cpp|  1 +
 src/mesa/drivers/dri/i965/brw_shader.h  |  1 +
 src/mesa/drivers/dri/i965/brw_vec4.h|  6 ++---
 src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp   |  6 ++---
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h |  2 +-
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp  | 24 +--
 8 files changed, 39 insertions(+), 41 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 3ab01c7..b932ed2 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -236,10 +236,10 @@ public:
uint32_t spill_offset, int count);
 
void emit_nir_code();
-   void nir_setup_inputs(nir_shader *shader);
-   void nir_setup_outputs(nir_shader *shader);
-   void nir_setup_uniforms(nir_shader *shader);
-   void nir_emit_system_values(nir_shader *shader);
+   void nir_setup_inputs();
+   void nir_setup_outputs();
+   void nir_setup_uniforms();
+   void nir_emit_system_values();
void nir_emit_impl(nir_function_impl *impl);
void nir_emit_cf_list(exec_list *list);
void nir_emit_if(nir_if *if_stmt);
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index d70c672..3793791 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -37,15 +37,13 @@ using namespace brw::surface_access;
 void
 fs_visitor::emit_nir_code()
 {
-   nir_shader *nir = prog->nir;
-
/* emit the arrays used for inputs and outputs - load/store intrinsics will
 * be converted to reads/writes of these arrays
 */
-   nir_setup_inputs(nir);
-   nir_setup_outputs(nir);
-   nir_setup_uniforms(nir);
-   nir_emit_system_values(nir);
+   nir_setup_inputs();
+   nir_setup_outputs();
+   nir_setup_uniforms();
+   nir_emit_system_values();
 
/* get the main function and emit it */
nir_foreach_overload(nir, overload) {
@@ -56,11 +54,11 @@ fs_visitor::emit_nir_code()
 }
 
 void
-fs_visitor::nir_setup_inputs(nir_shader *shader)
+fs_visitor::nir_setup_inputs()
 {
-   nir_inputs = bld.vgrf(BRW_REGISTER_TYPE_F, shader->num_inputs);
+   nir_inputs = bld.vgrf(BRW_REGISTER_TYPE_F, nir->num_inputs);
 
-   foreach_list_typed(nir_variable, var, node, &shader->inputs) {
+   foreach_list_typed(nir_variable, var, node, &nir->inputs) {
   enum brw_reg_type type = brw_type_for_base_type(var->type);
   fs_reg input = offset(nir_inputs, bld, var->data.driver_location);
 
@@ -118,13 +116,13 @@ fs_visitor::nir_setup_inputs(nir_shader *shader)
 }
 
 void
-fs_visitor::nir_setup_outputs(nir_shader *shader)
+fs_visitor::nir_setup_outputs()
 {
brw_wm_prog_key *key = (brw_wm_prog_key*) this->key;
 
-   nir_outputs = bld.vgrf(BRW_REGISTER_TYPE_F, shader->num_outputs);
+   nir_outputs = bld.vgrf(BRW_REGISTER_TYPE_F, nir->num_outputs);
 
-   foreach_list_typed(nir_variable, var, node, &shader->outputs) {
+   foreach_list_typed(nir_variable, var, node, &nir->outputs) {
   fs_reg reg = offset(nir_outputs, bld, var->data.driver_location);
 
   int vector_elements =
@@ -175,14 +173,14 @@ fs_visitor::nir_setup_outputs(nir_shader *shader)
 }
 
 void
-fs_visitor::nir_setup_uniforms(nir_shader *shader)
+fs_visitor::nir_setup_uniforms()
 {
if (dispatch_width != 8)
   return;
 
-   uniforms = shader->num_uniforms;
+   uniforms = nir->num_uniforms;
 
-   foreach_list_typed(nir_variable, var, node, &shader->uniforms) {
+   foreach_list_typed(nir_variable, var, node, &nir->uniforms) {
   /* UBO's and atomics don't take up space in the uniform file */
   if (var->interface_type != NULL || var->type->contains_atomic())
  continue;
@@ -274,10 +272,10 @@ emit_system_values_block(nir_block *block, void 
*void_visitor)
 }
 
 void
-fs_visitor::nir_emit_system_values(nir_shader *shader)
+fs_visitor::nir_emit_system_values()
 {
nir_system_values = ralloc_array(mem_ctx, fs_reg, SYSTEM_VALUE_MAX);
-   nir_foreach_overload(shader, overload) {
+   nir_foreach_overload(nir, overload) {
   assert(strcmp(overload->function->name, "main") == 0);
   assert(overload->impl);
   nir_foreach_block(overload->impl, emit_system_values_block, this);
diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index 7c6043d..09a9859 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -913,6 +913,7 @@ backend_shader::backend_shader(const struct brw_compiler 
*compiler,
: compiler(compiler),
  log_data(log_data),
  devinfo(compiler->devinfo),
+ nir(prog->nir),
  shader(shader_prog ?
 (struct brw_shader *)shader_prog->_LinkedShaders[stage] : NULL),
  shader_prog(shader_prog),
diff --git a/src/mesa/drivers/dri/i965/brw_shader.h

[Mesa-dev] [PATCH 08/13] i965/cs: Remove the prog argument from local_id_payload_dwords

2015-10-01 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_cs.h| 3 +--
 src/mesa/drivers/dri/i965/brw_fs.cpp  | 2 +-
 src/mesa/drivers/dri/i965/gen7_cs_state.c | 7 +++
 3 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_cs.h 
b/src/mesa/drivers/dri/i965/brw_cs.h
index 0018c04..0c0ed2b 100644
--- a/src/mesa/drivers/dri/i965/brw_cs.h
+++ b/src/mesa/drivers/dri/i965/brw_cs.h
@@ -49,8 +49,7 @@ brw_cs_emit(struct brw_context *brw,
 unsigned *final_assembly_size);
 
 unsigned
-brw_cs_prog_local_id_payload_dwords(const struct gl_program *prog,
-unsigned dispatch_width);
+brw_cs_prog_local_id_payload_dwords(unsigned dispatch_width);
 
 #ifdef __cplusplus
 }
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index a4865b2..699d3d0 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -4727,7 +4727,7 @@ fs_visitor::setup_cs_payload()
 
if (prog->SystemValuesRead & SYSTEM_BIT_LOCAL_INVOCATION_ID) {
   const unsigned local_id_dwords =
- brw_cs_prog_local_id_payload_dwords(prog, dispatch_width);
+ brw_cs_prog_local_id_payload_dwords(dispatch_width);
   assert((local_id_dwords & 0x7) == 0);
   const unsigned local_id_regs = local_id_dwords / 8;
   payload.local_invocation_id_reg = payload.num_regs;
diff --git a/src/mesa/drivers/dri/i965/gen7_cs_state.c 
b/src/mesa/drivers/dri/i965/gen7_cs_state.c
index 0b88b2c..5edc4fc 100644
--- a/src/mesa/drivers/dri/i965/gen7_cs_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_cs_state.c
@@ -72,7 +72,7 @@ brw_upload_cs_state(struct brw_context *brw)
 
if (prog->SystemValuesRead & SYSTEM_BIT_LOCAL_INVOCATION_ID) {
   local_id_dwords =
- brw_cs_prog_local_id_payload_dwords(prog, cs_prog_data->simd_size);
+ brw_cs_prog_local_id_payload_dwords(cs_prog_data->simd_size);
}
 
unsigned push_constant_data_size =
@@ -216,8 +216,7 @@ const struct brw_tracked_state brw_cs_state = {
  *
  */
 unsigned
-brw_cs_prog_local_id_payload_dwords(const struct gl_program *prog,
-unsigned dispatch_width)
+brw_cs_prog_local_id_payload_dwords(unsigned dispatch_width)
 {
return 3 * dispatch_width;
 }
@@ -272,7 +271,7 @@ brw_upload_cs_push_constants(struct brw_context *brw,
 
if (prog->SystemValuesRead & SYSTEM_BIT_LOCAL_INVOCATION_ID) {
   local_id_dwords =
- brw_cs_prog_local_id_payload_dwords(prog, cs_prog_data->simd_size);
+ brw_cs_prog_local_id_payload_dwords(cs_prog_data->simd_size);
}
 
/* Updates the ParamaterValues[i] pointers for all parameters of the
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/13] i965: Move binding table setup to codegen time.

2015-10-01 Thread Jason Ekstrand
Setting up binding tables really has little to do with the actual process
of turning shaders into instructions; it's more part of setting up
prog_data.  This commit moves it out of the visitors and with the rest of
the prog_data setup stuff.
---
 src/mesa/drivers/dri/i965/brw_cs.c| 20 +
 src/mesa/drivers/dri/i965/brw_fs.cpp  | 42 ---
 src/mesa/drivers/dri/i965/brw_fs.h|  2 --
 src/mesa/drivers/dri/i965/brw_gs.c| 19 
 src/mesa/drivers/dri/i965/brw_vec4.cpp|  9 --
 src/mesa/drivers/dri/i965/brw_vec4.h  |  1 -
 src/mesa/drivers/dri/i965/brw_vs.c|  5 
 src/mesa/drivers/dri/i965/brw_wm.c| 23 +++
 src/mesa/drivers/dri/i965/gen6_gs_visitor.cpp | 11 ---
 src/mesa/drivers/dri/i965/gen6_gs_visitor.h   |  1 -
 10 files changed, 67 insertions(+), 66 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_cs.c 
b/src/mesa/drivers/dri/i965/brw_cs.c
index 24120fb..1857962 100644
--- a/src/mesa/drivers/dri/i965/brw_cs.c
+++ b/src/mesa/drivers/dri/i965/brw_cs.c
@@ -32,6 +32,23 @@
 #include "intel_batchbuffer.h"
 #include "glsl/nir/nir.h"
 
+static void
+assign_cs_binding_table_offsets(const struct brw_device_info *devinfo,
+const struct gl_shader_program *shader_prog,
+const struct gl_program *prog,
+struct brw_cs_prog_data *prog_data)
+{
+   uint32_t next_binding_table_offset = 0;
+
+   /* May not be used if the gl_NumWorkGroups variable is not accessed. */
+   prog_data->binding_table.work_groups_start = next_binding_table_offset;
+   next_binding_table_offset++;
+
+   brw_assign_common_binding_table_offsets(MESA_SHADER_COMPUTE, devinfo,
+   shader_prog, prog, &prog_data->base,
+   next_binding_table_offset);
+}
+
 static bool
 brw_codegen_cs_prog(struct brw_context *brw,
 struct gl_shader_program *prog,
@@ -52,6 +69,9 @@ brw_codegen_cs_prog(struct brw_context *brw,
 
memset(&prog_data, 0, sizeof(prog_data));
 
+   assign_cs_binding_table_offsets(brw->intelScreen->devinfo, prog,
+   &cp->program.Base, &prog_data);
+
/* Allocate the references to the uniforms that will end up in the
 * prog_data associated with the compiled program, and which will be freed
 * by the state cache.
diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index e90151b..a4865b2 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -4736,41 +4736,6 @@ fs_visitor::setup_cs_payload()
 }
 
 void
-fs_visitor::assign_fs_binding_table_offsets()
-{
-   assert(stage == MESA_SHADER_FRAGMENT);
-   brw_wm_prog_data *prog_data = (brw_wm_prog_data*) this->prog_data;
-   brw_wm_prog_key *key = (brw_wm_prog_key*) this->key;
-   uint32_t next_binding_table_offset = 0;
-
-   /* If there are no color regions, we still perform an FB write to a null
-* renderbuffer, which we place at surface index 0.
-*/
-   prog_data->binding_table.render_target_start = next_binding_table_offset;
-   next_binding_table_offset += MAX2(key->nr_color_regions, 1);
-
-   brw_assign_common_binding_table_offsets(MESA_SHADER_FRAGMENT, devinfo,
-   shader_prog, prog, stage_prog_data,
-   next_binding_table_offset);
-}
-
-void
-fs_visitor::assign_cs_binding_table_offsets()
-{
-   assert(stage == MESA_SHADER_COMPUTE);
-   brw_cs_prog_data *prog_data = (brw_cs_prog_data*) this->prog_data;
-   uint32_t next_binding_table_offset = 0;
-
-   /* May not be used if the gl_NumWorkGroups variable is not accessed. */
-   prog_data->binding_table.work_groups_start = next_binding_table_offset;
-   next_binding_table_offset++;
-
-   brw_assign_common_binding_table_offsets(MESA_SHADER_COMPUTE, devinfo,
-   shader_prog, prog, stage_prog_data,
-   next_binding_table_offset);
-}
-
-void
 fs_visitor::calculate_register_pressure()
 {
invalidate_live_intervals();
@@ -4983,9 +4948,6 @@ fs_visitor::run_vs(gl_clip_plane *clip_planes)
 {
assert(stage == MESA_SHADER_VERTEX);
 
-   brw_assign_common_binding_table_offsets(MESA_SHADER_VERTEX, devinfo,
-   shader_prog, prog, stage_prog_data,
-   0);
setup_vs_payload();
 
if (shader_time_index >= 0)
@@ -5026,8 +4988,6 @@ fs_visitor::run_fs(bool do_rep_send)
 
sanity_param_count = prog->Parameters->NumParameters;
 
-   assign_fs_binding_table_offsets();
-
if (devinfo->gen >= 6)
   setup_payload_gen6();
else
@@ -5114,8 +5074,6 @@ fs_visitor::run_cs()
 
sanity_param_count = prog->Parameters->NumParameters;
 
-   assign_cs_bind

[Mesa-dev] [PATCH 01/13] i965/shader: Pull assign_common_binding_table_offsets out of backend_shader

2015-10-01 Thread Jason Ekstrand
This really has nothing to do with the backend compiler and we'd like to
eventually be able to set this up earlier in the compile process.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp  | 12 +---
 src/mesa/drivers/dri/i965/brw_shader.cpp  | 17 +
 src/mesa/drivers/dri/i965/brw_shader.h| 11 +--
 src/mesa/drivers/dri/i965/brw_vec4.cpp|  3 ++-
 src/mesa/drivers/dri/i965/gen6_gs_visitor.cpp |  4 +++-
 5 files changed, 36 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index b062219..e90151b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -4749,7 +4749,9 @@ fs_visitor::assign_fs_binding_table_offsets()
prog_data->binding_table.render_target_start = next_binding_table_offset;
next_binding_table_offset += MAX2(key->nr_color_regions, 1);
 
-   assign_common_binding_table_offsets(next_binding_table_offset);
+   brw_assign_common_binding_table_offsets(MESA_SHADER_FRAGMENT, devinfo,
+   shader_prog, prog, stage_prog_data,
+   next_binding_table_offset);
 }
 
 void
@@ -4763,7 +4765,9 @@ fs_visitor::assign_cs_binding_table_offsets()
prog_data->binding_table.work_groups_start = next_binding_table_offset;
next_binding_table_offset++;
 
-   assign_common_binding_table_offsets(next_binding_table_offset);
+   brw_assign_common_binding_table_offsets(MESA_SHADER_COMPUTE, devinfo,
+   shader_prog, prog, stage_prog_data,
+   next_binding_table_offset);
 }
 
 void
@@ -4979,7 +4983,9 @@ fs_visitor::run_vs(gl_clip_plane *clip_planes)
 {
assert(stage == MESA_SHADER_VERTEX);
 
-   assign_common_binding_table_offsets(0);
+   brw_assign_common_binding_table_offsets(MESA_SHADER_VERTEX, devinfo,
+   shader_prog, prog, stage_prog_data,
+   0);
setup_vs_payload();
 
if (shader_time_index >= 0)
diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index 1d184a7..7c6043d 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -1363,16 +1363,25 @@ backend_shader::invalidate_cfg()
  * trigger some of our asserts that surface indices are < BRW_MAX_SURFACES.
  */
 void
-backend_shader::assign_common_binding_table_offsets(uint32_t 
next_binding_table_offset)
+brw_assign_common_binding_table_offsets(gl_shader_stage stage,
+const struct brw_device_info *devinfo,
+const struct gl_shader_program 
*shader_prog,
+const struct gl_program *prog,
+struct brw_stage_prog_data 
*stage_prog_data,
+uint32_t next_binding_table_offset)
 {
+   const struct gl_shader *shader = NULL;
int num_textures = _mesa_fls(prog->SamplersUsed);
 
+   if (shader_prog)
+  shader = shader_prog->_LinkedShaders[stage];
+
stage_prog_data->binding_table.texture_start = next_binding_table_offset;
next_binding_table_offset += num_textures;
 
if (shader) {
   stage_prog_data->binding_table.ubo_start = next_binding_table_offset;
-  next_binding_table_offset += shader->base.NumUniformBlocks;
+  next_binding_table_offset += shader->NumUniformBlocks;
} else {
   stage_prog_data->binding_table.ubo_start = 0xd0d0d0d0;
}
@@ -1403,9 +1412,9 @@ 
backend_shader::assign_common_binding_table_offsets(uint32_t next_binding_table_
   stage_prog_data->binding_table.abo_start = 0xd0d0d0d0;
}
 
-   if (shader && shader->base.NumImages) {
+   if (shader && shader->NumImages) {
   stage_prog_data->binding_table.image_start = next_binding_table_offset;
-  next_binding_table_offset += shader->base.NumImages;
+  next_binding_table_offset += shader->NumImages;
} else {
   stage_prog_data->binding_table.image_start = 0xd0d0d0d0;
}
diff --git a/src/mesa/drivers/dri/i965/brw_shader.h 
b/src/mesa/drivers/dri/i965/brw_shader.h
index eeb3306..9690332 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.h
+++ b/src/mesa/drivers/dri/i965/brw_shader.h
@@ -24,6 +24,7 @@
 #include 
 #include "brw_reg.h"
 #include "brw_defines.h"
+#include "brw_context.h"
 #include "main/compiler.h"
 #include "glsl/ir.h"
 #include "program/prog_parameter.h"
@@ -266,8 +267,6 @@ public:
void calculate_cfg();
void invalidate_cfg();
 
-   void assign_common_binding_table_offsets(uint32_t 
next_binding_table_offset);
-
virtual void invalidate_live_intervals() = 0;
 };
 
@@ -295,6 +294,14 @@ extern "C" {
 struct brw_compiler *
 brw_compiler_create(void *mem_ctx, const struct brw_device_info *devinfo);
 
+void
+brw_assign_common_binding_t

[Mesa-dev] [PATCH 03/13] i965: Move prog_data uniform setup to the codegen level

2015-10-01 Thread Jason Ekstrand
As of now, uniform setup is more-or-less unified between vec4 and fs and no
longer requires the fs_visitor.  This makes uniform setup more of a
language/API thing than a backend compiler thing.  This commit moves
setting up the stage_prog_data.params arrays to the same place as we set up
the rest of stage_prog_data.
---
 src/mesa/drivers/dri/i965/brw_cs.c |  5 -
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp   |  7 ---
 src/mesa/drivers/dri/i965/brw_gs.c |  5 -
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp |  7 ---
 src/mesa/drivers/dri/i965/brw_vs.c |  9 +
 src/mesa/drivers/dri/i965/brw_wm.c | 10 +-
 6 files changed, 26 insertions(+), 17 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_cs.c 
b/src/mesa/drivers/dri/i965/brw_cs.c
index 1857962..6b64030 100644
--- a/src/mesa/drivers/dri/i965/brw_cs.c
+++ b/src/mesa/drivers/dri/i965/brw_cs.c
@@ -30,7 +30,7 @@
 #include "intel_mipmap_tree.h"
 #include "brw_state.h"
 #include "intel_batchbuffer.h"
-#include "glsl/nir/nir.h"
+#include "brw_nir.h"
 
 static void
 assign_cs_binding_table_offsets(const struct brw_device_info *devinfo,
@@ -89,6 +89,9 @@ brw_codegen_cs_prog(struct brw_context *brw,
prog_data.base.nr_params = param_count;
prog_data.base.nr_image_params = cs->base.NumImages;
 
+   brw_nir_setup_glsl_uniforms(cp->program.Base.nir, prog, &cp->program.Base,
+   &prog_data.base, true);
+
if (unlikely(brw->perf_debug)) {
   start_busy = (brw->batch.last_bo &&
 drm_intel_bo_busy(brw->batch.last_bo));
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index eb0fe7b..d70c672 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -182,13 +182,6 @@ fs_visitor::nir_setup_uniforms(nir_shader *shader)
 
uniforms = shader->num_uniforms;
 
-   if (shader_prog) {
-  brw_nir_setup_glsl_uniforms(shader, shader_prog, prog,
-  stage_prog_data, true);
-   } else {
-  brw_nir_setup_arb_uniforms(shader, prog, stage_prog_data);
-   }
-
foreach_list_typed(nir_variable, var, node, &shader->uniforms) {
   /* UBO's and atomics don't take up space in the uniform file */
   if (var->interface_type != NULL || var->type->contains_atomic())
diff --git a/src/mesa/drivers/dri/i965/brw_gs.c 
b/src/mesa/drivers/dri/i965/brw_gs.c
index cfed35c..f6b9874 100644
--- a/src/mesa/drivers/dri/i965/brw_gs.c
+++ b/src/mesa/drivers/dri/i965/brw_gs.c
@@ -32,7 +32,7 @@
 #include "brw_vec4_gs_visitor.h"
 #include "brw_state.h"
 #include "brw_ff_gs.h"
-#include "glsl/nir/nir.h"
+#include "brw_nir.h"
 
 static void
 assign_gs_binding_table_offsets(const struct brw_device_info *devinfo,
@@ -91,6 +91,9 @@ brw_codegen_gs_prog(struct brw_context *brw,
c.prog_data.base.base.nr_params = param_count;
c.prog_data.base.base.nr_image_params = gs->NumImages;
 
+   brw_nir_setup_glsl_uniforms(gp->program.Base.nir, prog, &gp->program.Base,
+   &c.prog_data.base.base, false);
+
if (brw->gen >= 8) {
   c.prog_data.static_vertex_count = !gp->program.Base.nir ? -1 :
  nir_gs_count_vertices(gp->program.Base.nir);
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index 8274d48..41667eb 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
@@ -136,13 +136,6 @@ vec4_visitor::nir_setup_uniforms(nir_shader *shader)
 {
uniforms = shader->num_uniforms;
 
-   if (shader_prog) {
-  brw_nir_setup_glsl_uniforms(shader, shader_prog, prog,
-  stage_prog_data, false);
-   } else {
-  brw_nir_setup_arb_uniforms(shader, prog, stage_prog_data);
-   }
-
foreach_list_typed(nir_variable, var, node, &shader->uniforms) {
   /* UBO's and atomics don't take up space in the uniform file */
   if (var->interface_type != NULL || var->type->contains_atomic())
diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
b/src/mesa/drivers/dri/i965/brw_vs.c
index 8607825..3c6ee0a 100644
--- a/src/mesa/drivers/dri/i965/brw_vs.c
+++ b/src/mesa/drivers/dri/i965/brw_vs.c
@@ -135,6 +135,15 @@ brw_codegen_vs_prog(struct brw_context *brw,
 stage_prog_data->nr_image_params);
stage_prog_data->nr_params = param_count;
 
+   if (prog) {
+  brw_nir_setup_glsl_uniforms(vp->program.Base.nir, prog, 
&vp->program.Base,
+  &prog_data.base.base,
+  brw->intelScreen->compiler->scalar_vs);
+   } else {
+  brw_nir_setup_arb_uniforms(vp->program.Base.nir, &vp->program.Base,
+ &prog_data.base.base);
+   }
+
GLbitfield64 outputs_written = vp->program.Base.OutputsWritten;
prog_data.inputs_read = vp->program.Base.InputsRead;
 
diff --git a/src/mesa/drivers/dri/i965/brw_wm.

[Mesa-dev] [PATCH 06/13] nir: Move GS data to nir_shader_info

2015-10-01 Thread Jason Ekstrand
---
 src/glsl/nir/glsl_to_nir.cpp   |  4 ++--
 src/glsl/nir/nir.c |  3 ---
 src/glsl/nir/nir.h | 16 
 src/glsl/nir/nir_lower_gs_intrinsics.c |  2 +-
 4 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
index 6820962..4dd6287 100644
--- a/src/glsl/nir/glsl_to_nir.cpp
+++ b/src/glsl/nir/glsl_to_nir.cpp
@@ -150,8 +150,6 @@ glsl_to_nir(const struct gl_shader_program *shader_prog,
   if (sh->Program->SamplersUsed & (1 << i))
  num_textures = i;
 
-   shader->gs.vertices_out = sh->Geom.VerticesOut;
-   shader->gs.invocations = sh->Geom.Invocations;
shader->info.name = ralloc_asprintf(shader, "GLSL%d", sh->Name);
shader->info.num_textures = num_textures;
shader->info.num_ubos = sh->NumUniformBlocks;
@@ -164,6 +162,8 @@ glsl_to_nir(const struct gl_shader_program *shader_prog,
shader->info.uses_texture_gather = sh->Program->UsesGather;
shader->info.uses_clip_distance_out = sh->Program->UsesClipDistanceOut;
shader->info.separate_shader = shader_prog->SeparateShader;
+   shader->info.gs.vertices_out = sh->Geom.VerticesOut;
+   shader->info.gs.invocations = sh->Geom.Invocations;
 
return shader;
 }
diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
index 7a469a6..e12da80 100644
--- a/src/glsl/nir/nir.c
+++ b/src/glsl/nir/nir.c
@@ -55,9 +55,6 @@ nir_shader_create(void *mem_ctx,
 
shader->stage = stage;
 
-   shader->gs.vertices_out = 0;
-   shader->gs.invocations = 0;
-
return shader;
 }
 
diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index 320ffee..7bc6934 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -1483,6 +1483,14 @@ typedef struct nir_shader_info {
 
/* Whether or not separate shader objects were used */
bool separate_shader;
+
+   struct {
+  /** The maximum number of vertices the geometry shader might write. */
+  unsigned vertices_out;
+
+  /** 1 .. MAX_GEOMETRY_SHADER_INVOCATIONS */
+  unsigned invocations;
+   } gs;
 } nir_shader_info;
 
 typedef struct nir_shader {
@@ -1527,14 +1535,6 @@ typedef struct nir_shader {
 
/** The shader stage, such as MESA_SHADER_VERTEX. */
gl_shader_stage stage;
-
-   struct {
-  /** The maximum number of vertices the geometry shader might write. */
-  unsigned vertices_out;
-
-  /** 1 .. MAX_GEOMETRY_SHADER_INVOCATIONS */
-  unsigned invocations;
-   } gs;
 } nir_shader;
 
 #define nir_foreach_overload(shader, overload)\
diff --git a/src/glsl/nir/nir_lower_gs_intrinsics.c 
b/src/glsl/nir/nir_lower_gs_intrinsics.c
index 2ee4e5c..e0d0678 100644
--- a/src/glsl/nir/nir_lower_gs_intrinsics.c
+++ b/src/glsl/nir/nir_lower_gs_intrinsics.c
@@ -76,7 +76,7 @@ rewrite_emit_vertex(nir_intrinsic_instr *intrin, struct state 
*state)
b->cursor = nir_before_instr(&intrin->instr);
nir_ssa_def *count = nir_load_var(b, state->vertex_count_var);
 
-   nir_ssa_def *max_vertices = nir_imm_int(b, b->shader->gs.vertices_out);
+   nir_ssa_def *max_vertices = nir_imm_int(b, b->shader->info.gs.vertices_out);
 
/* Create: if (vertex_count < max_vertices) and insert it.
 *
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/13] nir: Add a a nir_shader_info struct

2015-10-01 Thread Jason Ekstrand
This commit also adds code to glsl_to_nir and prog_to_nir to fill it out.
---
 src/glsl/nir/glsl_to_nir.cpp   | 18 ++
 src/glsl/nir/nir.c |  1 +
 src/glsl/nir/nir.h | 34 ++
 src/mesa/program/prog_to_nir.c | 13 +
 4 files changed, 66 insertions(+)

diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
index ba08e17..6820962 100644
--- a/src/glsl/nir/glsl_to_nir.cpp
+++ b/src/glsl/nir/glsl_to_nir.cpp
@@ -144,8 +144,26 @@ glsl_to_nir(const struct gl_shader_program *shader_prog,
 
nir_lower_outputs_to_temporaries(shader);
 
+   /* TODO: Use _mesa_fls instead */
+   unsigned num_textures = 0;
+   for (unsigned i = 0; i < 8 * sizeof(sh->Program->SamplersUsed); i++)
+  if (sh->Program->SamplersUsed & (1 << i))
+ num_textures = i;
+
shader->gs.vertices_out = sh->Geom.VerticesOut;
shader->gs.invocations = sh->Geom.Invocations;
+   shader->info.name = ralloc_asprintf(shader, "GLSL%d", sh->Name);
+   shader->info.num_textures = num_textures;
+   shader->info.num_ubos = sh->NumUniformBlocks;
+   shader->info.num_abos = shader_prog->NumAtomicBuffers;
+   shader->info.num_ssbos = shader_prog->NumBufferInterfaceBlocks;
+   shader->info.num_images = sh->NumImages;
+   shader->info.inputs_read = sh->Program->InputsRead;
+   shader->info.outputs_written = sh->Program->OutputsWritten;
+   shader->info.system_values_read = sh->Program->SystemValuesRead;
+   shader->info.uses_texture_gather = sh->Program->UsesGather;
+   shader->info.uses_clip_distance_out = sh->Program->UsesClipDistanceOut;
+   shader->info.separate_shader = shader_prog->SeparateShader;
 
return shader;
 }
diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
index fe10b38..7a469a6 100644
--- a/src/glsl/nir/nir.c
+++ b/src/glsl/nir/nir.c
@@ -41,6 +41,7 @@ nir_shader_create(void *mem_ctx,
exec_list_make_empty(&shader->outputs);
 
shader->options = options;
+   memset(&shader->info, 0, sizeof(shader->info));
 
exec_list_make_empty(&shader->functions);
exec_list_make_empty(&shader->registers);
diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index d0c7b04..320ffee 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -1454,6 +1454,37 @@ typedef struct nir_shader_compiler_options {
bool native_integers;
 } nir_shader_compiler_options;
 
+typedef struct nir_shader_info {
+   const char *name;
+
+   /* Number of textures used by this shader */
+   unsigned num_textures;
+   /* Number of uniform buffers used by this shader */
+   unsigned num_ubos;
+   /* Number of atomic buffers used by this shader */
+   unsigned num_abos;
+   /* Number of shader storage buffers used by this shader */
+   unsigned num_ssbos;
+   /* Number of images used by this shader */
+   unsigned num_images;
+
+   /* Which inputs are actually read */
+   uint64_t inputs_read;
+   /* Which outputs are actually written */
+   uint64_t outputs_written;
+   /* Which system values are actually read */
+   uint64_t system_values_read;
+
+   /* Whether or not this shader ever uses textureGather() */
+   bool uses_texture_gather;
+
+   /* Whether or not this shader uses the gl_ClipDistance output */
+   bool uses_clip_distance_out;
+
+   /* Whether or not separate shader objects were used */
+   bool separate_shader;
+} nir_shader_info;
+
 typedef struct nir_shader {
/** list of uniforms (nir_variable) */
struct exec_list uniforms;
@@ -1471,6 +1502,9 @@ typedef struct nir_shader {
 */
const struct nir_shader_compiler_options *options;
 
+   /** Various bits of compile-time information about a given shader */
+   struct nir_shader_info info;
+
/** list of global variables in the shader (nir_variable) */
struct exec_list globals;
 
diff --git a/src/mesa/program/prog_to_nir.c b/src/mesa/program/prog_to_nir.c
index 1bd735a..fc00534 100644
--- a/src/mesa/program/prog_to_nir.c
+++ b/src/mesa/program/prog_to_nir.c
@@ -1122,6 +1122,19 @@ prog_to_nir(const struct gl_program *prog,
 
ptn_add_output_stores(c);
 
+   s->info.name = ralloc_asprintf(s, "ARB%d", prog->Id);
+   s->info.num_textures = _mesa_fls(prog->SamplersUsed);
+   s->info.num_ubos = 0;
+   s->info.num_abos = 0;
+   s->info.num_ssbos = 0;
+   s->info.num_images = 0;
+   s->info.inputs_read = prog->InputsRead;
+   s->info.outputs_written = prog->OutputsWritten;
+   s->info.system_values_read = prog->SystemValuesRead;
+   s->info.uses_texture_gather = false;
+   s->info.uses_clip_distance_out = false;
+   s->info.separate_shader = false;
+
 fail:
if (c->error) {
   ralloc_free(s);
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/13] i965/vec4: Use nir info instead of pulling things out of [shader_]prog

2015-10-01 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 8 
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 8 
 src/mesa/drivers/dri/i965/brw_wm_iz.cpp| 3 +--
 3 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 9c90726..a8cdd19 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1839,8 +1839,8 @@ vec4_visitor::run()
\
   if (unlikely(INTEL_DEBUG & DEBUG_OPTIMIZER) && this_progress) {  \
  char filename[64];\
- snprintf(filename, 64, "%s-%04d-%02d-%02d-" #pass,\
-  stage_abbrev, shader_prog ? shader_prog->Name : 0, 
iteration, pass_num); \
+ snprintf(filename, 64, "%s-%s-%02d-%02d-" #pass,  \
+  stage_abbrev, nir->info.name, iteration, pass_num);  \
\
  backend_shader::dump_instructions(filename);  \
   }\
@@ -1852,8 +1852,8 @@ vec4_visitor::run()
 
if (unlikely(INTEL_DEBUG & DEBUG_OPTIMIZER)) {
   char filename[64];
-  snprintf(filename, 64, "%s-%04d-00-start",
-   stage_abbrev, shader_prog ? shader_prog->Name : 0);
+  snprintf(filename, 64, "%s-%s-00-start",
+   stage_abbrev, nir->info.name);
 
   backend_shader::dump_instructions(filename);
}
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index 8da8e2c..b25058c 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
@@ -470,7 +470,7 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr)
 
  brw_mark_surface_used(&prog_data->base,
prog_data->base.binding_table.ubo_start +
-   shader_prog->NumBufferInterfaceBlocks - 1);
+   nir->info.num_ssbos - 1);
   }
 
   /* Offset */
@@ -617,7 +617,7 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr)
   */
  brw_mark_surface_used(&prog_data->base,
prog_data->base.binding_table.ubo_start +
-   shader_prog->NumBufferInterfaceBlocks - 1);
+   nir->info.num_ssbos - 1);
   }
 
   src_reg offset_reg = src_reg(this, glsl_type::uint_type);
@@ -767,7 +767,7 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr)
   */
  brw_mark_surface_used(&prog_data->base,
prog_data->base.binding_table.ubo_start +
-   shader_prog->NumBufferInterfaceBlocks - 1);
+   nir->info.num_ssbos - 1);
   }
 
   unsigned const_offset = instr->const_index[0];
@@ -837,7 +837,7 @@ vec4_visitor::nir_emit_ssbo_atomic(int op, 
nir_intrinsic_instr *instr)
*/
   brw_mark_surface_used(&prog_data->base,
 prog_data->base.binding_table.ubo_start +
-shader_prog->NumBufferInterfaceBlocks - 1);
+nir->info.num_ssbos - 1);
}
 
src_reg offset = get_nir_src(instr->src[1], 1);
diff --git a/src/mesa/drivers/dri/i965/brw_wm_iz.cpp 
b/src/mesa/drivers/dri/i965/brw_wm_iz.cpp
index 14930eb..6f22f29 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_iz.cpp
+++ b/src/mesa/drivers/dri/i965/brw_wm_iz.cpp
@@ -124,12 +124,11 @@ void fs_visitor::setup_payload_gen4()
 {
assert(stage == MESA_SHADER_FRAGMENT);
brw_wm_prog_key *key = (brw_wm_prog_key*) this->key;
-   gl_fragment_program *fp = (gl_fragment_program*) prog;
GLuint reg = 2;
bool kill_stats_promoted_workaround = false;
int lookup = key->iz_lookup;
bool uses_depth =
-  (fp->Base.InputsRead & (1 << VARYING_SLOT_POS)) != 0;
+  (nir->info.inputs_read & (1 << VARYING_SLOT_POS)) != 0;
 
assert(lookup < IZ_BIT_MAX);
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/13] nir/glsl: Take a gl_shader_program and a stage rather than a gl_shader

2015-10-01 Thread Jason Ekstrand
---
 src/glsl/nir/glsl_to_nir.cpp| 8 ++--
 src/glsl/nir/glsl_to_nir.h  | 3 ++-
 src/mesa/drivers/dri/i965/brw_nir.c | 3 +--
 3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
index f874582..ba08e17 100644
--- a/src/glsl/nir/glsl_to_nir.cpp
+++ b/src/glsl/nir/glsl_to_nir.cpp
@@ -129,9 +129,13 @@ private:
 }; /* end of anonymous namespace */
 
 nir_shader *
-glsl_to_nir(struct gl_shader *sh, const nir_shader_compiler_options *options)
+glsl_to_nir(const struct gl_shader_program *shader_prog,
+gl_shader_stage stage,
+const nir_shader_compiler_options *options)
 {
-   nir_shader *shader = nir_shader_create(NULL, sh->Stage, options);
+   struct gl_shader *sh = shader_prog->_LinkedShaders[stage];
+
+   nir_shader *shader = nir_shader_create(NULL, stage, options);
 
nir_visitor v1(shader);
nir_function_visitor v2(&v1);
diff --git a/src/glsl/nir/glsl_to_nir.h b/src/glsl/nir/glsl_to_nir.h
index 3801e8c..29badcd 100644
--- a/src/glsl/nir/glsl_to_nir.h
+++ b/src/glsl/nir/glsl_to_nir.h
@@ -32,7 +32,8 @@
 extern "C" {
 #endif
 
-nir_shader *glsl_to_nir(struct gl_shader *sh,
+nir_shader *glsl_to_nir(const struct gl_shader_program *shader_prog,
+gl_shader_stage stage,
 const nir_shader_compiler_options *options);
 
 #ifdef __cplusplus
diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
b/src/mesa/drivers/dri/i965/brw_nir.c
index 1d4f6ab..6ec2d76 100644
--- a/src/mesa/drivers/dri/i965/brw_nir.c
+++ b/src/mesa/drivers/dri/i965/brw_nir.c
@@ -83,13 +83,12 @@ brw_create_nir(struct brw_context *brw,
static const nir_lower_tex_options tex_options = {
   .lower_txp = ~0,
};
-   struct gl_shader *shader = shader_prog ? shader_prog->_LinkedShaders[stage] 
: NULL;
bool debug_enabled = INTEL_DEBUG & intel_debug_flag_for_shader_stage(stage);
nir_shader *nir;
 
/* First, lower the GLSL IR or Mesa IR to NIR */
if (shader_prog) {
-  nir = glsl_to_nir(shader, options);
+  nir = glsl_to_nir(shader_prog, stage, options);
} else {
   nir = prog_to_nir(prog, options);
   nir_convert_to_ssa(nir); /* turn registers into SSA */
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/13] i965: Pull shader, prog, and shader_prog out of the

2015-10-01 Thread Jason Ekstrand
This series does a bunch of code-shuffling with the end objective of
getting the gl_* data structures out of the backend compiler.  Now that we
have NIR, we would like the compiler to be a NIR -> binary translator and
not be loaded full of mesa and GL-specific stuff.  This series gets us 95%
of the way there.  Unfortunately, we still have to pass gl_program into
geometry shaders for transform-fedback and, for fragment shaders, we need
to pass in gl_shader_program so that we can fix up TEXTURE_RECTANGLE on
older hardware (pre-Broadwell).

Incidentally, according to our CI system, this series fixes 2 piglit tests
on Iron Lake and GM45.  I have no idea why.

Jason Ekstrand (13):
  i965/shader: Pull assign_common_binding_table_offsets out of
backend_shader
  i965: Move binding table setup to codegen time.
  i965: Move prog_data uniform setup to the codegen level
  nir/glsl: Take a gl_shader_program and a stage rather than a gl_shader
  nir: Add a a nir_shader_info struct
  nir: Move GS data to nir_shader_info
  i965/backend_shader: Add a field to store the NIR shader
  i965/cs: Remove the prog argument from local_id_payload_dwords
  i965/fs: Move sampler unit lookup into rescale_texcoord
  i965/fs: Use the nir info instead of pulling things out of
[shader_]prog
  i965/vec4: Use nir info instead of pulling things out of [shader_]prog
  i965/fs,vec4: Get rid of the sanity_param_count
  i965/shader: Get rid of the shader, prog, and shader_prog fields

 src/glsl/nir/glsl_to_nir.cpp  |  30 +-
 src/glsl/nir/glsl_to_nir.h|   3 +-
 src/glsl/nir/nir.c|   4 +-
 src/glsl/nir/nir.h|  50 --
 src/glsl/nir/nir_lower_gs_intrinsics.c|   2 +-
 src/mesa/drivers/dri/i965/brw_cs.c|  25 -
 src/mesa/drivers/dri/i965/brw_cs.h|   3 +-
 src/mesa/drivers/dri/i965/brw_fs.cpp  | 106 +-
 src/mesa/drivers/dri/i965/brw_fs.h|  20 ++--
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp  |  57 
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp  |  24 +++--
 src/mesa/drivers/dri/i965/brw_gs.c|  24 -
 src/mesa/drivers/dri/i965/brw_nir.c   |   3 +-
 src/mesa/drivers/dri/i965/brw_shader.cpp  |  30 +++---
 src/mesa/drivers/dri/i965/brw_shader.h|  21 +++--
 src/mesa/drivers/dri/i965/brw_vec4.cpp|  34 ++-
 src/mesa/drivers/dri/i965/brw_vec4.h  |  13 +--
 src/mesa/drivers/dri/i965/brw_vec4_gs_nir.cpp |   6 +-
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |  24 ++---
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h   |   5 +-
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp|  41 +++--
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp|   8 +-
 src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp |  11 +--
 src/mesa/drivers/dri/i965/brw_vs.c|  14 +++
 src/mesa/drivers/dri/i965/brw_vs.h|   4 +-
 src/mesa/drivers/dri/i965/brw_wm.c|  33 ++-
 src/mesa/drivers/dri/i965/brw_wm_iz.cpp   |   3 +-
 src/mesa/drivers/dri/i965/gen6_gs_visitor.cpp |   9 --
 src/mesa/drivers/dri/i965/gen6_gs_visitor.h   |   4 +-
 src/mesa/drivers/dri/i965/gen7_cs_state.c |   7 +-
 src/mesa/program/prog_to_nir.c|  13 +++
 31 files changed, 333 insertions(+), 298 deletions(-)

-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/13] i965/fs: Move sampler unit lookup into rescale_texcoord

2015-10-01 Thread Jason Ekstrand
The texunit variable we create and assign in nir_emit_texture gets passed
through two more layers of function calls before it gets to its sole use in
rescale_texcoord.  The best part is that we already pass the sampler into
rescale_texcoord so we can just look it up there.
---
 src/mesa/drivers/dri/i965/brw_fs.h   |  5 ++---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  8 +---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 14 ++
 3 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index b932ed2..213b2e0 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -205,7 +205,7 @@ public:
void emit_interpolation_setup_gen6();
void compute_sample_position(fs_reg dst, fs_reg int_sample_pos);
fs_reg rescale_texcoord(fs_reg coordinate, int coord_components,
-   bool is_rect, uint32_t sampler, int texunit);
+   bool is_rect, uint32_t sampler);
void emit_texture(ir_texture_opcode op,
  const glsl_type *dest_type,
  fs_reg coordinate, int components,
@@ -218,8 +218,7 @@ public:
  bool is_cube_array,
  bool is_rect,
  uint32_t sampler,
- fs_reg sampler_reg,
- int texunit);
+ fs_reg sampler_reg);
fs_reg emit_mcs_fetch(const fs_reg &coordinate, unsigned components,
  const fs_reg &sampler);
void emit_gen6_gather_wa(uint8_t wa, fs_reg dst);
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 3793791..42bf050 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -1917,12 +1917,6 @@ fs_visitor::nir_emit_texture(const fs_builder &bld, 
nir_tex_instr *instr)
unsigned sampler = instr->sampler_index;
fs_reg sampler_reg(sampler);
 
-   /* FINISHME: We're failing to recompile our programs when the sampler is
-* updated.  This only matters for the texture rectangle scale parameters
-* (pre-gen6, or gen6+ with GL_CLAMP).
-*/
-   int texunit = prog->SamplerUnits[sampler];
-
int gather_component = instr->component;
 
bool is_rect = instr->sampler_dim == GLSL_SAMPLER_DIM_RECT;
@@ -2063,7 +2057,7 @@ fs_visitor::nir_emit_texture(const fs_builder &bld, 
nir_tex_instr *instr)
emit_texture(op, dest_type, coordinate, instr->coord_components,
 shadow_comparitor, lod, lod2, lod_components, sample_index,
 tex_offset, mcs, gather_component,
-is_cube_array, is_rect, sampler, sampler_reg, texunit);
+is_cube_array, is_rect, sampler, sampler_reg);
 
fs_reg dest = get_nir_dest(instr->dest);
dest.type = this->result.type;
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 47d7ae4..c319bf7 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -79,7 +79,7 @@ fs_visitor::emit_vs_system_value(int location)
 
 fs_reg
 fs_visitor::rescale_texcoord(fs_reg coordinate, int coord_components,
- bool is_rect, uint32_t sampler, int texunit)
+ bool is_rect, uint32_t sampler)
 {
bool needs_gl_clamp = true;
fs_reg scale_x, scale_y;
@@ -93,10 +93,16 @@ fs_visitor::rescale_texcoord(fs_reg coordinate, int 
coord_components,
 (devinfo->gen >= 6 && (key_tex->gl_clamp_mask[0] & (1 << sampler) ||
key_tex->gl_clamp_mask[1] & (1 << sampler) {
   struct gl_program_parameter_list *params = prog->Parameters;
+
+
+  /* FINISHME: We're failing to recompile our programs when the sampler is
+   * updated.  This only matters for the texture rectangle scale
+   * parameters (pre-gen6, or gen6+ with GL_CLAMP).
+   */
   int tokens[STATE_LENGTH] = {
 STATE_INTERNAL,
 STATE_TEXRECT_SCALE,
-texunit,
+prog->SamplerUnits[sampler],
 0,
 0
   };
@@ -221,7 +227,7 @@ fs_visitor::emit_texture(ir_texture_opcode op,
  bool is_cube_array,
  bool is_rect,
  uint32_t sampler,
- fs_reg sampler_reg, int texunit)
+ fs_reg sampler_reg)
 {
fs_inst *inst = NULL;
 
@@ -256,7 +262,7 @@ fs_visitor::emit_texture(ir_texture_opcode op,
* samplers.  This should only be a problem with GL_CLAMP on Gen7.
*/
   coordinate = rescale_texcoord(coordinate, coord_components, is_rect,
-sampler, texunit);
+sampler);
}
 
/* Writemasking doesn't eliminate channels on SIMD8 texture
-- 
2.5.0.400.gff86faf


[Mesa-dev] [PATCH 10/13] i965/fs: Use the nir info instead of pulling things out of [shader_]prog

2015-10-01 Thread Jason Ekstrand
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 29 ++--
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  8 
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |  2 +-
 3 files changed, 19 insertions(+), 20 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 699d3d0..2d67085 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1401,7 +1401,7 @@ fs_visitor::calculate_urb_setup()
int urb_next = 0;
/* Figure out where each of the incoming setup attributes lands. */
if (devinfo->gen >= 6) {
-  if (_mesa_bitcount_64(prog->InputsRead &
+  if (_mesa_bitcount_64(nir->info.inputs_read &
 BRW_FS_VARYING_INPUT_MASK) <= 16) {
  /* The SF/SBE pipeline stage can do arbitrary rearrangement of the
   * first 16 varying inputs, so we can put them wherever we want.
@@ -1413,7 +1413,7 @@ fs_visitor::calculate_urb_setup()
   * a different vertex (or geometry) shader.
   */
  for (unsigned int i = 0; i < VARYING_SLOT_MAX; i++) {
-if (prog->InputsRead & BRW_FS_VARYING_INPUT_MASK &
+if (nir->info.inputs_read & BRW_FS_VARYING_INPUT_MASK &
 BITFIELD64_BIT(i)) {
prog_data->urb_setup[i] = urb_next++;
 }
@@ -1427,7 +1427,7 @@ fs_visitor::calculate_urb_setup()
  struct brw_vue_map prev_stage_vue_map;
  brw_compute_vue_map(devinfo, &prev_stage_vue_map,
  key->input_slots_valid,
- shader_prog->SeparateShader);
+ nir->info.separate_shader);
  int first_slot = 2 * BRW_SF_URB_ENTRY_READ_OFFSET;
  assert(prev_stage_vue_map.num_slots <= first_slot + 32);
  for (int slot = first_slot; slot < prev_stage_vue_map.num_slots;
@@ -1437,7 +1437,7 @@ fs_visitor::calculate_urb_setup()
  * unused.
  */
 if (varying != BRW_VARYING_SLOT_COUNT &&
-(prog->InputsRead & BRW_FS_VARYING_INPUT_MASK &
+(nir->info.inputs_read & BRW_FS_VARYING_INPUT_MASK &
  BITFIELD64_BIT(varying))) {
prog_data->urb_setup[varying] = slot - first_slot;
 }
@@ -1470,7 +1470,7 @@ fs_visitor::calculate_urb_setup()
*
* See compile_sf_prog() for more info.
*/
-  if (prog->InputsRead & BITFIELD64_BIT(VARYING_SLOT_PNTC))
+  if (nir->info.inputs_read & BITFIELD64_BIT(VARYING_SLOT_PNTC))
  prog_data->urb_setup[VARYING_SLOT_PNTC] = urb_next++;
}
 
@@ -4634,7 +4634,7 @@ void
 fs_visitor::setup_payload_gen6()
 {
bool uses_depth =
-  (prog->InputsRead & (1 << VARYING_SLOT_POS)) != 0;
+  (nir->info.inputs_read & (1 << VARYING_SLOT_POS)) != 0;
unsigned barycentric_interp_modes =
   (stage == MESA_SHADER_FRAGMENT) ?
   ((brw_wm_prog_data*) this->prog_data)->barycentric_interp_modes : 0;
@@ -4693,7 +4693,7 @@ fs_visitor::setup_payload_gen6()
}
 
/* R32: MSAA input coverage mask */
-   if (prog->SystemValuesRead & SYSTEM_BIT_SAMPLE_MASK_IN) {
+   if (nir->info.system_values_read & SYSTEM_BIT_SAMPLE_MASK_IN) {
   assert(devinfo->gen >= 7);
   payload.sample_mask_in_reg = payload.num_regs;
   payload.num_regs++;
@@ -4706,7 +4706,7 @@ fs_visitor::setup_payload_gen6()
/* R34-: bary for 32-pixel. */
/* R58-59: interp W for 32-pixel. */
 
-   if (prog->OutputsWritten & BITFIELD64_BIT(FRAG_RESULT_DEPTH)) {
+   if (nir->info.outputs_written & BITFIELD64_BIT(FRAG_RESULT_DEPTH)) {
   source_depth_to_render_target = true;
}
 }
@@ -4725,7 +4725,7 @@ fs_visitor::setup_cs_payload()
 
payload.num_regs = 1;
 
-   if (prog->SystemValuesRead & SYSTEM_BIT_LOCAL_INVOCATION_ID) {
+   if (nir->info.system_values_read & SYSTEM_BIT_LOCAL_INVOCATION_ID) {
   const unsigned local_id_dwords =
  brw_cs_prog_local_id_payload_dwords(dispatch_width);
   assert((local_id_dwords & 0x7) == 0);
@@ -4786,8 +4786,8 @@ fs_visitor::optimize()
 \
   if (unlikely(INTEL_DEBUG & DEBUG_OPTIMIZER) && this_progress) {   \
  char filename[64]; \
- snprintf(filename, 64, "%s%d-%04d-%02d-%02d-" #pass,  \
-  stage_abbrev, dispatch_width, shader_prog ? 
shader_prog->Name : 0, iteration, pass_num); \
+ snprintf(filename, 64, "%s%d-%s-%02d-%02d-" #pass,  \
+  stage_abbrev, dispatch_width, nir->info.name, iteration, 
pass_num); \
 \
  backend_shader::dump_instructions(filename);   \
   } \
@@ -4800,9 +4800,8 @@ fs_visitor::optimize()
 
if (unlikely(INTEL_D

[Mesa-dev] [PATCH 12/13] i965/fs, vec4: Get rid of the sanity_param_count

2015-10-01 Thread Jason Ekstrand
It doesn't exist for anything other than an assert that, as far as I can
tell, isn't possible to trip.  Soon, we will remove prog from the visitor
entirely and this will become even more impossible to hit.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp   | 18 --
 src/mesa/drivers/dri/i965/brw_fs.h |  1 -
 src/mesa/drivers/dri/i965/brw_vec4.cpp |  9 -
 src/mesa/drivers/dri/i965/brw_vec4.h   |  2 --
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp |  1 -
 5 files changed, 31 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 2d67085..7e6022b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -4985,8 +4985,6 @@ fs_visitor::run_fs(bool do_rep_send)
 
assert(stage == MESA_SHADER_FRAGMENT);
 
-   sanity_param_count = prog->Parameters->NumParameters;
-
if (devinfo->gen >= 6)
   setup_payload_gen6();
else
@@ -5055,13 +5053,6 @@ fs_visitor::run_fs(bool do_rep_send)
else
   wm_prog_data->reg_blocks_16 = brw_register_blocks(grf_used);
 
-   /* If any state parameters were appended, then ParameterValues could have
-* been realloced, in which case the driver uniform storage set up by
-* _mesa_associate_uniform_storage() would point to freed memory.  Make
-* sure that didn't happen.
-*/
-   assert(sanity_param_count == prog->Parameters->NumParameters);
-
return !failed;
 }
 
@@ -5071,8 +5062,6 @@ fs_visitor::run_cs()
assert(stage == MESA_SHADER_COMPUTE);
assert(shader);
 
-   sanity_param_count = prog->Parameters->NumParameters;
-
setup_cs_payload();
 
if (shader_time_index >= 0)
@@ -5100,13 +5089,6 @@ fs_visitor::run_cs()
if (failed)
   return false;
 
-   /* If any state parameters were appended, then ParameterValues could have
-* been realloced, in which case the driver uniform storage set up by
-* _mesa_associate_uniform_storage() would point to freed memory.  Make
-* sure that didn't happen.
-*/
-   assert(sanity_param_count == prog->Parameters->NumParameters);
-
return !failed;
 }
 
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 213b2e0..321d302 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -300,7 +300,6 @@ public:
const struct brw_sampler_prog_key_data *key_tex;
 
struct brw_stage_prog_data *prog_data;
-   unsigned int sanity_param_count;
 
int *param_size;
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index a8cdd19..8a76f36 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1803,8 +1803,6 @@ vec4_visitor::emit_shader_time_write(int 
shader_time_subindex, src_reg value)
 bool
 vec4_visitor::run()
 {
-   sanity_param_count = prog->Parameters->NumParameters;
-
if (shader_time_index >= 0)
   emit_shader_time_begin();
 
@@ -1927,13 +1925,6 @@ vec4_visitor::run()
  brw_get_scratch_size(last_scratch * REG_SIZE);
}
 
-   /* If any state parameters were appended, then ParameterValues could have
-* been realloced, in which case the driver uniform storage set up by
-* _mesa_associate_uniform_storage() would point to freed memory.  Make
-* sure that didn't happen.
-*/
-   assert(sanity_param_count == prog->Parameters->NumParameters);
-
return !failed;
 }
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index 6e3af60..3095a51 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -97,8 +97,6 @@ public:
 
const struct brw_sampler_prog_key_data * const key_tex;
struct brw_vue_prog_data * const prog_data;
-   unsigned int sanity_param_count;
-
char *fail_msg;
bool failed;
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index bc9d9a0..f4cbc9e 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -1836,7 +1836,6 @@ vec4_visitor::vec4_visitor(const struct brw_compiler 
*compiler,
 shader_prog, prog, &prog_data->base, stage),
  key_tex(key_tex),
  prog_data(prog_data),
- sanity_param_count(0),
  fail_msg(NULL),
  first_non_payload_grf(0),
  need_all_constants_in_pull_buffer(false),
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] nir: Add a function to determine if a source is dynamically uniform

2015-10-01 Thread Jason Ekstrand
On Oct 1, 2015 12:18 PM, "Matt Turner"  wrote:
>
> On Fri, Aug 7, 2015 at 9:31 AM, Neil Roberts  wrote:
> > Adds nir_src_is_dynamically_uniform which returns true if the source
> > is known to be dynamically uniform. This will be used in a later patch
> > to add a workaround for cases that only work with dynamically uniform
> > sources. Note that the function is not definitive, it can return false
> > negatives (but not false positives). Currently it only detects
> > constants and uniform accesses. It could easily be extended to include
> > more cases.
> > ---
> >  src/glsl/nir/nir.c | 29 +
> >  src/glsl/nir/nir.h |  1 +
> >  2 files changed, 30 insertions(+)
> >
> > diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
> > index 78ff886..242f0b4 100644
> > --- a/src/glsl/nir/nir.c
> > +++ b/src/glsl/nir/nir.c
> > @@ -1784,6 +1784,35 @@ nir_src_as_const_value(nir_src src)
> > return &load->value;
> >  }
> >
> > +/**
> > + * Returns true if the source is known to be dynamically uniform.
Otherwise it
> > + * returns false which means it may or may not be dynamically uniform
but it
> > + * can't be determined.
> > + */
> > +bool
> > +nir_src_is_dynamically_uniform(nir_src src)
> > +{
> > +   if (!src.is_ssa)
> > +  return false;
> > +
> > +   /* Constants are trivially dynamically uniform */
> > +   if (src.ssa->parent_instr->type == nir_instr_type_load_const)
> > +  return true;
> > +
> > +   /* As are uniform variables */
> > +   if (src.ssa->parent_instr->type == nir_instr_type_intrinsic) {
> > +  nir_intrinsic_instr *intr =
nir_instr_as_intrinsic(src.ssa->parent_instr);
> > +
> > +  if (intr->intrinsic == nir_intrinsic_load_uniform)
> > + return true;
> > +   }
> > +
> > +   /* XXX: this could have many more tests, such as when a sampler
function is
> > +* called with dynamically uniform arguments.
> > +*/
> > +   return false;
> > +}
>
> This functions seems correct as-is, so it gets a
>
> Reviewed-by: Matt Turner 
>
> On top of being useful for fixing the nonconst/nonuniform piglit
> tests, knowing which values are uniform can allow better optimization
> and knowing which branches are uniform would allow us to enable Single
> Program Flow.
>
> Cc'ing Jason for a discussion about how we'd like to handle this kind
> of information in NIR. Add a bool is_uniform or something to
> nir_ssa_def and hook into the metadata system?

That was more-or-less my plan. We should probably have a proper pass that
handles things like phi nodes and ALU operations.

--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 92231] [swrast] piglit gl-1.0-beginend-coverage regression

2015-10-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92231

Bug ID: 92231
   Summary: [swrast] piglit gl-1.0-beginend-coverage regression
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Keywords: bisected, regression
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: v...@freedesktop.org
QA Contact: mesa-dev@lists.freedesktop.org
CC: bri...@vmware.com, lem...@gmail.com

mesa: b3f9c5cc0fab6d770d220efd87ba3746c6673875 (master 11.1.0-devel)

$ ./bin/gl-1.0-beginend-coverage -auto -fbo
[...]
testing glTexParameterf()
  Testing within begin/end
Unexpected GL error: GL_INVALID_FRAMEBUFFER_OPERATION 0x506
(Error at tests/spec/gl-1.0/beginend-coverage.c:686)
Expected GL error: GL_INVALID_OPERATION 0x502
PIGLIT: {"subtest": {"glTexParameterf" : "fail"}}

c277fa394087272c79d65fc308d268fc768b91e5 is the first bad commit
commit c277fa394087272c79d65fc308d268fc768b91e5
Author: Brian Paul 
Date:   Wed Sep 30 11:29:13 2015 -0600

mesa: consolidate texture binding code

Before, we were doing the actual _mesa_reference_texobj() call and
ctx->Driver.BindTexture() and misc housekeeping in three different
places.  This consolidates the common code in a new bind_texture()
function.

Reviewed-by: Tapani Pälli 

:04 04 01cfb3f9437ac679e95f9507e0b47516ecec347e
7d921525c01aef94abfb26cc232d225ebe5f50af Msrc
bisect run success

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965: Fix intel_miptree_is_fast_clear_capable()

2015-10-01 Thread Ben Widawsky
On Thu, Oct 01, 2015 at 08:20:07AM -0700, Chad Versace wrote:
> There are three types of fast clears:
>   a. fast depth clears
>   b. fast singlesample color clears
>   c. fast multisample color clears
> Function intel_miptree_is_fast_clear_capable() checks if a miptree
> supports fast clears of type (b).
> 
> Rename the function to disambiguate what it does:
>   old: intel_miptree_is_fast_clear_capable
>   new: intel_miptree_supports_non_msrt_fast_clear
> 
> The functionally *accidentally* rejected multisampled color surfaces
> because it thought they were singlesample array surfaces. Fix that by
> explicitly rejecting surfaces with samples > 1.
> 

I wasn't going to say anything except you put "accidentally" in bold. I
don't think you can determine what Paul was thinking when he originally
implemented that code. Calling it accidental is unnecessarily judgmental (as is
my response here). The code was correct, and unless you want to ask him, we
should assume it was intentional.

FWIW: I didn't spot anything wrong in this patch, but I don't know the code well
enough to call that a reviewed-by in the time I spent looking at it.

> This fix would have been needed before we enabled layered fast
> singlesample color clears (introduced in gen8), which we want to do
> eventually. For now, though, this patch changes no behavior; it just
> fixes how the driver chooses its behavior.
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 24 +++-
>  1 file changed, 19 insertions(+), 5 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 05dc291..a169c41 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -194,8 +194,8 @@ intel_tiling_supports_non_msrt_mcs(struct brw_context 
> *brw, unsigned tiling)
>   *   64bpp, and 128bpp.
>   */
>  static bool
> -intel_miptree_is_fast_clear_capable(struct brw_context *brw,
> -struct intel_mipmap_tree *mt)
> +intel_miptree_supports_non_msrt_fast_clear(struct brw_context *brw,
> +   struct intel_mipmap_tree *mt)
>  {
> /* MCS support does not exist prior to Gen7 */
> if (brw->gen < 7)
> @@ -204,6 +204,10 @@ intel_miptree_is_fast_clear_capable(struct brw_context 
> *brw,
> if (mt->disable_aux_buffers)
>return false;
>  
> +   /* This function applies only to non-multisampled render targets. */
> +   if (mt->num_samples > 1)
> +  return false;
> +
> /* MCS is only supported for color buffers */
> switch (_mesa_get_format_base_format(mt->format)) {
> case GL_DEPTH_COMPONENT:
> @@ -222,7 +226,16 @@ intel_miptree_is_fast_clear_capable(struct brw_context 
> *brw,
>  
>return false;
> }
> +
> +   /* Check for layered surfaces. */
> if (mt->physical_depth0 != 1) {
> +   /* Multisample surfaces with the CMS layout are not layered surfaces,
> +* yet still have physical_depth0 > 1. Assert that we don't
> +* accidentally reject a multisampled surface here. We should have
> +* rejected it earlier by explicitly checking the sample count.
> +*/
> +  assert(mt->num_samples <= 1);
> +
>if (brw->gen >= 8) {
>   perf_debug("Layered fast clear - giving up. (%dx%d%d)\n",
>  mt->logical_width0, mt->logical_height0,
> @@ -494,7 +507,7 @@ intel_miptree_create_layout(struct brw_context *brw,
>  *  7   |  ? |?
>  *  6   |  ? |?
>  */
> -   if (intel_miptree_is_fast_clear_capable(brw, mt)) {
> +   if (intel_miptree_supports_non_msrt_fast_clear(brw, mt)) {
>if (brw->gen >= 9 || (brw->gen == 8 && num_samples <= 1))
>   layout_flags |= MIPTREE_LAYOUT_FORCE_HALIGN16;
> } else if (brw->gen >= 9 && num_samples > 1) {
> @@ -692,7 +705,7 @@ intel_miptree_create(struct brw_context *brw,
>  * clear actually occurs.
>  */
> if (intel_tiling_supports_non_msrt_mcs(brw, mt->tiling) &&
> -   intel_miptree_is_fast_clear_capable(brw, mt)) {
> +   intel_miptree_supports_non_msrt_fast_clear(brw, mt)) {
>mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_RESOLVED;
>assert(brw->gen < 8 || mt->halign == 16 || num_samples <= 1);
> }
> @@ -800,8 +813,9 @@ intel_update_winsys_renderbuffer_miptree(struct 
> brw_context *intel,
>  * clear actually occurs.
>  */
> if (intel_tiling_supports_non_msrt_mcs(intel, singlesample_mt->tiling) &&
> -   intel_miptree_is_fast_clear_capable(intel, singlesample_mt))
> +   intel_miptree_supports_non_msrt_fast_clear(intel, singlesample_mt)) {
>singlesample_mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_RESOLVED;
> +   }
>  
> if (num_samples == 0) {
>intel_miptree_release(&irb->mt);
> -- 
> 2.5.0.342.g44e0223
> 
> ___
> mesa-dev mailing list

[Mesa-dev] [Bug 92231] [swrast] piglit gl-1.0-beginend-coverage regression

2015-10-01 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92231

Tapani Pälli  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Tapani Pälli  ---


*** This bug has been marked as a duplicate of bug 92216 ***

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: fix GetProgramiv/GetActiveAttrib regression

2015-10-01 Thread Tapani Pälli
Ping, could someone ack/review this? If not, I will revert the patches 
that caused this for now and figure out alternative solution with more 
time for packed varyings. It is one of the CTS tests that will fail with 
revert and nothing else gets hurt since it's a corner-case with SSO. I 
think currently any game using Unity engine is broken because of this issue.


// Tapani


On 09/28/2015 12:12 PM, Tapani Pälli wrote:

Commit 4639cea2921669527eb43dcb49724c05afb27e8e added packed varyings
as part of program resource list. We need to take this to account with
all functions dealing with active input attributes.

Patch fixes the issue by adding additional check to is_active_attrib
and iterating active attribs explicitly in GetActiveAttrib function.

I've tested that fix works for Assault Android Cactus (demo version)
and does not cause Piglit or CTS regressions in glGetProgramiv tests.

Signed-off-by: Tapani Pälli 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92122
---
  src/mesa/main/shader_query.cpp | 44 +-
  1 file changed, 35 insertions(+), 9 deletions(-)

diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
index 99d9e10..d023504 100644
--- a/src/mesa/main/shader_query.cpp
+++ b/src/mesa/main/shader_query.cpp
@@ -101,15 +101,34 @@ _mesa_BindAttribLocation(GLhandleARB program, GLuint 
index,
  */
  }

+/* Function checks if given input variable is packed varying by checking
+ * if there also exists output variable with the same name.
+ */
+static bool
+is_packed_varying(struct gl_shader_program *shProg, const ir_variable *input)
+{
+   assert(input->data.mode == ir_var_shader_in);
+   unsigned array_index = 0;
+   struct gl_program_resource *out =
+  _mesa_program_resource_find_name(shProg, GL_PROGRAM_OUTPUT, input->name,
+   &array_index);
+   if (out)
+  return true;
+   return false;
+}
+
  static bool
-is_active_attrib(const ir_variable *var)
+is_active_attrib(struct gl_shader_program *shProg,
+ const ir_variable *var)
  {
 if (!var)
return false;

 switch (var->data.mode) {
 case ir_var_shader_in:
-  return var->data.location != -1;
+  return (var->data.location != -1 &&
+  var->data.location_frac == 0 &&
+  !is_packed_varying(shProg, var));

 case ir_var_system_value:
/* From GL 4.3 core spec, section 11.1.1 (Vertex Attributes):
@@ -154,19 +173,25 @@ _mesa_GetActiveAttrib(GLhandleARB program, GLuint 
desired_index,
return;
 }

-   struct gl_program_resource *res =
-  _mesa_program_resource_find_index(shProg, GL_PROGRAM_INPUT,
-desired_index);
+   struct gl_program_resource *res = shProg->ProgramResourceList;
+   unsigned index = -1;
+   for (unsigned i = 0; i < shProg->NumProgramResourceList; i++, res++) {
+  if (res->Type != GL_PROGRAM_INPUT ||
+  !is_active_attrib(shProg, RESOURCE_VAR(res)))
+ continue;
+  if (++index == desired_index)
+ break;
+   }

 /* User asked for index that does not exist. */
-   if (!res) {
+   if (!res || index != desired_index) {
_mesa_error(ctx, GL_INVALID_VALUE, "glGetActiveAttrib(index)");
return;
 }

 const ir_variable *const var = RESOURCE_VAR(res);

-   if (!is_active_attrib(var))
+   if (!is_active_attrib(shProg, var))
return;

 const char *var_name = var->name;
@@ -252,7 +277,7 @@ _mesa_count_active_attribs(struct gl_shader_program *shProg)
 for (unsigned j = 0; j < shProg->NumProgramResourceList; j++, res++) {
if (res->Type == GL_PROGRAM_INPUT &&
res->StageReferences & (1 << MESA_SHADER_VERTEX) &&
-  is_active_attrib(RESOURCE_VAR(res)))
+  is_active_attrib(shProg, RESOURCE_VAR(res)))
   count++;
 }
 return count;
@@ -271,7 +296,8 @@ _mesa_longest_attribute_name_length(struct 
gl_shader_program *shProg)
 size_t longest = 0;
 for (unsigned j = 0; j < shProg->NumProgramResourceList; j++, res++) {
if (res->Type == GL_PROGRAM_INPUT &&
-  res->StageReferences & (1 << MESA_SHADER_VERTEX)) {
+  res->StageReferences & (1 << MESA_SHADER_VERTEX) &&
+  is_active_attrib(shProg, RESOURCE_VAR(res))) {

const size_t length = strlen(RESOURCE_VAR(res)->name);
if (length >= longest)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 2/2] mesa: Add {Num}UniformBlocks and {Num}ShaderStorageBlocks to gl_shader_program

2015-10-01 Thread Iago Toral
On Thu, 2015-10-01 at 14:01 -0400, Ilia Mirkin wrote:
> On Thu, Oct 1, 2015 at 7:09 AM, Iago Toral Quiroga  wrote:
> > These arrays provide backends with separate index spaces for UBOS and SSBOs.
> > ---
> >  src/glsl/linker.cpp | 35 
> > +++
> >  src/glsl/standalone_scaffolding.cpp |  9 +
> >  src/mesa/main/mtypes.h  |  6 ++
> >  3 files changed, 50 insertions(+)
> >
> > diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
> > index e6eba94..3da773d 100644
> > --- a/src/glsl/linker.cpp
> > +++ b/src/glsl/linker.cpp
> > @@ -4107,6 +4107,41 @@ link_shaders(struct gl_context *ctx, struct 
> > gl_shader_program *prog)
> >}
> > }
> >
> > +   /* Split prog->BufferInterfaceBlocks into prog->UniformBlocks and
> > +* prog->ShaderStorageBlocks, so that drivers that need separate index
> > +* spaces for each set can have that.
> > +*/
> > +   unsigned num_ubo_blocks;
> > +   unsigned num_ssbo_blocks;
> > +   num_ubo_blocks = 0;
> > +   num_ssbo_blocks = 0;
> > +   for (unsigned i = 0; i < prog->NumBufferInterfaceBlocks; i++) {
> > +  if (prog->BufferInterfaceBlocks[i].IsShaderStorage)
> > + num_ssbo_blocks++;
> > +  else
> > + num_ubo_blocks++;
> > +   }
> > +
> > +   prog->UniformBlocks =
> > +  ralloc_array(mem_ctx, gl_uniform_block *, num_ubo_blocks);
> > +   prog->NumUniformBlocks = 0;
> > +
> > +   prog->ShaderStorageBlocks =
> > +  ralloc_array(mem_ctx, gl_uniform_block *, num_ssbo_blocks);
> > +   prog->NumShaderStorageBlocks = 0;
> > +
> > +   for (unsigned i = 0; i < prog->NumBufferInterfaceBlocks; i++) {
> > +  if (prog->BufferInterfaceBlocks[i].IsShaderStorage)
> > + prog->ShaderStorageBlocks[prog->NumShaderStorageBlocks++] =
> > +&prog->BufferInterfaceBlocks[i];
> > +  else
> > + prog->UniformBlocks[prog->NumUniformBlocks++] =
> > +&prog->BufferInterfaceBlocks[i];
> > +   }
> 
> Shouldn't this go through and also adjust the indices of the linked
> programs? Or... something along those lines? With this, I still need a
> remapping table to go from the index passed to a load_ssbo/load_ubo
> instruction to a ssbo/ubo index.

Maybe I am missing something, but in the case of i965 for example, we
use a lowering pass (lower_ubo_reference) to compute the block indices
(see src/glsl/lower_ubo_reference.cpp). If you look at
lower_ubo_reference_visitor::setup_for_load_or_store, there is the code
we use to compute the block index. This happens in the backend already,
so I was thinking that you would do something similar, but instead of
using BufferInterfaceBlocks you would use these two arrays instead.
Until this point there are no references to any blocks in the shaders.

> Perhaps a few well-placed helper functions can alleviate this? Also
> this should erase the need for some of the O(n) iterations that have
> sprung up as a result of this combined list.
> 
> IMHO ideally the BufferInterfaceBlocks list would get freed at the end
> of this function. But I understand that this will require work, and
> the onus is probably on me (or anyone wanting to add ssbo support to
> other drivers) to do it, or work around it.

Yeah, it could go away I guess, but as you say that would require some
extra work (and also rewrite the i965 driver first to use separate index
spaces). In any case, notice that UniformBlocks and ShaderStorageBlocks
only contain pointers into BufferInterfaceBlocks, so we would only be
saving a few bytes.

Iago

>   -ilia
> 
> > +
> > +   assert(prog->NumUniformBlocks + prog->NumShaderStorageBlocks ==
> > +  prog->NumBufferInterfaceBlocks);
> > +
> > /* FINISHME: Assign fragment shader output locations. */
> >
> >  done:
> > diff --git a/src/glsl/standalone_scaffolding.cpp 
> > b/src/glsl/standalone_scaffolding.cpp
> > index 0c53589..658245f 100644
> > --- a/src/glsl/standalone_scaffolding.cpp
> > +++ b/src/glsl/standalone_scaffolding.cpp
> > @@ -102,6 +102,15 @@ _mesa_clear_shader_program_data(struct 
> > gl_shader_program *shProg)
> > ralloc_free(shProg->BufferInterfaceBlocks);
> > shProg->BufferInterfaceBlocks = NULL;
> > shProg->NumBufferInterfaceBlocks = 0;
> > +
> > +   ralloc_free(shProg->UniformBlocks);
> > +   shProg->UniformBlocks = NULL;
> > +   shProg->NumUniformBlocks = 0;
> > +
> > +   ralloc_free(shProg->ShaderStorageBlocks);
> > +   shProg->ShaderStorageBlocks = NULL;
> > +   shProg->NumShaderStorageBlocks = 0;
> > +
> > for (i = 0; i < MESA_SHADER_STAGES; i++) {
> >ralloc_free(shProg->UniformBlockStageIndex[i]);
> >shProg->UniformBlockStageIndex[i] = NULL;
> > diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> > index 347da14..2362f54 100644
> > --- a/src/mesa/main/mtypes.h
> > +++ b/src/mesa/main/mtypes.h
> > @@ -2692,6 +2692,12 @@ struct gl_shader_program
> > unsigned NumBufferInterfaceBlocks;
> > struct gl_uniform_block *BufferInterfaceBlocks;
> >
>

Re: [Mesa-dev] [PATCH] mesa: avoid leaking closure when iterating over a string_to_uint_map

2015-10-01 Thread Iago Toral
Reviewed-by: Iago Toral Quiroga 

On Thu, 2015-10-01 at 20:19 -0400, Ilia Mirkin wrote:
> Signed-off-by: Ilia Mirkin 
> ---
>  src/mesa/program/hash_table.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/mesa/program/hash_table.h b/src/mesa/program/hash_table.h
> index e85a836..d0a2abf 100644
> --- a/src/mesa/program/hash_table.h
> +++ b/src/mesa/program/hash_table.h
> @@ -249,6 +249,7 @@ public:
>wrapper->closure = closure;
>  
>hash_table_call_foreach(this->ht, subtract_one_wrapper, wrapper);
> +  free(wrapper);
> }
>  
> /**


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: avoid leaking hiddenUniforms map when there are no uniforms

2015-10-01 Thread Iago Toral
On Thu, 2015-10-01 at 20:22 -0400, Ilia Mirkin wrote:
> Signed-off-by: Ilia Mirkin 
> ---
>  src/glsl/link_uniforms.cpp | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp
> index 47d49c8..740b0a4 100644
> --- a/src/glsl/link_uniforms.cpp
> +++ b/src/glsl/link_uniforms.cpp
> @@ -1131,15 +1131,15 @@ link_assign_uniform_locations(struct 
> gl_shader_program *prog,
> const unsigned num_data_slots = uniform_size.num_values;
> const unsigned hidden_uniforms = uniform_size.num_hidden_uniforms;
>  
> +   /* assign hidden uniforms a slot id */
> +   hiddenUniforms->iterate(assign_hidden_uniform_slot_id, &uniform_size);
> +   delete hiddenUniforms;
> +
> /* On the outside chance that there were no uniforms, bail out.
>  */
> if (num_uniforms == 0)
>return;
>  
> -   /* assign hidden uniforms a slot id */
> -   hiddenUniforms->iterate(assign_hidden_uniform_slot_id, &uniform_size);
> -   delete hiddenUniforms;
> -

I suppose there is no much gain in simply adding the delete statement
right before the return...

Reviewed-by: Iago Toral Quiroga 


> struct gl_uniform_storage *uniforms =
>rzalloc_array(prog, struct gl_uniform_storage, num_uniforms);
> union gl_constant_value *data =


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: don't forget to free image_param on prog_data free

2015-10-01 Thread Iago Toral
Reviewed-by: Iago Toral Quiroga 

On Thu, 2015-10-01 at 20:27 -0400, Ilia Mirkin wrote:
> Signed-off-by: Ilia Mirkin 
> ---
>  src/mesa/drivers/dri/i965/brw_program.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
> b/src/mesa/drivers/dri/i965/brw_program.c
> index fee96a8..0e4b823 100644
> --- a/src/mesa/drivers/dri/i965/brw_program.c
> +++ b/src/mesa/drivers/dri/i965/brw_program.c
> @@ -551,6 +551,7 @@ brw_stage_prog_data_free(const void *p)
>  
> ralloc_free(prog_data->param);
> ralloc_free(prog_data->pull_param);
> +   ralloc_free(prog_data->image_param);
>  }
>  
>  void


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev