date:20150507

Re: [Mesa-dev] [PATCH] clover: add --with-icd-file-dir option

2015-05-07 Thread Michel Dänzer

On 05.05.2015 01:47, Tom Stellard wrote:
> On Mon, May 04, 2015 at 10:13:19AM -0400, Ilia Mirkin wrote:
>> On Mon, May 4, 2015 at 10:04 AM, Tom Stellard  wrote:
>>> On Sat, May 02, 2015 at 01:31:41PM -0400, Ilia Mirkin wrote:
 On Sat, May 2, 2015 at 1:19 PM, EdB  wrote:
> The standard ICD file path is /etc/OpenCL/vendor/.
> However it doesn't fit well with custom build.
> This option allow ICD vendor file installation path override
> ---
>  configure.ac   | 6 ++
>  src/gallium/targets/opencl/Makefile.am | 2 +-
>  2 files changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/configure.ac b/configure.ac
> index 095e23e..bf08d76 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -2005,6 +2005,12 @@ AC_ARG_WITH([d3d-libdir],
>  [D3D_DRIVER_INSTALL_DIR="$withval"],
>  [D3D_DRIVER_INSTALL_DIR="${libdir}/d3d"])
>  AC_SUBST([D3D_DRIVER_INSTALL_DIR])
> +AC_ARG_WITH([icd-file-dir],
> +[AS_HELP_STRING([--with-icd-file-dir=DIR],
> +[directory for the OpenCL ICD vendor file 
> @<:@/etc/OpenCL/vendors@:>@])],
> +[ICD_FILE_INSTALL_DIR="$withval"],
> +[ICD_FILE_INSTALL_DIR="/etc/OpenCL/vendors"])

 What about making this default to ${sysconfdir}/OpenCL/vendors ? That
 way using --prefix should auto-make it go into the prefix instead of
 unexpectedly installing things outside of the specified prefix? That
 way a distro build which specifies --sysconfdir as /etc will get it in
 the right place, while by default it'll go into /usr/local/etc and a
 user can override the icd loader's default behaviour with
 OPENCL_VENDOR_PATH?

>>>
>>> I would prefer not to make this the default behavior, because it violates 
>>> the spec
>>> and there could potentially be multiple icd implementations, which may or 
>>> may not have
>>> the overrides.
>>>
>>> I think the best solution would be to rename the option to something like
>>> --enable-ocl-icd-respect-prefix (suggestions for other names encouraged).
>>> and have the option enable the behavior that Ilia is describing.
>>>
>>> This will give distros and advanced users a way to setup their system
>>> the way they want.
>>
>> It's just a very anti-autoconf thing to do to have "make install" fail
>> by default unless you specify some "hey, i actually want make install
>> to work" option.
>>
>> I think it's crazy to expect that, by default, people will want to
>> write over their system installs, and having things go outside of the
>> specified --prefix is very surprising (unless you force some other
>> option). And asking the user to run "make install" as root is even
>> crazier.
>>
> 
> My expectation is that, by default, when people specify --enable-opencl-icd
> they want an implementation that conforms to the specification.
> Unfortunately, this means installing icd files to /etc.
> 
> There is no good solution here, but I'd rather have users specify a flag
> to get a sane build system, than requiring them to set a flag and set
> an environment variable just to get working OpenCL with the ICD loader.
> 
>> I guess I haven't hit this yet because there's no OpenCL support in
>> nouveau or freedreno, but I made the same stink about vdpau when Emil
>> tried to make it install to some system location by default. At least
>> a few people seemed to agree with me back then...
>>
> 
> Does the vdpau spec also require installation to a specific system director
> (e.g. /etc/) ?

Tom, I think ensuring that the OpenCL ICD loader can pick up the
mesa.icd file is something for the distributor / administrator / user to
worry about, not Mesa upstream.

There's a similar situation with the drirc file, which is installed
inside the prefix by default but only read from /etc/.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 80183] [llvmpipe] triangles with vertices that map to raster positions > viewport width/height are not displayed

2015-05-07 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=80183

--- Comment #14 from cgerlac...@gmail.com ---
The problem is also reproducible with softpipe.

I understand your concern and I will try provide some samplecode to reproduce
the clipping error.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] util: Take memset out of rzalloc_size()

2015-05-07 Thread Juha-Pekka Heikkila

On 06.05.2015 21:51, Rob Clark wrote:
> On Wed, May 6, 2015 at 1:24 PM, Kenneth Graunke  wrote:
>> On Wednesday, May 06, 2015 03:35:27 PM Juha-Pekka Heikkila wrote:
>>> rzalloc_size() call ralloc_size() to allocate memory. ralloc_size()
>>> use calloc to get memory thus zeroing in rzalloc_size is not
>>> necessary.
>>>
>>> Signed-off-by: Juha-Pekka Heikkila 
>>> ---
>>>  src/util/ralloc.c | 2 --
>>>  1 file changed, 2 deletions(-)
>>>
>>> diff --git a/src/util/ralloc.c b/src/util/ralloc.c
>>> index 01719c8..09f5fcd 100644
>>> --- a/src/util/ralloc.c
>>> +++ b/src/util/ralloc.c
>>> @@ -132,8 +132,6 @@ void *
>>>  rzalloc_size(const void *ctx, size_t size)
>>>  {
>>> void *ptr = ralloc_size(ctx, size);
>>> -   if (likely(ptr != NULL))
>>> -  memset(ptr, 0, size);
>>> return ptr;
>>>  }
>>>
>>>
>>
>> Wow, I have no idea why I did that.  This is certainly
>> counter-intuitive.
>>
>> rzalloc() is supposed to guarantee zeroed memory.  ralloc() is not, but
>> it looks like it always has for some reason.  I'm somewhat inclined to
>> change ralloc_size() to use malloc instead of calloc.
>>
>> I wonder how many things would break :)
>>
> 
> try the change conditionally ifndef DEBUG??  (abusing --enable-debug
> as a proxy for --im-actually-a-mesa-dev-and-want-to-see-the-crashes)
> 
> 

I did have a try to put malloc in place of calloc and did see basically
almost all Piglit tests starting to fail on this one. There were handful
of tests which still worked but also saw many different places for
crashes thus though at first suggest just taking the memset out. :)

/Juha-Pekka
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] glx: provide a way to disable DRI3 using an environment variable

2015-05-07 Thread Martin Peres


On 06/05/15 19:47, Axel Davy wrote:

Le 06/05/2015 14:43, Martin Peres a écrit :

  diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index ff77a91..5246737 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -2092,6 +2092,11 @@ dri3_create_display(Display * dpy)
 xcb_generic_error_t  *error;
 const xcb_query_extension_reply_t*extension;
  +   if (getenv("MESA_GLX_DRI3_DISABLE")) {
+  ErrorMessageF("DRI3 disabled by the environment\n");
+  return NULL;
+   }
+
 xcb_prefetch_extension_data(c, &xcb_dri3_id);
 xcb_prefetch_extension_data(c, &xcb_present_id);

There is already a LIBGL_DRI3_DISABLE env var.

Does this one bring something different ?

Yours,

Axel Davy


Thanks Axel! I heard that there was such a variable, but no-one could 
remember the name. I looked for it in the wrong place it would seem!


Let's drop this patch for the moment. If the variable works as expected, 
I would suggest documenting it in envvar.html :)

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] glx: report which DRI version is used when in verbose debug mode

2015-05-07 Thread Eero Tamminen


Hi,

On 05/06/2015 08:28 PM, Kenneth Graunke wrote:

I agree with Axel - I think LIBGL_DRI3_DISABLE=1 already does what you
want, so patch 2 is unnecessary.


That needs a patch to doc/envvars.html...


- Eero

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 69101] prime: black window

2015-05-07 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=69101

higu...@gmx.net changed:

   What|Removed |Added

 CC||higu...@gmx.net

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 10/13] mesa/main: Check context pointer in _mesa_error before using it

2015-05-07 Thread Pohjolainen, Topi

On Tue, May 05, 2015 at 02:25:26PM +0300, Juha-Pekka Heikkila wrote:
> I guess this should not really be able to segfault but still it
> seems to be able to during context creation.
> 
> Signed-off-by: Juha-Pekka Heikkila 
> ---
>  src/mesa/main/errors.c | 26 --
>  1 file changed, 16 insertions(+), 10 deletions(-)
> 
> diff --git a/src/mesa/main/errors.c b/src/mesa/main/errors.c
> index 2aa1deb..6631b82 100644
> --- a/src/mesa/main/errors.c
> +++ b/src/mesa/main/errors.c
> @@ -1458,18 +1458,23 @@ _mesa_error( struct gl_context *ctx, GLenum error, 
> const char *fmtString, ... )
>  
To me it looks that it would be better to just leave early already here:

  if (!ctx)
 return;

Avoids extra indentation and it doesn't look meaningful to call
should_output() with null context.

 
> do_output = should_output(ctx, error, fmtString);
>  
> -   mtx_lock(&ctx->DebugMutex);
> -   if (ctx->Debug) {
> -  do_log = debug_is_message_enabled(ctx->Debug,
> -MESA_DEBUG_SOURCE_API,
> -MESA_DEBUG_TYPE_ERROR,
> -error_msg_id,
> -MESA_DEBUG_SEVERITY_HIGH);
> +   if (ctx) {
> +  mtx_lock(&ctx->DebugMutex);
> +  if (ctx->Debug) {
> + do_log = debug_is_message_enabled(ctx->Debug,
> +   MESA_DEBUG_SOURCE_API,
> +   MESA_DEBUG_TYPE_ERROR,
> +   error_msg_id,
> +   MESA_DEBUG_SEVERITY_HIGH);
> +  }
> +  else {
> + do_log = GL_FALSE;
> +  }
> +  mtx_unlock(&ctx->DebugMutex);
> }
> else {
>do_log = GL_FALSE;
> }
> -   mtx_unlock(&ctx->DebugMutex);
>  
> if (do_output || do_log) {
>char s[MAX_DEBUG_MESSAGE_LENGTH], s2[MAX_DEBUG_MESSAGE_LENGTH];
> @@ -1502,14 +1507,15 @@ _mesa_error( struct gl_context *ctx, GLenum error, 
> const char *fmtString, ... )
>}
>  
>/* Log the error via ARB_debug_output if needed.*/
> -  if (do_log) {
> +  if (ctx && do_log) {
>   log_msg(ctx, MESA_DEBUG_SOURCE_API, MESA_DEBUG_TYPE_ERROR,
>   error_msg_id, MESA_DEBUG_SEVERITY_HIGH, len, s2);
>}
> }
>  
> /* Set the GL context error state for glGetError. */
> -   _mesa_record_error(ctx, error);
> +   if (ctx)
> +  _mesa_record_error(ctx, error);
>  }
>  
>  void
> -- 
> 1.8.5.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 13/13] mesa/main: Verify context creation on progress

2015-05-07 Thread Pohjolainen, Topi

On Tue, May 05, 2015 at 02:25:29PM +0300, Juha-Pekka Heikkila wrote:
> Stop context creation if something failed. If something errored
> during context creation we'd segfault. Now will clean up and
> return error.
> 
> Signed-off-by: Juha-Pekka Heikkila 
> ---
>  src/mesa/main/shared.c | 66 
> +++---
>  1 file changed, 62 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/main/shared.c b/src/mesa/main/shared.c
> index 0b76cc0..cc05b05 100644
> --- a/src/mesa/main/shared.c
> +++ b/src/mesa/main/shared.c
> @@ -64,9 +64,21 @@ _mesa_alloc_shared_state(struct gl_context *ctx)
>  
> mtx_init(&shared->Mutex, mtx_plain);
>  
> +   /* Mutex and timestamp for texobj state validation */
> +   mtx_init(&shared->TexMutex, mtx_recursive);
> +   shared->TextureStateStamp = 0;

Do you really need to move this here?

> +
> shared->DisplayList = _mesa_NewHashTable();
> +   if (!shared->DisplayList)
> +  goto error_out;
> +
> shared->TexObjects = _mesa_NewHashTable();
> +   if (!shared->TexObjects)
> +  goto error_out;
> +
> shared->Programs = _mesa_NewHashTable();
> +   if (!shared->Programs)
> +  goto error_out;
>  
> shared->DefaultVertexProgram =
>gl_vertex_program(ctx->Driver.NewProgram(ctx,
> @@ -76,17 +88,28 @@ _mesa_alloc_shared_state(struct gl_context *ctx)
>   GL_FRAGMENT_PROGRAM_ARB, 
> 0));
>  
> shared->ATIShaders = _mesa_NewHashTable();
> +   if (!shared->ATIShaders)
> +  goto error_out;
> +
> shared->DefaultFragmentShader = _mesa_new_ati_fragment_shader(ctx, 0);
>  
> shared->ShaderObjects = _mesa_NewHashTable();
> +   if (!shared->ShaderObjects)
> +  goto error_out;
>  
> shared->BufferObjects = _mesa_NewHashTable();
> +   if (!shared->BufferObjects)
> +  goto error_out;
>  
> /* GL_ARB_sampler_objects */
> shared->SamplerObjects = _mesa_NewHashTable();
> +   if (!shared->SamplerObjects)
> +  goto error_out;
>  
> /* Allocate the default buffer object */
> shared->NullBufferObj = ctx->Driver.NewBufferObject(ctx, 0);
> +   if (!shared->NullBufferObj)
> +   goto error_out;
>  
> /* Create default texture objects */
> for (i = 0; i < NUM_TEXTURE_TARGETS; i++) {
> @@ -107,22 +130,57 @@ _mesa_alloc_shared_state(struct gl_context *ctx)
>};
>STATIC_ASSERT(ARRAY_SIZE(targets) == NUM_TEXTURE_TARGETS);
>shared->DefaultTex[i] = ctx->Driver.NewTextureObject(ctx, 0, 
> targets[i]);
> +
> +  if (!shared->DefaultTex[i])
> +  goto error_out;
> }
>  
> /* sanity check */
> assert(shared->DefaultTex[TEXTURE_1D_INDEX]->RefCount == 1);
>  
> -   /* Mutex and timestamp for texobj state validation */
> -   mtx_init(&shared->TexMutex, mtx_recursive);
> -   shared->TextureStateStamp = 0;
> -
> shared->FrameBuffers = _mesa_NewHashTable();
> +   if (!shared->FrameBuffers)
> +  goto error_out;
> +
> shared->RenderBuffers = _mesa_NewHashTable();
> +   if (!shared->RenderBuffers)
> +  goto error_out;
>  
> shared->SyncObjects = _mesa_set_create(NULL, _mesa_hash_pointer,
>_mesa_key_pointer_equal);
> +   if (!shared->SyncObjects)
> +  goto error_out;
>  
> return shared;
> +
> +error_out:
> +   for (i = 0; i < NUM_TEXTURE_TARGETS; i++) {
> +  if (shared->DefaultTex[i]) {
> + ctx->Driver.DeleteTexture(ctx, shared->DefaultTex[i]);
> +  }
> +   }
> +
> +   _mesa_reference_buffer_object(ctx, &shared->NullBufferObj, NULL);
> +
> +   _mesa_DeleteHashTable(shared->RenderBuffers);
> +   _mesa_DeleteHashTable(shared->FrameBuffers);
> +   _mesa_DeleteHashTable(shared->SamplerObjects);
> +   _mesa_DeleteHashTable(shared->BufferObjects);
> +   _mesa_DeleteHashTable(shared->ShaderObjects);
> +   _mesa_DeleteHashTable(shared->ATIShaders);
> +   _mesa_DeleteHashTable(shared->Programs);
> +   _mesa_DeleteHashTable(shared->TexObjects);
> +   _mesa_DeleteHashTable(shared->DisplayList);
> +
> +   _mesa_reference_vertprog(ctx, &shared->DefaultVertexProgram, NULL);
> +   _mesa_reference_geomprog(ctx, &shared->DefaultGeometryProgram, NULL);
> +   _mesa_reference_fragprog(ctx, &shared->DefaultFragmentProgram, NULL);
> +
> +   mtx_destroy(&shared->Mutex);
> +   mtx_destroy(&shared->TexMutex);
> +
> +   free(shared);
> +   return NULL;
>  }
>  
>  
> -- 
> 1.8.5.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 03/15] i965/fs_cse: Factor out code to create copy instructions

2015-05-07 Thread Pohjolainen, Topi

On Tue, May 05, 2015 at 06:28:06PM -0700, Jason Ekstrand wrote:
> v2: Get rid of the block parameter and make src a const reference
> 
> Reviewed-by: Topi Pohjolainen 
> Reviewed-by: Matt Turner 
> Reviewed-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 75 
> 
>  1 file changed, 38 insertions(+), 37 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> index 43370cb..9c4ed0b 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> @@ -185,6 +185,29 @@ instructions_match(fs_inst *a, fs_inst *b, bool *negate)
>operands_match(a, b, negate);
>  }
>  
> +static fs_inst *
> +create_copy_instr(fs_visitor *v, fs_inst *inst, fs_reg src, bool negate)

Did you mean 'src' to be constant reference? It is only used for reading
so it could be - you claim this in the commit message yourself :)

> +{
> +   int written = inst->regs_written;
> +   int dst_width = inst->dst.width / 8;
> +   fs_reg dst = inst->dst;
> +   fs_inst *copy;
> +
> +   if (written > dst_width) {
> +  fs_reg *sources = ralloc_array(v->mem_ctx, fs_reg, written / 
> dst_width);
> +  for (int i = 0; i < written / dst_width; i++)
> + sources[i] = offset(src, i);
> +  copy = v->LOAD_PAYLOAD(dst, sources, written / dst_width);
> +   } else {
> +  copy = v->MOV(dst, src);
> +  copy->force_writemask_all = inst->force_writemask_all;
> +  copy->src[0].negate = negate;
> +   }
> +   assert(copy->regs_written == written);
> +
> +   return copy;
> +}
> +
>  bool
>  fs_visitor::opt_cse_local(bblock_t *block)
>  {
> @@ -230,49 +253,27 @@ fs_visitor::opt_cse_local(bblock_t *block)
>  bool no_existing_temp = entry->tmp.file == BAD_FILE;
>  if (no_existing_temp && !entry->generator->dst.is_null()) {
> int written = entry->generator->regs_written;
> -   int dst_width = entry->generator->dst.width / 8;
> -   assert(written % dst_width == 0);
> -
> -   fs_reg orig_dst = entry->generator->dst;
> -   fs_reg tmp = fs_reg(GRF, alloc.allocate(written),
> -   orig_dst.type, orig_dst.width);
> -   entry->tmp = tmp;
> -   entry->generator->dst = tmp;
> -
> -   fs_inst *copy;
> -   if (written > dst_width) {
> -  fs_reg *sources = ralloc_array(mem_ctx, fs_reg, written / 
> dst_width);
> -  for (int i = 0; i < written / dst_width; i++)
> - sources[i] = offset(tmp, i);
> -  copy = LOAD_PAYLOAD(orig_dst, sources, written / 
> dst_width);
> -   } else {
> -  copy = MOV(orig_dst, tmp);
> -  copy->force_writemask_all =
> - entry->generator->force_writemask_all;
> -   }
> +   assert((written * 8) % entry->generator->dst.width == 0);
> +
> +   entry->tmp = fs_reg(GRF, alloc.allocate(written),
> +   entry->generator->dst.type,
> +   entry->generator->dst.width);
> +
> +   fs_inst *copy = create_copy_instr(this, entry->generator,
> + entry->tmp, false);
> entry->generator->insert_after(block, copy);
> +
> +   entry->generator->dst = entry->tmp;
>  }
>  
>  /* dest <- temp */
>  if (!inst->dst.is_null()) {
> -   int written = inst->regs_written;
> -   int dst_width = inst->dst.width / 8;
> -   assert(written == entry->generator->regs_written);
> -   assert(dst_width == entry->generator->dst.width / 8);
> +   assert(inst->regs_written == entry->generator->regs_written);
> +   assert(inst->dst.width == entry->generator->dst.width);
> assert(inst->dst.type == entry->tmp.type);
> -   fs_reg dst = inst->dst;
> -   fs_reg tmp = entry->tmp;
> -   fs_inst *copy;
> -   if (written > dst_width) {
> -  fs_reg *sources = ralloc_array(mem_ctx, fs_reg, written / 
> dst_width);
> -  for (int i = 0; i < written / dst_width; i++)
> - sources[i] = offset(tmp, i);
> -  copy = LOAD_PAYLOAD(dst, sources, written / dst_width);
> -   } else {
> -  copy = MOV(dst, tmp);
> -  copy->force_writemask_all = inst->force_writemask_all;
> -  copy->src[0].negate = negate;
> -   }
> +
> +   fs_inst *copy = create_copy_instr(this, inst,
> + entry->tmp, negate);
> inst->insert_before(block, copy);
>  }
>  
> -- 
> 2.3.6
> 
> _

Re: [Mesa-dev] [PATCH 14/27] nir: Add glsl_get_element_type() wrapper.

2015-05-07 Thread Timothy Arceri


On Tue, 2015-04-28 at 23:08 +0300, Abdiel Janulgue wrote:
> Signed-off-by: Abdiel Janulgue 
> ---
>  src/glsl/nir/nir_types.cpp | 5 +
>  src/glsl/nir/nir_types.h   | 2 ++
>  2 files changed, 7 insertions(+)
> 
> diff --git a/src/glsl/nir/nir_types.cpp b/src/glsl/nir/nir_types.cpp
> index f0d0b46..249678f 100644
> --- a/src/glsl/nir/nir_types.cpp
> +++ b/src/glsl/nir/nir_types.cpp
> @@ -82,6 +82,11 @@ glsl_get_base_type(const struct glsl_type *type)
> return type->base_type;
>  }
>  
> +const struct glsl_type *
> +glsl_get_element_type(const struct glsl_type *type)
> +{
> +   return type->element_type();

I've sent a patch to remove the element_type() helper. I'm yet to see a
case where just using is_array() and/or without_array() don't result in
clearer code with the added advantage (in most cases) of free
multidimensional array support.

http://lists.freedesktop.org/archives/mesa-dev/2015-April/083195.html

> +}
>  unsigned
>  glsl_get_vector_elements(const struct glsl_type *type)
>  {
> diff --git a/src/glsl/nir/nir_types.h b/src/glsl/nir/nir_types.h
> index 276d4ad..125f075 100644
> --- a/src/glsl/nir/nir_types.h
> +++ b/src/glsl/nir/nir_types.h
> @@ -49,6 +49,8 @@ const struct glsl_type *glsl_get_array_element(const struct 
> glsl_type *type);
>  
>  const struct glsl_type *glsl_get_column_type(const struct glsl_type *type);
>  
> +const struct glsl_type *glsl_get_element_type(const struct glsl_type *type);
> +
>  enum glsl_base_type glsl_get_base_type(const struct glsl_type *type);
>  
>  unsigned glsl_get_vector_elements(const struct glsl_type *type);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965/skl: In opt_sampler_eot always set destination register to null

2015-05-07 Thread Neil Roberts

opt_sampler_eot enables a direct write to framebuffer from a sample.
In order to do this the sample message needs to have a message header
so if there wasn't one already then the function adds one. In addition
the function sets the destination register to null because it's no
longer used. However it was only doing this in cases where it was
adding a message header. This patch just moves setting the destination
so that it happens even if there's a messge header. In practice this
doesn't seem to make any difference but it's a bit cleaner.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 1ca7ca6..72d408b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2675,6 +2675,7 @@ fs_visitor::opt_sampler_eot()
 
tex_inst->offset |= fb_write->target << 24;
tex_inst->eot = true;
+   tex_inst->dst = reg_null_ud;
fb_write->remove(cfg->blocks[cfg->num_blocks - 1]);
 
/* If a header is present, marking the eot is sufficient. Otherwise, we need
@@ -2712,7 +2713,6 @@ fs_visitor::opt_sampler_eot()
tex_inst->header_present = true;
tex_inst->insert_before(cfg->blocks[cfg->num_blocks - 1], new_load_payload);
tex_inst->src[0] = send_header;
-   tex_inst->dst = reg_null_ud;
 
return true;
 }
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965/wm/gen6: Add option for disabling statistics collection

2015-05-07 Thread Topi Pohjolainen

Normally this always needed but for internal blits and clears
we need to be able to disable it.

CC: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_state.h |  3 ++-
 src/mesa/drivers/dri/i965/gen6_wm_state.c | 14 +++---
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index 18449c4..26fdae6 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -339,7 +339,8 @@ gen6_upload_wm_state(struct brw_context *brw,
  bool multisampled_fbo, int min_inv_per_frag,
  bool dual_source_blend_enable, bool kill_enable,
  bool color_buffer_write_enable, bool msaa_enabled,
- bool line_stipple_enable, bool polygon_stipple_enable);
+ bool line_stipple_enable, bool polygon_stipple_enable,
+ bool statistic_enable);
 
 /* gen6_sf_state.c */
 void
diff --git a/src/mesa/drivers/dri/i965/gen6_wm_state.c 
b/src/mesa/drivers/dri/i965/gen6_wm_state.c
index e5b0f5a..7081eb7 100644
--- a/src/mesa/drivers/dri/i965/gen6_wm_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_wm_state.c
@@ -73,7 +73,8 @@ gen6_upload_wm_state(struct brw_context *brw,
  bool multisampled_fbo, int min_inv_per_frag,
  bool dual_source_blend_enable, bool kill_enable,
  bool color_buffer_write_enable, bool msaa_enabled,
- bool line_stipple_enable, bool polygon_stipple_enable)
+ bool line_stipple_enable, bool polygon_stipple_enable,
+ bool statistic_enable)
 {
uint32_t dw2, dw4, dw5, dw6, ksp0, ksp2;
 
@@ -109,7 +110,10 @@ gen6_upload_wm_state(struct brw_context *brw,
}
 
dw2 = dw4 = dw5 = dw6 = ksp2 = 0;
-   dw4 |= GEN6_WM_STATISTICS_ENABLE;
+
+   if (statistic_enable)
+  dw4 |= GEN6_WM_STATISTICS_ENABLE;
+
dw5 |= GEN6_WM_LINE_AA_WIDTH_1_0;
dw5 |= GEN6_WM_LINE_END_CAP_AA_WIDTH_0_5;
 
@@ -300,6 +304,9 @@ upload_wm_state(struct brw_context *brw)
 ctx->Multisample.SampleAlphaToCoverage ||
 prog_data->uses_omask;
 
+   /* Rendering against the gl-context is always taken into account. */
+   const bool statistic_enable = true;
+
/* _NEW_LINE | _NEW_POLYGON | _NEW_BUFFERS | _NEW_COLOR |
 * _NEW_MULTISAMPLE
 */
@@ -308,7 +315,8 @@ upload_wm_state(struct brw_context *brw)
 dual_src_blend_enable, kill_enable,
 brw_color_buffer_write_enabled(brw),
 ctx->Multisample.Enabled,
-ctx->Line.StippleFlag, ctx->Polygon.StippleFlag);
+ctx->Line.StippleFlag, ctx->Polygon.StippleFlag,
+statistic_enable);
 }
 
 const struct brw_tracked_state gen6_wm_state = {
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 03/27] i965: Enable hardware-generated binding tables on render path.

2015-05-07 Thread Pohjolainen, Topi

On Tue, Apr 28, 2015 at 11:08:00PM +0300, Abdiel Janulgue wrote:
> This patch implements the binding table enable command which is also
> used to allocate a binding table pool where hardware-generated
> binding table entries are flushed into. Each binding table offset in
> the binding table pool is unique per each shader stage that are
> enabled within a batch.
> 
> Also insert the required brw_tracked_state objects to enable
> hw-generated binding tables in normal render path.
> 
> Signed-off-by: Abdiel Janulgue 
> ---
>  src/mesa/drivers/dri/i965/brw_binding_tables.c | 70 
> ++
>  src/mesa/drivers/dri/i965/brw_context.c|  4 ++
>  src/mesa/drivers/dri/i965/brw_context.h|  5 ++
>  src/mesa/drivers/dri/i965/brw_state.h  |  7 +++
>  src/mesa/drivers/dri/i965/brw_state_upload.c   |  2 +
>  src/mesa/drivers/dri/i965/intel_batchbuffer.c  |  4 ++
>  6 files changed, 92 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
> b/src/mesa/drivers/dri/i965/brw_binding_tables.c
> index 459165a..a58e32e 100644
> --- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
> +++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
> @@ -44,6 +44,11 @@
>  #include "brw_state.h"
>  #include "intel_batchbuffer.h"
>  
> +/* Somehow the hw-binding table pool offset must start here, otherwise
> + * the GPU will hang
> + */
> +#define HW_BT_START_OFFSET 256;

I think we want to understand this a little better before enabling...

> +
>  /**
>   * Upload a shader stage's binding table as indirect state.
>   *
> @@ -163,6 +168,71 @@ const struct brw_tracked_state brw_gs_binding_table = {
> .emit = brw_gs_upload_binding_table,
>  };
>  
> +/**
> + * Hardware-generated binding tables for the resource streamer
> + */
> +void
> +gen7_disable_hw_binding_tables(struct brw_context *brw)
> +{
> +   BEGIN_BATCH(3);
> +   OUT_BATCH(_3DSTATE_BINDING_TABLE_POOL_ALLOC << 16 | (3 - 2));
> +   OUT_BATCH(SET_FIELD(BRW_HW_BINDING_TABLE_OFF, 
> BRW_HW_BINDING_TABLE_ENABLE) |
> + brw->is_haswell ? HSW_HW_BINDING_TABLE_RESERVED : 0);
> +   OUT_BATCH(0);
> +   ADVANCE_BATCH();
> +
> +   /* Pipe control workaround */
> +   brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE);
> +}
> +
> +void
> +gen7_enable_hw_binding_tables(struct brw_context *brw)
> +{
> +   if (!brw->has_resource_streamer) {
> +  gen7_disable_hw_binding_tables(brw);

I started wondering why we really need this - RS is disabled by default and
we haven't needed to do anything to disable it before.

> +  return;
> +   }
> +
> +   if (!brw->hw_bt_pool.bo) {
> +  /* From the BSpec, 3D Pipeline > Resource Streamer > Hardware Binding 
> Tables:
> +   *
> +   *  "A maximum of 16,383 Binding tables are allowed in any batch 
> buffer."
> +   */
> +  int max_size = 16383 * 4;

But does it really need this much all the time? I guess I need to go and
read the spec.

> +  brw->hw_bt_pool.bo = drm_intel_bo_alloc(brw->bufmgr, "hw_bt",
> +  max_size, 64);
> +  brw->hw_bt_pool.next_offset = HW_BT_START_OFFSET;
> +   }
> +
> +   uint32_t dw1 = SET_FIELD(BRW_HW_BINDING_TABLE_ON, 
> BRW_HW_BINDING_TABLE_ENABLE);
> +   if (brw->is_haswell)
> +  dw1 |= SET_FIELD(GEN7_MOCS_L3, GEN7_HW_BT_MOCS) | 
> HSW_HW_BINDING_TABLE_RESERVED;

These are overflowing 80 columns.

> +
> +   BEGIN_BATCH(3);
> +   OUT_BATCH(_3DSTATE_BINDING_TABLE_POOL_ALLOC << 16 | (3 - 2));
> +   OUT_RELOC(brw->hw_bt_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0, dw1);
> +   OUT_RELOC(brw->hw_bt_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0,
> + brw->hw_bt_pool.bo->size);
> +   ADVANCE_BATCH();
> +
> +   /* Pipe control workaround */
> +   brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE);

Would you have a spec reference for this?

> +}
> +
> +void
> +gen7_reset_rs_pool_offsets(struct brw_context *brw)
> +{
> +   brw->hw_bt_pool.next_offset = HW_BT_START_OFFSET;
> +}
> +
> +const struct brw_tracked_state gen7_hw_binding_tables = {
> +   .dirty = {
> +  .mesa = 0,
> +  .brw = BRW_NEW_BATCH,
> +   },
> +   .emit = gen7_enable_hw_binding_tables
> +};
> +
>  /** @} */
>  
>  /**
> diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
> b/src/mesa/drivers/dri/i965/brw_context.c
> index c7e1e81..9c7ccae 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.c
> +++ b/src/mesa/drivers/dri/i965/brw_context.c
> @@ -953,6 +953,10 @@ intelDestroyContext(__DRIcontext * driContextPriv)
> if (brw->wm.base.scratch_bo)
>drm_intel_bo_unreference(brw->wm.base.scratch_bo);
>  
> +   gen7_reset_rs_pool_offsets(brw);
> +   drm_intel_bo_unreference(brw->hw_bt_pool.bo);
> +   brw->hw_bt_pool.bo = NULL;
> +
> drm_intel_gem_context_destroy(brw->hw_ctx);
>  
> if (ctx->swrast_context) {
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
> b/src/mesa/drivers/dri/i965/brw_context.h
> index 07626af..1c72b74 100644
> --- a/src/mesa/drivers/dri/i965/b

Re: [Mesa-dev] [PATCH 01/27] i965: Define HW-binding table and resource streamer control opcodes

2015-05-07 Thread Pohjolainen, Topi

On Sun, May 03, 2015 at 06:04:05PM +0300, Pohjolainen, Topi wrote:
> On Tue, Apr 28, 2015 at 11:07:58PM +0300, Abdiel Janulgue wrote:
> > Signed-off-by: Abdiel Janulgue 
> > ---
> >  src/mesa/drivers/dri/i965/brw_context.h |  1 +
> >  src/mesa/drivers/dri/i965/brw_defines.h | 24 
> >  src/mesa/drivers/dri/i965/intel_reg.h   |  3 +++
> >  3 files changed, 28 insertions(+)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
> > b/src/mesa/drivers/dri/i965/brw_context.h
> > index a6d6787..07626af 100644
> > --- a/src/mesa/drivers/dri/i965/brw_context.h
> > +++ b/src/mesa/drivers/dri/i965/brw_context.h
> > @@ -1105,6 +1105,7 @@ struct brw_context
> > bool no_simd8;
> > bool use_rep_send;
> > bool scalar_vs;
> > +   bool has_resource_streamer;
> 
> This should go to the next patch. Other than that all looks good - I checked
> the values against bspec and I couldn't find anything amiss.
> 
> Reviewed-by: Topi Pohjolainen 
> 
> >  
> > /**
> >  * Some versions of Gen hardware don't do centroid interpolation 
> > correctly
> > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
> > b/src/mesa/drivers/dri/i965/brw_defines.h
> > index a97a944..da288d3 100644
> > --- a/src/mesa/drivers/dri/i965/brw_defines.h
> > +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> > @@ -1586,6 +1586,30 @@ enum brw_message_target {
> >  #define _3DSTATE_BINDING_TABLE_POINTERS_GS 0x7829 /* GEN7+ */
> >  #define _3DSTATE_BINDING_TABLE_POINTERS_PS 0x782A /* GEN7+ */
> >  
> > +#define _3DSTATE_BINDING_TABLE_POOL_ALLOC   0x7919 /* GEN7.5+ */
> > +#define BRW_HW_BINDING_TABLE_ENABLE_SHIFT   11 /* GEN7.5+ */
> > +#define BRW_HW_BINDING_TABLE_ENABLE_MASKINTEL_MASK(11, 11)

Actually we usually do the booleans just as:

 #define BRW_HW_BINDING_TABLE_ENABLE (1 << 11)

> > +#define BRW_HW_BINDING_TABLE_ON 1
> > +#define BRW_HW_BINDING_TABLE_OFF0
> > +#define GEN7_HW_BT_MOCS_SHIFT   7
> > +#define GEN7_HW_BT_MOCS_MASKINTEL_MASK(10, 7)
> > +#define GEN8_HW_BT_MOCS_SHIFT   0
> > +#define GEN8_HW_BT_MOCS_MASKINTEL_MASK(6, 0)
> > +/* Only required in HSW */
> > +#define HSW_HW_BINDING_TABLE_RESERVED   (3 << 5)
> > +
> > +#define _3DSTATE_BINDING_TABLE_EDIT_VS  0x7843 /* GEN7.5 */
> > +#define _3DSTATE_BINDING_TABLE_EDIT_GS  0x7844 /* GEN7.5 */
> > +#define _3DSTATE_BINDING_TABLE_EDIT_HS  0x7845 /* GEN7.5 */
> > +#define _3DSTATE_BINDING_TABLE_EDIT_DS  0x7846 /* GEN7.5 */
> > +#define _3DSTATE_BINDING_TABLE_EDIT_PS  0x7847 /* GEN7.5 */
> > +#define BRW_BINDING_TABLE_INDEX_SHIFT   16
> > +#define BRW_BINDING_TABLE_INDEX_MASKINTEL_MASK(23, 16)
> > +
> > +#define BRW_BINDING_TABLE_EDIT_TARGET_ALL   3
> > +#define BRW_BINDING_TABLE_EDIT_TARGET_CORE1 2
> > +#define BRW_BINDING_TABLE_EDIT_TARGET_CORE0 1
> > +
> >  #define _3DSTATE_SAMPLER_STATE_POINTERS0x7802 /* GEN6+ */
> >  # define PS_SAMPLER_STATE_CHANGE   (1 << 12)
> >  # define GS_SAMPLER_STATE_CHANGE   (1 << 9)
> > diff --git a/src/mesa/drivers/dri/i965/intel_reg.h 
> > b/src/mesa/drivers/dri/i965/intel_reg.h
> > index 488fb5b..9cdb3ca 100644
> > --- a/src/mesa/drivers/dri/i965/intel_reg.h
> > +++ b/src/mesa/drivers/dri/i965/intel_reg.h
> > @@ -47,6 +47,9 @@
> >  /* Load a value from memory into a register.  Only available on Gen7+. */
> >  #define GEN7_MI_LOAD_REGISTER_MEM  (CMD_MI | (0x29 << 23))
> >  # define MI_LOAD_REGISTER_MEM_USE_GGTT (1 << 22)
> > +/* Haswell RS control */
> > +#define MI_RS_CONTROL   (CMD_MI | (0x6 << 23))
> > +#define MI_RS_STORE_DATA_IMM(CMD_MI | (0x2b << 23))
> >  
> >  /** @{
> >   *
> > -- 
> > 1.9.1
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 06/27] i965: Define gather push constants opcodes

2015-05-07 Thread Pohjolainen, Topi

On Tue, Apr 28, 2015 at 11:08:03PM +0300, Abdiel Janulgue wrote:
> Signed-off-by: Abdiel Janulgue 
> ---
>  src/mesa/drivers/dri/i965/brw_defines.h | 23 +++
>  1 file changed, 23 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
> b/src/mesa/drivers/dri/i965/brw_defines.h
> index da288d3..8079433 100644
> --- a/src/mesa/drivers/dri/i965/brw_defines.h
> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> @@ -2209,6 +2209,29 @@ enum brw_wm_barycentric_interp_mode {
>  #define _3DSTATE_CONSTANT_HS  0x7819 /* GEN7+ */
>  #define _3DSTATE_CONSTANT_DS  0x781A /* GEN7+ */
>  
> +/* Resource streamer gather constants */
> +#define _3DSTATE_GATHER_POOL_ALLOC0x791A /* GEN7.5+ */
> +#define _3DSTATE_GATHER_CONSTANT_VS   0x7834
> +#define _3DSTATE_GATHER_CONSTANT_GS   0x7835
> +#define _3DSTATE_GATHER_CONSTANT_HS   0x7836
> +#define _3DSTATE_GATHER_CONSTANT_DS   0x7837
> +#define _3DSTATE_GATHER_CONSTANT_PS   0x7838
> +/* Only required in HSW */
> +#define HSW_GATHER_CONSTANTS_RESERVED (3 << 4)
> +
> +#define BRW_GATHER_CONSTANTS_ENABLE_SHIFT 11 /* GEN7.5+ */
> +#define BRW_GATHER_CONSTANTS_ENABLE_MASK  INTEL_MASK(11, 11)
> +#define BRW_GATHER_CONSTANTS_ON   1
> +#define BRW_GATHER_CONSTANTS_OFF  0

Such as below for SO_FUNCTION_ENABLE:

   #define BRW_GATHER_CONSTANTS_ENABLE   (1 << 11) /* GEN7.5+ */

> +#define BRW_GATHER_BUFFER_VALID_SHIFT 16
> +#define BRW_GATHER_BUFFER_VALID_MASK  INTEL_MASK(31, 16)
> +#define BRW_GATHER_BINDING_TABLE_BLOCK_SHIFT  12
> +#define BRW_GATHER_BINDING_TABLE_BLOCK_MASK   INTEL_MASK(15, 12)
> +#define BRW_GATHER_CONST_BUFFER_OFFSET_SHIFT  8
> +#define BRW_GATHER_CONST_BUFFER_OFFSET_MASK   INTEL_MASK(15, 8)
> +#define BRW_GATHER_CHANNEL_MASK_SHIFT 4
> +#define BRW_GATHER_CHANNEL_MASK_MASK  INTEL_MASK(7, 4)
> +
>  #define _3DSTATE_STREAMOUT0x781e /* GEN7+ */
>  /* DW1 */
>  # define SO_FUNCTION_ENABLE  (1 << 31)
> -- 
> 1.9.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 16/27] i965: Include UBO parameter sizes in push constant parameters

2015-05-07 Thread Timothy Arceri

On Tue, 2015-04-28 at 23:08 +0300, Abdiel Janulgue wrote:
> Now that we consider UBO constants as push constants, we need to include
> the sizes of the UBO's constant slots in the visitor's uniform slot sizes.
> This information is needed to properly pack vector constants tightly next to
> each other.
> 
> Signed-off-by: Abdiel Janulgue 
> ---
>  src/mesa/drivers/dri/i965/brw_gs.c | 11 +++
>  src/mesa/drivers/dri/i965/brw_vs.c | 13 +
>  src/mesa/drivers/dri/i965/brw_wm.c | 13 +
>  3 files changed, 37 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_gs.c 
> b/src/mesa/drivers/dri/i965/brw_gs.c
> index 97658d5..2dc3ea1 100644
> --- a/src/mesa/drivers/dri/i965/brw_gs.c
> +++ b/src/mesa/drivers/dri/i965/brw_gs.c
> @@ -32,6 +32,7 @@
>  #include "brw_vec4_gs_visitor.h"
>  #include "brw_state.h"
>  #include "brw_ff_gs.h"
> +#include "glsl/nir/nir_types.h"
>  
> 
>  bool
> @@ -70,6 +71,16 @@ brw_compile_gs_prog(struct brw_context *brw,
> c.prog_data.base.base.pull_param =
>rzalloc_array(NULL, const gl_constant_value *, param_count);
> c.prog_data.base.base.nr_params = param_count;
> +   c.prog_data.base.base.nr_ubo_params = 0;
> +   for (int i = 0; i < gs->NumUniformBlocks; i++) {
> +  for (int p = 0; p < gs->UniformBlocks[i].NumUniforms; p++) {
> + const struct glsl_type *type = 
> gs->UniformBlocks[i].Uniforms[p].Type;
> + const struct glsl_type *elem = glsl_get_element_type(type);
> + int array_sz = elem ? glsl_get_array_size(type) : 1;
> + int components = elem ? glsl_get_components(elem) : 
> glsl_get_components(type);

As mentioned on the previous patch I've sent a patch to remove the
element type helper. I'm not sure I understand the reason the nir
wrappers need to be used here can you explain for my benefit?

Another way to write this without element type could be something like
this:

const struct glsl_type *type = gs->UniformBlocks[i].Uniforms[p].Type;
int array_sz = MAX2(glsl_get_array_size(type), 1);
int components = glsl_get_components(glsl_get_type_without_array(type));

You would obviously need to wrapper the without_array() helper instead.

Assuming arrays of arrays support is required here in future (the spec
says uniform blocks can be arrays of arrays but I'm not overly familiar
with the code your working on) now the only bit missing would be
multiplying array size by the other array dimensions.


> + c.prog_data.base.base.nr_ubo_params += components * array_sz;
> +  }
> +   }
> c.prog_data.base.base.nr_gather_table = 0;
> c.prog_data.base.base.gather_table =
>rzalloc_size(NULL, sizeof(*c.prog_data.base.base.gather_table) *
> diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
> b/src/mesa/drivers/dri/i965/brw_vs.c
> index 52333c9..86bef5e 100644
> --- a/src/mesa/drivers/dri/i965/brw_vs.c
> +++ b/src/mesa/drivers/dri/i965/brw_vs.c
> @@ -37,6 +37,7 @@
>  #include "brw_state.h"
>  #include "program/prog_print.h"
>  #include "program/prog_parameter.h"
> +#include "glsl/nir/nir_types.h"
>  
>  #include "util/ralloc.h"
>  
> @@ -243,6 +244,18 @@ brw_compile_vs_prog(struct brw_context *brw,
>rzalloc_array(NULL, const gl_constant_value *, param_count);
> stage_prog_data->nr_params = param_count;
>  
> +   stage_prog_data->nr_ubo_params = 0;
> +   if (vs) {
> +  for (int i = 0; i < vs->NumUniformBlocks; i++) {
> + for (int p = 0; p < vs->UniformBlocks[i].NumUniforms; p++) {
> +const struct glsl_type *type = 
> vs->UniformBlocks[i].Uniforms[p].Type;
> +const struct glsl_type *elem = glsl_get_element_type(type);
> +int array_sz = elem ? glsl_get_array_size(type) : 1;
> +int components = elem ? glsl_get_components(elem) : 
> glsl_get_components(type);
> +stage_prog_data->nr_ubo_params += components * array_sz;
> + }
> +  }
> +   }
> stage_prog_data->nr_gather_table = 0;
> stage_prog_data->gather_table = rzalloc_size(NULL, 
> sizeof(*stage_prog_data->gather_table) *
>  (stage_prog_data->nr_params +
> diff --git a/src/mesa/drivers/dri/i965/brw_wm.c 
> b/src/mesa/drivers/dri/i965/brw_wm.c
> index 13a64d8..2060eab 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm.c
> @@ -38,6 +38,7 @@
>  #include "main/samplerobj.h"
>  #include "program/prog_parameter.h"
>  #include "program/program.h"
> +#include "glsl/nir/nir_types.h"
>  #include "intel_mipmap_tree.h"
>  
>  #include "util/ralloc.h"
> @@ -205,6 +206,18 @@ brw_compile_wm_prog(struct brw_context *brw,
>rzalloc_array(NULL, const gl_constant_value *, param_count);
> prog_data.base.nr_params = param_count;
>  
> +   prog_data.base.nr_ubo_params = 0;
> +   if (fs) {
> +  for (int i = 0; i < fs->NumUniformBlocks; i++) {
> + for (int p = 0; p < fs->UniformBlocks[i].NumUniforms; p++) {
> +const struct glsl_type *type = 
> f

Re: [Mesa-dev] i965: Revision of texture surface setup refactoring

2015-05-07 Thread Francisco Jerez

"Pohjolainen, Topi"  writes:

> On Wed, May 06, 2015 at 02:56:53PM +0300, Francisco Jerez wrote:
>> Hi!
>> 
>> Topi Pohjolainen  writes:
>> 
>> > This series moves all the decision making of values into common
>> > hardware independent dispatcher while leaving the hardware specific
>> > logic to deal with formatting only.
>> >
>> > Curro needed a similar refactor for gen7 and gen8. However, that
>> > makes it a harder to apply the changes I needed that expand all the
>> > way to gen4. Ken helped me to notice that my refactoring can in
>> > fact address both relatively easily.
>> >
>> > For context, I added the patch from Curro that makes use of the
>> > texture surface setup logic along with a small patch making it
>> > compatible with the surface state refactoring found here.
>> >
>> > Curro, what do you think? I'm not too happy with reverting your
>> > work but overall this way it becomes cleaner, I think.
>> >
>> 
>> *Shrug*, it seems weird to me that you opted to revert my patches even
>> though they are closer to where you want to get at than it was before my
>> patches.
>> 
>> This is the current interface:
>>   void (*emit_texture_surface_state)(struct brw_context *brw,
>>  struct intel_mipmap_tree *mt,
>>  GLenum target,
>>  unsigned min_layer,
>>  unsigned max_layer,
>>  unsigned min_level,
>>  unsigned max_level,
>>  unsigned format,
>>  unsigned swizzle,
>>  uint32_t *surf_offset,
>>  bool rw, bool for_gather);
>> 
>> This is the old interface we both wanted to get rid of:
>>   void (*update_texture_surface)(struct gl_context *ctx,
>>  unsigned unit,
>>  uint32_t *surf_offset,
>>  bool for_gather);
>>
>> 
>> This is the interface introduced by this series:
>> void (*update_texture_surface)(struct brw_context *brw,
>>const struct intel_mipmap_tree *mt,
>>GLenum target, uint32_t 
>> effective_depth,
>>uint32_t min_layer,
>>uint32_t min_lod, uint32_t mip_count,
>>uint32_t tex_format, int swizzle,
>>uint32_t *surf_offset,
>>bool for_gather);
>> 
>> AFAIK the only difference between your proposal and mine is the name
>> (IMHO emit_texture_surface_state is more consistent with the other
>> emit_*_surface_state hooks with similar semantics), the ordering of
>> arguments (and I find the ordering and naming of your "effective_depth",
>> "min_layer", "min_lod" and "mip_count" arguments rather asymmetric, they
>> are both pairs determining an interval of either layers or levels, it
>> doesn't make much sense to me that they are named and ordered
>> inconsistently in your series), the fact that you're using a min
>> level/layer index + count instead of half-open intervals like I did, and
>> the fact that you're missing an "rw" argument which is required for
>> ARB_shader_image_load_store support.
>> 
>> I fail to see why a revert is justified or desirable, and I fail to see
>> how your proposal will work better on Gen4, since the difference between
>> the two interfaces mostly cosmetic.
>
> I'm just looking at the end result. Here we don't need to introduce new entry
> to the jump table, the changes are kept to the minimum and we both get
> applicable interface. I didn't really intentionally choose between the
> interfaces - this was the outcome of trying to keep it as unintrusive as I
> could.

I've rebased your series on top of master.  In fact the rebased version
is a lot less churn, two of your patches (PATCH 3 and 5) that were
re-applying changes you had previously reverted become empty, and the
diffstat goes down from +131/-195 to +89/-143.

There were a number of subtle differences between the two interfaces
that weren't obvious at all by looking at the "end result", and I only
noticed while looking at the actual diff between master (without
reverts) and your branch, namely:

 - Your "mip_count" argument expects the number of mipmap levels minus
   one, instead of the actual number of mipmap levels (we already
   discussed this earlier today to some extent).

 - Your "min_lod" argument isn't the absolute starting mipmap level,
   instead it's relative to mt->first_level.  This could have bitten us
   in the future if some caller forgets to take this offset into
   acco

[Mesa-dev] [PATCH v2 0/6] Continu enabling Open Gl ES 3.1

2015-05-07 Thread Marta Lofstedt

Changes to my previous patch-set accoring to comments
from Tapani Palli. This will only expose the enums
for the respective extensions to gles 3.1 and GL Core.

Marta Lofstedt (6):
  mesa/es3.1: enable GL_ARB_shader_image_load_store for gles3.1
  mesa/es3.1: enable ARB_shader_atomic_counters for GLES 3.1
  mesa/es3.1: enable GL_ARB_texture_multisample for GLES 3.1
  mesa/es3.1: enable GL_ARB_texture_gather for GLES 3.1
  mesa/es3.1: enable GL_ARB_compute_shader for GLES 3.1
  mesa/es3.1: enable GL_ARB_explicit_uniform_location for GLES 3.1

 src/mesa/main/get.c  | 36 
 src/mesa/main/get_hash_params.py | 88 
 2 files changed, 80 insertions(+), 44 deletions(-)

-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 2/6] mesa/es3.1: enable ARB_shader_atomic_counters for GLES 3.1

2015-05-07 Thread Marta Lofstedt

From: Marta Lofstedt 

v2 :  only expose ARB_shader_atomic_counters enums
for gles 3.1 and GL core.

Signed-off-by: Marta Lofstedt 
---
 src/mesa/main/get.c  |  6 ++
 src/mesa/main/get_hash_params.py | 23 +--
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 73739b6..f5318d5 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -361,6 +361,12 @@ static const int extra_ARB_shader_image_load_store_es31[] 
= {
EXTRA_END
 };
 
+static const int extra_ARB_shader_atomic_counters_es31[] = {
+   EXT(ARB_shader_atomic_counters),
+   EXTRA_API_ES31,
+   EXTRA_END
+};
+
 EXTRA_EXT(ARB_texture_cube_map);
 EXTRA_EXT(EXT_texture_array);
 EXTRA_EXT(NV_fog_distance);
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 85c2494..f9bf749 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -421,6 +421,18 @@ descriptor=[
   [ "MAX_GEOMETRY_IMAGE_UNIFORMS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxImageUniforms), 
extra_ARB_shader_image_load_store_es31"],
   [ "MAX_FRAGMENT_IMAGE_UNIFORMS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms), 
extra_ARB_shader_image_load_store_es31"],
   [ "MAX_COMBINED_IMAGE_UNIFORMS", 
"CONTEXT_INT(Const.MaxCombinedImageUniforms), 
extra_ARB_shader_image_load_store_es31"],
+# GL_ARB_shader_atomic_counters / GLES 3.1
+  [ "ATOMIC_COUNTER_BUFFER_BINDING", "LOC_CUSTOM, TYPE_INT, 0, 
extra_ARB_shader_atomic_counters_es31" ],
+  [ "MAX_ATOMIC_COUNTER_BUFFER_BINDINGS", 
"CONTEXT_INT(Const.MaxAtomicBufferBindings), 
extra_ARB_shader_atomic_counters_es31" ],
+  [ "MAX_ATOMIC_COUNTER_BUFFER_SIZE", "CONTEXT_INT(Const.MaxAtomicBufferSize), 
extra_ARB_shader_atomic_counters_es31" ],
+  [ "MAX_VERTEX_ATOMIC_COUNTER_BUFFERS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxAtomicBuffers), 
extra_ARB_shader_atomic_counters_es31" ],
+  [ "MAX_VERTEX_ATOMIC_COUNTERS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxAtomicCounters), 
extra_ARB_shader_atomic_counters_es31" ],
+  [ "MAX_FRAGMENT_ATOMIC_COUNTER_BUFFERS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxAtomicBuffers), 
extra_ARB_shader_atomic_counters_es31" ],
+  [ "MAX_FRAGMENT_ATOMIC_COUNTERS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxAtomicCounters), 
extra_ARB_shader_atomic_counters_es31" ],
+  [ "MAX_GEOMETRY_ATOMIC_COUNTER_BUFFERS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxAtomicBuffers), 
extra_ARB_shader_atomic_counters_es31" ],
+  [ "MAX_GEOMETRY_ATOMIC_COUNTERS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxAtomicCounters), 
extra_ARB_shader_atomic_counters_es31" ],
+  [ "MAX_COMBINED_ATOMIC_COUNTER_BUFFERS", 
"CONTEXT_INT(Const.MaxCombinedAtomicBuffers), 
extra_ARB_shader_atomic_counters_es31" ],
+  [ "MAX_COMBINED_ATOMIC_COUNTERS", 
"CONTEXT_INT(Const.MaxCombinedAtomicCounters), 
extra_ARB_shader_atomic_counters_es31" ],
 ]},
 
 # Remaining enums are only in OpenGL
@@ -771,18 +783,9 @@ descriptor=[
 # GL_ARB_separate_shader_objects
   [ "PROGRAM_PIPELINE_BINDING", "LOC_CUSTOM, TYPE_INT, 
GL_PROGRAM_PIPELINE_BINDING, NO_EXTRA" ],
 
-# GL_ARB_shader_atomic_counters
-  [ "ATOMIC_COUNTER_BUFFER_BINDING", "LOC_CUSTOM, TYPE_INT, 0, 
extra_ARB_shader_atomic_counters" ],
-  [ "MAX_ATOMIC_COUNTER_BUFFER_BINDINGS", 
"CONTEXT_INT(Const.MaxAtomicBufferBindings), extra_ARB_shader_atomic_counters" 
],
-  [ "MAX_ATOMIC_COUNTER_BUFFER_SIZE", "CONTEXT_INT(Const.MaxAtomicBufferSize), 
extra_ARB_shader_atomic_counters" ],
-  [ "MAX_VERTEX_ATOMIC_COUNTER_BUFFERS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxAtomicBuffers), 
extra_ARB_shader_atomic_counters" ],
-  [ "MAX_VERTEX_ATOMIC_COUNTERS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxAtomicCounters), 
extra_ARB_shader_atomic_counters" ],
-  [ "MAX_FRAGMENT_ATOMIC_COUNTER_BUFFERS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxAtomicBuffers), 
extra_ARB_shader_atomic_counters" ],
-  [ "MAX_FRAGMENT_ATOMIC_COUNTERS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxAtomicCounters), 
extra_ARB_shader_atomic_counters" ],
+# GL_ARB_shader_atomic_counters and geometry shaders
   [ "MAX_GEOMETRY_ATOMIC_COUNTER_BUFFERS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxAtomicBuffers), 
extra_ARB_shader_atomic_counters_and_geometry_shader" ],
   [ "MAX_GEOMETRY_ATOMIC_COUNTERS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxAtomicCounters), 
extra_ARB_shader_atomic_counters_and_geometry_shader" ],
-  [ "MAX_COMBINED_ATOMIC_COUNTER_BUFFERS", 
"CONTEXT_INT(Const.MaxCombinedAtomicBuffers), extra_ARB_shader_atomic_counters" 
],
-  [ "MAX_COMBINED_ATOMIC_COUNTERS", 
"CONTEXT_INT(Const.MaxCombinedAtomicCounters), 
extra_ARB_shader_atomic_counters" ],
 
 # GL_ARB_vertex_attrib_binding
   [ "MAX_VERTEX_ATTRIB_RELATIVE_OFFSET", 
"CONTEXT_ENUM(Const.MaxVertexAttribRelativeOffset), NO_EXTRA" ],
-- 
1.9.1

___

[Mesa-dev] [PATCH v2 6/6] mesa/es3.1: enable GL_ARB_explicit_uniform_location for GLES 3.1

2015-05-07 Thread Marta Lofstedt

From: Marta Lofstedt 

v2 : only expose GL_ARB_explicit_uniform_location enums
for gles 3.1 and GL core.

Signed-off-by: Marta Lofstedt 
---
 src/mesa/main/get.c  | 6 ++
 src/mesa/main/get_hash_params.py | 3 ++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 97d3bf0..6fc0f3f 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -385,6 +385,12 @@ static const int extra_ARB_compute_shader_es31[] = {
EXTRA_END
 };
 
+static const int extra_ARB_explicit_uniform_location_es31[] = {
+   EXT(ARB_explicit_uniform_location),
+   EXTRA_API_ES31,
+   EXTRA_END
+};
+
 EXTRA_EXT(ARB_texture_cube_map);
 EXTRA_EXT(EXT_texture_array);
 EXTRA_EXT(NV_fog_distance);
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 985f252..6b07888 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -454,6 +454,8 @@ descriptor=[
   [ "MAX_COMPUTE_SHARED_MEMORY_SIZE", "CONST(MAX_COMPUTE_SHARED_MEMORY_SIZE), 
extra_ARB_compute_shader_es31" ],
   [ "MAX_COMPUTE_UNIFORM_COMPONENTS", "CONST(MAX_COMPUTE_UNIFORM_COMPONENTS), 
extra_ARB_compute_shader_es31" ],
   [ "MAX_COMPUTE_IMAGE_UNIFORMS", "CONST(MAX_COMPUTE_IMAGE_UNIFORMS), 
extra_ARB_compute_shader_es31" ],
+# GL_ARB_explicit_uniform_location / GLES 3.1
+  [ "MAX_UNIFORM_LOCATIONS", 
"CONTEXT_INT(Const.MaxUserAssignableUniformLocations), 
extra_ARB_explicit_uniform_location_es31" ],
 ]},
 
 # Remaining enums are only in OpenGL
@@ -539,7 +541,6 @@ descriptor=[
   [ "MAX_LIST_NESTING", "CONST(MAX_LIST_NESTING), NO_EXTRA" ],
   [ "MAX_NAME_STACK_DEPTH", "CONST(MAX_NAME_STACK_DEPTH), NO_EXTRA" ],
   [ "MAX_PIXEL_MAP_TABLE", "CONST(MAX_PIXEL_MAP_TABLE), NO_EXTRA" ],
-  [ "MAX_UNIFORM_LOCATIONS", 
"CONTEXT_INT(Const.MaxUserAssignableUniformLocations), 
extra_ARB_explicit_uniform_location" ],
   [ "NAME_STACK_DEPTH", "CONTEXT_INT(Select.NameStackDepth), NO_EXTRA" ],
   [ "PACK_LSB_FIRST", "CONTEXT_BOOL(Pack.LsbFirst), NO_EXTRA" ],
   [ "PACK_SWAP_BYTES", "CONTEXT_BOOL(Pack.SwapBytes), NO_EXTRA" ],
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 3/6] mesa/es3.1: enable GL_ARB_texture_multisample for GLES 3.1

2015-05-07 Thread Marta Lofstedt

From: Marta Lofstedt 

v2 : only expose GL_ARB_texture_multisample enums
for gles 3.1 and Gl core.

Signed-off-by: Marta Lofstedt 
---
 src/mesa/main/get.c  |  6 ++
 src/mesa/main/get_hash_params.py | 17 -
 2 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index f5318d5..dcf4f0a 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -367,6 +367,12 @@ static const int extra_ARB_shader_atomic_counters_es31[] = 
{
EXTRA_END
 };
 
+static const int extra_ARB_texture_multisample_es31[] = {
+   EXT(ARB_texture_multisample),
+   EXTRA_API_ES31,
+   EXTRA_END
+};
+
 EXTRA_EXT(ARB_texture_cube_map);
 EXTRA_EXT(EXT_texture_array);
 EXTRA_EXT(NV_fog_distance);
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index f9bf749..10c32f2 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -433,6 +433,14 @@ descriptor=[
   [ "MAX_GEOMETRY_ATOMIC_COUNTERS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxAtomicCounters), 
extra_ARB_shader_atomic_counters_es31" ],
   [ "MAX_COMBINED_ATOMIC_COUNTER_BUFFERS", 
"CONTEXT_INT(Const.MaxCombinedAtomicBuffers), 
extra_ARB_shader_atomic_counters_es31" ],
   [ "MAX_COMBINED_ATOMIC_COUNTERS", 
"CONTEXT_INT(Const.MaxCombinedAtomicCounters), 
extra_ARB_shader_atomic_counters_es31" ],
+# GL_ARB_texture_multisample / GLES 3.1
+  [ "TEXTURE_BINDING_2D_MULTISAMPLE", "LOC_CUSTOM, TYPE_INT, 
TEXTURE_2D_MULTISAMPLE_INDEX, extra_ARB_texture_multisample_es31" ],
+  [ "TEXTURE_BINDING_2D_MULTISAMPLE_ARRAY", "LOC_CUSTOM, TYPE_INT, 
TEXTURE_2D_MULTISAMPLE_ARRAY_INDEX, extra_ARB_texture_multisample_es31" ],
+  [ "MAX_COLOR_TEXTURE_SAMPLES", "CONTEXT_INT(Const.MaxColorTextureSamples), 
extra_ARB_texture_multisample_es31" ],
+  [ "MAX_DEPTH_TEXTURE_SAMPLES", "CONTEXT_INT(Const.MaxDepthTextureSamples), 
extra_ARB_texture_multisample_es31" ],
+  [ "MAX_INTEGER_SAMPLES", "CONTEXT_INT(Const.MaxIntegerSamples), 
extra_ARB_texture_multisample_es31" ],
+  [ "SAMPLE_MASK", "CONTEXT_BOOL(Multisample.SampleMask), 
extra_ARB_texture_multisample_es31" ],
+  [ "MAX_SAMPLE_MASK_WORDS", "CONST(1), extra_ARB_texture_multisample_es31" ],
 ]},
 
 # Remaining enums are only in OpenGL
@@ -718,15 +726,6 @@ descriptor=[
   [ "TEXTURE_BUFFER_FORMAT_ARB", "LOC_CUSTOM, TYPE_INT, 0, 
extra_texture_buffer_object" ],
   [ "TEXTURE_BUFFER_ARB", "LOC_CUSTOM, TYPE_INT, 0, 
extra_texture_buffer_object" ],
 
-# GL_ARB_texture_multisample / GL 3.2
-  [ "TEXTURE_BINDING_2D_MULTISAMPLE", "LOC_CUSTOM, TYPE_INT, 
TEXTURE_2D_MULTISAMPLE_INDEX, extra_ARB_texture_multisample" ],
-  [ "TEXTURE_BINDING_2D_MULTISAMPLE_ARRAY", "LOC_CUSTOM, TYPE_INT, 
TEXTURE_2D_MULTISAMPLE_ARRAY_INDEX, extra_ARB_texture_multisample" ],
-  [ "MAX_COLOR_TEXTURE_SAMPLES", "CONTEXT_INT(Const.MaxColorTextureSamples), 
extra_ARB_texture_multisample" ],
-  [ "MAX_DEPTH_TEXTURE_SAMPLES", "CONTEXT_INT(Const.MaxDepthTextureSamples), 
extra_ARB_texture_multisample" ],
-  [ "MAX_INTEGER_SAMPLES", "CONTEXT_INT(Const.MaxIntegerSamples), 
extra_ARB_texture_multisample" ],
-  [ "SAMPLE_MASK", "CONTEXT_BOOL(Multisample.SampleMask), 
extra_ARB_texture_multisample" ],
-  [ "MAX_SAMPLE_MASK_WORDS", "CONST(1), extra_ARB_texture_multisample" ],
-
 # GL 3.0
   [ "CONTEXT_FLAGS", "CONTEXT_INT(Const.ContextFlags), extra_version_30" ],
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 1/6] mesa/es3.1: enable GL_ARB_shader_image_load_store for gles3.1

2015-05-07 Thread Marta Lofstedt

From: Marta Lofstedt 

v2: only expose enums from GL_ARB_shader_image_load_store
for gles 3.1 and GL core

Signed-off-by: Marta Lofstedt 
---
 src/mesa/main/get.c  |  6 ++
 src/mesa/main/get_hash_params.py | 17 -
 2 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 9898197..73739b6 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -355,6 +355,12 @@ static const int extra_ARB_draw_indirect_es31[] = {
EXTRA_END
 };
 
+static const int extra_ARB_shader_image_load_store_es31[] = {
+   EXT(ARB_shader_image_load_store),
+   EXTRA_API_ES31,
+   EXTRA_END
+};
+
 EXTRA_EXT(ARB_texture_cube_map);
 EXTRA_EXT(EXT_texture_array);
 EXTRA_EXT(NV_fog_distance);
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 513d5d2..85c2494 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -413,6 +413,14 @@ descriptor=[
 { "apis": ["GL_CORE", "GLES3"], "params": [
 # GL_ARB_draw_indirect / GLES 3.1
   [ "DRAW_INDIRECT_BUFFER_BINDING", "LOC_CUSTOM, TYPE_INT, 0, 
extra_ARB_draw_indirect_es31" ],
+# GL_ARB_shader_image_load_store / GLES 3.1
+  [ "MAX_IMAGE_UNITS", "CONTEXT_INT(Const.MaxImageUnits), 
extra_ARB_shader_image_load_store_es31"],
+  [ "MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS", 
"CONTEXT_INT(Const.MaxCombinedImageUnitsAndFragmentOutputs), 
extra_ARB_shader_image_load_store_es31"],
+  [ "MAX_IMAGE_SAMPLES", "CONTEXT_INT(Const.MaxImageSamples), 
extra_ARB_shader_image_load_store_es31"],
+  [ "MAX_VERTEX_IMAGE_UNIFORMS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxImageUniforms), 
extra_ARB_shader_image_load_store_es31"],
+  [ "MAX_GEOMETRY_IMAGE_UNIFORMS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxImageUniforms), 
extra_ARB_shader_image_load_store_es31"],
+  [ "MAX_FRAGMENT_IMAGE_UNIFORMS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms), 
extra_ARB_shader_image_load_store_es31"],
+  [ "MAX_COMBINED_IMAGE_UNIFORMS", 
"CONTEXT_INT(Const.MaxCombinedImageUniforms), 
extra_ARB_shader_image_load_store_es31"],
 ]},
 
 # Remaining enums are only in OpenGL
@@ -780,15 +788,6 @@ descriptor=[
   [ "MAX_VERTEX_ATTRIB_RELATIVE_OFFSET", 
"CONTEXT_ENUM(Const.MaxVertexAttribRelativeOffset), NO_EXTRA" ],
   [ "MAX_VERTEX_ATTRIB_BINDINGS", 
"CONTEXT_ENUM(Const.MaxVertexAttribBindings), NO_EXTRA" ],
 
-# GL_ARB_shader_image_load_store
-  [ "MAX_IMAGE_UNITS", "CONTEXT_INT(Const.MaxImageUnits), 
extra_ARB_shader_image_load_store"],
-  [ "MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS", 
"CONTEXT_INT(Const.MaxCombinedImageUnitsAndFragmentOutputs), 
extra_ARB_shader_image_load_store"],
-  [ "MAX_IMAGE_SAMPLES", "CONTEXT_INT(Const.MaxImageSamples), 
extra_ARB_shader_image_load_store"],
-  [ "MAX_VERTEX_IMAGE_UNIFORMS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxImageUniforms), 
extra_ARB_shader_image_load_store"],
-  [ "MAX_GEOMETRY_IMAGE_UNIFORMS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxImageUniforms), 
extra_ARB_shader_image_load_store_and_geometry_shader"],
-  [ "MAX_FRAGMENT_IMAGE_UNIFORMS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms), 
extra_ARB_shader_image_load_store"],
-  [ "MAX_COMBINED_IMAGE_UNIFORMS", 
"CONTEXT_INT(Const.MaxCombinedImageUniforms), 
extra_ARB_shader_image_load_store"],
-
 # GL_ARB_compute_shader
   [ "MAX_COMPUTE_WORK_GROUP_INVOCATIONS", 
"CONTEXT_INT(Const.MaxComputeWorkGroupInvocations), extra_ARB_compute_shader" ],
   [ "MAX_COMPUTE_UNIFORM_BLOCKS", "CONST(MAX_COMPUTE_UNIFORM_BLOCKS), 
extra_ARB_compute_shader" ],
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 4/6] mesa/es3.1: enable GL_ARB_texture_gather for GLES 3.1

2015-05-07 Thread Marta Lofstedt

From: Marta Lofstedt 

v2 : only expose GL_ARB_texture_gather enums for
gles 3.1 and GL core.

Signed-off-by: Marta Lofstedt 
---
 src/mesa/main/get.c  | 6 ++
 src/mesa/main/get_hash_params.py | 9 -
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index dcf4f0a..95868bf 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -373,6 +373,12 @@ static const int extra_ARB_texture_multisample_es31[] = {
EXTRA_END
 };
 
+static const int extra_ARB_texture_gather_es31[] = {
+   EXT(ARB_texture_gather),
+   EXTRA_API_ES31,
+   EXTRA_END
+};
+
 EXTRA_EXT(ARB_texture_cube_map);
 EXTRA_EXT(EXT_texture_array);
 EXTRA_EXT(NV_fog_distance);
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 10c32f2..50af078 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -441,6 +441,10 @@ descriptor=[
   [ "MAX_INTEGER_SAMPLES", "CONTEXT_INT(Const.MaxIntegerSamples), 
extra_ARB_texture_multisample_es31" ],
   [ "SAMPLE_MASK", "CONTEXT_BOOL(Multisample.SampleMask), 
extra_ARB_texture_multisample_es31" ],
   [ "MAX_SAMPLE_MASK_WORDS", "CONST(1), extra_ARB_texture_multisample_es31" ],
+# GL_ARB_texture_gather / GLES 3.1
+  [ "MIN_PROGRAM_TEXTURE_GATHER_OFFSET", 
"CONTEXT_INT(Const.MinProgramTextureGatherOffset), 
extra_ARB_texture_gather_es31"],
+  [ "MAX_PROGRAM_TEXTURE_GATHER_OFFSET", 
"CONTEXT_INT(Const.MaxProgramTextureGatherOffset), 
extra_ARB_texture_gather_es31"],
+  [ "MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS_ARB", 
"CONTEXT_INT(Const.MaxProgramTextureGatherComponents), 
extra_ARB_texture_gather_es31"],
 ]},
 
 # Remaining enums are only in OpenGL
@@ -774,11 +778,6 @@ descriptor=[
 # GL_ARB_texture_cube_map_array
   [ "TEXTURE_BINDING_CUBE_MAP_ARRAY_ARB", "LOC_CUSTOM, TYPE_INT, 
TEXTURE_CUBE_ARRAY_INDEX, extra_ARB_texture_cube_map_array" ],
 
-# GL_ARB_texture_gather
-  [ "MIN_PROGRAM_TEXTURE_GATHER_OFFSET", 
"CONTEXT_INT(Const.MinProgramTextureGatherOffset), extra_ARB_texture_gather"],
-  [ "MAX_PROGRAM_TEXTURE_GATHER_OFFSET", 
"CONTEXT_INT(Const.MaxProgramTextureGatherOffset), extra_ARB_texture_gather"],
-  [ "MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS_ARB", 
"CONTEXT_INT(Const.MaxProgramTextureGatherComponents), 
extra_ARB_texture_gather"],
-
 # GL_ARB_separate_shader_objects
   [ "PROGRAM_PIPELINE_BINDING", "LOC_CUSTOM, TYPE_INT, 
GL_PROGRAM_PIPELINE_BINDING, NO_EXTRA" ],
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 5/6] mesa/es3.1: enable GL_ARB_compute_shader for GLES 3.1

2015-05-07 Thread Marta Lofstedt

From: Marta Lofstedt 

v2 : only expose GL_ARB_compute_shader enums for
gles 3.1 and GL core.

Signed-off-by: Marta Lofstedt 
---
 src/mesa/main/get.c  |  6 ++
 src/mesa/main/get_hash_params.py | 19 +--
 2 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 95868bf..97d3bf0 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -379,6 +379,12 @@ static const int extra_ARB_texture_gather_es31[] = {
EXTRA_END
 };
 
+static const int extra_ARB_compute_shader_es31[] = {
+   EXT(ARB_compute_shader),
+   EXTRA_API_ES31,
+   EXTRA_END
+};
+
 EXTRA_EXT(ARB_texture_cube_map);
 EXTRA_EXT(EXT_texture_array);
 EXTRA_EXT(NV_fog_distance);
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 50af078..985f252 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -445,6 +445,15 @@ descriptor=[
   [ "MIN_PROGRAM_TEXTURE_GATHER_OFFSET", 
"CONTEXT_INT(Const.MinProgramTextureGatherOffset), 
extra_ARB_texture_gather_es31"],
   [ "MAX_PROGRAM_TEXTURE_GATHER_OFFSET", 
"CONTEXT_INT(Const.MaxProgramTextureGatherOffset), 
extra_ARB_texture_gather_es31"],
   [ "MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS_ARB", 
"CONTEXT_INT(Const.MaxProgramTextureGatherComponents), 
extra_ARB_texture_gather_es31"],
+# GL_ARB_compute_shader / GLES 3.1
+  [ "MAX_COMPUTE_WORK_GROUP_INVOCATIONS", 
"CONTEXT_INT(Const.MaxComputeWorkGroupInvocations), 
extra_ARB_compute_shader_es31" ],
+  [ "MAX_COMPUTE_UNIFORM_BLOCKS", "CONST(MAX_COMPUTE_UNIFORM_BLOCKS), 
extra_ARB_compute_shader_es31" ],
+  [ "MAX_COMPUTE_TEXTURE_IMAGE_UNITS", 
"CONST(MAX_COMPUTE_TEXTURE_IMAGE_UNITS), extra_ARB_compute_shader_es31" ],
+  [ "MAX_COMPUTE_ATOMIC_COUNTER_BUFFERS", 
"CONST(MAX_COMPUTE_ATOMIC_COUNTER_BUFFERS), extra_ARB_compute_shader_es31" ],
+  [ "MAX_COMPUTE_ATOMIC_COUNTERS", "CONST(MAX_COMPUTE_ATOMIC_COUNTERS), 
extra_ARB_compute_shader_es31" ],
+  [ "MAX_COMPUTE_SHARED_MEMORY_SIZE", "CONST(MAX_COMPUTE_SHARED_MEMORY_SIZE), 
extra_ARB_compute_shader_es31" ],
+  [ "MAX_COMPUTE_UNIFORM_COMPONENTS", "CONST(MAX_COMPUTE_UNIFORM_COMPONENTS), 
extra_ARB_compute_shader_es31" ],
+  [ "MAX_COMPUTE_IMAGE_UNIFORMS", "CONST(MAX_COMPUTE_IMAGE_UNIFORMS), 
extra_ARB_compute_shader_es31" ],
 ]},
 
 # Remaining enums are only in OpenGL
@@ -789,16 +798,6 @@ descriptor=[
   [ "MAX_VERTEX_ATTRIB_RELATIVE_OFFSET", 
"CONTEXT_ENUM(Const.MaxVertexAttribRelativeOffset), NO_EXTRA" ],
   [ "MAX_VERTEX_ATTRIB_BINDINGS", 
"CONTEXT_ENUM(Const.MaxVertexAttribBindings), NO_EXTRA" ],
 
-# GL_ARB_compute_shader
-  [ "MAX_COMPUTE_WORK_GROUP_INVOCATIONS", 
"CONTEXT_INT(Const.MaxComputeWorkGroupInvocations), extra_ARB_compute_shader" ],
-  [ "MAX_COMPUTE_UNIFORM_BLOCKS", "CONST(MAX_COMPUTE_UNIFORM_BLOCKS), 
extra_ARB_compute_shader" ],
-  [ "MAX_COMPUTE_TEXTURE_IMAGE_UNITS", 
"CONST(MAX_COMPUTE_TEXTURE_IMAGE_UNITS), extra_ARB_compute_shader" ],
-  [ "MAX_COMPUTE_ATOMIC_COUNTER_BUFFERS", 
"CONST(MAX_COMPUTE_ATOMIC_COUNTER_BUFFERS), extra_ARB_compute_shader" ],
-  [ "MAX_COMPUTE_ATOMIC_COUNTERS", "CONST(MAX_COMPUTE_ATOMIC_COUNTERS), 
extra_ARB_compute_shader" ],
-  [ "MAX_COMPUTE_SHARED_MEMORY_SIZE", "CONST(MAX_COMPUTE_SHARED_MEMORY_SIZE), 
extra_ARB_compute_shader" ],
-  [ "MAX_COMPUTE_UNIFORM_COMPONENTS", "CONST(MAX_COMPUTE_UNIFORM_COMPONENTS), 
extra_ARB_compute_shader" ],
-  [ "MAX_COMPUTE_IMAGE_UNIFORMS", "CONST(MAX_COMPUTE_IMAGE_UNIFORMS), 
extra_ARB_compute_shader" ],
-
 # GL_ARB_gpu_shader5
   [ "MAX_GEOMETRY_SHADER_INVOCATIONS", 
"CONST(MAX_GEOMETRY_SHADER_INVOCATIONS), extra_ARB_gpu_shader5" ],
   [ "MIN_FRAGMENT_INTERPOLATION_OFFSET", 
"CONTEXT_FLOAT(Const.MinFragmentInterpolationOffset), extra_ARB_gpu_shader5" ],
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/7] i965: Move tex miptree and format resolving into dispatcher

2015-05-07 Thread Francisco Jerez

From: Topi Pohjolainen 

All hardware platforms have this in common, so do it in the
hardware independent dispatcher.

v2 (Matt): Removed extra whitespace.

Reviewed-by: Matt Turner  (v1)
Reviewed-by: Kenneth Graunke  (v1)
Signed-off-by: Topi Pohjolainen 
[ Francisco Jerez: Non-trivial rebase. ]
Reviewed-by: Francisco Jerez 
---
 src/mesa/drivers/dri/i965/brw_context.h   |  4 +++-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 26 ---
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 17 ++-
 src/mesa/drivers/dri/i965/gen8_surface_state.c| 17 ---
 4 files changed, 31 insertions(+), 33 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index a6282f4..d599ba8 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -984,7 +984,9 @@ struct brw_context
struct
{
   void (*update_texture_surface)(struct gl_context *ctx,
- unsigned unit,
+ struct intel_mipmap_tree *mt,
+ struct gl_texture_object *tObj,
+ uint32_t tex_format,
  uint32_t *surf_offset,
  bool for_gather);
   uint32_t (*update_renderbuffer_surface)(struct brw_context *brw,
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 2b8040c..7ed7e18 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -309,23 +309,19 @@ update_buffer_texture_surface(struct gl_context *ctx,
 
 static void
 brw_update_texture_surface(struct gl_context *ctx,
-   unsigned unit,
+   struct intel_mipmap_tree *mt,
+   struct gl_texture_object *tObj,
+   uint32_t tex_format,
uint32_t *surf_offset,
bool for_gather)
 {
struct brw_context *brw = brw_context(ctx);
-   struct gl_texture_object *tObj = ctx->Texture.Unit[unit]._Current;
struct intel_texture_object *intelObj = intel_texture_object(tObj);
-   struct intel_mipmap_tree *mt = intelObj->mt;
-   struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit);
uint32_t *surf;
 
surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
  6 * 4, 32, surf_offset);
 
-   uint32_t tex_format = translate_tex_format(brw, mt->format,
-  sampler->sRGBDecode);
-
if (for_gather) {
   /* Sandybridge's gather4 message is broken for integer formats.
* To work around this, we pretend the surface is UNORM for
@@ -801,7 +797,21 @@ update_texture_surface(struct gl_context *ctx,
if (obj->Target == GL_TEXTURE_BUFFER) {
   update_buffer_texture_surface(ctx, unit, surf_offset);
} else {
-  brw->vtbl.update_texture_surface(ctx, unit, surf_offset, for_gather);
+  struct intel_texture_object *intel_obj = intel_texture_object(obj);
+  struct intel_mipmap_tree *mt = intel_obj->mt;
+  const struct gl_texture_image *firstImage = 
obj->Image[0][obj->BaseLevel];
+  const struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, 
unit);
+  unsigned format = translate_tex_format(brw, intel_obj->_Format,
+ sampler->sRGBDecode);
+  if (obj->StencilSampling && firstImage->_BaseFormat == GL_DEPTH_STENCIL) 
{
+ assert(brw->gen >= 8);
+ mt = mt->stencil_mt;
+ assert(mt->format == MESA_FORMAT_S_UINT8);
+ format = BRW_SURFACEFORMAT_R8_UINT;
+  }
+
+  brw->vtbl.update_texture_surface(ctx, mt, obj, format, surf_offset,
+   for_gather);
}
 }
 
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
index 098b5c8..7e3ee67 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
@@ -349,16 +349,14 @@ gen7_emit_texture_surface_state(struct brw_context *brw,
 
 static void
 gen7_update_texture_surface(struct gl_context *ctx,
-unsigned unit,
+struct intel_mipmap_tree *mt,
+struct gl_texture_object *obj,
+uint32_t tex_format,
 uint32_t *surf_offset,
 bool for_gather)
 {
struct brw_context *brw = brw_context(ctx);
-   struct gl_texture_object *obj = ctx->Texture.Unit[unit]._Current;
-
struct intel_texture_object *intel_obj = intel_texture_object(obj);
-   struct intel_mipmap_tree *mt = intel_obj->mt;
-   struct gl_sampler_object *sampler = _mesa_get_sampl

[Mesa-dev] [PATCH 7/7] i965: Drop the update_texture_surface vtbl hook.

2015-05-07 Thread Francisco Jerez

At this point the update_texture_surface and
emit_texture_surface_state hooks are almost equivalent, the only
significant difference is that emit_texture_surface_state supports
binding read-write surfaces.  The name of the latter is more
consistent with the other emit_something_surface_state hooks, so let's
keep it.
---
 src/mesa/drivers/dri/i965/brw_context.h   | 10 --
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 37 ---
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 24 ++-
 src/mesa/drivers/dri/i965/gen8_surface_state.c| 18 ---
 4 files changed, 23 insertions(+), 66 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 2eb4251..780edba 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -983,16 +983,6 @@ struct brw_context
 
struct
{
-  void (*update_texture_surface)(struct brw_context *brw,
- struct intel_mipmap_tree *mt,
- GLenum target,
- unsigned min_layer,
- unsigned max_layer,
- unsigned min_level,
- unsigned max_level,
- uint32_t tex_format, unsigned swizzle,
- uint32_t *surf_offset,
- bool for_gather);
   uint32_t (*update_renderbuffer_surface)(struct brw_context *brw,
   struct gl_renderbuffer *rb,
   bool layered, unsigned unit,
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index de4bdc5..870d699 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -308,16 +308,17 @@ update_buffer_texture_surface(struct gl_context *ctx,
 }
 
 static void
-brw_update_texture_surface(struct brw_context *brw,
-   struct intel_mipmap_tree *mt,
-   GLenum target,
-   unsigned min_layer /* unused */,
-   unsigned max_layer /* unused */,
-   unsigned min_level,
-   unsigned max_level,
-   uint32_t tex_format, unsigned swizzle /* unused */,
-   uint32_t *surf_offset,
-   bool for_gather)
+gen4_emit_texture_surface_state(struct brw_context *brw,
+struct intel_mipmap_tree *mt,
+GLenum target,
+unsigned min_layer /* unused */,
+unsigned max_layer /* unused */,
+unsigned min_level,
+unsigned max_level,
+unsigned tex_format,
+unsigned swizzle /* unused */,
+uint32_t *surf_offset,
+bool rw, bool for_gather)
 {
uint32_t *surf;
 
@@ -378,7 +379,8 @@ brw_update_texture_surface(struct brw_context *brw,
*surf_offset + 4,
mt->bo,
surf[1] - mt->bo->offset64,
-   I915_GEM_DOMAIN_SAMPLER, 0);
+   I915_GEM_DOMAIN_SAMPLER,
+   (rw ? I915_GEM_DOMAIN_SAMPLER : 0));
 }
 
 /**
@@ -834,11 +836,12 @@ update_texture_surface(struct gl_context *ctx,
*/
   assert(brw->gen >= 7 || obj->MinLevel == 0 || brw->meta_in_progress);
 
-  brw->vtbl.update_texture_surface(brw, mt, obj->Target,
-   obj->MinLayer, obj->MinLayer + depth,
-   obj->MinLevel + obj->BaseLevel,
-   obj->MinLevel + intel_obj->_MaxLevel + 
1,
-   format, swizzle, surf_offset, 
for_gather);
+  brw->vtbl.emit_texture_surface_state(
+ brw, mt, obj->Target,
+ obj->MinLayer, obj->MinLayer + depth,
+ obj->MinLevel + obj->BaseLevel,
+ obj->MinLevel + intel_obj->_MaxLevel + 1,
+ format, swizzle, surf_offset, false, for_gather);
}
 }
 
@@ -1071,8 +1074,8 @@ const struct brw_tracked_state brw_cs_abo_surfaces = {
 void
 gen4_init_vtable_surface_functions(struct brw_context *brw)
 {
-   brw->vtbl.update_texture_surface = brw_update_texture_surface;
brw->vtbl.update_renderbuffer_surface = brw_update_renderbuffer_surface;
brw->vtbl.emit_null_surface_state = brw_emit_null_surface_state;
+   brw->vtbl.emit_texture_surface_state = gen4_emit_texture_surface_state;
brw->vt

[Mesa-dev] [PATCH 1/7] i965: Move texture buffer dispatch into single location

2015-05-07 Thread Francisco Jerez

From: Topi Pohjolainen 

All generations do the same exact dispatch and it could be
therefore done in the hardware independent stage.

Reviewed-by: Matt Turner 
Reviewed-by: Kenneth Graunke 
Signed-off-by: Topi Pohjolainen 
[ Francisco Jerez: Non-trivial rebase. ]
Reviewed-by: Francisco Jerez 
---
 src/mesa/drivers/dri/i965/brw_context.h   |  3 -
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 31 ++
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 69 +++
 src/mesa/drivers/dri/i965/gen8_surface_state.c| 67 ++
 4 files changed, 83 insertions(+), 87 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 2fcdcfa..a6282f4 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1698,9 +1698,6 @@ void brw_create_constant_surface(struct brw_context *brw,
  uint32_t size,
  uint32_t *out_offset,
  bool dword_pitch);
-void brw_update_buffer_texture_surface(struct gl_context *ctx,
-   unsigned unit,
-   uint32_t *surf_offset);
 void
 brw_update_sol_surface(struct brw_context *brw,
struct gl_buffer_object *buffer_obj,
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 160dd2f..2b8040c 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -274,10 +274,10 @@ gen4_emit_buffer_surface_state(struct brw_context *brw,
}
 }
 
-void
-brw_update_buffer_texture_surface(struct gl_context *ctx,
-  unsigned unit,
-  uint32_t *surf_offset)
+static void
+update_buffer_texture_surface(struct gl_context *ctx,
+  unsigned unit,
+  uint32_t *surf_offset)
 {
struct brw_context *brw = brw_context(ctx);
struct gl_texture_object *tObj = ctx->Texture.Unit[unit]._Current;
@@ -320,12 +320,6 @@ brw_update_texture_surface(struct gl_context *ctx,
struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit);
uint32_t *surf;
 
-   /* BRW_NEW_TEXTURE_BUFFER */
-   if (tObj->Target == GL_TEXTURE_BUFFER) {
-  brw_update_buffer_texture_surface(ctx, unit, surf_offset);
-  return;
-   }
-
surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
  6 * 4, 32, surf_offset);
 
@@ -795,6 +789,21 @@ const struct brw_tracked_state gen6_renderbuffer_surfaces 
= {
.emit = update_renderbuffer_surfaces,
 };
 
+static void
+update_texture_surface(struct gl_context *ctx,
+   unsigned unit,
+   uint32_t *surf_offset,
+   bool for_gather)
+{
+   struct brw_context *brw = brw_context(ctx);
+   struct gl_texture_object *obj = ctx->Texture.Unit[unit]._Current;
+
+   if (obj->Target == GL_TEXTURE_BUFFER) {
+  update_buffer_texture_surface(ctx, unit, surf_offset);
+   } else {
+  brw->vtbl.update_texture_surface(ctx, unit, surf_offset, for_gather);
+   }
+}
 
 static void
 update_stage_texture_surfaces(struct brw_context *brw,
@@ -824,7 +833,7 @@ update_stage_texture_surfaces(struct brw_context *brw,
 
  /* _NEW_TEXTURE */
  if (ctx->Texture.Unit[unit]._Current) {
-brw->vtbl.update_texture_surface(ctx, unit, surf_offset + s, 
for_gather);
+update_texture_surface(ctx, unit, surf_offset + s, for_gather);
  }
   }
}
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
index 15ab2b0..098b5c8 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
@@ -356,43 +356,38 @@ gen7_update_texture_surface(struct gl_context *ctx,
struct brw_context *brw = brw_context(ctx);
struct gl_texture_object *obj = ctx->Texture.Unit[unit]._Current;
 
-   if (obj->Target == GL_TEXTURE_BUFFER) {
-  brw_update_buffer_texture_surface(ctx, unit, surf_offset);
-
-   } else {
-  struct intel_texture_object *intel_obj = intel_texture_object(obj);
-  struct intel_mipmap_tree *mt = intel_obj->mt;
-  struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit);
-  /* If this is a view with restricted NumLayers, then our effective depth
-   * is not just the miptree depth.
-   */
-  const unsigned depth = (obj->Immutable && obj->Target != GL_TEXTURE_3D ?
-  obj->NumLayers : mt->logical_depth0);
-
-  /* Handling GL_ALPHA as a surface format override breaks 1.30+ style
-   * texturing functions that return a float, as our code generation always
-   * selects the .x channel (which would always be 0).
-   */

[Mesa-dev] [PATCH 4/7] i965: Refactor effective depth calculation

2015-05-07 Thread Francisco Jerez

From: Topi Pohjolainen 

Reviewed-by: Matt Turner 
Reviewed-by: Kenneth Graunke 
Signed-off-by: Topi Pohjolainen 
[ Francisco Jerez: Non-trivial rebase.  Pass a half-open interval of
  layers like emit_texture_surface_state does. ]
Reviewed-by: Francisco Jerez 
---
 src/mesa/drivers/dri/i965/brw_context.h   |  2 ++
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 12 ++--
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 10 +++---
 src/mesa/drivers/dri/i965/gen8_surface_state.c|  9 +++--
 4 files changed, 18 insertions(+), 15 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 9e85dd7..0e9ede9 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -986,6 +986,8 @@ struct brw_context
   void (*update_texture_surface)(struct brw_context *brw,
  struct intel_mipmap_tree *mt,
  struct gl_texture_object *tObj,
+ unsigned min_layer,
+ unsigned max_layer,
  uint32_t tex_format, unsigned swizzle,
  uint32_t *surf_offset,
  bool for_gather);
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 3dddf89..92383e1 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -311,6 +311,8 @@ static void
 brw_update_texture_surface(struct brw_context *brw,
struct intel_mipmap_tree *mt,
struct gl_texture_object *tObj,
+   unsigned min_layer /* unused */,
+   unsigned max_layer /* unused */,
uint32_t tex_format, unsigned swizzle /* unused */,
uint32_t *surf_offset,
bool for_gather)
@@ -800,6 +802,11 @@ update_texture_surface(struct gl_context *ctx,
   struct intel_mipmap_tree *mt = intel_obj->mt;
   const struct gl_texture_image *firstImage = 
obj->Image[0][obj->BaseLevel];
   const struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, 
unit);
+  /* If this is a view with restricted NumLayers, then our effective depth
+   * is not just the miptree depth.
+   */
+  const unsigned depth = (obj->Immutable && obj->Target != GL_TEXTURE_3D ?
+  obj->NumLayers : mt->logical_depth0);
 
   /* Handling GL_ALPHA as a surface format override breaks 1.30+ style
* texturing functions that return a float, as our code generation always
@@ -820,8 +827,9 @@ update_texture_surface(struct gl_context *ctx,
  format = BRW_SURFACEFORMAT_R8_UINT;
   }
 
-  brw->vtbl.update_texture_surface(brw, mt, obj, format, swizzle,
-   surf_offset, for_gather);
+  brw->vtbl.update_texture_surface(brw, mt, obj,
+   obj->MinLayer, obj->MinLayer + depth,
+   format, swizzle, surf_offset, 
for_gather);
}
 }
 
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
index 7576b20..9755236 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
@@ -351,22 +351,18 @@ static void
 gen7_update_texture_surface(struct brw_context *brw,
 struct intel_mipmap_tree *mt,
 struct gl_texture_object *obj,
+unsigned min_layer,
+unsigned max_layer,
 uint32_t tex_format, unsigned swizzle,
 uint32_t *surf_offset,
 bool for_gather)
 {
struct intel_texture_object *intel_obj = intel_texture_object(obj);
-   /* If this is a view with restricted NumLayers, then our effective depth
-* is not just the miptree depth.
-*/
-   const unsigned depth = (obj->Immutable && obj->Target != GL_TEXTURE_3D ?
-   obj->NumLayers : mt->logical_depth0);
-
if (for_gather && tex_format == BRW_SURFACEFORMAT_R32G32_FLOAT)
   tex_format = BRW_SURFACEFORMAT_R32G32_FLOAT_LD;
 
gen7_emit_texture_surface_state(brw, mt, obj->Target,
-   obj->MinLayer, obj->MinLayer + depth,
+   min_layer, max_layer,
obj->MinLevel + obj->BaseLevel,
obj->MinLevel + intel_obj->_MaxLevel + 1,
tex_format, swizzle,
diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
b/src/mesa/drivers/dri/i965/ge

[Mesa-dev] [PATCH 3/7] i965: Move texture swizzle resolving into dispatcher

2015-05-07 Thread Francisco Jerez

From: Topi Pohjolainen 

Reviewed-by: Matt Turner 
Reviewed-by: Kenneth Graunke 
Signed-off-by: Topi Pohjolainen 
[ Francisco Jerez: Non-trivial rebase. ]
Reviewed-by: Francisco Jerez 
---
 src/mesa/drivers/dri/i965/brw_context.h   |  4 ++--
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 20 +++-
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 16 ++--
 src/mesa/drivers/dri/i965/gen8_surface_state.c| 16 ++--
 4 files changed, 21 insertions(+), 35 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index d599ba8..9e85dd7 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -983,10 +983,10 @@ struct brw_context
 
struct
{
-  void (*update_texture_surface)(struct gl_context *ctx,
+  void (*update_texture_surface)(struct brw_context *brw,
  struct intel_mipmap_tree *mt,
  struct gl_texture_object *tObj,
- uint32_t tex_format,
+ uint32_t tex_format, unsigned swizzle,
  uint32_t *surf_offset,
  bool for_gather);
   uint32_t (*update_renderbuffer_surface)(struct brw_context *brw,
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 7ed7e18..3dddf89 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -308,14 +308,13 @@ update_buffer_texture_surface(struct gl_context *ctx,
 }
 
 static void
-brw_update_texture_surface(struct gl_context *ctx,
+brw_update_texture_surface(struct brw_context *brw,
struct intel_mipmap_tree *mt,
struct gl_texture_object *tObj,
-   uint32_t tex_format,
+   uint32_t tex_format, unsigned swizzle /* unused */,
uint32_t *surf_offset,
bool for_gather)
 {
-   struct brw_context *brw = brw_context(ctx);
struct intel_texture_object *intelObj = intel_texture_object(tObj);
uint32_t *surf;
 
@@ -801,6 +800,17 @@ update_texture_surface(struct gl_context *ctx,
   struct intel_mipmap_tree *mt = intel_obj->mt;
   const struct gl_texture_image *firstImage = 
obj->Image[0][obj->BaseLevel];
   const struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, 
unit);
+
+  /* Handling GL_ALPHA as a surface format override breaks 1.30+ style
+   * texturing functions that return a float, as our code generation always
+   * selects the .x channel (which would always be 0).
+   */
+  const bool alpha_depth = obj->DepthMode == GL_ALPHA &&
+ (firstImage->_BaseFormat == GL_DEPTH_COMPONENT ||
+  firstImage->_BaseFormat == GL_DEPTH_STENCIL);
+  const unsigned swizzle = (unlikely(alpha_depth) ? SWIZZLE_XYZW :
+brw_get_texture_swizzle(&brw->ctx, obj));
+
   unsigned format = translate_tex_format(brw, intel_obj->_Format,
  sampler->sRGBDecode);
   if (obj->StencilSampling && firstImage->_BaseFormat == GL_DEPTH_STENCIL) 
{
@@ -810,8 +820,8 @@ update_texture_surface(struct gl_context *ctx,
  format = BRW_SURFACEFORMAT_R8_UINT;
   }
 
-  brw->vtbl.update_texture_surface(ctx, mt, obj, format, surf_offset,
-   for_gather);
+  brw->vtbl.update_texture_surface(brw, mt, obj, format, swizzle,
+   surf_offset, for_gather);
}
 }
 
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
index 7e3ee67..7576b20 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
@@ -348,14 +348,13 @@ gen7_emit_texture_surface_state(struct brw_context *brw,
 }
 
 static void
-gen7_update_texture_surface(struct gl_context *ctx,
+gen7_update_texture_surface(struct brw_context *brw,
 struct intel_mipmap_tree *mt,
 struct gl_texture_object *obj,
-uint32_t tex_format,
+uint32_t tex_format, unsigned swizzle,
 uint32_t *surf_offset,
 bool for_gather)
 {
-   struct brw_context *brw = brw_context(ctx);
struct intel_texture_object *intel_obj = intel_texture_object(obj);
/* If this is a view with restricted NumLayers, then our effective depth
 * is not just the miptree depth.
@@ -363,17 +362,6 @@ gen7_update_texture_surface(struct gl_context *ctx,
const unsigned depth = (obj->Immutable && obj->Target != GL_TEXTURE_3D ?

[Mesa-dev] [PATCH 5/7] i965: Pass texture target as parameter for surface setup

2015-05-07 Thread Francisco Jerez

From: Topi Pohjolainen 

Also changed a couple of direct shifts into SET_FIELD().

Reviewed-by: Matt Turner 
Reviewed-by: Kenneth Graunke 
Signed-off-by: Topi Pohjolainen 
[ Francisco Jerez: Non-trivial rebase. ]
Reviewed-by: Francisco Jerez 
---
 src/mesa/drivers/dri/i965/brw_context.h   |  1 +
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 12 ++--
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c |  4 ++--
 src/mesa/drivers/dri/i965/gen8_surface_state.c|  4 ++--
 4 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 0e9ede9..6f08b06 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -986,6 +986,7 @@ struct brw_context
   void (*update_texture_surface)(struct brw_context *brw,
  struct intel_mipmap_tree *mt,
  struct gl_texture_object *tObj,
+ GLenum target,
  unsigned min_layer,
  unsigned max_layer,
  uint32_t tex_format, unsigned swizzle,
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 92383e1..fa4e36d 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -310,7 +310,7 @@ update_buffer_texture_surface(struct gl_context *ctx,
 static void
 brw_update_texture_surface(struct brw_context *brw,
struct intel_mipmap_tree *mt,
-   struct gl_texture_object *tObj,
+   struct gl_texture_object *tObj, GLenum target,
unsigned min_layer /* unused */,
unsigned max_layer /* unused */,
uint32_t tex_format, unsigned swizzle /* unused */,
@@ -352,10 +352,10 @@ brw_update_texture_surface(struct brw_context *brw,
   }
}
 
-   surf[0] = (translate_tex_target(tObj->Target) << BRW_SURFACE_TYPE_SHIFT |
- BRW_SURFACE_MIPMAPLAYOUT_BELOW << BRW_SURFACE_MIPLAYOUT_SHIFT |
- BRW_SURFACE_CUBEFACE_ENABLES |
- tex_format << BRW_SURFACE_FORMAT_SHIFT);
+   surf[0] = SET_FIELD(translate_tex_target(target), BRW_SURFACE_TYPE) |
+ BRW_SURFACE_MIPMAPLAYOUT_BELOW << BRW_SURFACE_MIPLAYOUT_SHIFT |
+ BRW_SURFACE_CUBEFACE_ENABLES |
+ tex_format << BRW_SURFACE_FORMAT_SHIFT;
 
surf[1] = mt->bo->offset64 + mt->offset; /* reloc */
 
@@ -827,7 +827,7 @@ update_texture_surface(struct gl_context *ctx,
  format = BRW_SURFACEFORMAT_R8_UINT;
   }
 
-  brw->vtbl.update_texture_surface(brw, mt, obj,
+  brw->vtbl.update_texture_surface(brw, mt, obj, obj->Target,
obj->MinLayer, obj->MinLayer + depth,
format, swizzle, surf_offset, 
for_gather);
}
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
index 9755236..89dba40 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
@@ -350,7 +350,7 @@ gen7_emit_texture_surface_state(struct brw_context *brw,
 static void
 gen7_update_texture_surface(struct brw_context *brw,
 struct intel_mipmap_tree *mt,
-struct gl_texture_object *obj,
+struct gl_texture_object *obj, GLenum target,
 unsigned min_layer,
 unsigned max_layer,
 uint32_t tex_format, unsigned swizzle,
@@ -361,7 +361,7 @@ gen7_update_texture_surface(struct brw_context *brw,
if (for_gather && tex_format == BRW_SURFACEFORMAT_R32G32_FLOAT)
   tex_format = BRW_SURFACEFORMAT_R32G32_FLOAT_LD;
 
-   gen7_emit_texture_surface_state(brw, mt, obj->Target,
+   gen7_emit_texture_surface_state(brw, mt, target,
min_layer, max_layer,
obj->MinLevel + obj->BaseLevel,
obj->MinLevel + intel_obj->_MaxLevel + 1,
diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
b/src/mesa/drivers/dri/i965/gen8_surface_state.c
index 580c1a3..9858f5f 100644
--- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
@@ -249,7 +249,7 @@ gen8_emit_texture_surface_state(struct brw_context *brw,
 static void
 gen8_update_texture_surface(struct brw_context *brw,
 struct intel_mipmap_tree *mt,
-struct gl_texture_object *obj,
+struct gl_texture_object *obj, GLenum target,

Re: [Mesa-dev] [PATCH 07/27] i965: Enable gather push constants

2015-05-07 Thread Pohjolainen, Topi

On Tue, Apr 28, 2015 at 11:08:04PM +0300, Abdiel Janulgue wrote:
> The 3DSTATE_GATHER_POOL_ALLOC is used to enable or disable the gather
> push constants feature within a context. This patch provides the toggle
> functionality of using gather push constants to program constant data
> within a batch.
> 
> Using gather push constants require that a gather pool be allocated so
> that the resource streamer can flush the packed constants it gathered.
> The pool is later referenced by the 3DSTATE_CONSTANT_* command to
> program the push constant data.
> 
> Also introduce INTEL_UBO_GATHER to selectively enable which shader stage
> uses gather constants for ubo fetches.
> 
> Signed-off-by: Abdiel Janulgue 
> ---
>  src/mesa/drivers/dri/i965/brw_binding_tables.c | 43 
> +-
>  src/mesa/drivers/dri/i965/brw_context.c| 37 ++
>  src/mesa/drivers/dri/i965/brw_context.h| 10 ++
>  src/mesa/drivers/dri/i965/brw_state.h  |  1 +
>  4 files changed, 90 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
> b/src/mesa/drivers/dri/i965/brw_binding_tables.c
> index c1d188e..4793fbc 100644
> --- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
> +++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
> @@ -236,9 +236,47 @@ gen7_update_binding_table_from_array(struct brw_context 
> *brw,
> ADVANCE_BATCH();
>  }
>  
> +static void
> +gen7_init_gather_pool(struct brw_context *brw)
> +{
> +   if (!brw->has_resource_streamer)
> +  return;
> +
> +   if (!brw->gather_pool.bo) {
> +  brw->gather_pool.bo = drm_intel_bo_alloc(brw->bufmgr, "gather_pool",
> +   brw->gather_pool.size, 4096);
> +  brw->gather_pool.next_offset = 0;
> +   }
> +}
> +
> +void
> +gen7_toggle_gather_constants(struct brw_context *brw, bool enable)
> +{
> +   if (enable && !brw->has_resource_streamer)
> +  return;
> +
> +   uint32_t dw1 = brw->is_haswell ? HSW_GATHER_CONSTANTS_RESERVED : 0;
> +
> +   BEGIN_BATCH(3);
> +   OUT_BATCH(_3DSTATE_GATHER_POOL_ALLOC << 16 | (3 - 2));
> +   if (enable) {
> +  dw1 |= SET_FIELD(BRW_GATHER_CONSTANTS_ON, BRW_GATHER_CONSTANTS_ENABLE) 
> |
> + (brw->is_haswell ? GEN7_MOCS_L3 : 0);

This should align with the previous line.

> +  OUT_RELOC(brw->gather_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0, dw1);
> +  OUT_RELOC(brw->gather_pool.bo, I915_GEM_DOMAIN_SAMPLER, 0,
> +brw->gather_pool.bo->size);
> +   } else {
> +  OUT_BATCH(dw1);
> +  OUT_BATCH(0);
> +   }
> +   ADVANCE_BATCH();
> +}
> +
>  void
>  gen7_disable_hw_binding_tables(struct brw_context *brw)
>  {
> +   gen7_toggle_gather_constants(brw, false);
> +
> BEGIN_BATCH(3);
> OUT_BATCH(_3DSTATE_BINDING_TABLE_POOL_ALLOC << 16 | (3 - 2));
> OUT_BATCH(SET_FIELD(BRW_HW_BINDING_TABLE_OFF, 
> BRW_HW_BINDING_TABLE_ENABLE) |
> @@ -280,6 +318,9 @@ gen7_enable_hw_binding_tables(struct brw_context *brw)
>   brw->hw_bt_pool.bo->size);
> ADVANCE_BATCH();
>  
> +   gen7_init_gather_pool(brw);
> +   gen7_toggle_gather_constants(brw, true);
> +
> /* Pipe control workaround */
> brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE);
>  }
> @@ -288,6 +329,7 @@ void
>  gen7_reset_rs_pool_offsets(struct brw_context *brw)
>  {
> brw->hw_bt_pool.next_offset = HW_BT_START_OFFSET;
> +   brw->gather_pool.next_offset = 0;
>  }
>  
>  const struct brw_tracked_state gen7_hw_binding_tables = {
> @@ -371,5 +413,4 @@ const struct brw_tracked_state 
> gen6_binding_table_pointers = {
> },
> .emit = gen6_upload_binding_table_pointers,
>  };
> -

Not related to this patch.

>  /** @} */
> diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
> b/src/mesa/drivers/dri/i965/brw_context.c
> index 9c7ccae..685ca70 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.c
> +++ b/src/mesa/drivers/dri/i965/brw_context.c
> @@ -67,6 +67,7 @@
>  #include "tnl/tnl.h"
>  #include "tnl/t_pipeline.h"
>  #include "util/ralloc.h"
> +#include "util/u_atomic.h"
>  
>  #include "glsl/nir/nir.h"
>  
> @@ -692,6 +693,25 @@ brw_get_revision(int fd)
> return revision;
>  }
>  
> +static void
> +brw_process_intel_gather_variable(struct brw_context *brw)
> +{
> +   uint64_t INTEL_UBO_GATHER = 0;
> +
> +   static const struct dri_debug_control gather_control[] = {
> +  { "vs", (1 << MESA_SHADER_VERTEX)},
> +  { "gs", (1 << MESA_SHADER_GEOMETRY)},
> +  { "fs", (1 << MESA_SHADER_FRAGMENT)},

You can drop the outermost ().

> +  { NULL, 0 }
> +   };
> +   uint64_t intel_ubo_gather = 
> driParseDebugString(getenv("INTEL_UBO_GATHER"), gather_control);

Wrap to next line, overflowing 80.

> +   (void) p_atomic_cmpxchg(&INTEL_UBO_GATHER, 0, intel_ubo_gather);
> +
> +   brw->vs_ubo_gather = (INTEL_UBO_GATHER & (1 << MESA_SHADER_VERTEX));
> +   brw->gs_ubo_gather = (INTEL_UBO_GATHER & (1 << MESA_SHADER_GEOMETRY));
> +   brw->fs_ubo_gather = (INTEL_UBO_GATHER & (1

[Mesa-dev] [PATCH 6/7] i965: Pass slice details as parameters for surface setup

2015-05-07 Thread Francisco Jerez

From: Topi Pohjolainen 

Also changed a couple of direct shifts into SET_FIELD().

Fixes: arb_copy_image-formats -auto -fbo on ILK. In principle,
minimum level settings are only for TextureView to use. We,
however, also take advantage of that internally when blitting.
Before this patch this wasn't taken into account for ILK in the
surface setup.

v2:
   - Removed extra whitespace and switched tabs to spaces (Matt)
   - Added assertion on minimum level (Ken).

v3 (Curro): Reorder min_layer and effective_depth

Reviewed-by: Matt Turner  (v1)
Reviewed-by: Kenneth Graunke  (v1)
Signed-off-by: Topi Pohjolainen 
[ Francisco Jerez: Non-trivial rebase.  Pass a half-open interval of
  levels like emit_texture_surface_state does. ]
Reviewed-by: Francisco Jerez 
---
 src/mesa/drivers/dri/i965/brw_context.h   |  3 ++-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 31 +++
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 10 +++-
 src/mesa/drivers/dri/i965/gen8_surface_state.c| 11 +++-
 4 files changed, 30 insertions(+), 25 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 6f08b06..2eb4251 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -985,10 +985,11 @@ struct brw_context
{
   void (*update_texture_surface)(struct brw_context *brw,
  struct intel_mipmap_tree *mt,
- struct gl_texture_object *tObj,
  GLenum target,
  unsigned min_layer,
  unsigned max_layer,
+ unsigned min_level,
+ unsigned max_level,
  uint32_t tex_format, unsigned swizzle,
  uint32_t *surf_offset,
  bool for_gather);
diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index fa4e36d..de4bdc5 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -310,14 +310,15 @@ update_buffer_texture_surface(struct gl_context *ctx,
 static void
 brw_update_texture_surface(struct brw_context *brw,
struct intel_mipmap_tree *mt,
-   struct gl_texture_object *tObj, GLenum target,
+   GLenum target,
unsigned min_layer /* unused */,
unsigned max_layer /* unused */,
+   unsigned min_level,
+   unsigned max_level,
uint32_t tex_format, unsigned swizzle /* unused */,
uint32_t *surf_offset,
bool for_gather)
 {
-   struct intel_texture_object *intelObj = intel_texture_object(tObj);
uint32_t *surf;
 
surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
@@ -359,16 +360,16 @@ brw_update_texture_surface(struct brw_context *brw,
 
surf[1] = mt->bo->offset64 + mt->offset; /* reloc */
 
-   surf[2] = ((intelObj->_MaxLevel - tObj->BaseLevel) << BRW_SURFACE_LOD_SHIFT 
|
- (mt->logical_width0 - 1) << BRW_SURFACE_WIDTH_SHIFT |
- (mt->logical_height0 - 1) << BRW_SURFACE_HEIGHT_SHIFT);
+   surf[2] = SET_FIELD(max_level - min_level - 1, BRW_SURFACE_LOD) |
+ SET_FIELD(mt->logical_width0 - 1, BRW_SURFACE_WIDTH) |
+ SET_FIELD(mt->logical_height0 - 1, BRW_SURFACE_HEIGHT);
 
-   surf[3] = (brw_get_surface_tiling_bits(mt->tiling) |
- (mt->logical_depth0 - 1) << BRW_SURFACE_DEPTH_SHIFT |
- (mt->pitch - 1) << BRW_SURFACE_PITCH_SHIFT);
+   surf[3] = brw_get_surface_tiling_bits(mt->tiling) |
+ SET_FIELD(mt->logical_depth0 - 1, BRW_SURFACE_DEPTH) |
+ SET_FIELD(mt->pitch - 1, BRW_SURFACE_PITCH);
 
-   surf[4] = (brw_get_surface_num_multisamples(mt->num_samples) |
-  SET_FIELD(tObj->BaseLevel - mt->first_level, 
BRW_SURFACE_MIN_LOD));
+   surf[4] = brw_get_surface_num_multisamples(mt->num_samples) |
+ SET_FIELD(min_level - mt->first_level, BRW_SURFACE_MIN_LOD);
 
surf[5] = mt->align_h == 4 ? BRW_SURFACE_VERTICAL_ALIGN_ENABLE : 0;
 
@@ -827,8 +828,16 @@ update_texture_surface(struct gl_context *ctx,
  format = BRW_SURFACEFORMAT_R8_UINT;
   }
 
-  brw->vtbl.update_texture_surface(brw, mt, obj, obj->Target,
+  /* Minimum level is only supported for TextureView but internally it is
+   * also taken advantage of by meta blit path. The former is only enabled
+   * from gen7 onwards.
+   */
+  assert(brw->gen >= 7 || obj->MinLevel == 0 || brw->meta_in_progress);
+
+  brw->vtbl.update_texture_surface(brw, mt, obj->Target,

Re: [Mesa-dev] [PATCH v2 03/15] i965/fs_cse: Factor out code to create copy instructions

2015-05-07 Thread Jason Ekstrand

On Thu, May 7, 2015 at 5:52 AM, Pohjolainen, Topi
 wrote:
> On Tue, May 05, 2015 at 06:28:06PM -0700, Jason Ekstrand wrote:
>> v2: Get rid of the block parameter and make src a const reference
>>
>> Reviewed-by: Topi Pohjolainen 
>> Reviewed-by: Matt Turner 
>> Reviewed-by: Kenneth Graunke 
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 75 
>> 
>>  1 file changed, 38 insertions(+), 37 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
>> index 43370cb..9c4ed0b 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
>> @@ -185,6 +185,29 @@ instructions_match(fs_inst *a, fs_inst *b, bool *negate)
>>operands_match(a, b, negate);
>>  }
>>
>> +static fs_inst *
>> +create_copy_instr(fs_visitor *v, fs_inst *inst, fs_reg src, bool negate)
>
> Did you mean 'src' to be constant reference? It is only used for reading
> so it could be - you claim this in the commit message yourself :)

Oops...  I think what happened is that I tried to do it for
is_copy_payload not create_copy_instr.  But then is_copy_payload does
actually change it so I put it back and somehow my brain leaked it
into the commit message.  Unfortunately, it's already pushed so I
can't change it now.  However, I could make a fixup if you'd like.
--Jason

>> +{
>> +   int written = inst->regs_written;
>> +   int dst_width = inst->dst.width / 8;
>> +   fs_reg dst = inst->dst;
>> +   fs_inst *copy;
>> +
>> +   if (written > dst_width) {
>> +  fs_reg *sources = ralloc_array(v->mem_ctx, fs_reg, written / 
>> dst_width);
>> +  for (int i = 0; i < written / dst_width; i++)
>> + sources[i] = offset(src, i);
>> +  copy = v->LOAD_PAYLOAD(dst, sources, written / dst_width);
>> +   } else {
>> +  copy = v->MOV(dst, src);
>> +  copy->force_writemask_all = inst->force_writemask_all;
>> +  copy->src[0].negate = negate;
>> +   }
>> +   assert(copy->regs_written == written);
>> +
>> +   return copy;
>> +}
>> +
>>  bool
>>  fs_visitor::opt_cse_local(bblock_t *block)
>>  {
>> @@ -230,49 +253,27 @@ fs_visitor::opt_cse_local(bblock_t *block)
>>  bool no_existing_temp = entry->tmp.file == BAD_FILE;
>>  if (no_existing_temp && !entry->generator->dst.is_null()) {
>> int written = entry->generator->regs_written;
>> -   int dst_width = entry->generator->dst.width / 8;
>> -   assert(written % dst_width == 0);
>> -
>> -   fs_reg orig_dst = entry->generator->dst;
>> -   fs_reg tmp = fs_reg(GRF, alloc.allocate(written),
>> -   orig_dst.type, orig_dst.width);
>> -   entry->tmp = tmp;
>> -   entry->generator->dst = tmp;
>> -
>> -   fs_inst *copy;
>> -   if (written > dst_width) {
>> -  fs_reg *sources = ralloc_array(mem_ctx, fs_reg, written / 
>> dst_width);
>> -  for (int i = 0; i < written / dst_width; i++)
>> - sources[i] = offset(tmp, i);
>> -  copy = LOAD_PAYLOAD(orig_dst, sources, written / 
>> dst_width);
>> -   } else {
>> -  copy = MOV(orig_dst, tmp);
>> -  copy->force_writemask_all =
>> - entry->generator->force_writemask_all;
>> -   }
>> +   assert((written * 8) % entry->generator->dst.width == 0);
>> +
>> +   entry->tmp = fs_reg(GRF, alloc.allocate(written),
>> +   entry->generator->dst.type,
>> +   entry->generator->dst.width);
>> +
>> +   fs_inst *copy = create_copy_instr(this, entry->generator,
>> + entry->tmp, false);
>> entry->generator->insert_after(block, copy);
>> +
>> +   entry->generator->dst = entry->tmp;
>>  }
>>
>>  /* dest <- temp */
>>  if (!inst->dst.is_null()) {
>> -   int written = inst->regs_written;
>> -   int dst_width = inst->dst.width / 8;
>> -   assert(written == entry->generator->regs_written);
>> -   assert(dst_width == entry->generator->dst.width / 8);
>> +   assert(inst->regs_written == entry->generator->regs_written);
>> +   assert(inst->dst.width == entry->generator->dst.width);
>> assert(inst->dst.type == entry->tmp.type);
>> -   fs_reg dst = inst->dst;
>> -   fs_reg tmp = entry->tmp;
>> -   fs_inst *copy;
>> -   if (written > dst_width) {
>> -  fs_reg *sources = ralloc_array(mem_ctx, fs_reg, written / 
>> dst_width);
>> -  for (int i = 0; i < written / dst_width; i++)
>> - sources[i] = offset(tmp, i);
>> -  copy = LOAD_PAYLOAD(ds

Re: [Mesa-dev] [PATCH 1/5] prog_to_nir: OPCODE_EXP is not nir_op_fexp

2015-05-07 Thread Jason Ekstrand

On Wed, May 6, 2015 at 7:29 PM, Matt Turner  wrote:
> On Wed, May 6, 2015 at 7:09 PM, Ian Romanick  wrote:
>> From: Ian Romanick 
>>
>> It's a weird thing that provides some values related to 2**x.  It's also
>> already handled by a case in the switch.
>>
>> Signed-off-by: Ian Romanick 
>
> The series is
>
> Reviewed-by: Matt Turner 

I was going to complain about you making my SPIR-V -> NIR translator
harder to write.  But, based on the discussion by Ken and Ilia on IRC,
it looks like basically no one's hardware does a base-e log.  I'll
just lower on-the-fly.  I guess maybe we could do it with pow(x, e)
but meh.  If you'd like, the series is

Acked-by: Jason Ekstrand 

I can't say I read it enough to call it a review but I glanced through
it and it seems ok.
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 03/15] i965/fs_cse: Factor out code to create copy instructions

2015-05-07 Thread Pohjolainen, Topi

On Thu, May 07, 2015 at 07:26:12AM -0700, Jason Ekstrand wrote:
> On Thu, May 7, 2015 at 5:52 AM, Pohjolainen, Topi
>  wrote:
> > On Tue, May 05, 2015 at 06:28:06PM -0700, Jason Ekstrand wrote:
> >> v2: Get rid of the block parameter and make src a const reference
> >>
> >> Reviewed-by: Topi Pohjolainen 
> >> Reviewed-by: Matt Turner 
> >> Reviewed-by: Kenneth Graunke 
> >> ---
> >>  src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 75 
> >> 
> >>  1 file changed, 38 insertions(+), 37 deletions(-)
> >>
> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
> >> b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> >> index 43370cb..9c4ed0b 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> >> +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> >> @@ -185,6 +185,29 @@ instructions_match(fs_inst *a, fs_inst *b, bool 
> >> *negate)
> >>operands_match(a, b, negate);
> >>  }
> >>
> >> +static fs_inst *
> >> +create_copy_instr(fs_visitor *v, fs_inst *inst, fs_reg src, bool negate)
> >
> > Did you mean 'src' to be constant reference? It is only used for reading
> > so it could be - you claim this in the commit message yourself :)
> 
> Oops...  I think what happened is that I tried to do it for
> is_copy_payload not create_copy_instr.  But then is_copy_payload does
> actually change it so I put it back and somehow my brain leaked it
> into the commit message.  Unfortunately, it's already pushed so I
> can't change it now.  However, I could make a fixup if you'd like.
> --Jason

No big deal really, I'm sure compiler handles that for us anyway.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] docs: document the LIBGL_DRI3_DISABLE environment variable

2015-05-07 Thread Martin Peres

Suggested-by: Axel Davy 
Signed-off-by: Martin Peres 
---
 docs/envvars.html | 1 +
 1 file changed, 1 insertion(+)

diff --git a/docs/envvars.html b/docs/envvars.html
index 31d14a4..c0d5a51 100644
--- a/docs/envvars.html
+++ b/docs/envvars.html
@@ -34,6 +34,7 @@ sometimes be useful for debugging end-user issues.
 LIBGL_NO_DRAWARRAYS - if set do not use DrawArrays GLX protocol (for 
debugging)
 LIBGL_SHOW_FPS - print framerate to stdout based on the number of 
glXSwapBuffers
 calls per second.
+LIBGL_DRI3_DISABLE - disable DRI3 if set (the value does not matter)
 
 
 
-- 
2.4.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/7] i965: Move texture buffer dispatch into single location

2015-05-07 Thread Pohjolainen, Topi

On Thu, May 07, 2015 at 05:15:35PM +0300, Francisco Jerez wrote:
> From: Topi Pohjolainen 
> 
> All generations do the same exact dispatch and it could be
> therefore done in the hardware independent stage.
> 
> Reviewed-by: Matt Turner 
> Reviewed-by: Kenneth Graunke 
> Signed-off-by: Topi Pohjolainen 
> [ Francisco Jerez: Non-trivial rebase. ]
> Reviewed-by: Francisco Jerez 
> ---
>  src/mesa/drivers/dri/i965/brw_context.h   |  3 -
>  src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 31 ++
>  src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 69 
> +++
>  src/mesa/drivers/dri/i965/gen8_surface_state.c| 67 ++
>  4 files changed, 83 insertions(+), 87 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
> b/src/mesa/drivers/dri/i965/brw_context.h
> index 2fcdcfa..a6282f4 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -1698,9 +1698,6 @@ void brw_create_constant_surface(struct brw_context 
> *brw,
>   uint32_t size,
>   uint32_t *out_offset,
>   bool dword_pitch);
> -void brw_update_buffer_texture_surface(struct gl_context *ctx,
> -   unsigned unit,
> -   uint32_t *surf_offset);
>  void
>  brw_update_sol_surface(struct brw_context *brw,
> struct gl_buffer_object *buffer_obj,
> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> index 160dd2f..2b8040c 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> @@ -274,10 +274,10 @@ gen4_emit_buffer_surface_state(struct brw_context *brw,
> }
>  }
>  
> -void
> -brw_update_buffer_texture_surface(struct gl_context *ctx,
> -  unsigned unit,
> -  uint32_t *surf_offset)
> +static void
> +update_buffer_texture_surface(struct gl_context *ctx,
> +  unsigned unit,
> +  uint32_t *surf_offset)
>  {
> struct brw_context *brw = brw_context(ctx);
> struct gl_texture_object *tObj = ctx->Texture.Unit[unit]._Current;
> @@ -320,12 +320,6 @@ brw_update_texture_surface(struct gl_context *ctx,
> struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit);
> uint32_t *surf;
>  
> -   /* BRW_NEW_TEXTURE_BUFFER */
> -   if (tObj->Target == GL_TEXTURE_BUFFER) {
> -  brw_update_buffer_texture_surface(ctx, unit, surf_offset);
> -  return;
> -   }
> -
> surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
> 6 * 4, 32, surf_offset);
>  
> @@ -795,6 +789,21 @@ const struct brw_tracked_state 
> gen6_renderbuffer_surfaces = {
> .emit = update_renderbuffer_surfaces,
>  };
>  
> +static void
> +update_texture_surface(struct gl_context *ctx,
> +   unsigned unit,
> +   uint32_t *surf_offset,
> +   bool for_gather)
> +{
> +   struct brw_context *brw = brw_context(ctx);
> +   struct gl_texture_object *obj = ctx->Texture.Unit[unit]._Current;
> +
> +   if (obj->Target == GL_TEXTURE_BUFFER) {
> +  update_buffer_texture_surface(ctx, unit, surf_offset);

In order to avoid extra level of indentation I used the following. I would
have preferred it here also.

  if (obj->Target == GL_TEXTURE_BUFFER) {
 update_buffer_texture_surface(ctx, unit, surf_offset);
 return;
  }

> +   } else {
> +  brw->vtbl.update_texture_surface(ctx, unit, surf_offset, for_gather);
> +   }
> +}
>  
>  static void
>  update_stage_texture_surfaces(struct brw_context *brw,
> @@ -824,7 +833,7 @@ update_stage_texture_surfaces(struct brw_context *brw,
>  
>   /* _NEW_TEXTURE */
>   if (ctx->Texture.Unit[unit]._Current) {
> -brw->vtbl.update_texture_surface(ctx, unit, surf_offset + s, 
> for_gather);
> +update_texture_surface(ctx, unit, surf_offset + s, for_gather);
>   }
>}
> }
> diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
> b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
> index 15ab2b0..098b5c8 100644
> --- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
> @@ -356,43 +356,38 @@ gen7_update_texture_surface(struct gl_context *ctx,
> struct brw_context *brw = brw_context(ctx);
> struct gl_texture_object *obj = ctx->Texture.Unit[unit]._Current;
>  
> -   if (obj->Target == GL_TEXTURE_BUFFER) {
> -  brw_update_buffer_texture_surface(ctx, unit, surf_offset);
> -
> -   } else {
> -  struct intel_texture_object *intel_obj = intel_texture_object(obj);
> -  struct intel_mipmap_tree *mt = intel_obj->mt;
> -  struct gl_sampler_object *sampler = _mes

Re: [Mesa-dev] [PATCH 03/27] i965: Enable hardware-generated binding tables on render path.

2015-05-07 Thread Pohjolainen, Topi

On Thu, May 07, 2015 at 04:43:21PM +0300, Pohjolainen, Topi wrote:
> On Tue, Apr 28, 2015 at 11:08:00PM +0300, Abdiel Janulgue wrote:
> > This patch implements the binding table enable command which is also
> > used to allocate a binding table pool where hardware-generated
> > binding table entries are flushed into. Each binding table offset in
> > the binding table pool is unique per each shader stage that are
> > enabled within a batch.
> > 
> > Also insert the required brw_tracked_state objects to enable
> > hw-generated binding tables in normal render path.
> > 
> > Signed-off-by: Abdiel Janulgue 
> > ---
> >  src/mesa/drivers/dri/i965/brw_binding_tables.c | 70 
> > ++
> >  src/mesa/drivers/dri/i965/brw_context.c|  4 ++
> >  src/mesa/drivers/dri/i965/brw_context.h|  5 ++
> >  src/mesa/drivers/dri/i965/brw_state.h  |  7 +++
> >  src/mesa/drivers/dri/i965/brw_state_upload.c   |  2 +
> >  src/mesa/drivers/dri/i965/intel_batchbuffer.c  |  4 ++
> >  6 files changed, 92 insertions(+)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
> > b/src/mesa/drivers/dri/i965/brw_binding_tables.c
> > index 459165a..a58e32e 100644
> > --- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
> > +++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
> > @@ -44,6 +44,11 @@
> >  #include "brw_state.h"
> >  #include "intel_batchbuffer.h"
> >  
> > +/* Somehow the hw-binding table pool offset must start here, otherwise
> > + * the GPU will hang
> > + */
> > +#define HW_BT_START_OFFSET 256;
> 
> I think we want to understand this a little better before enabling...
> 
> > +
> >  /**
> >   * Upload a shader stage's binding table as indirect state.
> >   *
> > @@ -163,6 +168,71 @@ const struct brw_tracked_state brw_gs_binding_table = {
> > .emit = brw_gs_upload_binding_table,
> >  };
> >  
> > +/**
> > + * Hardware-generated binding tables for the resource streamer
> > + */
> > +void
> > +gen7_disable_hw_binding_tables(struct brw_context *brw)
> > +{
> > +   BEGIN_BATCH(3);
> > +   OUT_BATCH(_3DSTATE_BINDING_TABLE_POOL_ALLOC << 16 | (3 - 2));
> > +   OUT_BATCH(SET_FIELD(BRW_HW_BINDING_TABLE_OFF, 
> > BRW_HW_BINDING_TABLE_ENABLE) |
> > + brw->is_haswell ? HSW_HW_BINDING_TABLE_RESERVED : 0);
> > +   OUT_BATCH(0);
> > +   ADVANCE_BATCH();
> > +
> > +   /* Pipe control workaround */
> > +   brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE);
> > +}
> > +
> > +void
> > +gen7_enable_hw_binding_tables(struct brw_context *brw)
> > +{
> > +   if (!brw->has_resource_streamer) {
> > +  gen7_disable_hw_binding_tables(brw);
> 
> I started wondering why we really need this - RS is disabled by default and
> we haven't needed to do anything to disable it before.

Right, patch number eight gave me the answer, we want to disable it for blorp.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 26/27] i965: Disable gather push constants for null constants

2015-05-07 Thread Pohjolainen, Topi

On Tue, Apr 28, 2015 at 11:08:23PM +0300, Abdiel Janulgue wrote:
> Programming null constants with gather constant tables seems to
> be unsupported and results in a GPU lockup even with the prescribed
> GPU workarounds in the bspec. Found out by trial and error that
> disabling HW gather constant when the constant state for a stage
> needs to be nullified is the only way to go around the issue.

Just a general question. We keep resource streamer itself always enabled
(except for blorp of course). Does it still do something meaningful without
gather constants or should we disable them both?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 09/27] i965: Allocate space on the gather pool for plain uniforms

2015-05-07 Thread Pohjolainen, Topi

On Tue, Apr 28, 2015 at 11:08:06PM +0300, Abdiel Janulgue wrote:
> Reserve space in the gather pool where the gathered uniforms are flushed.
> 
> Signed-off-by: Abdiel Janulgue 
> ---
>  src/mesa/drivers/dri/i965/gen6_vs_state.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen6_vs_state.c 
> b/src/mesa/drivers/dri/i965/gen6_vs_state.c
> index 35d10ef..aebaa49 100644
> --- a/src/mesa/drivers/dri/i965/gen6_vs_state.c
> +++ b/src/mesa/drivers/dri/i965/gen6_vs_state.c
> @@ -120,6 +120,14 @@ gen6_upload_push_constants(struct brw_context *brw,
> */
>assert(stage_state->push_const_size <= 32);
> }
> +   /* Allocate gather pool space for uniform and UBO entries in 512-bit 
> chunks*/
> +   if (brw->gather_pool.bo != NULL) {
> +  if (prog_data->nr_params > 0) {

I guess you combine these conditions:

  if (brw->gather_pool.bo != NULL && prog_data->nr_params > 0)

Or even bail out early:

  if (brw->gather_pool.bo == NULL || prog_data->nr_params == 0)
 return;

> + int num_consts = ALIGN(prog_data->nr_params, 4) / 4;

This could be const, no big deal though.

> + stage_state->push_const_offset = brw->gather_pool.next_offset;
> + brw->gather_pool.next_offset += (ALIGN(num_consts, 4) / 4) * 64;
> +  }
> +   }
>  }
>  
>  static void
> -- 
> 1.9.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 09/27] i965: Allocate space on the gather pool for plain uniforms

2015-05-07 Thread Pohjolainen, Topi

On Thu, May 07, 2015 at 05:52:12PM +0300, Pohjolainen, Topi wrote:
> On Tue, Apr 28, 2015 at 11:08:06PM +0300, Abdiel Janulgue wrote:
> > Reserve space in the gather pool where the gathered uniforms are flushed.
> > 
> > Signed-off-by: Abdiel Janulgue 
> > ---
> >  src/mesa/drivers/dri/i965/gen6_vs_state.c | 8 
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/gen6_vs_state.c 
> > b/src/mesa/drivers/dri/i965/gen6_vs_state.c
> > index 35d10ef..aebaa49 100644
> > --- a/src/mesa/drivers/dri/i965/gen6_vs_state.c
> > +++ b/src/mesa/drivers/dri/i965/gen6_vs_state.c
> > @@ -120,6 +120,14 @@ gen6_upload_push_constants(struct brw_context *brw,
> > */
> >assert(stage_state->push_const_size <= 32);
> > }
> > +   /* Allocate gather pool space for uniform and UBO entries in 512-bit 
> > chunks*/
> > +   if (brw->gather_pool.bo != NULL) {
> > +  if (prog_data->nr_params > 0) {
> 
> I guess you combine these conditions:
> 
>   if (brw->gather_pool.bo != NULL && prog_data->nr_params > 0)
> 
> Or even bail out early:

Newermind, you modify it even further in the next patch.

> 
>   if (brw->gather_pool.bo == NULL || prog_data->nr_params == 0)
>  return;
> 
> > + int num_consts = ALIGN(prog_data->nr_params, 4) / 4;
> 
> This could be const, no big deal though.
> 
> > + stage_state->push_const_offset = brw->gather_pool.next_offset;
> > + brw->gather_pool.next_offset += (ALIGN(num_consts, 4) / 4) * 64;
> > +  }
> > +   }
> >  }
> >  
> >  static void
> > -- 
> > 1.9.1
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 09/27] i965: Allocate space on the gather pool for plain uniforms

2015-05-07 Thread Ilia Mirkin

On Thu, May 7, 2015 at 10:52 AM, Pohjolainen, Topi
 wrote:
> On Tue, Apr 28, 2015 at 11:08:06PM +0300, Abdiel Janulgue wrote:
>> Reserve space in the gather pool where the gathered uniforms are flushed.
>>
>> Signed-off-by: Abdiel Janulgue 
>> ---
>>  src/mesa/drivers/dri/i965/gen6_vs_state.c | 8 
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/gen6_vs_state.c 
>> b/src/mesa/drivers/dri/i965/gen6_vs_state.c
>> index 35d10ef..aebaa49 100644
>> --- a/src/mesa/drivers/dri/i965/gen6_vs_state.c
>> +++ b/src/mesa/drivers/dri/i965/gen6_vs_state.c
>> @@ -120,6 +120,14 @@ gen6_upload_push_constants(struct brw_context *brw,
>> */
>>assert(stage_state->push_const_size <= 32);
>> }
>> +   /* Allocate gather pool space for uniform and UBO entries in 512-bit 
>> chunks*/
>> +   if (brw->gather_pool.bo != NULL) {
>> +  if (prog_data->nr_params > 0) {
>
> I guess you combine these conditions:
>
>   if (brw->gather_pool.bo != NULL && prog_data->nr_params > 0)
>
> Or even bail out early:
>
>   if (brw->gather_pool.bo == NULL || prog_data->nr_params == 0)
>  return;
>
>> + int num_consts = ALIGN(prog_data->nr_params, 4) / 4;
>
> This could be const, no big deal though.

And it could be DIV_ROUND_UP...

>
>> + stage_state->push_const_offset = brw->gather_pool.next_offset;
>> + brw->gather_pool.next_offset += (ALIGN(num_consts, 4) / 4) * 64;
>> +  }
>> +   }
>>  }
>>
>>  static void
>> --
>> 1.9.1
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/7] i965: Move texture buffer dispatch into single location

2015-05-07 Thread Francisco Jerez

"Pohjolainen, Topi"  writes:

> On Thu, May 07, 2015 at 05:15:35PM +0300, Francisco Jerez wrote:
>> From: Topi Pohjolainen 
>> 
>> All generations do the same exact dispatch and it could be
>> therefore done in the hardware independent stage.
>> 
>> Reviewed-by: Matt Turner 
>> Reviewed-by: Kenneth Graunke 
>> Signed-off-by: Topi Pohjolainen 
>> [ Francisco Jerez: Non-trivial rebase. ]
>> Reviewed-by: Francisco Jerez 
>> ---
>>  src/mesa/drivers/dri/i965/brw_context.h   |  3 -
>>  src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 31 ++
>>  src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 69 
>> +++
>>  src/mesa/drivers/dri/i965/gen8_surface_state.c| 67 
>> ++
>>  4 files changed, 83 insertions(+), 87 deletions(-)
>> 
>> diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
>> b/src/mesa/drivers/dri/i965/brw_context.h
>> index 2fcdcfa..a6282f4 100644
>> --- a/src/mesa/drivers/dri/i965/brw_context.h
>> +++ b/src/mesa/drivers/dri/i965/brw_context.h
>> @@ -1698,9 +1698,6 @@ void brw_create_constant_surface(struct brw_context 
>> *brw,
>>   uint32_t size,
>>   uint32_t *out_offset,
>>   bool dword_pitch);
>> -void brw_update_buffer_texture_surface(struct gl_context *ctx,
>> -   unsigned unit,
>> -   uint32_t *surf_offset);
>>  void
>>  brw_update_sol_surface(struct brw_context *brw,
>> struct gl_buffer_object *buffer_obj,
>> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
>> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>> index 160dd2f..2b8040c 100644
>> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
>> @@ -274,10 +274,10 @@ gen4_emit_buffer_surface_state(struct brw_context *brw,
>> }
>>  }
>>  
>> -void
>> -brw_update_buffer_texture_surface(struct gl_context *ctx,
>> -  unsigned unit,
>> -  uint32_t *surf_offset)
>> +static void
>> +update_buffer_texture_surface(struct gl_context *ctx,
>> +  unsigned unit,
>> +  uint32_t *surf_offset)
>>  {
>> struct brw_context *brw = brw_context(ctx);
>> struct gl_texture_object *tObj = ctx->Texture.Unit[unit]._Current;
>> @@ -320,12 +320,6 @@ brw_update_texture_surface(struct gl_context *ctx,
>> struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit);
>> uint32_t *surf;
>>  
>> -   /* BRW_NEW_TEXTURE_BUFFER */
>> -   if (tObj->Target == GL_TEXTURE_BUFFER) {
>> -  brw_update_buffer_texture_surface(ctx, unit, surf_offset);
>> -  return;
>> -   }
>> -
>> surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
>>6 * 4, 32, surf_offset);
>>  
>> @@ -795,6 +789,21 @@ const struct brw_tracked_state 
>> gen6_renderbuffer_surfaces = {
>> .emit = update_renderbuffer_surfaces,
>>  };
>>  
>> +static void
>> +update_texture_surface(struct gl_context *ctx,
>> +   unsigned unit,
>> +   uint32_t *surf_offset,
>> +   bool for_gather)
>> +{
>> +   struct brw_context *brw = brw_context(ctx);
>> +   struct gl_texture_object *obj = ctx->Texture.Unit[unit]._Current;
>> +
>> +   if (obj->Target == GL_TEXTURE_BUFFER) {
>> +  update_buffer_texture_surface(ctx, unit, surf_offset);
>
> In order to avoid extra level of indentation I used the following. I would
> have preferred it here also.
>
>   if (obj->Target == GL_TEXTURE_BUFFER) {
>  update_buffer_texture_surface(ctx, unit, surf_offset);
>  return;
>   }
>
I kept this as an indented block because it's harmless IMHO and it
seemed a somewhat lesser evil than:
1/ Define all texture-specific variables (i.e. things that are not
   applicable to buffer textures, including some pointer dereferences)
   at the top level, which is what you did, but it seemed a bit dodgy.
2/ Mix statements and declarations. (Granted, this file is likely
   already relying on other C99 features, so it wouldn't matter in
   practice, it's just a codestyle itch)
3/ Declare stuff and leave it uninitialized until later.

That said, the reason was largely subjective, and I don't really have a
strong preference.  As you are still the author of this commit you're
free to format it as you wish, you can keep my R-b if you simply
reindent this function.

>> +   } else {
>> +  brw->vtbl.update_texture_surface(ctx, unit, surf_offset, for_gather);
>> +   }
>> +}
>>  
>>  static void
>>  update_stage_texture_surfaces(struct brw_context *brw,
>> @@ -824,7 +833,7 @@ update_stage_texture_surfaces(struct brw_context *brw,
>>  
>>   /* _NEW_TEXTURE */
>>   if (ctx->Texture.Unit[unit]._Current) {
>> -brw->vtbl.update_texture_surface(ctx, unit, surf_off

Re: [Mesa-dev] [PATCH 1/7] i965: Move texture buffer dispatch into single location

2015-05-07 Thread Pohjolainen, Topi

On Thu, May 07, 2015 at 05:55:48PM +0300, Francisco Jerez wrote:
> "Pohjolainen, Topi"  writes:
> 
> > On Thu, May 07, 2015 at 05:15:35PM +0300, Francisco Jerez wrote:
> >> From: Topi Pohjolainen 
> >> 
> >> All generations do the same exact dispatch and it could be
> >> therefore done in the hardware independent stage.
> >> 
> >> Reviewed-by: Matt Turner 
> >> Reviewed-by: Kenneth Graunke 
> >> Signed-off-by: Topi Pohjolainen 
> >> [ Francisco Jerez: Non-trivial rebase. ]
> >> Reviewed-by: Francisco Jerez 
> >> ---
> >>  src/mesa/drivers/dri/i965/brw_context.h   |  3 -
> >>  src/mesa/drivers/dri/i965/brw_wm_surface_state.c  | 31 ++
> >>  src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 69 
> >> +++
> >>  src/mesa/drivers/dri/i965/gen8_surface_state.c| 67 
> >> ++
> >>  4 files changed, 83 insertions(+), 87 deletions(-)
> >> 
> >> diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
> >> b/src/mesa/drivers/dri/i965/brw_context.h
> >> index 2fcdcfa..a6282f4 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_context.h
> >> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> >> @@ -1698,9 +1698,6 @@ void brw_create_constant_surface(struct brw_context 
> >> *brw,
> >>   uint32_t size,
> >>   uint32_t *out_offset,
> >>   bool dword_pitch);
> >> -void brw_update_buffer_texture_surface(struct gl_context *ctx,
> >> -   unsigned unit,
> >> -   uint32_t *surf_offset);
> >>  void
> >>  brw_update_sol_surface(struct brw_context *brw,
> >> struct gl_buffer_object *buffer_obj,
> >> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
> >> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> >> index 160dd2f..2b8040c 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> >> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> >> @@ -274,10 +274,10 @@ gen4_emit_buffer_surface_state(struct brw_context 
> >> *brw,
> >> }
> >>  }
> >>  
> >> -void
> >> -brw_update_buffer_texture_surface(struct gl_context *ctx,
> >> -  unsigned unit,
> >> -  uint32_t *surf_offset)
> >> +static void
> >> +update_buffer_texture_surface(struct gl_context *ctx,
> >> +  unsigned unit,
> >> +  uint32_t *surf_offset)
> >>  {
> >> struct brw_context *brw = brw_context(ctx);
> >> struct gl_texture_object *tObj = ctx->Texture.Unit[unit]._Current;
> >> @@ -320,12 +320,6 @@ brw_update_texture_surface(struct gl_context *ctx,
> >> struct gl_sampler_object *sampler = _mesa_get_samplerobj(ctx, unit);
> >> uint32_t *surf;
> >>  
> >> -   /* BRW_NEW_TEXTURE_BUFFER */
> >> -   if (tObj->Target == GL_TEXTURE_BUFFER) {
> >> -  brw_update_buffer_texture_surface(ctx, unit, surf_offset);
> >> -  return;
> >> -   }
> >> -
> >> surf = brw_state_batch(brw, AUB_TRACE_SURFACE_STATE,
> >>  6 * 4, 32, surf_offset);
> >>  
> >> @@ -795,6 +789,21 @@ const struct brw_tracked_state 
> >> gen6_renderbuffer_surfaces = {
> >> .emit = update_renderbuffer_surfaces,
> >>  };
> >>  
> >> +static void
> >> +update_texture_surface(struct gl_context *ctx,
> >> +   unsigned unit,
> >> +   uint32_t *surf_offset,
> >> +   bool for_gather)
> >> +{
> >> +   struct brw_context *brw = brw_context(ctx);
> >> +   struct gl_texture_object *obj = ctx->Texture.Unit[unit]._Current;
> >> +
> >> +   if (obj->Target == GL_TEXTURE_BUFFER) {
> >> +  update_buffer_texture_surface(ctx, unit, surf_offset);
> >
> > In order to avoid extra level of indentation I used the following. I would
> > have preferred it here also.
> >
> >   if (obj->Target == GL_TEXTURE_BUFFER) {
> >  update_buffer_texture_surface(ctx, unit, surf_offset);
> >  return;
> >   }
> >
> I kept this as an indented block because it's harmless IMHO and it
> seemed a somewhat lesser evil than:
> 1/ Define all texture-specific variables (i.e. things that are not
>applicable to buffer textures, including some pointer dereferences)
>at the top level, which is what you did, but it seemed a bit dodgy.
> 2/ Mix statements and declarations. (Granted, this file is likely
>already relying on other C99 features, so it wouldn't matter in
>practice, it's just a codestyle itch)
> 3/ Declare stuff and leave it uninitialized until later.
> 
> That said, the reason was largely subjective, and I don't really have a
> strong preference.  As you are still the author of this commit you're
> free to format it as you wish, you can keep my R-b if you simply
> reindent this function.

If Ken and Matt are happy with this series, so am I. I'm just glad if we
can land it.
___
mesa

Re: [Mesa-dev] [PATCH 03/27] i965: Enable hardware-generated binding tables on render path.

2015-05-07 Thread Pohjolainen, Topi

On Thu, May 07, 2015 at 04:43:21PM +0300, Pohjolainen, Topi wrote:
> On Tue, Apr 28, 2015 at 11:08:00PM +0300, Abdiel Janulgue wrote:
> > This patch implements the binding table enable command which is also
> > used to allocate a binding table pool where hardware-generated
> > binding table entries are flushed into. Each binding table offset in
> > the binding table pool is unique per each shader stage that are
> > enabled within a batch.
> > 
> > Also insert the required brw_tracked_state objects to enable
> > hw-generated binding tables in normal render path.
> > 
> > Signed-off-by: Abdiel Janulgue 
> > ---
> >  src/mesa/drivers/dri/i965/brw_binding_tables.c | 70 
> > ++
> >  src/mesa/drivers/dri/i965/brw_context.c|  4 ++
> >  src/mesa/drivers/dri/i965/brw_context.h|  5 ++
> >  src/mesa/drivers/dri/i965/brw_state.h  |  7 +++
> >  src/mesa/drivers/dri/i965/brw_state_upload.c   |  2 +
> >  src/mesa/drivers/dri/i965/intel_batchbuffer.c  |  4 ++
> >  6 files changed, 92 insertions(+)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_binding_tables.c 
> > b/src/mesa/drivers/dri/i965/brw_binding_tables.c
> > index 459165a..a58e32e 100644
> > --- a/src/mesa/drivers/dri/i965/brw_binding_tables.c
> > +++ b/src/mesa/drivers/dri/i965/brw_binding_tables.c
> > @@ -44,6 +44,11 @@
> >  #include "brw_state.h"
> >  #include "intel_batchbuffer.h"
> >  
> > +/* Somehow the hw-binding table pool offset must start here, otherwise
> > + * the GPU will hang
> > + */
> > +#define HW_BT_START_OFFSET 256;
> 
> I think we want to understand this a little better before enabling...
> 
> > +
> >  /**
> >   * Upload a shader stage's binding table as indirect state.
> >   *
> > @@ -163,6 +168,71 @@ const struct brw_tracked_state brw_gs_binding_table = {
> > .emit = brw_gs_upload_binding_table,
> >  };
> >  
> > +/**
> > + * Hardware-generated binding tables for the resource streamer
> > + */
> > +void
> > +gen7_disable_hw_binding_tables(struct brw_context *brw)
> > +{
> > +   BEGIN_BATCH(3);
> > +   OUT_BATCH(_3DSTATE_BINDING_TABLE_POOL_ALLOC << 16 | (3 - 2));
> > +   OUT_BATCH(SET_FIELD(BRW_HW_BINDING_TABLE_OFF, 
> > BRW_HW_BINDING_TABLE_ENABLE) |
> > + brw->is_haswell ? HSW_HW_BINDING_TABLE_RESERVED : 0);
> > +   OUT_BATCH(0);
> > +   ADVANCE_BATCH();
> > +
> > +   /* Pipe control workaround */
> > +   brw_emit_pipe_control_flush(brw, PIPE_CONTROL_STATE_CACHE_INVALIDATE);
> > +}
> > +
> > +void
> > +gen7_enable_hw_binding_tables(struct brw_context *brw)
> > +{
> > +   if (!brw->has_resource_streamer) {
> > +  gen7_disable_hw_binding_tables(brw);
> 
> I started wondering why we really need this - RS is disabled by default and
> we haven't needed to do anything to disable it before.
> 
> > +  return;
> > +   }
> > +
> > +   if (!brw->hw_bt_pool.bo) {
> > +  /* From the BSpec, 3D Pipeline > Resource Streamer > Hardware 
> > Binding Tables:
> > +   *
> > +   *  "A maximum of 16,383 Binding tables are allowed in any batch 
> > buffer."
> > +   */
> > +  int max_size = 16383 * 4;
> 
> But does it really need this much all the time? I guess I need to go and
> read the spec.

I haven't read through the entire series but it seems that we can calculate
(at least for gather constants) pretty accurately how much we need space.
Could we do it also here based on the program data of all stages? I maybe
missing something and just throwing questions up in the air, so bare with me...
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 11/27] i965: Store gather table information in the program data

2015-05-07 Thread Pohjolainen, Topi

On Tue, Apr 28, 2015 at 11:08:08PM +0300, Abdiel Janulgue wrote:
> The resource streamer is able to gather and pack sparsely-located
> constant data from any buffer object by a referring to a gather table
> This patch adds support for keeping track of these constant data
> fetches into a gather table.
> 
> The gather table is generated from two sources. Ordinary uniform fetches
> are stored first. These are then combined with a separate table containing
> UBO entries. The separate entry for UBOs is needed to make it easier to
> generate the gather mask when combining and packing the constant data.
> 
> Signed-off-by: Abdiel Janulgue 
> ---
>  src/mesa/drivers/dri/i965/brw_context.h  |  9 +
>  src/mesa/drivers/dri/i965/brw_gs.c   |  4 
>  src/mesa/drivers/dri/i965/brw_program.c  |  5 +
>  src/mesa/drivers/dri/i965/brw_shader.cpp |  4 +++-
>  src/mesa/drivers/dri/i965/brw_shader.h   | 11 +++
>  src/mesa/drivers/dri/i965/brw_vs.c   |  5 +
>  src/mesa/drivers/dri/i965/brw_wm.c   |  5 +
>  7 files changed, 42 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
> b/src/mesa/drivers/dri/i965/brw_context.h
> index 7fd49e9..e25c64d 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -355,9 +355,12 @@ struct brw_stage_prog_data {
>  
> GLuint nr_params;   /**< number of float params/constants */
> GLuint nr_pull_params;
> +   GLuint nr_ubo_params;
> +   GLuint nr_gather_table;

I would introduce these as non gl-types - we are trying to move away from
them. Perhaps change "nr_params" and "nr_pull_params" while you are at it.

>  
> unsigned curb_read_length;
> unsigned total_scratch;
> +   unsigned max_ubo_const_block;
>  
> /**
>  * Register where the thread expects to find input data from the URB
> @@ -375,6 +378,12 @@ struct brw_stage_prog_data {
>  */
> const gl_constant_value **param;
> const gl_constant_value **pull_param;
> +   struct {
> +  int reg;
> +  unsigned channel_mask;
> +  unsigned const_block;
> +  unsigned const_offset;
> +   } *gather_table;
>  };

Below in brw_shader.h you do:

   struct gather_table {
  int reg;
  unsigned channel_mask;
  unsigned const_block;
  unsigned const_offset;
   };
   gather_table *ubo_gather_table;

Why not here?

>  
>  /* Data about a particular attempt to compile a program.  Note that
> diff --git a/src/mesa/drivers/dri/i965/brw_gs.c 
> b/src/mesa/drivers/dri/i965/brw_gs.c
> index bea90d8..97658d5 100644
> --- a/src/mesa/drivers/dri/i965/brw_gs.c
> +++ b/src/mesa/drivers/dri/i965/brw_gs.c
> @@ -70,6 +70,10 @@ brw_compile_gs_prog(struct brw_context *brw,
> c.prog_data.base.base.pull_param =
>rzalloc_array(NULL, const gl_constant_value *, param_count);
> c.prog_data.base.base.nr_params = param_count;
> +   c.prog_data.base.base.nr_gather_table = 0;
> +   c.prog_data.base.base.gather_table =
> +  rzalloc_size(NULL, sizeof(*c.prog_data.base.base.gather_table) *
> +   (c.prog_data.base.base.nr_params + 
> c.prog_data.base.base.nr_ubo_params));

Wrap this line.

>  
> if (brw->gen >= 7) {
>if (gp->program.OutputType == GL_POINTS) {
> diff --git a/src/mesa/drivers/dri/i965/brw_program.c 
> b/src/mesa/drivers/dri/i965/brw_program.c
> index 81a0c19..f27c799 100644
> --- a/src/mesa/drivers/dri/i965/brw_program.c
> +++ b/src/mesa/drivers/dri/i965/brw_program.c
> @@ -573,6 +573,10 @@ brw_stage_prog_data_compare(const struct 
> brw_stage_prog_data *a,
> if (memcmp(a->pull_param, b->pull_param, a->nr_pull_params * sizeof(void 
> *)))
>return false;
>  
> +   if (memcmp(a->gather_table, b->gather_table,
> +  a->nr_gather_table * sizeof(*a->gather_table)))
> +  return false;
> +
> return true;
>  }
>  
> @@ -583,6 +587,7 @@ brw_stage_prog_data_free(const void *p)
>  
> ralloc_free(prog_data->param);
> ralloc_free(prog_data->pull_param);
> +   ralloc_free(prog_data->gather_table);
>  }
>  
>  void
> diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
> b/src/mesa/drivers/dri/i965/brw_shader.cpp
> index 0d6ac0c..8769f67 100644
> --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
> @@ -739,11 +739,13 @@ backend_visitor::backend_visitor(struct brw_context 
> *brw,
>   prog(prog),
>   stage_prog_data(stage_prog_data),
>   cfg(NULL),
> - stage(stage)
> + stage(stage),
> + ubo_gather_table(NULL)
>  {
> debug_enabled = INTEL_DEBUG & intel_debug_flag_for_shader_stage(stage);
> stage_name = _mesa_shader_stage_to_string(stage);
> stage_abbrev = _mesa_shader_stage_to_abbrev(stage);
> +   this->nr_ubo_gather_table = 0;

Any particular reason not to do this in the initializer along with the other
members?

>  }
>  
>  bool
> diff --git a/src/mesa/drivers/dri/i965/brw_shader.h 
> b/src/mesa/drivers/dri/i965/brw_shader.h
> i

Re: [Mesa-dev] [PATCH 12/27] i965: Assign hw-binding table index for each UBO constant buffer.

2015-05-07 Thread Pohjolainen, Topi

On Tue, Apr 28, 2015 at 11:08:09PM +0300, Abdiel Janulgue wrote:
> To be able to refer to a constant buffer, the resource streamer needs
> to index it with a hardware binding table entry. This blankets the ubo
> buffers with hardware binding table indices.
> 
> Gather constants hardware fetches in 16-entry binding table blocks.
> So we need to use a block that is unused.
> 
> Signed-off-by: Abdiel Janulgue 
> ---
>  src/mesa/drivers/dri/i965/brw_context.h  | 11 +++
>  src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  6 ++
>  2 files changed, 17 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
> b/src/mesa/drivers/dri/i965/brw_context.h
> index e25c64d..276c359 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -678,6 +678,17 @@ struct brw_vs_prog_data {
>  
>  #define SURF_INDEX_GEN6_SOL_BINDING(t) (t)
>  
> +/** Start of hardware binding table index for uniform gather constant 
> entries.
> + *  This must be aligned to the start of a hardware binding table block (a 
> block
> + *  is a group 16 binding table entries).
> + */
> +#define BRW_UNIFORM_GATHER_INDEX_START 32
> +
> +/** Appended to the end of the binding table index for uniform constant 
> buffers to indicate

Wrap this line.

> + *  start of the UBO gather constant binding table.
> + */
> +#define BRW_UBO_GATHER_INDEX_APPEND 2
> +
>  /* Note: brw_gs_prog_data_compare() must be updated when adding fields to
>   * this struct!
>   */
> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> index 161d140..ce61554 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> @@ -884,6 +884,7 @@ brw_upload_ubo_surfaces(struct brw_context *brw,
>  
> uint32_t *surf_offsets =
>&stage_state->surf_offset[prog_data->binding_table.ubo_start];
> +   bool use_gather = (brw->gather_pool.bo != NULL);

I would move this closer to the only use. This won't get re-used in the
rest of the series.

>  
> for (int i = 0; i < shader->NumUniformBlocks; i++) {
>struct gl_uniform_buffer_binding *binding;
> @@ -904,6 +905,11 @@ brw_upload_ubo_surfaces(struct brw_context *brw,
>bo->size - binding->Offset,
>&surf_offsets[i],
>dword_pitch);
> +  if (use_gather) {

Or simply:

 if (brw->gather_pool.bo) {

> + int bt_idx = BRW_UNIFORM_GATHER_INDEX_START + 
> BRW_UBO_GATHER_INDEX_APPEND + i;

Wrap this line.

> + gen7_update_binding_table(brw, stage_state->stage,
> +   bt_idx, surf_offsets[i]);
> +  }
> }
>  
> if (shader->NumUniformBlocks)
> -- 
> 1.9.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 13/27] i965: Assign hw-binding table index for uniform constant buffer block

2015-05-07 Thread Pohjolainen, Topi

On Tue, Apr 28, 2015 at 11:08:10PM +0300, Abdiel Janulgue wrote:
> Assign the uploaded uniform block with hardware binding table indices.
> This is indexed by the resource streamer to fetch the constant buffers
> referred to by our gather table entries.
> 
> Signed-off-by: Abdiel Janulgue 
> ---
>  src/mesa/drivers/dri/i965/gen6_vs_state.c | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen6_vs_state.c 
> b/src/mesa/drivers/dri/i965/gen6_vs_state.c
> index 7325c6e..bce597f 100644
> --- a/src/mesa/drivers/dri/i965/gen6_vs_state.c
> +++ b/src/mesa/drivers/dri/i965/gen6_vs_state.c
> @@ -72,9 +72,16 @@ gen6_upload_push_constants(struct brw_context *brw,
>gl_constant_value *param;
>int i;
>  
> -  param = brw_state_batch(brw, type,
> -   prog_data->nr_params * sizeof(gl_constant_value),
> +  uint32_t size = prog_data->nr_params * sizeof(gl_constant_value);

Const would be nice here.

> +  param = brw_state_batch(brw, type, size,
> 32, &stage_state->push_const_offset);
> +  if (brw->gather_pool.bo != NULL) {
> + uint32_t surf_offset = 0;
> + brw_create_constant_surface(brw, brw->batch.bo, 
> stage_state->push_const_offset,
> + size, &surf_offset, false);
> + gen7_update_binding_table(brw, stage_state->stage, 
> BRW_UNIFORM_GATHER_INDEX_START,

Two lines overflowing 80 columns.

> +   surf_offset);
> +  }
>  
>STATIC_ASSERT(sizeof(gl_constant_value) == sizeof(float));
>  
> -- 
> 1.9.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 16/27] i965: Include UBO parameter sizes in push constant parameters

2015-05-07 Thread Pohjolainen, Topi

On Tue, Apr 28, 2015 at 11:08:13PM +0300, Abdiel Janulgue wrote:
> Now that we consider UBO constants as push constants, we need to include
> the sizes of the UBO's constant slots in the visitor's uniform slot sizes.
> This information is needed to properly pack vector constants tightly next to
> each other.
> 
> Signed-off-by: Abdiel Janulgue 
> ---
>  src/mesa/drivers/dri/i965/brw_gs.c | 11 +++
>  src/mesa/drivers/dri/i965/brw_vs.c | 13 +
>  src/mesa/drivers/dri/i965/brw_wm.c | 13 +
>  3 files changed, 37 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_gs.c 
> b/src/mesa/drivers/dri/i965/brw_gs.c
> index 97658d5..2dc3ea1 100644
> --- a/src/mesa/drivers/dri/i965/brw_gs.c
> +++ b/src/mesa/drivers/dri/i965/brw_gs.c
> @@ -32,6 +32,7 @@
>  #include "brw_vec4_gs_visitor.h"
>  #include "brw_state.h"
>  #include "brw_ff_gs.h"
> +#include "glsl/nir/nir_types.h"
>  
>  
>  bool
> @@ -70,6 +71,16 @@ brw_compile_gs_prog(struct brw_context *brw,
> c.prog_data.base.base.pull_param =
>rzalloc_array(NULL, const gl_constant_value *, param_count);
> c.prog_data.base.base.nr_params = param_count;
> +   c.prog_data.base.base.nr_ubo_params = 0;
> +   for (int i = 0; i < gs->NumUniformBlocks; i++) {
> +  for (int p = 0; p < gs->UniformBlocks[i].NumUniforms; p++) {
> + const struct glsl_type *type = 
> gs->UniformBlocks[i].Uniforms[p].Type;
> + const struct glsl_type *elem = glsl_get_element_type(type);
> + int array_sz = elem ? glsl_get_array_size(type) : 1;
> + int components = elem ? glsl_get_components(elem) : 
> glsl_get_components(type);
> + c.prog_data.base.base.nr_ubo_params += components * array_sz;
> +  }
> +   }
> c.prog_data.base.base.nr_gather_table = 0;
> c.prog_data.base.base.gather_table =
>rzalloc_size(NULL, sizeof(*c.prog_data.base.base.gather_table) *
> diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
> b/src/mesa/drivers/dri/i965/brw_vs.c
> index 52333c9..86bef5e 100644
> --- a/src/mesa/drivers/dri/i965/brw_vs.c
> +++ b/src/mesa/drivers/dri/i965/brw_vs.c
> @@ -37,6 +37,7 @@
>  #include "brw_state.h"
>  #include "program/prog_print.h"
>  #include "program/prog_parameter.h"
> +#include "glsl/nir/nir_types.h"
>  
>  #include "util/ralloc.h"
>  
> @@ -243,6 +244,18 @@ brw_compile_vs_prog(struct brw_context *brw,
>rzalloc_array(NULL, const gl_constant_value *, param_count);
> stage_prog_data->nr_params = param_count;
>  
> +   stage_prog_data->nr_ubo_params = 0;
> +   if (vs) {
> +  for (int i = 0; i < vs->NumUniformBlocks; i++) {
> + for (int p = 0; p < vs->UniformBlocks[i].NumUniforms; p++) {
> +const struct glsl_type *type = 
> vs->UniformBlocks[i].Uniforms[p].Type;
> +const struct glsl_type *elem = glsl_get_element_type(type);
> +int array_sz = elem ? glsl_get_array_size(type) : 1;
> +int components = elem ? glsl_get_components(elem) : 
> glsl_get_components(type);
> +stage_prog_data->nr_ubo_params += components * array_sz;
> + }
> +  }
> +   }
> stage_prog_data->nr_gather_table = 0;
> stage_prog_data->gather_table = rzalloc_size(NULL, 
> sizeof(*stage_prog_data->gather_table) *
>  (stage_prog_data->nr_params +
> diff --git a/src/mesa/drivers/dri/i965/brw_wm.c 
> b/src/mesa/drivers/dri/i965/brw_wm.c
> index 13a64d8..2060eab 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm.c
> @@ -38,6 +38,7 @@
>  #include "main/samplerobj.h"
>  #include "program/prog_parameter.h"
>  #include "program/program.h"
> +#include "glsl/nir/nir_types.h"
>  #include "intel_mipmap_tree.h"
>  
>  #include "util/ralloc.h"
> @@ -205,6 +206,18 @@ brw_compile_wm_prog(struct brw_context *brw,
>rzalloc_array(NULL, const gl_constant_value *, param_count);
> prog_data.base.nr_params = param_count;
>  
> +   prog_data.base.nr_ubo_params = 0;
> +   if (fs) {
> +  for (int i = 0; i < fs->NumUniformBlocks; i++) {
> + for (int p = 0; p < fs->UniformBlocks[i].NumUniforms; p++) {
> +const struct glsl_type *type = 
> fs->UniformBlocks[i].Uniforms[p].Type;
> +const struct glsl_type *elem = glsl_get_element_type(type);
> +int array_sz = elem ? glsl_get_array_size(type) : 1;
> +int components = elem ? glsl_get_components(elem) : 
> glsl_get_components(type);
> +prog_data.base.nr_ubo_params += components * array_sz;
> + }
> +  }
> +   }

I didn't check for exact details but looks to me you could refactor this
into its own routine - all three occurences look awfully similar.

> prog_data.base.nr_gather_table = 0;
> prog_data.base.gather_table = rzalloc_size(NULL, 
> sizeof(*prog_data.base.gather_table) *
>(prog_data.base.nr_params +
> -- 
> 1.9.1
> 
> __

Re: [Mesa-dev] [PATCH 18/27] i965/fs: Append ir_binop_ubo_load entries to the gather table

2015-05-07 Thread Pohjolainen, Topi

On Tue, Apr 28, 2015 at 11:08:15PM +0300, Abdiel Janulgue wrote:
> When the const block and offset are immediate values. Otherwise just
> fall-back to the previous method of uploading the UBO constant data to
> GRF using pull constants.
> 
> Signed-off-by: Abdiel Janulgue 
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 11 
>  src/mesa/drivers/dri/i965/brw_fs.h   |  4 ++
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 86 
> +++-
>  3 files changed, 100 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 071ac59..031d807 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -2273,6 +2273,7 @@ fs_visitor::assign_constant_locations()
> }
>  
> stage_prog_data->nr_params = 0;
> +   stage_prog_data->nr_ubo_params = ubo_uniforms;
>  
> unsigned const_reg_access[uniforms];
> memset(const_reg_access, 0, sizeof(const_reg_access));
> @@ -2302,6 +2303,16 @@ fs_visitor::assign_constant_locations()
>stage_prog_data->gather_table[p].channel_mask =
>   const_reg_access[i];
> }
> +
> +   for (unsigned i = 0; i < this->nr_ubo_gather_table; i++) {
> +  int p = stage_prog_data->nr_gather_table++;
> +  stage_prog_data->gather_table[p].reg = this->ubo_gather_table[i].reg;
> +  stage_prog_data->gather_table[p].channel_mask = 
> this->ubo_gather_table[i].channel_mask;
> +  stage_prog_data->gather_table[p].const_block = 
> this->ubo_gather_table[i].const_block;
> +  stage_prog_data->gather_table[p].const_offset = 
> this->ubo_gather_table[i].const_offset;
> +  stage_prog_data->max_ubo_const_block = 
> MAX2(stage_prog_data->max_ubo_const_block,
> +  
> this->ubo_gather_table[i].const_block);

These are all overflowing 80 columns.

> +   }
>  }
>  
>  /**
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
> b/src/mesa/drivers/dri/i965/brw_fs.h
> index 32063f0..a48b2bb 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> @@ -417,6 +417,7 @@ public:
> void setup_uniform_values(ir_variable *ir);
> void setup_builtin_uniform_values(ir_variable *ir);
> int implied_mrf_writes(fs_inst *inst);
> +   bool generate_ubo_gather_table(ir_expression* ir);
>  
> virtual void dump_instructions();
> virtual void dump_instructions(const char *name);
> @@ -445,6 +446,9 @@ public:
> /** Total number of direct uniforms we can get from NIR */
> unsigned num_direct_uniforms;
>  
> +   /** Number of ubo uniform variable components visited. */
> +   unsigned ubo_uniforms;
> +
> /** Byte-offset for the next available spot in the scratch space buffer. 
> */
> unsigned last_scratch;
>  
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index 4e99366..11e608b 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> @@ -1179,11 +1179,18 @@ fs_visitor::visit(ir_expression *ir)
>emit(FS_OPCODE_PACK_HALF_2x16_SPLIT, this->result, op[0], op[1]);
>break;
> case ir_binop_ubo_load: {
> +  /* Use gather push constants if at all possible, otherwise just
> +   * fall back to pull constants for UBOs
> +   */
> +  if (generate_ubo_gather_table(ir))
> + break;
> +
>/* This IR node takes a constant uniform block and a constant or
> * variable byte offset within the block and loads a vector from that.
> */
>ir_constant *const_uniform_block = ir->operands[0]->as_constant();
>ir_constant *const_offset = ir->operands[1]->as_constant();
> +

Not part of this patch.

>fs_reg surf_index;
>  
>if (const_uniform_block) {
> @@ -4144,6 +4151,79 @@ fs_visitor::resolve_bool_comparison(ir_rvalue *rvalue, 
> fs_reg *reg)
> *reg = neg_result;
>  }
>  
> +bool
> +fs_visitor::generate_ubo_gather_table(ir_expression *ir)
> +{
> +   ir_constant *const_uniform_block = ir->operands[0]->as_constant();
> +   ir_constant *const_offset = ir->operands[1]->as_constant();

These are only used for reading, lets use constant pointers.

> +
> +   if (ir->operation != ir_binop_ubo_load ||
> +   !brw->has_resource_streamer||
> +   !brw->fs_ubo_gather||
> +   !const_uniform_block   ||

Not really the style used elsewhere, don't align "||".

> +   !const_offset)
> +  return false;
> +
> +  /* Only allow 16 registers (128 uniform components) as push constants.
> +   */

Move the comment closing to the previous line.

> +   unsigned int max_push_components = 16 * 8;
> +   unsigned param_index = uniforms + ubo_uniforms;

These could be both declared as const.

> +   if ((param_index + ir->type->vector_elements) >= max_push_components)
> +  return false;
> +
> +   fs_reg reg;
> +

Re: [Mesa-dev] [PATCH 19/27] i965/fs/nir: Append nir_intrinsic_load_ubo entries to the gather table

2015-05-07 Thread Pohjolainen, Topi

On Tue, Apr 28, 2015 at 11:08:16PM +0300, Abdiel Janulgue wrote:
> When the const block and offset are immediate values. Otherwise just
> fall-back to the previous method of uploading the UBO constant data to
> GRF using pull constants.
> 
> Signed-off-by: Abdiel Janulgue 
> ---
>  src/mesa/drivers/dri/i965/brw_fs.h   |  2 ++
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 59 
> 
>  2 files changed, 61 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
> b/src/mesa/drivers/dri/i965/brw_fs.h
> index a48b2bb..5247fa1 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> @@ -418,6 +418,8 @@ public:
> void setup_builtin_uniform_values(ir_variable *ir);
> int implied_mrf_writes(fs_inst *inst);
> bool generate_ubo_gather_table(ir_expression* ir);
> +   bool nir_generate_ubo_gather_table(nir_intrinsic_instr *instr, fs_reg 
> &dest,
> +  bool has_indirect);
>  
> virtual void dump_instructions();
> virtual void dump_instructions(const char *name);
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 3972581..b68f221 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -1377,6 +1377,9 @@ fs_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
> *instr)
>has_indirect = true;
>/* fallthrough */
> case nir_intrinsic_load_ubo: {
> +  if (nir_generate_ubo_gather_table(instr, dest, has_indirect))
> + break;
> +
>nir_const_value *const_index = nir_src_as_const_value(instr->src[0]);
>fs_reg surf_index;
>  
> @@ -1774,3 +1777,59 @@ fs_visitor::nir_emit_jump(nir_jump_instr *instr)
>unreachable("unknown jump");
> }
>  }
> +
> +bool
> +fs_visitor::nir_generate_ubo_gather_table(nir_intrinsic_instr *instr, fs_reg 
> &dest,
> +  bool has_indirect)
> +{
> +   nir_const_value *const_index = nir_src_as_const_value(instr->src[0]);

Used only for reading, const.

> +
> +   if (!const_index || has_indirect || !brw->fs_ubo_gather || 
> !brw->has_resource_streamer)

Wrap this line.

> +  return false;
> +
> +   /* Only allow 16 registers (128 uniform components) as push constants.
> +*/
> +   unsigned int max_push_components = 16 * 8;
> +   unsigned param_index = uniforms + ubo_uniforms;

These would be nicer as constants.

> +   if ((MAX2(param_index, num_direct_uniforms) +
> +instr->num_components) > max_push_components)
> +  return false;
> +
> +   fs_reg uniform_reg;
> +   if (dispatch_width == 16) {
> +  for (int i = 0; i < (int) this->nr_ubo_gather_table; i++) {

Extra space.

> + if ((this->ubo_gather_table[i].const_block ==
> +  const_index->u[0]) &&
> + (this->ubo_gather_table[i].const_offset ==
> +  (unsigned) instr->const_index[0])) {

Here also.

> +uniform_reg = fs_reg(UNIFORM, this->ubo_gather_table[i].reg);
> +break;
> + }
> +  }
> +  if (uniform_reg.file != UNIFORM) {
> + /* Unlikely but this means that SIMD8 wasn't able to allocate push 
> constant

Wrap this line.

> +  * registers for this ubo load. Fall back to pull-constant method.
> +  */
> + return false;
> +  }
> +   }
> +
> +   if (uniform_reg.file != UNIFORM) {
> +  uniform_reg = fs_reg(UNIFORM, param_index);
> +  int gather = this->nr_ubo_gather_table++;
> +
> +  assert(instr->num_components <= 4);
> +  ubo_uniforms += instr->num_components;
> +  this->ubo_gather_table[gather].reg = uniform_reg.reg;
> +  this->ubo_gather_table[gather].const_block = const_index->u[0];
> +  this->ubo_gather_table[gather].const_offset = instr->const_index[0];
> +   }
> +
> +   for (unsigned j = 0; j < instr->num_components; j++) {
> +  fs_reg src = offset(retype(uniform_reg, dest.type), j);
> +  emit(MOV(dest, src));
> +  dest = offset(dest, 1);
> +   }
> +
> +   return true;
> +}
> -- 
> 1.9.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965/fs: Don't forget the force_sechalf flag in lower_load_payload().

2015-05-07 Thread Francisco Jerez

Regression from commit 41868bb6824c6106a55c8442006c1e2215abf567.
Fixes a bunch of ARB_shader_image_load_store tests.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 7e4ead0..0a62e46 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -3512,6 +3512,7 @@ fs_visitor::lower_load_payload()
 fs_inst *mov = MOV(retype(dst, inst->src[i].type),
inst->src[i]);
 mov->force_writemask_all = inst->force_writemask_all;
+mov->force_sechalf = inst->force_sechalf;
 inst->insert_before(block, mov);
  }
  dst = offset(dst, 1);
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] clover: replace --enable-opencl-icd with --with-opencl-icd

2015-05-07 Thread Aaron Watry

I'm not sure what the final consensus will be on how to do this, but FWIW:
Tested-By: Aaron Watry 

I've tested this with 4 combinations:
no --with-opencl-icd option specified : libOpenCL.so gets installed in
${prefix}/lib
--with-opencl-icd=no : libOpenCL.so gets installed in ${prefix}/lib
--with-opencl-icd=standard : libMesaOpenCL.so installed in ${prefix}/lib,
icd in /etc/OpenCL/vendors/mesa.icd
--with-opencl-icd=sysconfdir : libMesaOpenCL.so installed in ${prefix}/lib,
icd in ${prefix}/etc//mesa.icd.  I only specified --prefix, no other
directories overridden in configure command.

--Aaron


On Wed, May 6, 2015 at 4:34 PM, EdB  wrote:

> The standard ICD file path is /etc/OpenCL/vendor/.
> However it doesn't fit well with custom build.
> This option allow ICD vendor file installation path override
> ---
>  configure.ac   | 46
> +++---
>  src/gallium/targets/opencl/Makefile.am |  2 +-
>  2 files changed, 33 insertions(+), 15 deletions(-)
>
> diff --git a/configure.ac b/configure.ac
> index 095e23e..90dba4e 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -804,12 +804,6 @@ AC_ARG_ENABLE([opencl],
>   [enable OpenCL library @<:@default=disabled@:>@])],
> [enable_opencl="$enableval"],
> [enable_opencl=no])
> -AC_ARG_ENABLE([opencl_icd],
> -   [AS_HELP_STRING([--enable-opencl-icd],
> -  [Build an OpenCL ICD library to be loaded by an ICD
> implementation
> -   @<:@default=disabled@:>@])],
> -[enable_opencl_icd="$enableval"],
> -[enable_opencl_icd=no])
>  AC_ARG_ENABLE([xlib-glx],
>  [AS_HELP_STRING([--enable-xlib-glx],
>  [make GLX library Xlib-based instead of DRI-based
> @<:@default=disabled@:>@])],
> @@ -1689,19 +1683,11 @@ if test "x$enable_opencl" = xyes; then
>  # XXX: Use $enable_shared_pipe_drivers once converted to use
> static/shared pipe-drivers
>  enable_gallium_loader=yes
>
> -if test "x$enable_opencl_icd" = xyes; then
> -OPENCL_LIBNAME="MesaOpenCL"
> -else
> -OPENCL_LIBNAME="OpenCL"
> -fi
> -
>  if test "x$have_libelf" != xyes; then
> AC_MSG_ERROR([Clover requires libelf])
>  fi
>  fi
>  AM_CONDITIONAL(HAVE_CLOVER, test "x$enable_opencl" = xyes)
> -AM_CONDITIONAL(HAVE_CLOVER_ICD, test "x$enable_opencl_icd" = xyes)
> -AC_SUBST([OPENCL_LIBNAME])
>
>  dnl
>  dnl Gallium configuration
> @@ -2006,6 +1992,38 @@ AC_ARG_WITH([d3d-libdir],
>  [D3D_DRIVER_INSTALL_DIR="${libdir}/d3d"])
>  AC_SUBST([D3D_DRIVER_INSTALL_DIR])
>
> +dnl OpenCL ICD
> +
> +AC_ARG_WITH([opencl-icd],
> +[AS_HELP_STRING([--with-opencl-icd=@<:@no,standard,sysconfdir@:>@],
> +[Build an OpenCL ICD library to be loaded by an ICD
> implementation.
> + If @<:@standard@:>@ the OpenCL ICD vendor file installs in
> /etc/OpenCL/vendors.
> + @<:@sysconfdir@:>@ installs the file in
> $sysconfdir/OpenCL/vendors
> + @<:@default=no@:>@])],
> +[OPENCL_ICD="$withval"],
> +[OPENCL_ICD="no"])
> +
> +case "x$OPENCL_ICD" in
> +xno)
> +OPENCL_LIBNAME="OpenCL"
> +;;
> +xstandard)
> +OPENCL_LIBNAME="MesaOpenCL"
> +ICD_FILE_DIR="/etc/OpenCL/vendors"
> +;;
> +xsysconfdir)
> +OPENCL_LIBNAME="MesaOpenCL"
> +ICD_FILE_DIR="$sysconfdir/OpenCL/vendors"
> +;;
> +*)
> +AC_MSG_ERROR(['$OPENCL_ICD' is not a valid option for
> --with-opencl-icd])
> +;;
> +esac
> +
> +AM_CONDITIONAL(HAVE_CLOVER_ICD, test "x$OPENCL_ICD" != xno)
> +AC_SUBST([OPENCL_LIBNAME])
> +AC_SUBST([ICD_FILE_DIR])
> +
>  dnl
>  dnl Gallium helper functions
>  dnl
> diff --git a/src/gallium/targets/opencl/Makefile.am
> b/src/gallium/targets/opencl/Makefile.am
> index 5daf327..781daa0 100644
> --- a/src/gallium/targets/opencl/Makefile.am
> +++ b/src/gallium/targets/opencl/Makefile.am
> @@ -47,7 +47,7 @@ EXTRA_lib@OPENCL_LIBNAME@_la_DEPENDENCIES = opencl.sym
>  EXTRA_DIST = mesa.icd opencl.sym
>
>  if HAVE_CLOVER_ICD
> -icddir = /etc/OpenCL/vendors/
> +icddir = $(ICD_FILE_DIR)
>  icd_DATA = mesa.icd
>  endif
>
> --
> 2.1.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] clover: add --with-icd-file-dir option

2015-05-07 Thread Ilia Mirkin

On Thu, May 7, 2015 at 3:59 AM, Michel Dänzer  wrote:
> On 05.05.2015 01:47, Tom Stellard wrote:
>> On Mon, May 04, 2015 at 10:13:19AM -0400, Ilia Mirkin wrote:
>>> On Mon, May 4, 2015 at 10:04 AM, Tom Stellard  wrote:
 On Sat, May 02, 2015 at 01:31:41PM -0400, Ilia Mirkin wrote:
> On Sat, May 2, 2015 at 1:19 PM, EdB  wrote:
>> The standard ICD file path is /etc/OpenCL/vendor/.
>> However it doesn't fit well with custom build.
>> This option allow ICD vendor file installation path override
>> ---
>>  configure.ac   | 6 ++
>>  src/gallium/targets/opencl/Makefile.am | 2 +-
>>  2 files changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/configure.ac b/configure.ac
>> index 095e23e..bf08d76 100644
>> --- a/configure.ac
>> +++ b/configure.ac
>> @@ -2005,6 +2005,12 @@ AC_ARG_WITH([d3d-libdir],
>>  [D3D_DRIVER_INSTALL_DIR="$withval"],
>>  [D3D_DRIVER_INSTALL_DIR="${libdir}/d3d"])
>>  AC_SUBST([D3D_DRIVER_INSTALL_DIR])
>> +AC_ARG_WITH([icd-file-dir],
>> +[AS_HELP_STRING([--with-icd-file-dir=DIR],
>> +[directory for the OpenCL ICD vendor file 
>> @<:@/etc/OpenCL/vendors@:>@])],
>> +[ICD_FILE_INSTALL_DIR="$withval"],
>> +[ICD_FILE_INSTALL_DIR="/etc/OpenCL/vendors"])
>
> What about making this default to ${sysconfdir}/OpenCL/vendors ? That
> way using --prefix should auto-make it go into the prefix instead of
> unexpectedly installing things outside of the specified prefix? That
> way a distro build which specifies --sysconfdir as /etc will get it in
> the right place, while by default it'll go into /usr/local/etc and a
> user can override the icd loader's default behaviour with
> OPENCL_VENDOR_PATH?
>

 I would prefer not to make this the default behavior, because it violates 
 the spec
 and there could potentially be multiple icd implementations, which may or 
 may not have
 the overrides.

 I think the best solution would be to rename the option to something like
 --enable-ocl-icd-respect-prefix (suggestions for other names encouraged).
 and have the option enable the behavior that Ilia is describing.

 This will give distros and advanced users a way to setup their system
 the way they want.
>>>
>>> It's just a very anti-autoconf thing to do to have "make install" fail
>>> by default unless you specify some "hey, i actually want make install
>>> to work" option.
>>>
>>> I think it's crazy to expect that, by default, people will want to
>>> write over their system installs, and having things go outside of the
>>> specified --prefix is very surprising (unless you force some other
>>> option). And asking the user to run "make install" as root is even
>>> crazier.
>>>
>>
>> My expectation is that, by default, when people specify --enable-opencl-icd
>> they want an implementation that conforms to the specification.
>> Unfortunately, this means installing icd files to /etc.
>>
>> There is no good solution here, but I'd rather have users specify a flag
>> to get a sane build system, than requiring them to set a flag and set
>> an environment variable just to get working OpenCL with the ICD loader.
>>
>>> I guess I haven't hit this yet because there's no OpenCL support in
>>> nouveau or freedreno, but I made the same stink about vdpau when Emil
>>> tried to make it install to some system location by default. At least
>>> a few people seemed to agree with me back then...
>>>
>>
>> Does the vdpau spec also require installation to a specific system director
>> (e.g. /etc/) ?
>
> Tom, I think ensuring that the OpenCL ICD loader can pick up the
> mesa.icd file is something for the distributor / administrator / user to
> worry about, not Mesa upstream.
>
> There's a similar situation with the drirc file, which is installed
> inside the prefix by default but only read from /etc/.

FTR, I fully agree with this assessment (it's the distributor's
problem), but my main priority was making sure "make install" works.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/wm/gen6: Add option for disabling statistics collection

2015-05-07 Thread Kenneth Graunke

On Thursday, May 07, 2015 04:39:14 PM Topi Pohjolainen wrote:
> Normally this always needed but for internal blits and clears
> we need to be able to disable it.
> 
> CC: Kenneth Graunke  Signed-off-by: Topi Pohjolainen 

Reviewed-by: Kenneth Graunke 

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965/fs: Set the header_size on LOAD_PAYLOAD in opt_sampler_eot

2015-05-07 Thread Neil Roberts

Commit 94ee908448 added a header size parameter to the function to
create the LOAD_PAYLOAD instruction. However this broke
opt_sampler_eot which manually constructs the instruction and so
wasn't setting the header_size. This ends up making the parameters for
the send message all have the wrong location and it all falls apart.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 3bf5866..02a1ad5 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2701,6 +2701,7 @@ fs_visitor::opt_sampler_eot()
 load_payload->sources + 1);
 
new_load_payload->regs_written = load_payload->regs_written + 1;
+   new_load_payload->header_size = 1;
tex_inst->mlen++;
tex_inst->header_size = 1;
tex_inst->insert_before(cfg->blocks[cfg->num_blocks - 1], new_load_payload);
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/fs: Set the header_size on LOAD_PAYLOAD in opt_sampler_eot

2015-05-07 Thread Jason Ekstrand

Reviewed-by: Jason Ekstrand 

On Thu, May 7, 2015 at 11:06 AM, Neil Roberts  wrote:
> Commit 94ee908448 added a header size parameter to the function to
> create the LOAD_PAYLOAD instruction. However this broke
> opt_sampler_eot which manually constructs the instruction and so
> wasn't setting the header_size. This ends up making the parameters for
> the send message all have the wrong location and it all falls apart.
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 3bf5866..02a1ad5 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -2701,6 +2701,7 @@ fs_visitor::opt_sampler_eot()
>  load_payload->sources + 
> 1);
>
> new_load_payload->regs_written = load_payload->regs_written + 1;
> +   new_load_payload->header_size = 1;
> tex_inst->mlen++;
> tex_inst->header_size = 1;
> tex_inst->insert_before(cfg->blocks[cfg->num_blocks - 1], 
> new_load_payload);
> --
> 1.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] clover: add --with-icd-file-dir option

2015-05-07 Thread Tom Stellard

On Thu, May 07, 2015 at 04:59:41PM +0900, Michel Dänzer wrote:
> On 05.05.2015 01:47, Tom Stellard wrote:
> > On Mon, May 04, 2015 at 10:13:19AM -0400, Ilia Mirkin wrote:
> >> On Mon, May 4, 2015 at 10:04 AM, Tom Stellard  wrote:
> >>> On Sat, May 02, 2015 at 01:31:41PM -0400, Ilia Mirkin wrote:
>  On Sat, May 2, 2015 at 1:19 PM, EdB  wrote:
> > The standard ICD file path is /etc/OpenCL/vendor/.
> > However it doesn't fit well with custom build.
> > This option allow ICD vendor file installation path override
> > ---
> >  configure.ac   | 6 ++
> >  src/gallium/targets/opencl/Makefile.am | 2 +-
> >  2 files changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/configure.ac b/configure.ac
> > index 095e23e..bf08d76 100644
> > --- a/configure.ac
> > +++ b/configure.ac
> > @@ -2005,6 +2005,12 @@ AC_ARG_WITH([d3d-libdir],
> >  [D3D_DRIVER_INSTALL_DIR="$withval"],
> >  [D3D_DRIVER_INSTALL_DIR="${libdir}/d3d"])
> >  AC_SUBST([D3D_DRIVER_INSTALL_DIR])
> > +AC_ARG_WITH([icd-file-dir],
> > +[AS_HELP_STRING([--with-icd-file-dir=DIR],
> > +[directory for the OpenCL ICD vendor file 
> > @<:@/etc/OpenCL/vendors@:>@])],
> > +[ICD_FILE_INSTALL_DIR="$withval"],
> > +[ICD_FILE_INSTALL_DIR="/etc/OpenCL/vendors"])
> 
>  What about making this default to ${sysconfdir}/OpenCL/vendors ? That
>  way using --prefix should auto-make it go into the prefix instead of
>  unexpectedly installing things outside of the specified prefix? That
>  way a distro build which specifies --sysconfdir as /etc will get it in
>  the right place, while by default it'll go into /usr/local/etc and a
>  user can override the icd loader's default behaviour with
>  OPENCL_VENDOR_PATH?
> 
> >>>
> >>> I would prefer not to make this the default behavior, because it violates 
> >>> the spec
> >>> and there could potentially be multiple icd implementations, which may or 
> >>> may not have
> >>> the overrides.
> >>>
> >>> I think the best solution would be to rename the option to something like
> >>> --enable-ocl-icd-respect-prefix (suggestions for other names encouraged).
> >>> and have the option enable the behavior that Ilia is describing.
> >>>
> >>> This will give distros and advanced users a way to setup their system
> >>> the way they want.
> >>
> >> It's just a very anti-autoconf thing to do to have "make install" fail
> >> by default unless you specify some "hey, i actually want make install
> >> to work" option.
> >>
> >> I think it's crazy to expect that, by default, people will want to
> >> write over their system installs, and having things go outside of the
> >> specified --prefix is very surprising (unless you force some other
> >> option). And asking the user to run "make install" as root is even
> >> crazier.
> >>
> > 
> > My expectation is that, by default, when people specify --enable-opencl-icd
> > they want an implementation that conforms to the specification.
> > Unfortunately, this means installing icd files to /etc.
> > 
> > There is no good solution here, but I'd rather have users specify a flag
> > to get a sane build system, than requiring them to set a flag and set
> > an environment variable just to get working OpenCL with the ICD loader.
> > 
> >> I guess I haven't hit this yet because there's no OpenCL support in
> >> nouveau or freedreno, but I made the same stink about vdpau when Emil
> >> tried to make it install to some system location by default. At least
> >> a few people seemed to agree with me back then...
> >>
> > 
> > Does the vdpau spec also require installation to a specific system director
> > (e.g. /etc/) ?
> 
> Tom, I think ensuring that the OpenCL ICD loader can pick up the
> mesa.icd file is something for the distributor / administrator / user to
> worry about, not Mesa upstream.
> 

I don't really disagree with this in general.  My position is that when
there is a situation where it is impossible to follow both the API spec
and build system best practices that it is more important to follow the
API spec.

I realize some people disagree with this, and I completely understand
their rationale.

For this particular situation, I'm happy with any solution that:

1. Allows a user to install the icd file to /etc if he or she wants to.
and
2. Does not require the user to read the spec to know that /etc is the
correct place to install it.

I think EdB's latest patch is a good solution:
http://lists.freedesktop.org/archives/mesa-dev/2015-May/083661.html

-Tom

> There's a similar situation with the drirc file, which is installed
> inside the prefix by default but only read from /etc/.
> 
> 
> -- 
> Earthling Michel Dänzer   |   http://www.amd.com
> Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedes

Re: [Mesa-dev] [PATCH] docs: document the LIBGL_DRI3_DISABLE environment variable

2015-05-07 Thread Kenneth Graunke

On Thursday, May 07, 2015 05:34:13 PM Martin Peres wrote:
> Suggested-by: Axel Davy 
> Signed-off-by: Martin Peres 
> ---
>  docs/envvars.html | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/docs/envvars.html b/docs/envvars.html
> index 31d14a4..c0d5a51 100644
> --- a/docs/envvars.html
> +++ b/docs/envvars.html
> @@ -34,6 +34,7 @@ sometimes be useful for debugging end-user issues.
>  LIBGL_NO_DRAWARRAYS - if set do not use DrawArrays GLX protocol (for 
> debugging)
>  LIBGL_SHOW_FPS - print framerate to stdout based on the number of 
> glXSwapBuffers
>  calls per second.
> +LIBGL_DRI3_DISABLE - disable DRI3 if set (the value does not matter)
>  

Documentation?!? :)  Always nice to have.

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/skl: In opt_sampler_eot always set destination register to null

2015-05-07 Thread Anuj Phogat

On Thu, May 7, 2015 at 6:20 AM, Neil Roberts  wrote:
> opt_sampler_eot enables a direct write to framebuffer from a sample.
> In order to do this the sample message needs to have a message header
> so if there wasn't one already then the function adds one. In addition
> the function sets the destination register to null because it's no
> longer used. However it was only doing this in cases where it was
> adding a message header. This patch just moves setting the destination
> so that it happens even if there's a messge header. In practice this
> doesn't seem to make any difference but it's a bit cleaner.
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 1ca7ca6..72d408b 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -2675,6 +2675,7 @@ fs_visitor::opt_sampler_eot()
>
> tex_inst->offset |= fb_write->target << 24;
> tex_inst->eot = true;
> +   tex_inst->dst = reg_null_ud;
> fb_write->remove(cfg->blocks[cfg->num_blocks - 1]);
>
> /* If a header is present, marking the eot is sufficient. Otherwise, we 
> need
> @@ -2712,7 +2713,6 @@ fs_visitor::opt_sampler_eot()
> tex_inst->header_present = true;
> tex_inst->insert_before(cfg->blocks[cfg->num_blocks - 1], 
> new_load_payload);
> tex_inst->src[0] = send_header;
> -   tex_inst->dst = reg_null_ud;
>
> return true;
>  }
> --
> 1.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

LGTM.
Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965/fs: Set the header_size on LOAD_PAYLOAD in opt_sampler_eot

2015-05-07 Thread Anuj Phogat

On Thu, May 7, 2015 at 11:06 AM, Neil Roberts  wrote:
> Commit 94ee908448 added a header size parameter to the function to
> create the LOAD_PAYLOAD instruction. However this broke
> opt_sampler_eot which manually constructs the instruction and so
> wasn't setting the header_size. This ends up making the parameters for
> the send message all have the wrong location and it all falls apart.
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index 3bf5866..02a1ad5 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -2701,6 +2701,7 @@ fs_visitor::opt_sampler_eot()
>  load_payload->sources + 
> 1);
>
> new_load_payload->regs_written = load_payload->regs_written + 1;
> +   new_load_payload->header_size = 1;
> tex_inst->mlen++;
> tex_inst->header_size = 1;
> tex_inst->insert_before(cfg->blocks[cfg->num_blocks - 1], 
> new_load_payload);
> --
> 1.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] clover: replace --enable-opencl-icd with --with-opencl-icd

2015-05-07 Thread EdB


Le 2015-05-07 18:55, Aaron Watry a écrit :

I'm not sure what the final consensus will be on how to do this, but
FWIW:
Tested-By: Aaron Watry 

I've tested this with 4 combinations:
no --with-opencl-icd option specified : libOpenCL.so gets installed in
${prefix}/lib
--with-opencl-icd=no : libOpenCL.so gets installed in ${prefix}/lib
--with-opencl-icd=standard : libMesaOpenCL.so installed in
${prefix}/lib, icd in /etc/OpenCL/vendors/mesa.icd
--with-opencl-icd=sysconfdir : libMesaOpenCL.so installed in
${prefix}/lib, icd in ${prefix}/etc//mesa.icd.  I only specified
--prefix, no other directories overridden in configure command.



thanks

  EdB


--Aaron

 

On Wed, May 6, 2015 at 4:34 PM, EdB  wrote:


The standard ICD file path is /etc/OpenCL/vendor/.
However it doesn't fit well with custom build.
This option allow ICD vendor file installation path override
---
 configure.ac [1]                           | 46
+++---
 src/gallium/targets/opencl/Makefile.am |  2 +-
 2 files changed, 33 insertions(+), 15 deletions(-)

diff --git a/configure.ac [1] b/configure.ac [1]
index 095e23e..90dba4e 100644
--- a/configure.ac [1]
+++ b/configure.ac [1]
@@ -804,12 +804,6 @@ AC_ARG_ENABLE([opencl],
          [enable OpenCL library @<:@default=disabled@:>@])],
    [enable_opencl="$enableval"],
    [enable_opencl=no])
-AC_ARG_ENABLE([opencl_icd],
-   [AS_HELP_STRING([--enable-opencl-icd],
-          [Build an OpenCL ICD library to be loaded by an ICD
implementation
-           @<:@default=disabled@:>@])],
-    [enable_opencl_icd="$enableval"],
-    [enable_opencl_icd=no])
 AC_ARG_ENABLE([xlib-glx],
     [AS_HELP_STRING([--enable-xlib-glx],
         [make GLX library Xlib-based instead of DRI-based
@<:@default=disabled@:>@])],
@@ -1689,19 +1683,11 @@ if test "x$enable_opencl" = xyes; then
     # XXX: Use $enable_shared_pipe_drivers once converted to
use static/shared pipe-drivers
     enable_gallium_loader=yes

-    if test "x$enable_opencl_icd" = xyes; then
-        OPENCL_LIBNAME="MesaOpenCL"
-    else
-        OPENCL_LIBNAME="OpenCL"
-    fi
-
     if test "x$have_libelf" != xyes; then
        AC_MSG_ERROR([Clover requires libelf])
     fi
 fi
 AM_CONDITIONAL(HAVE_CLOVER, test "x$enable_opencl" = xyes)
-AM_CONDITIONAL(HAVE_CLOVER_ICD, test "x$enable_opencl_icd" = xyes)
-AC_SUBST([OPENCL_LIBNAME])

 dnl
 dnl Gallium configuration
@@ -2006,6 +1992,38 @@ AC_ARG_WITH([d3d-libdir],
     [D3D_DRIVER_INSTALL_DIR="${libdir}/d3d"])
 AC_SUBST([D3D_DRIVER_INSTALL_DIR])

+dnl OpenCL ICD
+
+AC_ARG_WITH([opencl-icd],
+   
[AS_HELP_STRING([--with-opencl-icd=@<:@no,standard,sysconfdir@:>@],
+        [Build an OpenCL ICD library to be loaded by an ICD
implementation.
+         If @<:@standard@:>@ the OpenCL ICD vendor file
installs in /etc/OpenCL/vendors.
+         @<:@sysconfdir@:>@ installs the file in
$sysconfdir/OpenCL/vendors
+         @<:@default=no@:>@])],
+    [OPENCL_ICD="$withval"],
+    [OPENCL_ICD="no"])
+
+case "x$OPENCL_ICD" in
+xno)
+    OPENCL_LIBNAME="OpenCL"
+    ;;
+xstandard)
+    OPENCL_LIBNAME="MesaOpenCL"
+    ICD_FILE_DIR="/etc/OpenCL/vendors"
+    ;;
+xsysconfdir)
+    OPENCL_LIBNAME="MesaOpenCL"
+    ICD_FILE_DIR="$sysconfdir/OpenCL/vendors"
+    ;;
+*)
+    AC_MSG_ERROR(['$OPENCL_ICD' is not a valid option for
--with-opencl-icd])
+    ;;
+esac
+
+AM_CONDITIONAL(HAVE_CLOVER_ICD, test "x$OPENCL_ICD" != xno)
+AC_SUBST([OPENCL_LIBNAME])
+AC_SUBST([ICD_FILE_DIR])
+
 dnl
 dnl Gallium helper functions
 dnl
diff --git a/src/gallium/targets/opencl/Makefile.am
b/src/gallium/targets/opencl/Makefile.am
index 5daf327..781daa0 100644
--- a/src/gallium/targets/opencl/Makefile.am
+++ b/src/gallium/targets/opencl/Makefile.am
@@ -47,7 +47,7 @@ EXTRA_lib@OPENCL_LIBNAME@_la_DEPENDENCIES =
opencl.sym
 EXTRA_DIST = mesa.icd opencl.sym

 if HAVE_CLOVER_ICD
-icddir = /etc/OpenCL/vendors/
+icddir = $(ICD_FILE_DIR)
 icd_DATA = mesa.icd
 endif

--
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev [2]




Links:
--
[1] http://configure.ac
[2] http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965/wm/gen7: Refactor state setup

2015-05-07 Thread Topi Pohjolainen

CC: Kenneth Graunke 
Signed-off-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_state.h |  9 +++
 src/mesa/drivers/dri/i965/gen7_wm_state.c | 98 ---
 2 files changed, 74 insertions(+), 33 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index 26fdae6..5a52a74 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -264,6 +264,15 @@ void brw_update_renderbuffer_surfaces(struct brw_context 
*brw,
 
 /* gen7_wm_state.c */
 void
+gen7_upload_wm_state(struct brw_context *brw,
+ const struct gl_program *fp,
+ const struct brw_wm_prog_data *prog_data,
+ bool multisampled_fbo, int min_inv_per_frag,
+ bool kill_enable, bool color_buffer_write_enable,
+ bool msaa_enabled, bool statistic_enable,
+ bool line_stipple_enable, bool polygon_stipple_enable);
+
+void
 gen7_upload_ps_state(struct brw_context *brw,
  const struct gl_fragment_program *fp,
  const struct brw_stage_state *stage_state,
diff --git a/src/mesa/drivers/dri/i965/gen7_wm_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_state.c
index b918275..b3fa5be 100644
--- a/src/mesa/drivers/dri/i965/gen7_wm_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_wm_state.c
@@ -32,63 +32,53 @@
 #include "program/prog_statevars.h"
 #include "intel_batchbuffer.h"
 
-static void
-upload_wm_state(struct brw_context *brw)
+void
+gen7_upload_wm_state(struct brw_context *brw,
+ const struct gl_program *fp,
+ const struct brw_wm_prog_data *prog_data,
+ bool multisampled_fbo, int min_inv_per_frag,
+ bool kill_enable, bool color_buffer_write_enable,
+ bool msaa_enabled, bool statistic_enable,
+ bool line_stipple_enable, bool polygon_stipple_enable)
 {
-   struct gl_context *ctx = &brw->ctx;
-   /* BRW_NEW_FRAGMENT_PROGRAM */
-   const struct brw_fragment_program *fp =
-  brw_fragment_program_const(brw->fragment_program);
-   /* BRW_NEW_FS_PROG_DATA */
-   const struct brw_wm_prog_data *prog_data = brw->wm.prog_data;
bool writes_depth = prog_data->computed_depth_mode != BRW_PSCDEPTH_OFF;
uint32_t dw1, dw2;
 
-   /* _NEW_BUFFERS */
-   bool multisampled_fbo = ctx->DrawBuffer->Visual.samples > 1;
-
dw1 = dw2 = 0;
-   dw1 |= GEN7_WM_STATISTICS_ENABLE;
+
+   if (statistic_enable)
+  dw1 |= GEN7_WM_STATISTICS_ENABLE;
+
dw1 |= GEN7_WM_LINE_AA_WIDTH_1_0;
dw1 |= GEN7_WM_LINE_END_CAP_AA_WIDTH_0_5;
 
-   /* _NEW_LINE */
-   if (ctx->Line.StippleFlag)
+   if (line_stipple_enable)
   dw1 |= GEN7_WM_LINE_STIPPLE_ENABLE;
 
-   /* _NEW_POLYGON */
-   if (ctx->Polygon.StippleFlag)
+   if (polygon_stipple_enable)
   dw1 |= GEN7_WM_POLYGON_STIPPLE_ENABLE;
 
-   if (fp->program.Base.InputsRead & VARYING_BIT_POS)
+   if (fp->InputsRead & VARYING_BIT_POS)
   dw1 |= GEN7_WM_USES_SOURCE_DEPTH | GEN7_WM_USES_SOURCE_W;
 
dw1 |= prog_data->computed_depth_mode << GEN7_WM_COMPUTED_DEPTH_MODE_SHIFT;
dw1 |= prog_data->barycentric_interp_modes <<
   GEN7_WM_BARYCENTRIC_INTERPOLATION_MODE_SHIFT;
 
-   /* _NEW_COLOR, _NEW_MULTISAMPLE */
-   /* Enable if the pixel shader kernel generates and outputs oMask.
-*/
-   if (prog_data->uses_kill || ctx->Color.AlphaEnabled ||
-   ctx->Multisample.SampleAlphaToCoverage ||
-   prog_data->uses_omask) {
+   if (kill_enable)
   dw1 |= GEN7_WM_KILL_ENABLE;
-   }
 
-   /* _NEW_BUFFERS | _NEW_COLOR */
-   if (brw_color_buffer_write_enabled(brw) || writes_depth ||
-   dw1 & GEN7_WM_KILL_ENABLE) {
+   if (color_buffer_write_enable || writes_depth ||
+   dw1 & GEN7_WM_KILL_ENABLE)
   dw1 |= GEN7_WM_DISPATCH_ENABLE;
-   }
+
if (multisampled_fbo) {
-  /* _NEW_MULTISAMPLE */
-  if (ctx->Multisample.Enabled)
+  if (msaa_enabled)
  dw1 |= GEN7_WM_MSRAST_ON_PATTERN;
   else
  dw1 |= GEN7_WM_MSRAST_OFF_PIXEL;
 
-  if (_mesa_get_min_invocations_per_fragment(ctx, brw->fragment_program, 
false) > 1)
+  if (min_inv_per_frag > 1)
  dw2 |= GEN7_WM_MSDISPMODE_PERSAMPLE;
   else
  dw2 |= GEN7_WM_MSDISPMODE_PERPIXEL;
@@ -97,9 +87,8 @@ upload_wm_state(struct brw_context *brw)
   dw2 |= GEN7_WM_MSDISPMODE_PERSAMPLE;
}
 
-   if (fp->program.Base.SystemValuesRead & SYSTEM_BIT_SAMPLE_MASK_IN) {
+   if (fp->SystemValuesRead & SYSTEM_BIT_SAMPLE_MASK_IN)
   dw1 |= GEN7_WM_USES_INPUT_COVERAGE_MASK;
-   }
 
BEGIN_BATCH(3);
OUT_BATCH(_3DSTATE_WM << 16 | (3 - 2));
@@ -108,6 +97,49 @@ upload_wm_state(struct brw_context *brw)
ADVANCE_BATCH();
 }
 
+static void
+upload_wm_state(struct brw_context *brw)
+{
+   struct gl_context *ctx = &brw->ctx;
+   /* BRW_NEW_FRAGMENT_PROGRAM */
+   const struct brw_fragment_program *fp =
+  brw_fragment_prog

Re: [Mesa-dev] [PATCH v2 1/6] mesa/es3.1: enable GL_ARB_shader_image_load_store for gles3.1

2015-05-07 Thread Ian Romanick

On 05/07/2015 12:57 AM, Marta Lofstedt wrote:
> From: Marta Lofstedt 
> 
> v2: only expose enums from GL_ARB_shader_image_load_store
> for gles 3.1 and GL core
> 
> Signed-off-by: Marta Lofstedt 
> ---
>  src/mesa/main/get.c  |  6 ++
>  src/mesa/main/get_hash_params.py | 17 -
>  2 files changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
> index 9898197..73739b6 100644
> --- a/src/mesa/main/get.c
> +++ b/src/mesa/main/get.c
> @@ -355,6 +355,12 @@ static const int extra_ARB_draw_indirect_es31[] = {
> EXTRA_END
>  };
>  
> +static const int extra_ARB_shader_image_load_store_es31[] = {
> +   EXT(ARB_shader_image_load_store),
> +   EXTRA_API_ES31,

I think you're missing the patch that adds EXTRA_API_ES31.  Did you
forget to send that one out?

Also, on a few of these patches, I think the old, non-_es31 set of
requirements can be removed due to no longer being used.

> +   EXTRA_END
> +};
> +
>  EXTRA_EXT(ARB_texture_cube_map);
>  EXTRA_EXT(EXT_texture_array);
>  EXTRA_EXT(NV_fog_distance);
> diff --git a/src/mesa/main/get_hash_params.py 
> b/src/mesa/main/get_hash_params.py
> index 513d5d2..85c2494 100644
> --- a/src/mesa/main/get_hash_params.py
> +++ b/src/mesa/main/get_hash_params.py
> @@ -413,6 +413,14 @@ descriptor=[
>  { "apis": ["GL_CORE", "GLES3"], "params": [
>  # GL_ARB_draw_indirect / GLES 3.1
>[ "DRAW_INDIRECT_BUFFER_BINDING", "LOC_CUSTOM, TYPE_INT, 0, 
> extra_ARB_draw_indirect_es31" ],
> +# GL_ARB_shader_image_load_store / GLES 3.1
> +  [ "MAX_IMAGE_UNITS", "CONTEXT_INT(Const.MaxImageUnits), 
> extra_ARB_shader_image_load_store_es31"],
> +  [ "MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS", 
> "CONTEXT_INT(Const.MaxCombinedImageUnitsAndFragmentOutputs), 
> extra_ARB_shader_image_load_store_es31"],
> +  [ "MAX_IMAGE_SAMPLES", "CONTEXT_INT(Const.MaxImageSamples), 
> extra_ARB_shader_image_load_store_es31"],
> +  [ "MAX_VERTEX_IMAGE_UNIFORMS", 
> "CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxImageUniforms), 
> extra_ARB_shader_image_load_store_es31"],
> +  [ "MAX_GEOMETRY_IMAGE_UNIFORMS", 
> "CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxImageUniforms), 
> extra_ARB_shader_image_load_store_es31"],
> +  [ "MAX_FRAGMENT_IMAGE_UNIFORMS", 
> "CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms), 
> extra_ARB_shader_image_load_store_es31"],
> +  [ "MAX_COMBINED_IMAGE_UNIFORMS", 
> "CONTEXT_INT(Const.MaxCombinedImageUniforms), 
> extra_ARB_shader_image_load_store_es31"],
>  ]},
>  
>  # Remaining enums are only in OpenGL
> @@ -780,15 +788,6 @@ descriptor=[
>[ "MAX_VERTEX_ATTRIB_RELATIVE_OFFSET", 
> "CONTEXT_ENUM(Const.MaxVertexAttribRelativeOffset), NO_EXTRA" ],
>[ "MAX_VERTEX_ATTRIB_BINDINGS", 
> "CONTEXT_ENUM(Const.MaxVertexAttribBindings), NO_EXTRA" ],
>  
> -# GL_ARB_shader_image_load_store
> -  [ "MAX_IMAGE_UNITS", "CONTEXT_INT(Const.MaxImageUnits), 
> extra_ARB_shader_image_load_store"],
> -  [ "MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS", 
> "CONTEXT_INT(Const.MaxCombinedImageUnitsAndFragmentOutputs), 
> extra_ARB_shader_image_load_store"],
> -  [ "MAX_IMAGE_SAMPLES", "CONTEXT_INT(Const.MaxImageSamples), 
> extra_ARB_shader_image_load_store"],
> -  [ "MAX_VERTEX_IMAGE_UNIFORMS", 
> "CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxImageUniforms), 
> extra_ARB_shader_image_load_store"],
> -  [ "MAX_GEOMETRY_IMAGE_UNIFORMS", 
> "CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxImageUniforms), 
> extra_ARB_shader_image_load_store_and_geometry_shader"],
> -  [ "MAX_FRAGMENT_IMAGE_UNIFORMS", 
> "CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms), 
> extra_ARB_shader_image_load_store"],
> -  [ "MAX_COMBINED_IMAGE_UNIFORMS", 
> "CONTEXT_INT(Const.MaxCombinedImageUniforms), 
> extra_ARB_shader_image_load_store"],
> -
>  # GL_ARB_compute_shader
>[ "MAX_COMPUTE_WORK_GROUP_INVOCATIONS", 
> "CONTEXT_INT(Const.MaxComputeWorkGroupInvocations), extra_ARB_compute_shader" 
> ],
>[ "MAX_COMPUTE_UNIFORM_BLOCKS", "CONST(MAX_COMPUTE_UNIFORM_BLOCKS), 
> extra_ARB_compute_shader" ],
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Initial amdgpu driver release

2015-05-07 Thread Alex Deucher

On Mon, Apr 20, 2015 at 6:33 PM, Alex Deucher  wrote:
> I'm pleased to announce the initial release of the new amdgpu driver.
> This is a partial replacement for the radeon driver for newer AMD
> asics.  A number of components are still shared.  Here is a comparison
> of the radeon and amdgpu stacks:
>
> 1. radeon stack
> kernel driver: radeon.ko
> libdrm: libdrm_radeon
> mesa: radeon, r200, r300, r600, radeonsi
> ddx: xf86-video-ati
>
> 2. amdgpu stack
> kernel driver: amdgpu.ko
> libdrm: libdrm_amdgpu
> mesa: radeonsi
> ddx: xf86-video-amdgpu
>
> Older asics will continue to be supported by the radeon stack; new
> asics will be supported by the amdgpu stack.  CI (Sea Islands) asics
> have support in both driver stacks, but this is purely for testing
> purposes.  CI parts are officially supported in the radeon stack.
> Support for CI on the amdgpu stack is determined by a config option in
> the kernel.  CI support is not enabled by default for amdgpu.
>
> Most of our focus has been on Carrizo support, so there are some gaps
> in the dGPU support for Tonga and Iceland, notably power management.
> Those gaps will be filled in eventually.
>
> Also included in this code base are full register headers for just
> about every block on the asics.
>
> Barring the gaps mentioned above, the driver stack is functionally on
> par with radeon including:
> - OpenGL 3.3 support using the radeonsi mesa driver
> - Video decode support using UVD
> - Video encode support using VCE
>
> The code can be found in the amdgpu branches of the following git trees.
> xf86-video-amdgpu:
> http://cgit.freedesktop.org/~agd5f/xf86-video-amdgpu/log/?h=amdgpu
> libdrm:
> http://cgit.freedesktop.org/~agd5f/drm/log/?h=amdgpu
> kernel:
> http://cgit.freedesktop.org/~agd5f/linux/log/?h=amdgpu
> mesa:
> http://cgit.freedesktop.org/~mareko/mesa/log/?h=amdgpu

Some updates on the latest source locations:

xf86-video-amdgpu:
http://cgit.freedesktop.org/xorg/driver/xf86-video-amdgpu
libdrm:
http://cgit.freedesktop.org/~agd5f/drm/log/?h=amdgpu
kernel:
http://cgit.freedesktop.org/amd/drm-amd/
mesa:
http://cgit.freedesktop.org/mesa/mesa/log/?h=amdgpu

Alex


>
> To test the new driver stack you will need to specify a device section
> in your xorg.conf with the driver set to amdgpu rather than radeon.
>
> Please review!
>
> Thanks,
>
> The AMD Linux Driver Team
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 10/13] mesa/main: Check context pointer in _mesa_error before using it

2015-05-07 Thread Ian Romanick

On 05/07/2015 05:17 AM, Pohjolainen, Topi wrote:
> On Tue, May 05, 2015 at 02:25:26PM +0300, Juha-Pekka Heikkila wrote:
>> I guess this should not really be able to segfault but still it
>> seems to be able to during context creation.
>>
>> Signed-off-by: Juha-Pekka Heikkila 
>> ---
>>  src/mesa/main/errors.c | 26 --
>>  1 file changed, 16 insertions(+), 10 deletions(-)
>>
>> diff --git a/src/mesa/main/errors.c b/src/mesa/main/errors.c
>> index 2aa1deb..6631b82 100644
>> --- a/src/mesa/main/errors.c
>> +++ b/src/mesa/main/errors.c
>> @@ -1458,18 +1458,23 @@ _mesa_error( struct gl_context *ctx, GLenum error, 
>> const char *fmtString, ... )
>>  
> To me it looks that it would be better to just leave early already here:
> 
>   if (!ctx)
>  return;
> 
> Avoids extra indentation and it doesn't look meaningful to call
> should_output() with null context.

I like that plan.

I don't think you can even get to _mesa_error (or _mesa_warning) without
a context.  Maybe add an assert(ctx != NULL)?

>> do_output = should_output(ctx, error, fmtString);
>>  
>> -   mtx_lock(&ctx->DebugMutex);
>> -   if (ctx->Debug) {
>> -  do_log = debug_is_message_enabled(ctx->Debug,
>> -MESA_DEBUG_SOURCE_API,
>> -MESA_DEBUG_TYPE_ERROR,
>> -error_msg_id,
>> -MESA_DEBUG_SEVERITY_HIGH);
>> +   if (ctx) {
>> +  mtx_lock(&ctx->DebugMutex);
>> +  if (ctx->Debug) {
>> + do_log = debug_is_message_enabled(ctx->Debug,
>> +   MESA_DEBUG_SOURCE_API,
>> +   MESA_DEBUG_TYPE_ERROR,
>> +   error_msg_id,
>> +   MESA_DEBUG_SEVERITY_HIGH);
>> +  }
>> +  else {
>> + do_log = GL_FALSE;
>> +  }
>> +  mtx_unlock(&ctx->DebugMutex);
>> }
>> else {
>>do_log = GL_FALSE;
>> }
>> -   mtx_unlock(&ctx->DebugMutex);
>>  
>> if (do_output || do_log) {
>>char s[MAX_DEBUG_MESSAGE_LENGTH], s2[MAX_DEBUG_MESSAGE_LENGTH];
>> @@ -1502,14 +1507,15 @@ _mesa_error( struct gl_context *ctx, GLenum error, 
>> const char *fmtString, ... )
>>}
>>  
>>/* Log the error via ARB_debug_output if needed.*/
>> -  if (do_log) {
>> +  if (ctx && do_log) {
>>   log_msg(ctx, MESA_DEBUG_SOURCE_API, MESA_DEBUG_TYPE_ERROR,
>>   error_msg_id, MESA_DEBUG_SEVERITY_HIGH, len, s2);
>>}
>> }
>>  
>> /* Set the GL context error state for glGetError. */
>> -   _mesa_record_error(ctx, error);
>> +   if (ctx)
>> +  _mesa_record_error(ctx, error);
>>  }
>>  
>>  void
>> -- 
>> 1.8.5.1
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 13/13] mesa/main: Verify context creation on progress

2015-05-07 Thread Ian Romanick

On 05/07/2015 05:21 AM, Pohjolainen, Topi wrote:
> On Tue, May 05, 2015 at 02:25:29PM +0300, Juha-Pekka Heikkila wrote:
>> Stop context creation if something failed. If something errored
>> during context creation we'd segfault. Now will clean up and
>> return error.
>>
>> Signed-off-by: Juha-Pekka Heikkila 
>> ---
>>  src/mesa/main/shared.c | 66 
>> +++---
>>  1 file changed, 62 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/mesa/main/shared.c b/src/mesa/main/shared.c
>> index 0b76cc0..cc05b05 100644
>> --- a/src/mesa/main/shared.c
>> +++ b/src/mesa/main/shared.c
>> @@ -64,9 +64,21 @@ _mesa_alloc_shared_state(struct gl_context *ctx)
>>  
>> mtx_init(&shared->Mutex, mtx_plain);
>>  
>> +   /* Mutex and timestamp for texobj state validation */
>> +   mtx_init(&shared->TexMutex, mtx_recursive);
>> +   shared->TextureStateStamp = 0;
> 
> Do you really need to move this here?

I was going to ask the same thing.  I think moving it here means that it
can be unconditionally mtx_destroy'ed in the error path below.

>> +
>> shared->DisplayList = _mesa_NewHashTable();
>> +   if (!shared->DisplayList)
>> +  goto error_out;
>> +
>> shared->TexObjects = _mesa_NewHashTable();
>> +   if (!shared->TexObjects)
>> +  goto error_out;
>> +
>> shared->Programs = _mesa_NewHashTable();
>> +   if (!shared->Programs)
>> +  goto error_out;
>>  
>> shared->DefaultVertexProgram =
>>gl_vertex_program(ctx->Driver.NewProgram(ctx,
>> @@ -76,17 +88,28 @@ _mesa_alloc_shared_state(struct gl_context *ctx)
>>   GL_FRAGMENT_PROGRAM_ARB, 
>> 0));
>>  
>> shared->ATIShaders = _mesa_NewHashTable();
>> +   if (!shared->ATIShaders)
>> +  goto error_out;
>> +
>> shared->DefaultFragmentShader = _mesa_new_ati_fragment_shader(ctx, 0);
>>  
>> shared->ShaderObjects = _mesa_NewHashTable();
>> +   if (!shared->ShaderObjects)
>> +  goto error_out;
>>  
>> shared->BufferObjects = _mesa_NewHashTable();
>> +   if (!shared->BufferObjects)
>> +  goto error_out;
>>  
>> /* GL_ARB_sampler_objects */
>> shared->SamplerObjects = _mesa_NewHashTable();
>> +   if (!shared->SamplerObjects)
>> +  goto error_out;
>>  
>> /* Allocate the default buffer object */
>> shared->NullBufferObj = ctx->Driver.NewBufferObject(ctx, 0);
>> +   if (!shared->NullBufferObj)
>> +   goto error_out;
>>  
>> /* Create default texture objects */
>> for (i = 0; i < NUM_TEXTURE_TARGETS; i++) {
>> @@ -107,22 +130,57 @@ _mesa_alloc_shared_state(struct gl_context *ctx)
>>};
>>STATIC_ASSERT(ARRAY_SIZE(targets) == NUM_TEXTURE_TARGETS);
>>shared->DefaultTex[i] = ctx->Driver.NewTextureObject(ctx, 0, 
>> targets[i]);
>> +
>> +  if (!shared->DefaultTex[i])
>> +  goto error_out;
>> }
>>  
>> /* sanity check */
>> assert(shared->DefaultTex[TEXTURE_1D_INDEX]->RefCount == 1);
>>  
>> -   /* Mutex and timestamp for texobj state validation */
>> -   mtx_init(&shared->TexMutex, mtx_recursive);
>> -   shared->TextureStateStamp = 0;
>> -
>> shared->FrameBuffers = _mesa_NewHashTable();
>> +   if (!shared->FrameBuffers)
>> +  goto error_out;
>> +
>> shared->RenderBuffers = _mesa_NewHashTable();
>> +   if (!shared->RenderBuffers)
>> +  goto error_out;
>>  
>> shared->SyncObjects = _mesa_set_create(NULL, _mesa_hash_pointer,
>>_mesa_key_pointer_equal);
>> +   if (!shared->SyncObjects)
>> +  goto error_out;
>>  
>> return shared;
>> +
>> +error_out:
>> +   for (i = 0; i < NUM_TEXTURE_TARGETS; i++) {
>> +  if (shared->DefaultTex[i]) {
>> + ctx->Driver.DeleteTexture(ctx, shared->DefaultTex[i]);
>> +  }
>> +   }
>> +
>> +   _mesa_reference_buffer_object(ctx, &shared->NullBufferObj, NULL);
>> +
>> +   _mesa_DeleteHashTable(shared->RenderBuffers);
>> +   _mesa_DeleteHashTable(shared->FrameBuffers);
>> +   _mesa_DeleteHashTable(shared->SamplerObjects);
>> +   _mesa_DeleteHashTable(shared->BufferObjects);
>> +   _mesa_DeleteHashTable(shared->ShaderObjects);
>> +   _mesa_DeleteHashTable(shared->ATIShaders);
>> +   _mesa_DeleteHashTable(shared->Programs);
>> +   _mesa_DeleteHashTable(shared->TexObjects);
>> +   _mesa_DeleteHashTable(shared->DisplayList);
>> +
>> +   _mesa_reference_vertprog(ctx, &shared->DefaultVertexProgram, NULL);
>> +   _mesa_reference_geomprog(ctx, &shared->DefaultGeometryProgram, NULL);
>> +   _mesa_reference_fragprog(ctx, &shared->DefaultFragmentProgram, NULL);
>> +
>> +   mtx_destroy(&shared->Mutex);
>> +   mtx_destroy(&shared->TexMutex);
>> +
>> +   free(shared);
>> +   return NULL;
>>  }
>>  
>>  
>> -- 
>> 1.8.5.1
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.

Re: [Mesa-dev] [PATCH 1/5] prog_to_nir: OPCODE_EXP is not nir_op_fexp

2015-05-07 Thread Ian Romanick

On 05/07/2015 07:30 AM, Jason Ekstrand wrote:
> On Wed, May 6, 2015 at 7:29 PM, Matt Turner  wrote:
>> On Wed, May 6, 2015 at 7:09 PM, Ian Romanick  wrote:
>>> From: Ian Romanick 
>>>
>>> It's a weird thing that provides some values related to 2**x.  It's also
>>> already handled by a case in the switch.
>>>
>>> Signed-off-by: Ian Romanick 
>>
>> The series is
>>
>> Reviewed-by: Matt Turner 
> 
> I was going to complain about you making my SPIR-V -> NIR translator
> harder to write.  But, based on the discussion by Ken and Ilia on IRC,
> it looks like basically no one's hardware does a base-e log.  I'll
> just lower on-the-fly.  I guess maybe we could do it with pow(x, e)
> but meh.  If you'd like, the series is

Right.  We currently unconditionally lower exp(x) to exp2(x * M_LOG2E)
in the GLSL IR lowering code.  I believe we picked that lowering because
some older architectures lack a pow instruction.  It may be worth trying
the other way to see if we get better code.

> Acked-by: Jason Ekstrand 
> 
> I can't say I read it enough to call it a review but I glanced through
> it and it seems ok.
> --Jason
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 13/13] SQUASH: nir: Update various components for the new list-based use/def sets

2015-05-07 Thread Connor Abbott

On Tue, Apr 28, 2015 at 12:03 AM, Jason Ekstrand  wrote:
> ---
>  src/glsl/nir/nir_from_ssa.c | 11 +--
>  src/glsl/nir/nir_lower_locals_to_regs.c | 14 ++
>  src/glsl/nir/nir_lower_to_source_mods.c | 20 
>  src/glsl/nir/nir_lower_vars_to_ssa.c|  3 ++-
>  src/glsl/nir/nir_opt_gcm.c  | 14 ++
>  src/glsl/nir/nir_opt_global_to_local.c  | 13 ++---
>  src/glsl/nir/nir_opt_peephole_ffma.c|  9 -
>  src/glsl/nir/nir_opt_peephole_select.c  | 10 --
>  src/glsl/nir/nir_to_ssa.c   | 19 ++-
>  9 files changed, 55 insertions(+), 58 deletions(-)
>
> diff --git a/src/glsl/nir/nir_from_ssa.c b/src/glsl/nir/nir_from_ssa.c
> index 5e7deca..94d1ced 100644
> --- a/src/glsl/nir/nir_from_ssa.c
> +++ b/src/glsl/nir/nir_from_ssa.c
> @@ -345,6 +345,7 @@ isolate_phi_nodes_block(nir_block *block, void 
> *void_state)
>
>   nir_parallel_copy_entry *entry = rzalloc(state->dead_ctx,
>nir_parallel_copy_entry);
> + entry->src.parent_instr = &pcopy->instr;

I don't think this change, or the one immediately below, are needed
since nir_instr_rewrite_uses() will already set the parent_instr.

>   nir_ssa_dest_init(&pcopy->instr, &entry->dest,
> phi->dest.ssa.num_components, src->src.ssa->name);
>   exec_list_push_tail(&pcopy->entries, &entry->node);
> @@ -358,6 +359,7 @@ isolate_phi_nodes_block(nir_block *block, void 
> *void_state)
>
>nir_parallel_copy_entry *entry = rzalloc(state->dead_ctx,
> nir_parallel_copy_entry);
> +  entry->src.parent_instr = &block_pcopy->instr;
>nir_ssa_dest_init(&block_pcopy->instr, &entry->dest,
>  phi->dest.ssa.num_components, phi->dest.ssa.name);
>exec_list_push_tail(&block_pcopy->entries, &entry->node);
> @@ -503,7 +505,7 @@ rewrite_ssa_def(nir_ssa_def *def, void *void_state)
> }
>
> nir_ssa_def_rewrite_uses(def, nir_src_for_reg(reg), state->mem_ctx);
> -   assert(def->uses->entries == 0 && def->if_uses->entries == 0);
> +   assert(list_empty(&def->uses) && list_empty(&def->if_uses));
>
> if (def->parent_instr->type == nir_instr_type_ssa_undef)
>return true;
> @@ -515,12 +517,9 @@ rewrite_ssa_def(nir_ssa_def *def, void *void_state)
>  */
> nir_dest *dest = exec_node_data(nir_dest, def, ssa);
>
> -   _mesa_set_destroy(dest->ssa.uses, NULL);
> -   _mesa_set_destroy(dest->ssa.if_uses, NULL);
> -
> *dest = nir_dest_for_reg(reg);
> -
> -   _mesa_set_add(reg->defs, state->instr);
> +   dest->reg.parent_instr = state->instr;
> +   list_addtail(&dest->reg.def_link, ®->defs);
>
> return true;
>  }
> diff --git a/src/glsl/nir/nir_lower_locals_to_regs.c 
> b/src/glsl/nir/nir_lower_locals_to_regs.c
> index bc6a3d3..28fdec5 100644
> --- a/src/glsl/nir/nir_lower_locals_to_regs.c
> +++ b/src/glsl/nir/nir_lower_locals_to_regs.c
> @@ -269,18 +269,16 @@ lower_locals_to_regs_block(nir_block *block, void 
> *void_state)
>  static nir_block *
>  compute_reg_usedef_lca(nir_register *reg)
>  {
> -   struct set_entry *entry;
> nir_block *lca = NULL;
>
> -   set_foreach(reg->defs, entry)
> -  lca = nir_dominance_lca(lca, ((nir_instr *)entry->key)->block);
> +   list_for_each_entry(nir_dest, def_dest, ®->defs, reg.def_link)
> +  lca = nir_dominance_lca(lca, def_dest->reg.parent_instr->block);
>
> -   set_foreach(reg->uses, entry)
> -  lca = nir_dominance_lca(lca, ((nir_instr *)entry->key)->block);
> +   list_for_each_entry(nir_src, use_src, ®->uses, use_link)
> +  lca = nir_dominance_lca(lca, use_src->parent_instr->block);
>
> -   set_foreach(reg->if_uses, entry) {
> -  nir_if *if_stmt = (nir_if *)entry->key;
> -  nir_cf_node *prev_node = nir_cf_node_prev(&if_stmt->cf_node);
> +   list_for_each_entry(nir_src, use_src, ®->if_uses, use_link) {
> +  nir_cf_node *prev_node = 
> nir_cf_node_prev(&use_src->parent_if->cf_node);
>assert(prev_node->type == nir_cf_node_block);
>lca = nir_dominance_lca(lca, nir_cf_node_as_block(prev_node));
> }
> diff --git a/src/glsl/nir/nir_lower_to_source_mods.c 
> b/src/glsl/nir/nir_lower_to_source_mods.c
> index 7b4a0f6..94c7e36 100644
> --- a/src/glsl/nir/nir_lower_to_source_mods.c
> +++ b/src/glsl/nir/nir_lower_to_source_mods.c
> @@ -88,8 +88,8 @@ nir_lower_to_source_mods_block(nir_block *block, void 
> *state)
>  alu->src[i].swizzle[j] = 
> parent->src[0].swizzle[alu->src[i].swizzle[j]];
>   }
>
> - if (parent->dest.dest.ssa.uses->entries == 0 &&
> - parent->dest.dest.ssa.if_uses->entries == 0)
> + if (list_empty(&parent->dest.dest.ssa.uses) &&
> + list_empty(&parent->dest.dest.ssa.if_uses))
>  nir_instr_remove(&parent->instr);
>}
>
> @@ -131,13 +131,13 @@ nir_lower_to_source_mods_block(nir_block *bloc

Re: [Mesa-dev] [PATCH] clover: replace --enable-opencl-icd with --with-opencl-icd

2015-05-07 Thread Jan Vesely

On Thu, 2015-05-07 at 21:52 +0200, EdB wrote:
> Le 2015-05-07 18:55, Aaron Watry a écrit :
> > I'm not sure what the final consensus will be on how to do this, but
> > FWIW:
> > Tested-By: Aaron Watry 
> > 
> > I've tested this with 4 combinations:
> > no --with-opencl-icd option specified : libOpenCL.so gets installed in
> > ${prefix}/lib
> > --with-opencl-icd=no : libOpenCL.so gets installed in ${prefix}/lib
> > --with-opencl-icd=standard : libMesaOpenCL.so installed in
> > ${prefix}/lib, icd in /etc/OpenCL/vendors/mesa.icd
> > --with-opencl-icd=sysconfdir : libMesaOpenCL.so installed in
> > ${prefix}/lib, icd in ${prefix}/etc//mesa.icd.  I only specified
> > --prefix, no other directories overridden in configure command.

shouldn't this part go to ${prefix}/etc/OpenCL/vendors?
Is it just a typo or did it install to ${prefix}/etc//?

jan

> > 
> 
> thanks
> 
>EdB
> 
> > --Aaron
> > 
> >  
> > 
> > On Wed, May 6, 2015 at 4:34 PM, EdB  wrote:
> > 
> >> The standard ICD file path is /etc/OpenCL/vendor/.
> >> However it doesn't fit well with custom build.
> >> This option allow ICD vendor file installation path override
> >> ---
> >>  configure.ac [1]   | 46
> >> +++---
> >>  src/gallium/targets/opencl/Makefile.am |  2 +-
> >>  2 files changed, 33 insertions(+), 15 deletions(-)
> >> 
> >> diff --git a/configure.ac [1] b/configure.ac [1]
> >> index 095e23e..90dba4e 100644
> >> --- a/configure.ac [1]
> >> +++ b/configure.ac [1]
> >> @@ -804,12 +804,6 @@ AC_ARG_ENABLE([opencl],
> >>   [enable OpenCL library @<:@default=disabled@:>@])],
> >> [enable_opencl="$enableval"],
> >> [enable_opencl=no])
> >> -AC_ARG_ENABLE([opencl_icd],
> >> -   [AS_HELP_STRING([--enable-opencl-icd],
> >> -  [Build an OpenCL ICD library to be loaded by an ICD
> >> implementation
> >> -   @<:@default=disabled@:>@])],
> >> -[enable_opencl_icd="$enableval"],
> >> -[enable_opencl_icd=no])
> >>  AC_ARG_ENABLE([xlib-glx],
> >>  [AS_HELP_STRING([--enable-xlib-glx],
> >>  [make GLX library Xlib-based instead of DRI-based
> >> @<:@default=disabled@:>@])],
> >> @@ -1689,19 +1683,11 @@ if test "x$enable_opencl" = xyes; then
> >>  # XXX: Use $enable_shared_pipe_drivers once converted to
> >> use static/shared pipe-drivers
> >>  enable_gallium_loader=yes
> >> 
> >> -if test "x$enable_opencl_icd" = xyes; then
> >> -OPENCL_LIBNAME="MesaOpenCL"
> >> -else
> >> -OPENCL_LIBNAME="OpenCL"
> >> -fi
> >> -
> >>  if test "x$have_libelf" != xyes; then
> >> AC_MSG_ERROR([Clover requires libelf])
> >>  fi
> >>  fi
> >>  AM_CONDITIONAL(HAVE_CLOVER, test "x$enable_opencl" = xyes)
> >> -AM_CONDITIONAL(HAVE_CLOVER_ICD, test "x$enable_opencl_icd" = xyes)
> >> -AC_SUBST([OPENCL_LIBNAME])
> >> 
> >>  dnl
> >>  dnl Gallium configuration
> >> @@ -2006,6 +1992,38 @@ AC_ARG_WITH([d3d-libdir],
> >>  [D3D_DRIVER_INSTALL_DIR="${libdir}/d3d"])
> >>  AC_SUBST([D3D_DRIVER_INSTALL_DIR])
> >> 
> >> +dnl OpenCL ICD
> >> +
> >> +AC_ARG_WITH([opencl-icd],
> >> +   
> >> [AS_HELP_STRING([--with-opencl-icd=@<:@no,standard,sysconfdir@:>@],
> >> +[Build an OpenCL ICD library to be loaded by an ICD
> >> implementation.
> >> + If @<:@standard@:>@ the OpenCL ICD vendor file
> >> installs in /etc/OpenCL/vendors.
> >> + @<:@sysconfdir@:>@ installs the file in
> >> $sysconfdir/OpenCL/vendors
> >> + @<:@default=no@:>@])],
> >> +[OPENCL_ICD="$withval"],
> >> +[OPENCL_ICD="no"])
> >> +
> >> +case "x$OPENCL_ICD" in
> >> +xno)
> >> +OPENCL_LIBNAME="OpenCL"
> >> +;;
> >> +xstandard)
> >> +OPENCL_LIBNAME="MesaOpenCL"
> >> +ICD_FILE_DIR="/etc/OpenCL/vendors"
> >> +;;
> >> +xsysconfdir)
> >> +OPENCL_LIBNAME="MesaOpenCL"
> >> +ICD_FILE_DIR="$sysconfdir/OpenCL/vendors"
> >> +;;
> >> +*)
> >> +AC_MSG_ERROR(['$OPENCL_ICD' is not a valid option for
> >> --with-opencl-icd])
> >> +;;
> >> +esac
> >> +
> >> +AM_CONDITIONAL(HAVE_CLOVER_ICD, test "x$OPENCL_ICD" != xno)
> >> +AC_SUBST([OPENCL_LIBNAME])
> >> +AC_SUBST([ICD_FILE_DIR])
> >> +
> >>  dnl
> >>  dnl Gallium helper functions
> >>  dnl
> >> diff --git a/src/gallium/targets/opencl/Makefile.am
> >> b/src/gallium/targets/opencl/Makefile.am
> >> index 5daf327..781daa0 100644
> >> --- a/src/gallium/targets/opencl/Makefile.am
> >> +++ b/src/gallium/targets/opencl/Makefile.am
> >> @@ -47,7 +47,7 @@ EXTRA_lib@OPENCL_LIBNAME@_la_DEPENDENCIES =
> >> opencl.sym
> >>  EXTRA_DIST = mesa.icd opencl.sym
> >> 
> >>  if HAVE_CLOVER_ICD
> >> -icddir = /etc/OpenCL/vendors/
> >> +icddir = $(ICD_FILE_DIR)
> >>  icd_DATA = mesa.icd
> >>  endif
> >> 
> >> --
> >> 2.1.0
> >> 
> >> ___
> >> mesa-dev mailing list
> >> mesa-dev@lists.freedesktop.org
> >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev [2]
> > 
> > 
> > 
> > Links:
> > --
> > [1] http://configure.ac
> > [2] http://lists.fr

Re: [Mesa-dev] [PATCH 11/13] nir/nir: Use a linked list instead of a has set for use/def sets

2015-05-07 Thread Connor Abbott

Based on the testing you did, it sounds like switching to linked lists
gives us some pretty good performance gains, but before we go ahead
with this you should collect some numbers using
http://anholt.net/compare-perf/ and put them on this commit message.
Comparing list vs. no-list as well as NIR vs. non-NIR might be useful,
so we can compare the time saved to the total time we spend doing
NIR-related things.

On Tue, Apr 28, 2015 at 12:03 AM, Jason Ekstrand  wrote:
> This commit switches us from the current setup of using hash sets for
> use/def sets to using linked lists.  Doing so should save us quite a bit of
> memory because we aren't carrying around 3 hash sets per register and 2 per
> SSA value.  It should also save us CPU time because adding/removing things
> from use/def sets is 4 pointer manipulations instead of a hash lookup.
>
> On the code complexity side of things, some things are now much easier and
> others are a bit harder.  One of the operations we perform constantly in
> optimization passes is to replace one source with another.  Due to the fact
> that an instruction can use the same SSA value multiple times, we had to
> iterate through the sources of the instruction and determine if the use we
> were replacing was the only one before removing it from the set of uses.
> With this patch, uses are per-source not per-instruction so we can just
> remove it safely.  On the other hand, trying to iterate over all of the
> instructions that use a given value is more difficult.  Fortunately, the
> two places we do that are the ffma peephole where it doesn't matter and GCM
> where we already gracefully handle duplicates visits to an instruction.
>
> Another aspect here is that using linked lists in this way can be tricky to
> get right.  With sets, things were quite forgiving and the worst that
> happened if you didn't properly remove a use was that it would get caught
> in the validator.  With linked lists, it can lead to linked list corruption
> which can be harder to track.  However, we do just as much validation of
> the linked lists as we did of the sets so the validator should still catch
> these problems.  While working on this series, the vast majority of the
> bugs I had to fix were caught by assertions.  I don't think the lists are
> going to be that much worse than the sets.
> ---
>  src/glsl/nir/nir.c  | 228 
> +++-
>  src/glsl/nir/nir.h  |  45 +++--
>  src/glsl/nir/nir_validate.c | 158 +++---
>  3 files changed, 194 insertions(+), 237 deletions(-)
>
> diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
> index b8f5dd4..be13c90 100644
> --- a/src/glsl/nir/nir.c
> +++ b/src/glsl/nir/nir.c
> @@ -58,12 +58,9 @@ reg_create(void *mem_ctx, struct exec_list *list)
> nir_register *reg = ralloc(mem_ctx, nir_register);
>
> reg->parent_instr = NULL;
> -   reg->uses = _mesa_set_create(reg, _mesa_hash_pointer,
> -_mesa_key_pointer_equal);
> -   reg->defs = _mesa_set_create(reg, _mesa_hash_pointer,
> -_mesa_key_pointer_equal);
> -   reg->if_uses = _mesa_set_create(reg, _mesa_hash_pointer,
> -   _mesa_key_pointer_equal);
> +   list_inithead(®->uses);
> +   list_inithead(®->defs);
> +   list_inithead(®->if_uses);
>
> reg->num_components = 0;
> reg->num_array_elems = 0;
> @@ -1070,11 +1067,14 @@ update_if_uses(nir_cf_node *node)
>
> nir_if *if_stmt = nir_cf_node_as_if(node);
>
> -   struct set *if_uses_set = if_stmt->condition.is_ssa ?
> - if_stmt->condition.ssa->if_uses :
> - if_stmt->condition.reg.reg->uses;
> -
> -   _mesa_set_add(if_uses_set, if_stmt);
> +   if_stmt->condition.parent_if = if_stmt;
> +   if (if_stmt->condition.is_ssa) {
> +  list_addtail(&if_stmt->condition.use_link,
> +   &if_stmt->condition.ssa->if_uses);
> +   } else {
> +  list_addtail(&if_stmt->condition.use_link,
> +   &if_stmt->condition.reg.reg->if_uses);
> +   }
>  }
>
>  void
> @@ -1227,16 +1227,7 @@ cleanup_cf_node(nir_cf_node *node)
>foreach_list_typed(nir_cf_node, child, node, &if_stmt->else_list)
>   cleanup_cf_node(child);
>
> -  struct set *if_uses;
> -  if (if_stmt->condition.is_ssa) {
> - if_uses = if_stmt->condition.ssa->if_uses;
> -  } else {
> - if_uses = if_stmt->condition.reg.reg->if_uses;
> -  }
> -
> -  struct set_entry *entry = _mesa_set_search(if_uses, if_stmt);
> -  assert(entry);
> -  _mesa_set_remove(if_uses, entry);
> +  list_del(&if_stmt->condition.use_link);
>break;
> }
>
> @@ -1293,9 +1284,9 @@ add_use_cb(nir_src *src, void *state)
>  {
> nir_instr *instr = state;
>
> -   struct set *uses_set = src->is_ssa ? src->ssa->uses : src->reg.reg->uses;
> -
> -   _mesa_set_add(uses_set, instr);
> +   src->parent_instr = instr;
> +   list_

Re: [Mesa-dev] [PATCH 01/13] nir/validate: Validate SSA def parent instructiosn

2015-05-07 Thread Connor Abbott

I can't seem to find the cover email, so I'll respond to this one.
Aside from my comments on patches 11 and 13, patches 1-5 and 11-13 are

Reviewed-by: Connor Abbott 

and FWIW 6-10 are

Acked-by: Connor Abbott 

although what's important there are other people testing those and
make sure they don't break other things (particularly Windows).


On Tue, May 5, 2015 at 8:16 PM, Connor Abbott  wrote:
> Typo in the subject line.
>
> On Tue, Apr 28, 2015 at 12:03 AM, Jason Ekstrand  wrote:
>> ---
>>  src/glsl/nir/nir_validate.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/src/glsl/nir/nir_validate.c b/src/glsl/nir/nir_validate.c
>> index a7aa798..35a853d 100644
>> --- a/src/glsl/nir/nir_validate.c
>> +++ b/src/glsl/nir/nir_validate.c
>> @@ -236,6 +236,8 @@ validate_ssa_def(nir_ssa_def *def, validate_state *state)
>> assert(!BITSET_TEST(state->ssa_defs_found, def->index));
>> BITSET_SET(state->ssa_defs_found, def->index);
>>
>> +   assert(def->parent_instr == state->instr);
>> +
>> assert(def->num_components <= 4);
>>
>> ssa_def_validate_state *def_state = ralloc(state->ssa_defs,
>> --
>> 2.3.6
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/5] nir: Translate image load, store and atomic intrinsics from GLSL IR.

2015-05-07 Thread Connor Abbott

On Tue, May 5, 2015 at 4:29 PM, Francisco Jerez  wrote:
> ---
>  src/glsl/nir/glsl_to_nir.cpp | 125 
> +++
>  1 file changed, 114 insertions(+), 11 deletions(-)
>
> diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
> index f6b8331..a01ab3b 100644
> --- a/src/glsl/nir/glsl_to_nir.cpp
> +++ b/src/glsl/nir/glsl_to_nir.cpp
> @@ -614,27 +614,130 @@ nir_visitor::visit(ir_call *ir)
>   op = nir_intrinsic_atomic_counter_inc_var;
>} else if (strcmp(ir->callee_name(), 
> "__intrinsic_atomic_predecrement") == 0) {
>   op = nir_intrinsic_atomic_counter_dec_var;
> +  } else if (strcmp(ir->callee_name(), "__intrinsic_image_load") == 0) {
> + op = nir_intrinsic_image_load;
> +  } else if (strcmp(ir->callee_name(), "__intrinsic_image_store") == 0) {
> + op = nir_intrinsic_image_store;
> +  } else if (strcmp(ir->callee_name(), "__intrinsic_image_atomic_add") 
> == 0) {
> + op = nir_intrinsic_image_atomic_add;
> +  } else if (strcmp(ir->callee_name(), "__intrinsic_image_atomic_min") 
> == 0) {
> + op = nir_intrinsic_image_atomic_min;
> +  } else if (strcmp(ir->callee_name(), "__intrinsic_image_atomic_max") 
> == 0) {
> + op = nir_intrinsic_image_atomic_max;
> +  } else if (strcmp(ir->callee_name(), "__intrinsic_image_atomic_and") 
> == 0) {
> + op = nir_intrinsic_image_atomic_and;
> +  } else if (strcmp(ir->callee_name(), "__intrinsic_image_atomic_or") == 
> 0) {
> + op = nir_intrinsic_image_atomic_or;
> +  } else if (strcmp(ir->callee_name(), "__intrinsic_image_atomic_xor") 
> == 0) {
> + op = nir_intrinsic_image_atomic_xor;
> +  } else if (strcmp(ir->callee_name(), 
> "__intrinsic_image_atomic_exchange") == 0) {
> + op = nir_intrinsic_image_atomic_exchange;
> +  } else if (strcmp(ir->callee_name(), 
> "__intrinsic_image_atomic_comp_swap") == 0) {
> + op = nir_intrinsic_image_atomic_comp_swap;
>} else {
>   unreachable("not reached");
>}
>
>nir_intrinsic_instr *instr = nir_intrinsic_instr_create(shader, op);
> -  ir_dereference *param =
> - (ir_dereference *) ir->actual_parameters.get_head();
> -  instr->variables[0] = evaluate_deref(&instr->instr, param);
> -  nir_ssa_dest_init(&instr->instr, &instr->dest, 1, NULL);
> +
> +  switch (op) {
> +  case nir_intrinsic_atomic_counter_read_var:
> +  case nir_intrinsic_atomic_counter_inc_var:
> +  case nir_intrinsic_atomic_counter_dec_var: {
> + ir_dereference *param =
> +(ir_dereference *) ir->actual_parameters.get_head();
> + instr->variables[0] = evaluate_deref(&instr->instr, param);
> + nir_ssa_dest_init(&instr->instr, &instr->dest, 1, NULL);
> + break;
> +  }
> +  case nir_intrinsic_image_load:
> +  case nir_intrinsic_image_store:
> +  case nir_intrinsic_image_atomic_add:
> +  case nir_intrinsic_image_atomic_min:
> +  case nir_intrinsic_image_atomic_max:
> +  case nir_intrinsic_image_atomic_and:
> +  case nir_intrinsic_image_atomic_or:
> +  case nir_intrinsic_image_atomic_xor:
> +  case nir_intrinsic_image_atomic_exchange:
> +  case nir_intrinsic_image_atomic_comp_swap: {
> + nir_load_const_instr *instr_zero = 
> nir_load_const_instr_create(shader, 1);
> + instr_zero->value.u[0] = 0;
> + nir_instr_insert_after_cf_list(this->cf_node_list, 
> &instr_zero->instr);
> +
> + /* Set the image variable dereference. */
> + exec_node *param = ir->actual_parameters.get_head();
> + ir_dereference *image = (ir_dereference *)param;
> + const glsl_type *type =
> +image->variable_referenced()->type->without_array();
> +
> + instr->variables[0] = evaluate_deref(&instr->instr, image);
> + param = param->get_next();
> +
> + /* Set the address argument, extending the coordinate vector to four
> +  * components.
> +  */
> + const nir_src src_addr = evaluate_rvalue((ir_dereference *)param);
> + nir_alu_instr *instr_addr = nir_alu_instr_create(shader, 
> nir_op_vec4);
> + nir_ssa_dest_init(&instr_addr->instr, &instr_addr->dest.dest, 4, 
> NULL);
> +
> + for (int i = 0; i < 4; i++) {
> +if (i < type->coordinate_components()) {
> +   instr_addr->src[i].src = src_addr;
> +   instr_addr->src[i].swizzle[0] = i;
> +} else {
> +   instr_addr->src[i].src = nir_src_for_ssa(&instr_zero->def);

I think it would better convey the intent to create an ssa_undef_instr
and use that here instead of zero. Unless something else relies on the
extra coordinates being zeroed?

> +}
> + }
> +
> + nir_instr_insert_after_cf_list(cf_node_list, &instr_addr->instr);
> + instr->src[0] = nir_src_for_ssa(&instr_addr->dest.dest.ssa);
> +

Re: [Mesa-dev] [PATCH 1/5] nir: Define image load, store and atomic intrinsics.

2015-05-07 Thread Connor Abbott

On IRC, Ken and I were discussing using a scheme inspired by SPIR-V,
which has an OpImagePointer instruction that forms a pointer to the
particular texel of the image as well as
OpAtomic{Load,Store,Exchange,etc.} that operate on an image or shared
buffer pointer. The advantages would be:

* Makes translating from SPIR-V easier.
* Reduces the number of intrinsics we need to add for SSBO support.
* Reduces the combinatorial explosion enough that we can have separate
versions for 2, 3, and 4 components and MS vs. non-MS without it being
unbearable. I'm not sure how much of a benefit that would be though.

The disadvantages I can think of are:

* Doesn't actually save any code in the i965 backend, since we need to
do different things depending on if the pointer is to an image or a
shared buffer anyways.
* We'd have to special case nir_convert_from_ssa to ignore the SSA
value that's really a pointer since we don't have any real type-level
support for pointers.
* Since we lower to SSA before converting to i965, there are some ugly
edge cases when the coordinate argument becomes part of a phi web and
gets potentially overwritten before the instruction that uses the
pointer.

I don't have a preference one way or the other, and I guess we could
always refactor it later if we wanted to, so assuming Ken is OK with
this, then besides one minor comment on patch 4 the series is

Reviewed-by: Connor Abbott 

On Tue, May 5, 2015 at 4:29 PM, Francisco Jerez  wrote:
> ---
>  src/glsl/nir/nir_intrinsics.h | 27 +++
>  1 file changed, 27 insertions(+)
>
> diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h
> index 8e28765..4b13c75 100644
> --- a/src/glsl/nir/nir_intrinsics.h
> +++ b/src/glsl/nir/nir_intrinsics.h
> @@ -89,6 +89,33 @@ ATOMIC(inc, 0)
>  ATOMIC(dec, 0)
>  ATOMIC(read, NIR_INTRINSIC_CAN_ELIMINATE)
>
> +/*
> + * Image load, store and atomic intrinsics.
> + *
> + * All image intrinsics take an image target passed as a nir_variable.  Image
> + * variables contain a number of memory and layout qualifiers that influence
> + * the semantics of the intrinsic.
> + *
> + * All image intrinsics take a four-coordinate vector and a sample index as
> + * first two sources, determining the location within the image that will be
> + * accessed by the intrinsic.  Components not applicable to the image target
> + * in use are equal to zero by convention.  Image store takes an additional
> + * four-component argument with the value to be written, and image atomic
> + * operations take either one or two additional scalar arguments with the 
> same
> + * meaning as in the ARB_shader_image_load_store specification.
> + */
> +INTRINSIC(image_load, 2, ARR(4, 1), true, 4, 1, 0,
> +  NIR_INTRINSIC_CAN_ELIMINATE)
> +INTRINSIC(image_store, 3, ARR(4, 1, 4), false, 0, 1, 0, 0)
> +INTRINSIC(image_atomic_add, 3, ARR(4, 1, 1), true, 1, 1, 0, 0)
> +INTRINSIC(image_atomic_min, 3, ARR(4, 1, 1), true, 1, 1, 0, 0)
> +INTRINSIC(image_atomic_max, 3, ARR(4, 1, 1), true, 1, 1, 0, 0)
> +INTRINSIC(image_atomic_and, 3, ARR(4, 1, 1), true, 1, 1, 0, 0)
> +INTRINSIC(image_atomic_or, 3, ARR(4, 1, 1), true, 1, 1, 0, 0)
> +INTRINSIC(image_atomic_xor, 3, ARR(4, 1, 1), true, 1, 1, 0, 0)
> +INTRINSIC(image_atomic_exchange, 3, ARR(4, 1, 1), true, 1, 1, 0, 0)
> +INTRINSIC(image_atomic_comp_swap, 4, ARR(4, 1, 1, 1), true, 1, 1, 0, 0)
> +
>  #define SYSTEM_VALUE(name, components) \
> INTRINSIC(load_##name, 0, ARR(), true, components, 0, 0, \
> NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
> --
> 2.3.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/5] clover: Add threadsafe wrappers for pipe_screen and pipe_context

2015-05-07 Thread Tom Stellard

Events can be added to an OpenCL command queue concurrently from
multiple threads, but pipe_context and pipe_screen objects
are not threadsafe.  The threadsafe wrappers protect all pipe_screen
and pipe_context function calls with a mutex, so we can safely use
them with multiple threads.
---
 src/gallium/state_trackers/clover/Makefile.am  |   6 +-
 src/gallium/state_trackers/clover/Makefile.sources |   4 +
 src/gallium/state_trackers/clover/core/device.cpp  |   2 +
 .../clover/core/pipe_threadsafe_context.c  | 272 +
 .../clover/core/pipe_threadsafe_screen.c   | 184 ++
 .../state_trackers/clover/core/threadsafe.h|  39 +++
 src/gallium/targets/opencl/Makefile.am |   3 +-
 7 files changed, 508 insertions(+), 2 deletions(-)
 create mode 100644 
src/gallium/state_trackers/clover/core/pipe_threadsafe_context.c
 create mode 100644 
src/gallium/state_trackers/clover/core/pipe_threadsafe_screen.c
 create mode 100644 src/gallium/state_trackers/clover/core/threadsafe.h

diff --git a/src/gallium/state_trackers/clover/Makefile.am 
b/src/gallium/state_trackers/clover/Makefile.am
index f46d9ef..8b615ae 100644
--- a/src/gallium/state_trackers/clover/Makefile.am
+++ b/src/gallium/state_trackers/clover/Makefile.am
@@ -1,5 +1,6 @@
 AUTOMAKE_OPTIONS = subdir-objects
 
+include $(top_srcdir)/src/gallium/Automake.inc
 include Makefile.sources
 
 AM_CPPFLAGS = \
@@ -32,6 +33,9 @@ cl_HEADERS = \
$(top_srcdir)/include/CL/opencl.h
 endif
 
+AM_CFLAGS = \
+   $(GALLIUM_CFLAGS)
+
 noinst_LTLIBRARIES = libclover.la libcltgsi.la libclllvm.la
 
 libcltgsi_la_CXXFLAGS = \
@@ -58,6 +62,6 @@ libclover_la_CXXFLAGS = \
 libclover_la_LIBADD = \
libcltgsi.la libclllvm.la
 
-libclover_la_SOURCES = $(CPP_SOURCES)
+libclover_la_SOURCES = $(CPP_SOURCES) $(C_SOURCES)
 
 EXTRA_DIST = Doxyfile
diff --git a/src/gallium/state_trackers/clover/Makefile.sources 
b/src/gallium/state_trackers/clover/Makefile.sources
index 10bbda0..90e6b7e 100644
--- a/src/gallium/state_trackers/clover/Makefile.sources
+++ b/src/gallium/state_trackers/clover/Makefile.sources
@@ -53,6 +53,10 @@ CPP_SOURCES := \
util/range.hpp \
util/tuple.hpp
 
+C_SOURCES := \
+   core/pipe_threadsafe_context.c \
+   core/pipe_threadsafe_screen.c
+
 LLVM_SOURCES := \
llvm/invocation.cpp
 
diff --git a/src/gallium/state_trackers/clover/core/device.cpp 
b/src/gallium/state_trackers/clover/core/device.cpp
index 42b45b7..b145027 100644
--- a/src/gallium/state_trackers/clover/core/device.cpp
+++ b/src/gallium/state_trackers/clover/core/device.cpp
@@ -22,6 +22,7 @@
 
 #include "core/device.hpp"
 #include "core/platform.hpp"
+#include "core/threadsafe.h"
 #include "pipe/p_screen.h"
 #include "pipe/p_state.h"
 
@@ -47,6 +48,7 @@ device::device(clover::platform &platform, pipe_loader_device 
*ldev) :
  pipe->destroy(pipe);
   throw error(CL_INVALID_DEVICE);
}
+   pipe = pipe_threadsafe_screen(pipe);
 }
 
 device::~device() {
diff --git a/src/gallium/state_trackers/clover/core/pipe_threadsafe_context.c 
b/src/gallium/state_trackers/clover/core/pipe_threadsafe_context.c
new file mode 100644
index 000..f08f56c
--- /dev/null
+++ b/src/gallium/state_trackers/clover/core/pipe_threadsafe_context.c
@@ -0,0 +1,272 @@
+/*
+ * Copyright 2015 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN 
THE
+ * SOFTWARE.
+ *
+ * Authors: Tom Stellard 
+ *
+ */
+
+#include 
+
+/**
+ * \file
+ *
+ * threadsafe_context is a wrapper around a pipe_context to make it thread
+ * safe.
+ */
+
+#include "os/os_thread.h"
+#include "pipe/p_context.h"
+#include "util/u_memory.h"
+
+#include "threadsafe.h"
+
+
+
+struct threadsafe_context {
+   struct pipe_context base;
+   struct pipe_context *ctx;
+   pipe_mutex mutex;
+};
+
+static struct pipe_context *unwrap(struct pipe_context *ctx) {
+   if (!ctx)
+  return NULL;
+

[Mesa-dev] [PATCH 2/5] clover: Replace open-coded event::signalled()

2015-05-07 Thread Tom Stellard

This consolidates signalled checks into the same place.
---
 src/gallium/state_trackers/clover/core/event.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/clover/core/event.cpp 
b/src/gallium/state_trackers/clover/core/event.cpp
index 58de888..3c9336e 100644
--- a/src/gallium/state_trackers/clover/core/event.cpp
+++ b/src/gallium/state_trackers/clover/core/event.cpp
@@ -66,7 +66,7 @@ event::signalled() const {
 
 void
 event::chain(event &ev) {
-   if (wait_count) {
+   if (!signalled()) {
   ev.wait_count++;
   _chain.push_back(ev);
}
-- 
2.0.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/5] clover: Fix a bug with multi-threaded events

2015-05-07 Thread Tom Stellard

It was possible for some events never to get triggered if one thread
was creating events and another threads was waiting for them.

This patch consolidates soft_event::wait() and hard_event::wait()
into event::wait() so that hard_event objects will now wait for
all their dependencies to be submitted before flushing the command
queue.
---
 src/gallium/state_trackers/clover/core/event.cpp | 19 +++
 src/gallium/state_trackers/clover/core/event.hpp |  9 ++---
 2 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/src/gallium/state_trackers/clover/core/event.cpp 
b/src/gallium/state_trackers/clover/core/event.cpp
index 3c9336e..da227bb 100644
--- a/src/gallium/state_trackers/clover/core/event.cpp
+++ b/src/gallium/state_trackers/clover/core/event.cpp
@@ -39,6 +39,7 @@ event::~event() {
 void
 event::trigger() {
if (!--wait_count) {
+  signalled_cv.notify_all();
   action_ok(*this);
 
   while (!_chain.empty()) {
@@ -73,6 +74,15 @@ event::chain(event &ev) {
ev.deps.push_back(*this);
 }
 
+void
+event::wait() {
+   for (event &ev : deps)
+  ev.wait();
+
+   std::unique_lock lock(signalled_mutex);
+   signalled_cv.wait(lock, [=]{ return signalled(); });
+}
+
 hard_event::hard_event(command_queue &q, cl_command_type command,
const ref_vector &deps, action action) :
event(q.context(), deps, profile(q, action), [](event &ev){}),
@@ -117,9 +127,11 @@ hard_event::command() const {
 }
 
 void
-hard_event::wait() const {
+hard_event::wait() {
pipe_screen *screen = queue()->device().pipe;
 
+   event::wait();
+
if (status() == CL_QUEUED)
   queue()->flush();
 
@@ -206,9 +218,8 @@ soft_event::command() const {
 }
 
 void
-soft_event::wait() const {
-   for (event &ev : deps)
-  ev.wait();
+soft_event::wait() {
+   event::wait();
 
if (status() != CL_COMPLETE)
   throw error(CL_EXEC_STATUS_ERROR_FOR_EVENTS_IN_WAIT_LIST);
diff --git a/src/gallium/state_trackers/clover/core/event.hpp 
b/src/gallium/state_trackers/clover/core/event.hpp
index d407c80..dffafb9 100644
--- a/src/gallium/state_trackers/clover/core/event.hpp
+++ b/src/gallium/state_trackers/clover/core/event.hpp
@@ -23,6 +23,7 @@
 #ifndef CLOVER_CORE_EVENT_HPP
 #define CLOVER_CORE_EVENT_HPP
 
+#include 
 #include 
 
 #include "core/object.hpp"
@@ -68,7 +69,7 @@ namespace clover {
   virtual cl_int status() const = 0;
   virtual command_queue *queue() const = 0;
   virtual cl_command_type command() const = 0;
-  virtual void wait() const = 0;
+  virtual void wait();
 
   virtual struct pipe_fence_handle *fence() const {
  return NULL;
@@ -87,6 +88,8 @@ namespace clover {
   action action_ok;
   action action_fail;
   std::vector> _chain;
+  std::condition_variable signalled_cv;
+  std::mutex signalled_mutex;
};
 
///
@@ -111,7 +114,7 @@ namespace clover {
   virtual cl_int status() const;
   virtual command_queue *queue() const;
   virtual cl_command_type command() const;
-  virtual void wait() const;
+  virtual void wait();
 
   const lazy &time_queued() const;
   const lazy &time_submit() const;
@@ -149,7 +152,7 @@ namespace clover {
   virtual cl_int status() const;
   virtual command_queue *queue() const;
   virtual cl_command_type command() const;
-  virtual void wait() const;
+  virtual void wait();
};
 }
 
-- 
2.0.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/5] clover: Add a mutex to guard queue::queued_events

2015-05-07 Thread Tom Stellard

This fixes a potential crash where on a sequence like this:

Thread 0: Check if queue is not empty.
Thread 1: Remove item from queue, making it empty.
Thread 0: Do something assuming queue is not empty.
---
 src/gallium/state_trackers/clover/core/queue.cpp | 2 ++
 src/gallium/state_trackers/clover/core/queue.hpp | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/src/gallium/state_trackers/clover/core/queue.cpp 
b/src/gallium/state_trackers/clover/core/queue.cpp
index 24f9326..87f9dcc 100644
--- a/src/gallium/state_trackers/clover/core/queue.cpp
+++ b/src/gallium/state_trackers/clover/core/queue.cpp
@@ -44,6 +44,7 @@ command_queue::flush() {
pipe_screen *screen = device().pipe;
pipe_fence_handle *fence = NULL;
 
+   std::lock_guard lock(queued_events_mutex);
if (!queued_events.empty()) {
   pipe->flush(pipe, &fence, 0);
 
@@ -69,6 +70,7 @@ command_queue::profiling_enabled() const {
 
 void
 command_queue::sequence(hard_event &ev) {
+   std::lock_guard lock(queued_events_mutex);
if (!queued_events.empty())
   queued_events.back()().chain(ev);
 
diff --git a/src/gallium/state_trackers/clover/core/queue.hpp 
b/src/gallium/state_trackers/clover/core/queue.hpp
index b7166e6..bddb86c 100644
--- a/src/gallium/state_trackers/clover/core/queue.hpp
+++ b/src/gallium/state_trackers/clover/core/queue.hpp
@@ -24,6 +24,7 @@
 #define CLOVER_CORE_QUEUE_HPP
 
 #include 
+#include 
 
 #include "core/object.hpp"
 #include "core/context.hpp"
@@ -69,6 +70,7 @@ namespace clover {
 
   cl_command_queue_properties props;
   pipe_context *pipe;
+  std::mutex queued_events_mutex;
   std::deque> queued_events;
};
 }
-- 
2.0.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/5] clover: Add a mutex to guard event::chain and event::wait_count

2015-05-07 Thread Tom Stellard

This mutex effectively prevents an event's chain or wait_count from
being updated while it is in the process of triggering.  Otherwise it
may be possible to add to an event's chain after it has been triggered,
which causes the chained event to never be triggered.
---
 src/gallium/state_trackers/clover/core/event.cpp | 3 +++
 src/gallium/state_trackers/clover/core/event.hpp | 1 +
 2 files changed, 4 insertions(+)

diff --git a/src/gallium/state_trackers/clover/core/event.cpp 
b/src/gallium/state_trackers/clover/core/event.cpp
index da227bb..646fd38 100644
--- a/src/gallium/state_trackers/clover/core/event.cpp
+++ b/src/gallium/state_trackers/clover/core/event.cpp
@@ -38,6 +38,7 @@ event::~event() {
 
 void
 event::trigger() {
+   std::lock_guard lock(trigger_mutex);
if (!--wait_count) {
   signalled_cv.notify_all();
   action_ok(*this);
@@ -54,6 +55,7 @@ event::abort(cl_int status) {
_status = status;
action_fail(*this);
 
+   std::lock_guard lock(trigger_mutex);
while (!_chain.empty()) {
   _chain.back()().abort(status);
   _chain.pop_back();
@@ -67,6 +69,7 @@ event::signalled() const {
 
 void
 event::chain(event &ev) {
+   std::lock_guard lock(trigger_mutex);
if (!signalled()) {
   ev.wait_count++;
   _chain.push_back(ev);
diff --git a/src/gallium/state_trackers/clover/core/event.hpp 
b/src/gallium/state_trackers/clover/core/event.hpp
index dffafb9..a64fbba 100644
--- a/src/gallium/state_trackers/clover/core/event.hpp
+++ b/src/gallium/state_trackers/clover/core/event.hpp
@@ -90,6 +90,7 @@ namespace clover {
   std::vector> _chain;
   std::condition_variable signalled_cv;
   std::mutex signalled_mutex;
+  std::mutex trigger_mutex;
};
 
///
-- 
2.0.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+

2015-05-07 Thread Jason Ekstrand

GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:

   total instructions in shared programs: 2724483 -> 2711790 (-0.47%)
   instructions in affected programs: 1860859 -> 1848166 (-0.68%)
   helped:4387
   HURT:  4758
   GAINED:1499

The gained programs are ARB vertext programs that were previously going
through the vec4 backend.  Now that we have prog_to_nir, ARB vertex
programs can go through the scalar backend so they show up as "gained" in
the shader-db results.
---
 src/mesa/drivers/dri/i965/brw_context.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index fd7420a..8615e5e 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -588,7 +588,7 @@ brw_initialize_context_constants(struct brw_context *brw)
   ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].EmitNoIndirectTemp 
= true;
   ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].OptimizeForAOS = 
false;
 
-  if (brw_env_var_as_boolean("INTEL_USE_NIR", false))
+  if (brw_env_var_as_boolean("INTEL_USE_NIR", true))
  ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].NirOptions = 
&nir_options;
}
 
-- 
2.4.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 07/13] util: Move gallium's linked list to util

2015-05-07 Thread Ian Romanick

Isn't this the same as src/util/simple_list.h?

On 04/27/2015 09:03 PM, Jason Ekstrand wrote:
> The linked list in gallium is pretty much the kernel list and we would like
> to have a C-based linked list for all of mesa.  Let's not duplicate and
> just steal the gallium one.
> ---
>  src/gallium/auxiliary/Makefile.sources |   1 -
>  src/gallium/auxiliary/hud/hud_private.h|   2 +-
>  .../auxiliary/pipebuffer/pb_buffer_fenced.c|   2 +-
>  src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c |   2 +-
>  src/gallium/auxiliary/pipebuffer/pb_bufmgr_debug.c |   2 +-
>  src/gallium/auxiliary/pipebuffer/pb_bufmgr_mm.c|   2 +-
>  src/gallium/auxiliary/pipebuffer/pb_bufmgr_pool.c  |   2 +-
>  src/gallium/auxiliary/pipebuffer/pb_bufmgr_slab.c  |   2 +-
>  src/gallium/auxiliary/util/u_debug_flush.c |   2 +-
>  src/gallium/auxiliary/util/u_debug_memory.c|   2 +-
>  src/gallium/auxiliary/util/u_dirty_surfaces.h  |   2 +-
>  src/gallium/auxiliary/util/u_double_list.h | 146 
> -
>  src/gallium/drivers/freedreno/freedreno_context.h  |   2 +-
>  src/gallium/drivers/freedreno/freedreno_query_hw.h |   2 +-
>  src/gallium/drivers/freedreno/freedreno_resource.h |   2 +-
>  src/gallium/drivers/ilo/ilo_common.h   |   2 +-
>  src/gallium/drivers/nouveau/nouveau_buffer.h   |   2 +-
>  src/gallium/drivers/nouveau/nouveau_fence.c|   2 -
>  src/gallium/drivers/nouveau/nouveau_fence.h|   2 +-
>  src/gallium/drivers/nouveau/nouveau_mm.c   |   2 +-
>  src/gallium/drivers/nouveau/nv30/nv30_screen.h |   2 +-
>  src/gallium/drivers/nouveau/nv50/nv50_resource.h   |   2 +-
>  src/gallium/drivers/r600/compute_memory_pool.c |   2 +-
>  src/gallium/drivers/r600/evergreen_compute.c   |   2 +-
>  src/gallium/drivers/r600/r600_llvm.c   |   2 +-
>  src/gallium/drivers/r600/r600_pipe.h   |   2 +-
>  src/gallium/drivers/radeon/r600_pipe_common.h  |   2 +-
>  src/gallium/drivers/radeon/radeon_vce.h|   2 +-
>  src/gallium/drivers/svga/svga_context.h|   2 +-
>  src/gallium/drivers/svga/svga_resource_buffer.h|   2 -
>  .../drivers/svga/svga_resource_buffer_upload.c |   1 -
>  src/gallium/drivers/svga/svga_screen_cache.h   |   2 +-
>  src/gallium/state_trackers/nine/basetexture9.h |   2 +-
>  src/gallium/state_trackers/nine/device9.h  |   2 +-
>  src/gallium/state_trackers/nine/nine_state.h   |   2 +-
>  src/gallium/state_trackers/nine/surface9.h |   2 +-
>  src/gallium/state_trackers/omx/vid_dec.h   |   2 +-
>  src/gallium/state_trackers/omx/vid_enc.h   |   2 +-
>  src/gallium/winsys/radeon/drm/radeon_drm_bo.c  |   2 +-
>  .../winsys/svga/drm/pb_buffer_simple_fenced.c  |   2 +-
>  src/gallium/winsys/svga/drm/vmw_fence.c|   2 +-
>  src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c  |   2 +-
>  src/util/Makefile.sources  |   1 +
>  src/util/list.h| 146 
> +
>  44 files changed, 184 insertions(+), 189 deletions(-)
>  delete mode 100644 src/gallium/auxiliary/util/u_double_list.h
>  create mode 100644 src/util/list.h
> 
> diff --git a/src/gallium/auxiliary/Makefile.sources 
> b/src/gallium/auxiliary/Makefile.sources
> index ec7547c..62e6b94 100644
> --- a/src/gallium/auxiliary/Makefile.sources
> +++ b/src/gallium/auxiliary/Makefile.sources
> @@ -197,7 +197,6 @@ C_SOURCES := \
>   util/u_dirty_surfaces.h \
>   util/u_dl.c \
>   util/u_dl.h \
> - util/u_double_list.h \
>   util/u_draw.c \
>   util/u_draw.h \
>   util/u_draw_quad.c \
> diff --git a/src/gallium/auxiliary/hud/hud_private.h 
> b/src/gallium/auxiliary/hud/hud_private.h
> index 1606ada..c74dc3b 100644
> --- a/src/gallium/auxiliary/hud/hud_private.h
> +++ b/src/gallium/auxiliary/hud/hud_private.h
> @@ -29,7 +29,7 @@
>  #define HUD_PRIVATE_H
>  
>  #include "pipe/p_context.h"
> -#include "util/u_double_list.h"
> +#include "util/list.h"
>  
>  struct hud_graph {
> /* initialized by common code */
> diff --git a/src/gallium/auxiliary/pipebuffer/pb_buffer_fenced.c 
> b/src/gallium/auxiliary/pipebuffer/pb_buffer_fenced.c
> index 9e0cace..7840467 100644
> --- a/src/gallium/auxiliary/pipebuffer/pb_buffer_fenced.c
> +++ b/src/gallium/auxiliary/pipebuffer/pb_buffer_fenced.c
> @@ -46,7 +46,7 @@
>  #include "util/u_debug.h"
>  #include "os/os_thread.h"
>  #include "util/u_memory.h"
> -#include "util/u_double_list.h"
> +#include "util/list.h"
>  
>  #include "pb_buffer.h"
>  #include "pb_buffer_fenced.h"
> diff --git a/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c 
> b/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c
> index 5eb8d06..5023687 100644
> --- a/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c
> +++ b/src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c
> @@ -38,7 +38,7 @@
>  #include "util/u_debug.h"
>  #include "os/os_thread.h"
>  #

Re: [Mesa-dev] [PATCH 09/13] util/list: Add list_empty and list_length functions

2015-05-07 Thread Ian Romanick

On 05/05/2015 11:21 AM, Neil Roberts wrote:
> Jason Ekstrand  writes:
> 
>> +static inline bool list_empty(struct list_head *list)
>> +{
>> +   return list->next == list;
>> +}
> 
> It would be good if list.h also included stdbool.h in order to get the
> declaration of bool. However, will that cause problems on MSVC? Is the
> Gallium code compiled on MSVC in general?
> 
>> +static inline unsigned list_length(struct list_head *list)
>> +{
>> +   unsigned length = 0;
>> +   for (struct list_head *node = list->next; node != list; node = 
>> node->next)
>> +  length++;
>> +   return length;
>> +}
> 
> Any reason not to use one of the list iterator macros here? Is it safe
> to use a C99-ism outside of a macro in this header? Maybe MSVC
> supports this particular C99-ism anyway.
> 
> For what it's worth, I'm strongly in favour of using these kernel-style
> lists instead of exec_list. The kernel ones seem much less confusing.

Huh?  They're practically identical.  The only difference is the
kernel-style lists have a single sentinel node, and that node is
impossible to identify "in a crowd."  The exec_lists use two sentinel
nodes, and those nodes have one pointer of overlapping storage (head and
tail are the next and prev pointers of one node, and tail and tail_pred
are the next and prev pointers of the other).  I thought there was some
ASCII art in list.h that showed this, but that appears to not be the case...

This gives some convenience that you can walk through a list from any
node in the list without having a pointer to the list itself.  I don't
know if we still do, but there used to be a few places where we took
advantage of that.

Some of the APIs are (very) poorly named (I'm looking at you,
insert_before), and I'd welcome patches to fix that up.

> Regards,
> - Neil
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+

2015-05-07 Thread Kenneth Graunke

On Thursday, May 07, 2015 04:50:39 PM Jason Ekstrand wrote:
> GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:
> 
>total instructions in shared programs: 2724483 -> 2711790 (-0.47%)
>instructions in affected programs: 1860859 -> 1848166 (-0.68%)
>helped:4387
>HURT:  4758
>GAINED:1499
> 
> The gained programs are ARB vertext programs that were previously going
> through the vec4 backend.  Now that we have prog_to_nir, ARB vertex
> programs can go through the scalar backend so they show up as "gained" in
> the shader-db results.
> ---
>  src/mesa/drivers/dri/i965/brw_context.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
> b/src/mesa/drivers/dri/i965/brw_context.c
> index fd7420a..8615e5e 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.c
> +++ b/src/mesa/drivers/dri/i965/brw_context.c
> @@ -588,7 +588,7 @@ brw_initialize_context_constants(struct brw_context *brw)
>
> ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].EmitNoIndirectTemp = 
> true;
>ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].OptimizeForAOS = 
> false;
>  
> -  if (brw_env_var_as_boolean("INTEL_USE_NIR", false))
> +  if (brw_env_var_as_boolean("INTEL_USE_NIR", true))
>   ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].NirOptions = 
> &nir_options;
> }
>  
> 

We definitely want to throw the switch before 10.6, so that all the
scalar backends are using NIR, and we'll be able to delete the
deprecated ones post-release.

Acked-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+

2015-05-07 Thread Ian Romanick

On 05/07/2015 04:50 PM, Jason Ekstrand wrote:
> GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:
> 
>total instructions in shared programs: 2724483 -> 2711790 (-0.47%)
>instructions in affected programs: 1860859 -> 1848166 (-0.68%)
>helped:4387
>HURT:  4758
>GAINED:1499
> 
> The gained programs are ARB vertext programs that were previously going
> through the vec4 backend.  Now that we have prog_to_nir, ARB vertex
> programs can go through the scalar backend so they show up as "gained" in
> the shader-db results.

I thought we already did this... why didn't this happen when NIR became
the default for the FS backend?  And has that reason (assuming there was
one) been resolved?

> ---
>  src/mesa/drivers/dri/i965/brw_context.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
> b/src/mesa/drivers/dri/i965/brw_context.c
> index fd7420a..8615e5e 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.c
> +++ b/src/mesa/drivers/dri/i965/brw_context.c
> @@ -588,7 +588,7 @@ brw_initialize_context_constants(struct brw_context *brw)
>
> ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].EmitNoIndirectTemp = 
> true;
>ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].OptimizeForAOS = 
> false;
>  
> -  if (brw_env_var_as_boolean("INTEL_USE_NIR", false))
> +  if (brw_env_var_as_boolean("INTEL_USE_NIR", true))
>   ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].NirOptions = 
> &nir_options;
> }
>  

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+

2015-05-07 Thread Jason Ekstrand

On May 7, 2015 5:38 PM, "Ian Romanick"  wrote:
>
> On 05/07/2015 04:50 PM, Jason Ekstrand wrote:
> > GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:
> >
> >total instructions in shared programs: 2724483 -> 2711790 (-0.47%)
> >instructions in affected programs: 1860859 -> 1848166 (-0.68%)
> >helped:4387
> >HURT:  4758
> >GAINED:1499
> >
> > The gained programs are ARB vertext programs that were previously going
> > through the vec4 backend.  Now that we have prog_to_nir, ARB vertex
> > programs can go through the scalar backend so they show up as "gained"
in
> > the shader-db results.
>
> I thought we already did this... why didn't this happen when NIR became
> the default for the FS backend?  And has that reason (assuming there was
> one) been resolved?

We couldn't do copy propagation of values in the attribute register file.
That, it turn was blocked on reworking the LOAD_PAYLOAD instruction.  I
pushed a series this morning that fixed both of those and cut 7.5% off of
all SIMD8 VS instructions when using NIR.  It also helps GLSL IR but by
only 1% or so.
--Jason

> > ---
> >  src/mesa/drivers/dri/i965/brw_context.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_context.c
b/src/mesa/drivers/dri/i965/brw_context.c
> > index fd7420a..8615e5e 100644
> > --- a/src/mesa/drivers/dri/i965/brw_context.c
> > +++ b/src/mesa/drivers/dri/i965/brw_context.c
> > @@ -588,7 +588,7 @@ brw_initialize_context_constants(struct brw_context
*brw)
> >
ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].EmitNoIndirectTemp =
true;
> >
ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].OptimizeForAOS = false;
> >
> > -  if (brw_env_var_as_boolean("INTEL_USE_NIR", false))
> > +  if (brw_env_var_as_boolean("INTEL_USE_NIR", true))
> >
 ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].NirOptions =
&nir_options;
> > }
> >
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] main: glGetIntegeri_v fails for GL_VERTEX_BINDING_STRIDE

2015-05-07 Thread Marta Lofstedt

The return type for GL_VERTEX_BINDING_STRIDE is missing,
this cause glGetIntegeri_v to fail.

Signed-off-by: Marta Lofstedt 
---
 src/mesa/main/get.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 6fc0f3f..9fb8fba 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -1959,6 +1959,7 @@ find_value_indexed(const char *func, GLenum pname, GLuint 
index, union value *v)
   if (index >= ctx->Const.Program[MESA_SHADER_VERTEX].MaxAttribs)
   goto invalid_value;
   v->value_int = 
ctx->Array.VAO->VertexBinding[VERT_ATTRIB_GENERIC(index)].Stride;
+  return TYPE_INT;
 
/* ARB_shader_image_load_store */
case GL_IMAGE_BINDING_NAME: {
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+

2015-05-07 Thread Ian Romanick

On 05/07/2015 05:44 PM, Jason Ekstrand wrote:
> 
> On May 7, 2015 5:38 PM, "Ian Romanick"  > wrote:
>>
>> On 05/07/2015 04:50 PM, Jason Ekstrand wrote:
>> > GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:
>> >
>> >total instructions in shared programs: 2724483 -> 2711790 (-0.47%)
>> >instructions in affected programs: 1860859 -> 1848166 (-0.68%)
>> >helped:4387
>> >HURT:  4758
>> >GAINED:1499
>> >
>> > The gained programs are ARB vertext programs that were previously going
>> > through the vec4 backend.  Now that we have prog_to_nir, ARB vertex
>> > programs can go through the scalar backend so they show up as
> "gained" in
>> > the shader-db results.
>>
>> I thought we already did this... why didn't this happen when NIR became
>> the default for the FS backend?  And has that reason (assuming there was
>> one) been resolved?
> 
> We couldn't do copy propagation of values in the attribute register
> file.  That, it turn was blocked on reworking the LOAD_PAYLOAD
> instruction.  I pushed a series this morning that fixed both of those
> and cut 7.5% off of all SIMD8 VS instructions when using NIR.  It also
> helps GLSL IR but by only 1% or so.
> --Jason

Ah, that's right.  Make it so!

Reviewed-by: Ian Romanick 

>> > ---
>> >  src/mesa/drivers/dri/i965/brw_context.c | 2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/src/mesa/drivers/dri/i965/brw_context.c
> b/src/mesa/drivers/dri/i965/brw_context.c
>> > index fd7420a..8615e5e 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_context.c
>> > +++ b/src/mesa/drivers/dri/i965/brw_context.c
>> > @@ -588,7 +588,7 @@ brw_initialize_context_constants(struct
> brw_context *brw)
>> >   
> ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].EmitNoIndirectTemp
> = true;
>> >   
> ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].OptimizeForAOS = false;
>> >
>> > -  if (brw_env_var_as_boolean("INTEL_USE_NIR", false))
>> > +  if (brw_env_var_as_boolean("INTEL_USE_NIR", true))
>> > 
>  ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].NirOptions =
> &nir_options;
>> > }
>> >
>>

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+

2015-05-07 Thread Matt Turner

On Thu, May 7, 2015 at 4:50 PM, Jason Ekstrand  wrote:
> GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:
>
>total instructions in shared programs: 2724483 -> 2711790 (-0.47%)
>instructions in affected programs: 1860859 -> 1848166 (-0.68%)
>helped:4387
>HURT:  4758
>GAINED:1499
>
> The gained programs are ARB vertext programs that were previously going
> through the vec4 backend.  Now that we have prog_to_nir, ARB vertex
> programs can go through the scalar backend so they show up as "gained" in
> the shader-db results.

Again, I'm kind of confused and disappointed that we're just okay with
hurting 4700 programs without more analysis. I guess I'll go do
that...

I'm concerned -- lots of shaders like left-4-dead-2/low/3699 go from
297 -> 161 instructions. More concerning, the number of send
instructions drop from 36 to 12, and a loop that was 111 instructions
long suddenly becomes

   START B1 <-B0 <-B2
cmp.ge.f0(8)nullg42<8,8,1>D g7<0,1,0>D
(+f0) break(8)  JIP: 24 UIP: 24
   END B1 ->B3 ->B2
   START B2 <-B1
add(8)  g42<1>D g42<8,8,1>D 1D
while(8)JIP: -32
   END B2 ->B1

That deserves a lot more investigation. I'll take a gamble and say
something is broken.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+

2015-05-07 Thread Ian Romanick

On 05/07/2015 06:17 PM, Matt Turner wrote:
> On Thu, May 7, 2015 at 4:50 PM, Jason Ekstrand  wrote:
>> GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:
>>
>>total instructions in shared programs: 2724483 -> 2711790 (-0.47%)
>>instructions in affected programs: 1860859 -> 1848166 (-0.68%)
>>helped:4387
>>HURT:  4758
>>GAINED:1499
>>
>> The gained programs are ARB vertext programs that were previously going
>> through the vec4 backend.  Now that we have prog_to_nir, ARB vertex
>> programs can go through the scalar backend so they show up as "gained" in
>> the shader-db results.
> 
> Again, I'm kind of confused and disappointed that we're just okay with
> hurting 4700 programs without more analysis. I guess I'll go do
> that...

Yeah... I think I just (foolishly) assumed it was mostly +/- small
amounts given the % in affected programs.

> I'm concerned -- lots of shaders like left-4-dead-2/low/3699 go from
> 297 -> 161 instructions. More concerning, the number of send
> instructions drop from 36 to 12, and a loop that was 111 instructions
> long suddenly becomes
> 
>START B1 <-B0 <-B2
> cmp.ge.f0(8)nullg42<8,8,1>D g7<0,1,0>D
> (+f0) break(8)  JIP: 24 UIP: 24
>END B1 ->B3 ->B2
>START B2 <-B1
> add(8)  g42<1>D g42<8,8,1>D 1D
> while(8)JIP: -32
>END B2 ->B1
> 
> That deserves a lot more investigation. I'll take a gamble and say
> something is broken.

Yikes.  I guess I'm surprised that piglit+gles3conform+deqp didn't
already find

> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] clover: replace --enable-opencl-icd with --with-opencl-icd

2015-05-07 Thread Aaron Watry

On Thu, May 7, 2015 at 5:27 PM, Jan Vesely  wrote:

> On Thu, 2015-05-07 at 21:52 +0200, EdB wrote:
> > Le 2015-05-07 18:55, Aaron Watry a écrit :
> > > I'm not sure what the final consensus will be on how to do this, but
> > > FWIW:
> > > Tested-By: Aaron Watry 
> > >
> > > I've tested this with 4 combinations:
> > > no --with-opencl-icd option specified : libOpenCL.so gets installed in
> > > ${prefix}/lib
> > > --with-opencl-icd=no : libOpenCL.so gets installed in ${prefix}/lib
> > > --with-opencl-icd=standard : libMesaOpenCL.so installed in
> > > ${prefix}/lib, icd in /etc/OpenCL/vendors/mesa.icd
> > > --with-opencl-icd=sysconfdir : libMesaOpenCL.so installed in
> > > ${prefix}/lib, icd in ${prefix}/etc//mesa.icd.  I only specified
> > > --prefix, no other directories overridden in configure command.
>
> shouldn't this part go to ${prefix}/etc/OpenCL/vendors?
> Is it just a typo or did it install to ${prefix}/etc//?
>
>
That was just a typo.  It went to ${prefix}/etc/OpenCL/vendors/mesa.icd.

--Aaron


> jan
>
> > >
> >
> > thanks
> >
> >EdB
> >
> > > --Aaron
> > >
> > >
> > >
> > > On Wed, May 6, 2015 at 4:34 PM, EdB  wrote:
> > >
> > >> The standard ICD file path is /etc/OpenCL/vendor/.
> > >> However it doesn't fit well with custom build.
> > >> This option allow ICD vendor file installation path override
> > >> ---
> > >>  configure.ac [1]   | 46
> > >> +++---
> > >>  src/gallium/targets/opencl/Makefile.am |  2 +-
> > >>  2 files changed, 33 insertions(+), 15 deletions(-)
> > >>
> > >> diff --git a/configure.ac [1] b/configure.ac [1]
> > >> index 095e23e..90dba4e 100644
> > >> --- a/configure.ac [1]
> > >> +++ b/configure.ac [1]
> > >> @@ -804,12 +804,6 @@ AC_ARG_ENABLE([opencl],
> > >>   [enable OpenCL library @<:@default=disabled@:>@])],
> > >> [enable_opencl="$enableval"],
> > >> [enable_opencl=no])
> > >> -AC_ARG_ENABLE([opencl_icd],
> > >> -   [AS_HELP_STRING([--enable-opencl-icd],
> > >> -  [Build an OpenCL ICD library to be loaded by an ICD
> > >> implementation
> > >> -   @<:@default=disabled@:>@])],
> > >> -[enable_opencl_icd="$enableval"],
> > >> -[enable_opencl_icd=no])
> > >>  AC_ARG_ENABLE([xlib-glx],
> > >>  [AS_HELP_STRING([--enable-xlib-glx],
> > >>  [make GLX library Xlib-based instead of DRI-based
> > >> @<:@default=disabled@:>@])],
> > >> @@ -1689,19 +1683,11 @@ if test "x$enable_opencl" = xyes; then
> > >>  # XXX: Use $enable_shared_pipe_drivers once converted to
> > >> use static/shared pipe-drivers
> > >>  enable_gallium_loader=yes
> > >>
> > >> -if test "x$enable_opencl_icd" = xyes; then
> > >> -OPENCL_LIBNAME="MesaOpenCL"
> > >> -else
> > >> -OPENCL_LIBNAME="OpenCL"
> > >> -fi
> > >> -
> > >>  if test "x$have_libelf" != xyes; then
> > >> AC_MSG_ERROR([Clover requires libelf])
> > >>  fi
> > >>  fi
> > >>  AM_CONDITIONAL(HAVE_CLOVER, test "x$enable_opencl" = xyes)
> > >> -AM_CONDITIONAL(HAVE_CLOVER_ICD, test "x$enable_opencl_icd" = xyes)
> > >> -AC_SUBST([OPENCL_LIBNAME])
> > >>
> > >>  dnl
> > >>  dnl Gallium configuration
> > >> @@ -2006,6 +1992,38 @@ AC_ARG_WITH([d3d-libdir],
> > >>  [D3D_DRIVER_INSTALL_DIR="${libdir}/d3d"])
> > >>  AC_SUBST([D3D_DRIVER_INSTALL_DIR])
> > >>
> > >> +dnl OpenCL ICD
> > >> +
> > >> +AC_ARG_WITH([opencl-icd],
> > >> +
> > >> [AS_HELP_STRING([--with-opencl-icd=@<:@no,standard,sysconfdir@:>@],
> > >> +[Build an OpenCL ICD library to be loaded by an ICD
> > >> implementation.
> > >> + If @<:@standard@:>@ the OpenCL ICD vendor file
> > >> installs in /etc/OpenCL/vendors.
> > >> + @<:@sysconfdir@:>@ installs the file in
> > >> $sysconfdir/OpenCL/vendors
> > >> + @<:@default=no@:>@])],
> > >> +[OPENCL_ICD="$withval"],
> > >> +[OPENCL_ICD="no"])
> > >> +
> > >> +case "x$OPENCL_ICD" in
> > >> +xno)
> > >> +OPENCL_LIBNAME="OpenCL"
> > >> +;;
> > >> +xstandard)
> > >> +OPENCL_LIBNAME="MesaOpenCL"
> > >> +ICD_FILE_DIR="/etc/OpenCL/vendors"
> > >> +;;
> > >> +xsysconfdir)
> > >> +OPENCL_LIBNAME="MesaOpenCL"
> > >> +ICD_FILE_DIR="$sysconfdir/OpenCL/vendors"
> > >> +;;
> > >> +*)
> > >> +AC_MSG_ERROR(['$OPENCL_ICD' is not a valid option for
> > >> --with-opencl-icd])
> > >> +;;
> > >> +esac
> > >> +
> > >> +AM_CONDITIONAL(HAVE_CLOVER_ICD, test "x$OPENCL_ICD" != xno)
> > >> +AC_SUBST([OPENCL_LIBNAME])
> > >> +AC_SUBST([ICD_FILE_DIR])
> > >> +
> > >>  dnl
> > >>  dnl Gallium helper functions
> > >>  dnl
> > >> diff --git a/src/gallium/targets/opencl/Makefile.am
> > >> b/src/gallium/targets/opencl/Makefile.am
> > >> index 5daf327..781daa0 100644
> > >> --- a/src/gallium/targets/opencl/Makefile.am
> > >> +++ b/src/gallium/targets/opencl/Makefile.am
> > >> @@ -47,7 +47,7 @@ EXTRA_lib@OPENCL_LIBNAME@_la_DEPENDENCIES =
> > >> opencl.sym
> > >>  EXTRA_DIST = mesa.icd opencl.sym
> > >>
> > >>  if HAVE_CLOVER_ICD
> > >> -icddir

Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+

2015-05-07 Thread Jason Ekstrand

On Thu, May 7, 2015 at 6:17 PM, Matt Turner  wrote:
> On Thu, May 7, 2015 at 4:50 PM, Jason Ekstrand  wrote:
>> GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:
>>
>>total instructions in shared programs: 2724483 -> 2711790 (-0.47%)
>>instructions in affected programs: 1860859 -> 1848166 (-0.68%)
>>helped:4387
>>HURT:  4758
>>GAINED:1499
>>
>> The gained programs are ARB vertext programs that were previously going
>> through the vec4 backend.  Now that we have prog_to_nir, ARB vertex
>> programs can go through the scalar backend so they show up as "gained" in
>> the shader-db results.
>
> Again, I'm kind of confused and disappointed that we're just okay with
> hurting 4700 programs without more analysis. I guess I'll go do
> that...

What confuses me more is why the results aren't better.  When we first
turned NIR on by default for FS, the shader-db results looked a lot
better.  On one branch (wip/nir-by-default-v2) I applied the ATTR
copy-prop and we had the following:

GLSL IR vs. NIR shader-db results on Broadwell (VS only):

   total instructions in shared programs: 7106293 -> 7001640 (-1.47%)
   instructions in affected programs: 4604798 -> 4500145 (-2.27%)
   helped:16786
   HURT:  8442
   GAINED:1563
   LOST:  1526

The difference between gained/lost was due to capturing standard
error.  However, that shouldn't  affect the over-all numbers that
much.  I think adding the improved ffma stuff probably made a bunch of
the difference.

As far as when we turn it on, I do think that we want to do it before
the merge window closes if we can.  Being able to delete the visitor
after the branch would be really nice.  Also, we want to get people
testing it and reporting bugs because we're not going to find every
bug in every vertex shader  by inspection.

> I'm concerned -- lots of shaders like left-4-dead-2/low/3699 go from
> 297 -> 161 instructions. More concerning, the number of send
> instructions drop from 36 to 12, and a loop that was 111 instructions
> long suddenly becomes
>
>START B1 <-B0 <-B2
> cmp.ge.f0(8)nullg42<8,8,1>D g7<0,1,0>D
> (+f0) break(8)  JIP: 24 UIP: 24
>END B1 ->B3 ->B2
>START B2 <-B1
> add(8)  g42<1>D g42<8,8,1>D 1D
> while(8)JIP: -32
>END B2 ->B1
>
> That deserves a lot more investigation. I'll take a gamble and say
> something is broken.

Yes, that needs some investigation.  I can also take a look at some of
the hurt and/or really helped shaders as well and see what I find.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] clover: add --with-icd-file-dir option

2015-05-07 Thread Michel Dänzer

On 08.05.2015 03:24, Tom Stellard wrote:
> For this particular situation, I'm happy with any solution that:
> 
> 1. Allows a user to install the icd file to /etc if he or she wants to.

--sysconfdir=/etc

That covers drirc as well.

> 2. Does not require the user to read the spec to know that /etc is the
> correct place to install it.

I think the above is pretty standard for autotools projects. I think it
would be better to document this in the appropriate place(s) for OpenCL
users than to add another convoluted option which doesn't really add any
flexibility.

-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 07/13] util: Move gallium's linked list to util

2015-05-07 Thread Jason Ekstrand

On Thu, May 7, 2015 at 5:30 PM, Ian Romanick  wrote:
> Isn't this the same as src/util/simple_list.h?

In terms of being a two-pointer circularly linked list, yes.  In terms
of having a decent API, no.

1) Nothing in simple_list is namespaced in any way
2) it's all macros with do-while around them instead of static inlines
3) It assumes that you just put prev and next pointers in the
structure you're putting in the list rather than having a node you
embed.  While this provides the type saftey claimed at the top of
simple_list.h, it requires that, if you want a list of struct foo's,
you to use an entire struct foo as the sentinel instead of a 2 or 3
pointer list structure.
4) Point 3 isn't quite true because there is a simple_node structure.
However, it looks like a complete after-thought because none of the
iterators or manipulators do anything with it.

I could probably extend the list, but I think you get the point.
Sure, I could improve simple_list, but why do so when there's a
perfectly good list in gallium that does everything simple_list does
and more.

I did start working on replacing simple_list with the gallium list to
get us down to two lists, but we use it in things like swrast and tnl
so it turned into quite the spider-web.  Eventually, I'd like to see
simple_list die but if we can at least restrict it back to the older
parts of the code and remove it from util, that would make me happy
enough for now.
--Jason

> On 04/27/2015 09:03 PM, Jason Ekstrand wrote:
>> The linked list in gallium is pretty much the kernel list and we would like
>> to have a C-based linked list for all of mesa.  Let's not duplicate and
>> just steal the gallium one.
>> ---
>>  src/gallium/auxiliary/Makefile.sources |   1 -
>>  src/gallium/auxiliary/hud/hud_private.h|   2 +-
>>  .../auxiliary/pipebuffer/pb_buffer_fenced.c|   2 +-
>>  src/gallium/auxiliary/pipebuffer/pb_bufmgr_cache.c |   2 +-
>>  src/gallium/auxiliary/pipebuffer/pb_bufmgr_debug.c |   2 +-
>>  src/gallium/auxiliary/pipebuffer/pb_bufmgr_mm.c|   2 +-
>>  src/gallium/auxiliary/pipebuffer/pb_bufmgr_pool.c  |   2 +-
>>  src/gallium/auxiliary/pipebuffer/pb_bufmgr_slab.c  |   2 +-
>>  src/gallium/auxiliary/util/u_debug_flush.c |   2 +-
>>  src/gallium/auxiliary/util/u_debug_memory.c|   2 +-
>>  src/gallium/auxiliary/util/u_dirty_surfaces.h  |   2 +-
>>  src/gallium/auxiliary/util/u_double_list.h | 146 
>> -
>>  src/gallium/drivers/freedreno/freedreno_context.h  |   2 +-
>>  src/gallium/drivers/freedreno/freedreno_query_hw.h |   2 +-
>>  src/gallium/drivers/freedreno/freedreno_resource.h |   2 +-
>>  src/gallium/drivers/ilo/ilo_common.h   |   2 +-
>>  src/gallium/drivers/nouveau/nouveau_buffer.h   |   2 +-
>>  src/gallium/drivers/nouveau/nouveau_fence.c|   2 -
>>  src/gallium/drivers/nouveau/nouveau_fence.h|   2 +-
>>  src/gallium/drivers/nouveau/nouveau_mm.c   |   2 +-
>>  src/gallium/drivers/nouveau/nv30/nv30_screen.h |   2 +-
>>  src/gallium/drivers/nouveau/nv50/nv50_resource.h   |   2 +-
>>  src/gallium/drivers/r600/compute_memory_pool.c |   2 +-
>>  src/gallium/drivers/r600/evergreen_compute.c   |   2 +-
>>  src/gallium/drivers/r600/r600_llvm.c   |   2 +-
>>  src/gallium/drivers/r600/r600_pipe.h   |   2 +-
>>  src/gallium/drivers/radeon/r600_pipe_common.h  |   2 +-
>>  src/gallium/drivers/radeon/radeon_vce.h|   2 +-
>>  src/gallium/drivers/svga/svga_context.h|   2 +-
>>  src/gallium/drivers/svga/svga_resource_buffer.h|   2 -
>>  .../drivers/svga/svga_resource_buffer_upload.c |   1 -
>>  src/gallium/drivers/svga/svga_screen_cache.h   |   2 +-
>>  src/gallium/state_trackers/nine/basetexture9.h |   2 +-
>>  src/gallium/state_trackers/nine/device9.h  |   2 +-
>>  src/gallium/state_trackers/nine/nine_state.h   |   2 +-
>>  src/gallium/state_trackers/nine/surface9.h |   2 +-
>>  src/gallium/state_trackers/omx/vid_dec.h   |   2 +-
>>  src/gallium/state_trackers/omx/vid_enc.h   |   2 +-
>>  src/gallium/winsys/radeon/drm/radeon_drm_bo.c  |   2 +-
>>  .../winsys/svga/drm/pb_buffer_simple_fenced.c  |   2 +-
>>  src/gallium/winsys/svga/drm/vmw_fence.c|   2 +-
>>  src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c  |   2 +-
>>  src/util/Makefile.sources  |   1 +
>>  src/util/list.h| 146 
>> +
>>  44 files changed, 184 insertions(+), 189 deletions(-)
>>  delete mode 100644 src/gallium/auxiliary/util/u_double_list.h
>>  create mode 100644 src/util/list.h
>>
>> diff --git a/src/gallium/auxiliary/Makefile.sources 
>> b/src/gallium/auxiliary/Makefile.sources
>> index ec7547c..62e6b94 100644
>> --- a/src/gallium/auxiliary/Makefile.sources
>> +++ b/src/gallium/auxiliary/Makefile.sources
>> @@ -197,7 +197,6 @@ C_SOURCES := \
>>   util/u_dirty_

Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+

2015-05-07 Thread Jason Ekstrand

On Thu, May 7, 2015 at 6:17 PM, Matt Turner  wrote:
> On Thu, May 7, 2015 at 4:50 PM, Jason Ekstrand  wrote:
>> GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:
>>
>>total instructions in shared programs: 2724483 -> 2711790 (-0.47%)
>>instructions in affected programs: 1860859 -> 1848166 (-0.68%)
>>helped:4387
>>HURT:  4758
>>GAINED:1499
>>
>> The gained programs are ARB vertext programs that were previously going
>> through the vec4 backend.  Now that we have prog_to_nir, ARB vertex
>> programs can go through the scalar backend so they show up as "gained" in
>> the shader-db results.
>
> Again, I'm kind of confused and disappointed that we're just okay with
> hurting 4700 programs without more analysis. I guess I'll go do
> that...
>
> I'm concerned -- lots of shaders like left-4-dead-2/low/3699 go from
> 297 -> 161 instructions. More concerning, the number of send
> instructions drop from 36 to 12, and a loop that was 111 instructions
> long suddenly becomes
>
>START B1 <-B0 <-B2
> cmp.ge.f0(8)nullg42<8,8,1>D g7<0,1,0>D
> (+f0) break(8)  JIP: 24 UIP: 24
>END B1 ->B3 ->B2
>START B2 <-B1
> add(8)  g42<1>D g42<8,8,1>D 1D
> while(8)JIP: -32
>END B2 ->B1
>
> That deserves a lot more investigation. I'll take a gamble and say
> something is broken.

I did a little looking at that shader and it looks like NIR dead-coded
the contents of a for loop and, as a result, a bunch of stuff was
promoted to push constants, hence fewer sampler messages.  I didn't
find anything broken but, then again, that's hard to do without being
able to verifiably run the shader.  I'll try and look at the places
where we end up with more instructions.
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 1/6] mesa/es3.1: enable GL_ARB_shader_image_load_store for gles3.1

2015-05-07 Thread Tapani Pälli




On 05/08/2015 12:13 AM, Ian Romanick wrote:

On 05/07/2015 12:57 AM, Marta Lofstedt wrote:

From: Marta Lofstedt 

v2: only expose enums from GL_ARB_shader_image_load_store
for gles 3.1 and GL core

Signed-off-by: Marta Lofstedt 
---
  src/mesa/main/get.c  |  6 ++
  src/mesa/main/get_hash_params.py | 17 -
  2 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 9898197..73739b6 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -355,6 +355,12 @@ static const int extra_ARB_draw_indirect_es31[] = {
 EXTRA_END
  };

+static const int extra_ARB_shader_image_load_store_es31[] = {
+   EXT(ARB_shader_image_load_store),
+   EXTRA_API_ES31,


I think you're missing the patch that adds EXTRA_API_ES31.  Did you
forget to send that one out?


Marta's series builds on top of my patch here that adds EXTRA_API_ES31:

http://lists.freedesktop.org/archives/mesa-dev/2015-May/083593.html


Also, on a few of these patches, I think the old, non-_es31 set of
requirements can be removed due to no longer being used.


+   EXTRA_END
+};
+
  EXTRA_EXT(ARB_texture_cube_map);
  EXTRA_EXT(EXT_texture_array);
  EXTRA_EXT(NV_fog_distance);
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 513d5d2..85c2494 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -413,6 +413,14 @@ descriptor=[
  { "apis": ["GL_CORE", "GLES3"], "params": [
  # GL_ARB_draw_indirect / GLES 3.1
[ "DRAW_INDIRECT_BUFFER_BINDING", "LOC_CUSTOM, TYPE_INT, 0, 
extra_ARB_draw_indirect_es31" ],
+# GL_ARB_shader_image_load_store / GLES 3.1
+  [ "MAX_IMAGE_UNITS", "CONTEXT_INT(Const.MaxImageUnits), 
extra_ARB_shader_image_load_store_es31"],
+  [ "MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS", 
"CONTEXT_INT(Const.MaxCombinedImageUnitsAndFragmentOutputs), 
extra_ARB_shader_image_load_store_es31"],
+  [ "MAX_IMAGE_SAMPLES", "CONTEXT_INT(Const.MaxImageSamples), 
extra_ARB_shader_image_load_store_es31"],
+  [ "MAX_VERTEX_IMAGE_UNIFORMS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxImageUniforms), 
extra_ARB_shader_image_load_store_es31"],
+  [ "MAX_GEOMETRY_IMAGE_UNIFORMS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxImageUniforms), 
extra_ARB_shader_image_load_store_es31"],
+  [ "MAX_FRAGMENT_IMAGE_UNIFORMS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms), 
extra_ARB_shader_image_load_store_es31"],
+  [ "MAX_COMBINED_IMAGE_UNIFORMS", "CONTEXT_INT(Const.MaxCombinedImageUniforms), 
extra_ARB_shader_image_load_store_es31"],
  ]},

  # Remaining enums are only in OpenGL
@@ -780,15 +788,6 @@ descriptor=[
[ "MAX_VERTEX_ATTRIB_RELATIVE_OFFSET", 
"CONTEXT_ENUM(Const.MaxVertexAttribRelativeOffset), NO_EXTRA" ],
[ "MAX_VERTEX_ATTRIB_BINDINGS", "CONTEXT_ENUM(Const.MaxVertexAttribBindings), 
NO_EXTRA" ],

-# GL_ARB_shader_image_load_store
-  [ "MAX_IMAGE_UNITS", "CONTEXT_INT(Const.MaxImageUnits), 
extra_ARB_shader_image_load_store"],
-  [ "MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS", 
"CONTEXT_INT(Const.MaxCombinedImageUnitsAndFragmentOutputs), 
extra_ARB_shader_image_load_store"],
-  [ "MAX_IMAGE_SAMPLES", "CONTEXT_INT(Const.MaxImageSamples), 
extra_ARB_shader_image_load_store"],
-  [ "MAX_VERTEX_IMAGE_UNIFORMS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_VERTEX].MaxImageUniforms), 
extra_ARB_shader_image_load_store"],
-  [ "MAX_GEOMETRY_IMAGE_UNIFORMS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_GEOMETRY].MaxImageUniforms), 
extra_ARB_shader_image_load_store_and_geometry_shader"],
-  [ "MAX_FRAGMENT_IMAGE_UNIFORMS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxImageUniforms), 
extra_ARB_shader_image_load_store"],
-  [ "MAX_COMBINED_IMAGE_UNIFORMS", "CONTEXT_INT(Const.MaxCombinedImageUniforms), 
extra_ARB_shader_image_load_store"],
-
  # GL_ARB_compute_shader
[ "MAX_COMPUTE_WORK_GROUP_INVOCATIONS", 
"CONTEXT_INT(Const.MaxComputeWorkGroupInvocations), extra_ARB_compute_shader" ],
[ "MAX_COMPUTE_UNIFORM_BLOCKS", "CONST(MAX_COMPUTE_UNIFORM_BLOCKS), 
extra_ARB_compute_shader" ],



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] nv50: keep track of PGRAPH state in nv50_screen

2015-05-07 Thread Ilia Mirkin

Normally this is kept in nv50_context, and on switching the active
context, the state is copied from the previous context. However when the
last context is destroyed, this is lost, and a new context might later
be created. When the currently-active context is destroyed, save its
state in the screen, and restore it when setting the current context.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90363
Reported-by: Matteo Bruni 
Signed-off-by: Ilia Mirkin 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/gallium/drivers/nouveau/nv50/nv50_context.c| 11 ++--
 src/gallium/drivers/nouveau/nv50/nv50_context.h| 29 +-
 src/gallium/drivers/nouveau/nv50/nv50_screen.h | 24 ++
 .../drivers/nouveau/nv50/nv50_state_validate.c |  2 ++
 4 files changed, 36 insertions(+), 30 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_context.c 
b/src/gallium/drivers/nouveau/nv50/nv50_context.c
index 2cfd5db..5b5d391 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_context.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_context.c
@@ -138,8 +138,11 @@ nv50_destroy(struct pipe_context *pipe)
 {
struct nv50_context *nv50 = nv50_context(pipe);
 
-   if (nv50_context_screen(nv50)->cur_ctx == nv50)
-  nv50_context_screen(nv50)->cur_ctx = NULL;
+   if (nv50->screen->cur_ctx == nv50) {
+  nv50->screen->cur_ctx = NULL;
+  /* Save off the state in case another context gets created */
+  nv50->screen->save_state = nv50->state;
+   }
nouveau_pushbuf_bufctx(nv50->base.pushbuf, NULL);
nouveau_pushbuf_kick(nv50->base.pushbuf, nv50->base.pushbuf->channel);
 
@@ -290,6 +293,10 @@ nv50_create(struct pipe_screen *pscreen, void *priv)
pipe->get_sample_position = nv50_context_get_sample_position;
 
if (!screen->cur_ctx) {
+  /* Restore the last context's state here, normally handled during
+   * context switch
+   */
+  nv50->state = screen->save_state;
   screen->cur_ctx = nv50;
   nouveau_pushbuf_bufctx(screen->base.pushbuf, nv50->bufctx);
}
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_context.h 
b/src/gallium/drivers/nouveau/nv50/nv50_context.h
index 45eb554..1f123ef 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_context.h
+++ b/src/gallium/drivers/nouveau/nv50/nv50_context.h
@@ -104,28 +104,7 @@ struct nv50_context {
uint32_t dirty;
boolean cb_dirty;
 
-   struct {
-  uint32_t instance_elts; /* bitmask of per-instance elements */
-  uint32_t instance_base;
-  uint32_t interpolant_ctrl;
-  uint32_t semantic_color;
-  uint32_t semantic_psize;
-  int32_t index_bias;
-  boolean uniform_buffer_bound[3];
-  boolean prim_restart;
-  boolean point_sprite;
-  boolean rt_serialize;
-  boolean flushed;
-  boolean rasterizer_discard;
-  uint8_t tls_required;
-  boolean new_tls_space;
-  uint8_t num_vtxbufs;
-  uint8_t num_vtxelts;
-  uint8_t num_textures[3];
-  uint8_t num_samplers[3];
-  uint8_t prim_size;
-  uint16_t scissor;
-   } state;
+   struct nv50_graph_state state;
 
struct nv50_blend_stateobj *blend;
struct nv50_rasterizer_stateobj *rast;
@@ -191,12 +170,6 @@ nv50_context(struct pipe_context *pipe)
return (struct nv50_context *)pipe;
 }
 
-static INLINE struct nv50_screen *
-nv50_context_screen(struct nv50_context *nv50)
-{
-   return nv50_screen(&nv50->base.screen->base);
-}
-
 /* return index used in nv50_context arrays for a specific shader type */
 static INLINE unsigned
 nv50_context_shader_stage(unsigned pipe)
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.h 
b/src/gallium/drivers/nouveau/nv50/nv50_screen.h
index f8ce365..881051b 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_screen.h
+++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.h
@@ -25,10 +25,34 @@ struct nv50_context;
 
 struct nv50_blitter;
 
+struct nv50_graph_state {
+   uint32_t instance_elts; /* bitmask of per-instance elements */
+   uint32_t instance_base;
+   uint32_t interpolant_ctrl;
+   uint32_t semantic_color;
+   uint32_t semantic_psize;
+   int32_t index_bias;
+   boolean uniform_buffer_bound[3];
+   boolean prim_restart;
+   boolean point_sprite;
+   boolean rt_serialize;
+   boolean flushed;
+   boolean rasterizer_discard;
+   uint8_t tls_required;
+   boolean new_tls_space;
+   uint8_t num_vtxbufs;
+   uint8_t num_vtxelts;
+   uint8_t num_textures[3];
+   uint8_t num_samplers[3];
+   uint8_t prim_size;
+   uint16_t scissor;
+};
+
 struct nv50_screen {
struct nouveau_screen base;
 
struct nv50_context *cur_ctx;
+   struct nv50_graph_state save_state;
 
struct nouveau_bo *code;
struct nouveau_bo *uniforms;
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c 
b/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c
index 85e19b4..116bf4b 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_state_validate.c
@@ -394,6 +394,8

[Mesa-dev] [PATCH 2/2] nvc0: keep track of PGRAPH state in nvc0_screen

2015-05-07 Thread Ilia Mirkin

See identical commit for nv50. Destroying the current context and then
creating a new one or switching to another existing context would cause
the "current" state to not be properly initialized, so we save it off in
the screen.

Signed-off-by: Ilia Mirkin 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/gallium/drivers/nouveau/nvc0/nvc0_context.c|  7 +-
 src/gallium/drivers/nouveau/nvc0/nvc0_context.h| 24 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.h | 25 ++
 .../drivers/nouveau/nvc0/nvc0_state_validate.c |  2 ++
 4 files changed, 34 insertions(+), 24 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_context.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_context.c
index 7662fb5..7904984 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_context.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_context.c
@@ -139,8 +139,12 @@ nvc0_destroy(struct pipe_context *pipe)
 {
struct nvc0_context *nvc0 = nvc0_context(pipe);
 
-   if (nvc0->screen->cur_ctx == nvc0)
+   if (nvc0->screen->cur_ctx == nvc0) {
   nvc0->screen->cur_ctx = NULL;
+  nvc0->screen->save_state = nvc0->state;
+  nvc0->screen->save_state.tfb = NULL;
+   }
+
/* Unset bufctx, we don't want to revalidate any resources after the flush.
 * Other contexts will always set their bufctx again on action calls.
 */
@@ -303,6 +307,7 @@ nvc0_create(struct pipe_screen *pscreen, void *priv)
pipe->get_sample_position = nvc0_context_get_sample_position;
 
if (!screen->cur_ctx) {
+  nvc0->state = screen->save_state;
   screen->cur_ctx = nvc0;
   nouveau_pushbuf_bufctx(screen->base.pushbuf, nvc0->bufctx);
}
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h 
b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
index ef251f3..a8d7593 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
@@ -113,29 +113,7 @@ struct nvc0_context {
uint32_t dirty;
uint32_t dirty_cp; /* dirty flags for compute state */
 
-   struct {
-  boolean flushed;
-  boolean rasterizer_discard;
-  boolean early_z_forced;
-  boolean prim_restart;
-  uint32_t instance_elts; /* bitmask of per-instance elements */
-  uint32_t instance_base;
-  uint32_t constant_vbos;
-  uint32_t constant_elts;
-  int32_t index_bias;
-  uint16_t scissor;
-  uint8_t vbo_mode; /* 0 = normal, 1 = translate, 3 = translate, forced */
-  uint8_t num_vtxbufs;
-  uint8_t num_vtxelts;
-  uint8_t num_textures[6];
-  uint8_t num_samplers[6];
-  uint8_t tls_required; /* bitmask of shader types using l[] */
-  uint8_t c14_bound; /* whether immediate array constbuf is bound */
-  uint8_t clip_enable;
-  uint32_t clip_mode;
-  uint32_t uniform_buffer_bound[5];
-  struct nvc0_transform_feedback_state *tfb;
-   } state;
+   struct nvc0_graph_state state;
 
struct nvc0_blend_stateobj *blend;
struct nvc0_rasterizer_stateobj *rast;
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h 
b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h
index 8a1991f..bce0f4a 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.h
@@ -27,10 +27,35 @@ struct nvc0_context;
 
 struct nvc0_blitter;
 
+struct nvc0_graph_state {
+   boolean flushed;
+   boolean rasterizer_discard;
+   boolean early_z_forced;
+   boolean prim_restart;
+   uint32_t instance_elts; /* bitmask of per-instance elements */
+   uint32_t instance_base;
+   uint32_t constant_vbos;
+   uint32_t constant_elts;
+   int32_t index_bias;
+   uint16_t scissor;
+   uint8_t vbo_mode; /* 0 = normal, 1 = translate, 3 = translate, forced */
+   uint8_t num_vtxbufs;
+   uint8_t num_vtxelts;
+   uint8_t num_textures[6];
+   uint8_t num_samplers[6];
+   uint8_t tls_required; /* bitmask of shader types using l[] */
+   uint8_t c14_bound; /* whether immediate array constbuf is bound */
+   uint8_t clip_enable;
+   uint32_t clip_mode;
+   uint32_t uniform_buffer_bound[5];
+   struct nvc0_transform_feedback_state *tfb;
+};
+
 struct nvc0_screen {
struct nouveau_screen base;
 
struct nvc0_context *cur_ctx;
+   struct nvc0_graph_state save_state;
 
int num_occlusion_queries_active;
 
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c
index 6051f12..d3ad81d 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state_validate.c
@@ -543,6 +543,8 @@ nvc0_switch_pipe_context(struct nvc0_context *ctx_to)
 
if (ctx_from)
   ctx_to->state = ctx_from->state;
+   else
+  ctx_to->state = ctx_to->screen->save_state;
 
ctx_to->dirty = ~0;
ctx_to->viewports_dirty = ~0;
-- 
2.3.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/lis

Re: [Mesa-dev] [PATCH] i965: Use NIR by default for vertex shaders on GEN8+

2015-05-07 Thread Jason Ekstrand

On Thu, May 7, 2015 at 8:49 PM, Jason Ekstrand  wrote:
> On Thu, May 7, 2015 at 6:17 PM, Matt Turner  wrote:
>> On Thu, May 7, 2015 at 4:50 PM, Jason Ekstrand  wrote:
>>> GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:
>>>
>>>total instructions in shared programs: 2724483 -> 2711790 (-0.47%)
>>>instructions in affected programs: 1860859 -> 1848166 (-0.68%)
>>>helped:4387
>>>HURT:  4758
>>>GAINED:1499
>>>
>>> The gained programs are ARB vertext programs that were previously going
>>> through the vec4 backend.  Now that we have prog_to_nir, ARB vertex
>>> programs can go through the scalar backend so they show up as "gained" in
>>> the shader-db results.
>>
>> Again, I'm kind of confused and disappointed that we're just okay with
>> hurting 4700 programs without more analysis. I guess I'll go do
>> that...
>>
>> I'm concerned -- lots of shaders like left-4-dead-2/low/3699 go from
>> 297 -> 161 instructions. More concerning, the number of send
>> instructions drop from 36 to 12, and a loop that was 111 instructions
>> long suddenly becomes
>>
>>START B1 <-B0 <-B2
>> cmp.ge.f0(8)nullg42<8,8,1>D g7<0,1,0>D
>> (+f0) break(8)  JIP: 24 UIP: 24
>>END B1 ->B3 ->B2
>>START B2 <-B1
>> add(8)  g42<1>D g42<8,8,1>D 1D
>> while(8)JIP: -32
>>END B2 ->B1
>>
>> That deserves a lot more investigation. I'll take a gamble and say
>> something is broken.
>
> I did a little looking at that shader and it looks like NIR dead-coded
> the contents of a for loop and, as a result, a bunch of stuff was
> promoted to push constants, hence fewer sampler messages.  I didn't
> find anything broken but, then again, that's hard to do without being
> able to verifiably run the shader.  I'll try and look at the places
> where we end up with more instructions.
> --Jason

Looking at the assembly even closer, it looks like NIR did 100% the
right thing.  The shader had a for loop that computes a bunch of
values that either don't get used at all or are over-written before
they are used.  (I didn't check every value written in the loop but I
did check a good half-dozen or so.)  NIR, probably thanks to SSA,
realized that these values were never used for anything, and
dead-coded the entire contents of the for loop.  The result was that
the 12 (yes, 12) pull constant loads inside the loop went away and the
9 after the loop were promoted to push constants.  Unfortunately, NIR
isn't yet smart enough to remove the loop entirely but an empty loop
isn't nearly as expensive as sampler invocations so I'm not too
worried about it.

I'll try and take a look at some of the hurt programs tomorrow.
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 86701] [regression] weston-simple-egl not running anymore inside qemu

2015-05-07 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=86701

--- Comment #14 from Pekka Paalanen  ---
Patch series posted by Axel Davy:
http://lists.freedesktop.org/archives/mesa-dev/2015-May/083254.html
Reviews combined from Daniel Stone and Dave Airlie cover it all.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nir: fix sampler lowering pass for arrays

2015-05-07 Thread Tapani Pälli

This fixes bugs with special cases where we have arrays of
structures containing samplers or arrays of samplers.

I've verified that patch results in calculating same index value as
returned by _mesa_get_sampler_uniform_value for IR. Patch makes
following ES3 conformance test pass:

ES3-CTS.shaders.struct.uniform.sampler_array_fragment

Signed-off-by: Tapani Pälli 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90114
---
 src/glsl/nir/nir_lower_samplers.cpp | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/glsl/nir/nir_lower_samplers.cpp 
b/src/glsl/nir/nir_lower_samplers.cpp
index cf8ab83..9859cc0 100644
--- a/src/glsl/nir/nir_lower_samplers.cpp
+++ b/src/glsl/nir/nir_lower_samplers.cpp
@@ -78,7 +78,11 @@ lower_sampler(nir_tex_instr *instr, const struct 
gl_shader_program *shader_progr
  instr->sampler_index *= glsl_get_length(deref->type);
  switch (deref_array->deref_array_type) {
  case nir_deref_array_type_direct:
-instr->sampler_index += deref_array->base_offset;
+
+/* If this is an array of samplers. */
+if (deref->child->type->base_type == GLSL_TYPE_SAMPLER)
+   instr->sampler_index += deref_array->base_offset;
+
 if (deref_array->deref.child)
ralloc_asprintf_append(&name, "[%u]", deref_array->base_offset);
 break;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

1 2 >

1 - 100 of 101 matches

Mail list logo