Re: [Mesa-dev] [PATCH v3 2/5] glsl: add support for the imageSize builtin

2015-08-14 Thread Martin Peres

On 14/08/15 08:42, Pohjolainen, Topi wrote:

On Thu, Aug 13, 2015 at 07:58:53PM +0300, Martin Peres wrote:

The code is heavily inspired from Francisco Jerez's code supporting the
image_load_store extension.

Backends willing to support this builtin should handle
__intrinsic_image_size.

v2: Based on the review of Ilia Mirkin
- Enable the extension for GLES 3.1
- Fix indentation
- Fix the return type (float to int, number of components for CubeImages)
- Add a warning related to GLES 3.1

v3: Based on the review of Francisco Jerez
- Refactor the code to share both add_image_function and _image with the other
   image-related functions

Signed-off-by: Martin Peres 
---
  src/glsl/builtin_functions.cpp | 109 +++--
  1 file changed, 93 insertions(+), 16 deletions(-)

diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp
index 2175c66..5d0a825 100644
--- a/src/glsl/builtin_functions.cpp
+++ b/src/glsl/builtin_functions.cpp
@@ -399,6 +399,13 @@ shader_image_load_store(const _mesa_glsl_parse_state 
*state)
  }
  
  static bool

+shader_image_size(const _mesa_glsl_parse_state *state)
+{
+   return (state->is_version(430, 310) ||
+   state->ARB_shader_image_size_enable);

You can drop the extra ().

Thanks, fixed locally.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 4/5] i965: handle nir_intrinsic_image_size

2015-08-14 Thread Martin Peres

On 13/08/15 20:09, Ilia Mirkin wrote:

On Thu, Aug 13, 2015 at 1:00 PM, Martin Peres
 wrote:

v2, Review from Francisco Jerez:
- avoid the camelCase for the booleans
- init the booleans using the sampler type
- force the initialization of all the components of the output register

Signed-off-by: Martin Peres 
---
  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 48 
  1 file changed, 48 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index ce4153d..cc0a5a6 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -1406,6 +1406,54 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, 
nir_intrinsic_instr *instr
break;
 }

+   case nir_intrinsic_image_size: {
+  /* Get the referenced image variable and type. */
+  const nir_variable *var = instr->variables[0]->var;
+  const glsl_type *type = var->type->without_array();
+  const brw_reg_type base_type = get_image_base_type(type);
+
+  /* Get the size of the image. */
+  const fs_reg image = get_nir_image_deref(instr->variables[0]);
+  const fs_reg size = offset(image, bld, BRW_IMAGE_PARAM_SIZE_OFFSET);
+
+  /*
+   * For 1DArray image types, the array index is stored in the Z component.
+   * Fix this by swizzling the Z component to the Y component.
+   */
+  const bool is_1d_array_image =
+  (type->sampler_dimensionality == GLSL_SAMPLER_DIM_1D &&
+   type->sampler_array);
+
+  /*
+   * For CubeMapArray images, we should count the number of cubes instead

I think you're trying to use the actual glsl image type names? In that
case it should be CubeArray, not CubeMapArray. (It's imageCubeArray.)


Thanks :) Fixed locally.



+   * of the number of faces. Fix it by dividing the (Z component) by 6.
+   */
+  const bool is_cube_map_array_image =
+  (type->sampler_dimensionality == GLSL_SAMPLER_DIM_CUBE &&
+   type->sampler_array);
+
+  /* Copy all the components. */
+  const nir_intrinsic_info *info = &nir_intrinsic_infos[instr->intrinsic];
+  for (int c = 0; c < info->dest_components; ++c) {
+ if (c > type->coordinate_components()) {
+ bld.MOV(offset(retype(dest, base_type), bld, c),
+ fs_reg(1));
+ } else if (c == 1 && is_1d_array_image) {
+bld.MOV(offset(retype(dest, base_type), bld, c),
+offset(size, bld, 2));
+ } else if (c == 2 && is_cube_map_array_image) {
+bld.emit(SHADER_OPCODE_INT_QUOTIENT,
+ offset(retype(dest, base_type), bld, c),
+ offset(size, bld, c), fs_reg(6));
+ } else {
+bld.MOV(offset(retype(dest, base_type), bld, c),
+offset(size, bld, c));
+ }
+   }
+
+  break;
+   }
+
 case nir_intrinsic_load_front_face:
bld.MOV(retype(dest, BRW_REGISTER_TYPE_D),
*emit_frontfacing_interpolation());
--
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 4/5] i965: handle nir_intrinsic_image_size

2015-08-14 Thread Martin Peres

On 14/08/15 08:35, Pohjolainen, Topi wrote:

On Thu, Aug 13, 2015 at 08:00:43PM +0300, Martin Peres wrote:

v2, Review from Francisco Jerez:
- avoid the camelCase for the booleans
- init the booleans using the sampler type
- force the initialization of all the components of the output register

Signed-off-by: Martin Peres 
---
  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 48 
  1 file changed, 48 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index ce4153d..cc0a5a6 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -1406,6 +1406,54 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, 
nir_intrinsic_instr *instr
break;
 }
  
+   case nir_intrinsic_image_size: {

+  /* Get the referenced image variable and type. */
+  const nir_variable *var = instr->variables[0]->var;
+  const glsl_type *type = var->type->without_array();
+  const brw_reg_type base_type = get_image_base_type(type);
+
+  /* Get the size of the image. */
+  const fs_reg image = get_nir_image_deref(instr->variables[0]);
+  const fs_reg size = offset(image, bld, BRW_IMAGE_PARAM_SIZE_OFFSET);
+
+  /*
+   * For 1DArray image types, the array index is stored in the Z component.

Just a few style nits from my part.

Usually (and in the rest of the file being modified) multi-line comments do
not have separate first line, instead:

  /* For 1DArray image types, the array index is stored in the Z
   * component.


+   * Fix this by swizzling the Z component to the Y component.
+   */
+  const bool is_1d_array_image =
+  (type->sampler_dimensionality == GLSL_SAMPLER_DIM_1D &&
+   type->sampler_array);

Indentation here looks a little odd and you can drop the extra (). I would
write this:

  const bool is_1d_array_image =
 type->sampler_dimensionality == GLSL_SAMPLER_DIM_1D &&
 type->sampler_array;

Same comments just below.


Thanks Topi, fixed everything.



+
+  /*
+   * For CubeMapArray images, we should count the number of cubes instead
+   * of the number of faces. Fix it by dividing the (Z component) by 6.
+   */
+  const bool is_cube_map_array_image =
+  (type->sampler_dimensionality == GLSL_SAMPLER_DIM_CUBE &&
+   type->sampler_array);
+
+  /* Copy all the components. */
+  const nir_intrinsic_info *info = &nir_intrinsic_infos[instr->intrinsic];
+  for (int c = 0; c < info->dest_components; ++c) {
+ if (c > type->coordinate_components()) {
In addition to the changes you asked for, I added a (int) cast to c to 
avoid a warning here as type->coordinate_components() returns an int.

+ bld.MOV(offset(retype(dest, base_type), bld, c),
+ fs_reg(1));
+ } else if (c == 1 && is_1d_array_image) {
+bld.MOV(offset(retype(dest, base_type), bld, c),
+offset(size, bld, 2));
+ } else if (c == 2 && is_cube_map_array_image) {
+bld.emit(SHADER_OPCODE_INT_QUOTIENT,
+ offset(retype(dest, base_type), bld, c),
+ offset(size, bld, c), fs_reg(6));
+ } else {
+bld.MOV(offset(retype(dest, base_type), bld, c),
+offset(size, bld, c));
+ }
+   }
+
+  break;
+   }
+
 case nir_intrinsic_load_front_face:
bld.MOV(retype(dest, BRW_REGISTER_TYPE_D),
*emit_frontfacing_interpolation());
--
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/70] i965: Remove early release of DRI2 miptree

2015-08-14 Thread Chris Wilson
On Thu, Aug 13, 2015 at 09:58:52PM -0700, Kenneth Graunke wrote:
> On Thursday, August 13, 2015 02:57:20 PM Martin Peres wrote:
> > On 07/08/15 23:13, Chris Wilson wrote:
> > > intel_update_winsys_renderbuffer_miptree() will release the existing
> > > miptree when wrapping a new DRI2 buffer, so we can remove the early
> > > release and so prevent a NULL mt dereference should importing the new
> > > DRI2 name fail for any reason. (Reusing the old DRI2 name will result
> > > in the rendering going astray, to a stale buffer, and not shown on the
> > > screen, but it allows us to issue a warning and not crash much later in
> > > innocent code.)
> > >
> > > Signed-off-by: Chris Wilson 
> > > ---
> > >   src/mesa/drivers/dri/i965/brw_context.c | 1 -
> > >   1 file changed, 1 deletion(-)
> > >
> > > diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
> > > b/src/mesa/drivers/dri/i965/brw_context.c
> > > index e8d1396..72f3897 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_context.c
> > > +++ b/src/mesa/drivers/dri/i965/brw_context.c
> > > @@ -1388,7 +1388,6 @@ intel_process_dri2_buffer(struct brw_context *brw,
> > > buffer->cpp, buffer->pitch);
> > >  }
> > >   
> > > -   intel_miptree_release(&rb->mt);
> > >  bo = drm_intel_bo_gem_create_from_name(brw->bufmgr, buffer_name,
> > > buffer->name);
> > >  if (!bo) {
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86281
> > Reviewed-by: Martin Peres 
> > 
> > If no-one has any objection, I will push the patch tomorrow to avoid the 
> > crashes experienced by a lot of users, myself included!
> 
> (resending with Chris's actual email...sorry!)
> 
> This seems reasonable from a robustness point of view - try to limp
> along a bit further and hope the import works next time, somehow.
> 
> That said, I do have one concern: we might get into trouble with
> multisample buffers.
> 
> In the multisample case, irb->mt is the MSAA buffer, and
> irb->singlesample_mt is the actual single-sampled DRI2 buffer.
> 
> Previously, this call always released irb->mt, deleting the MSAA buffer
> every time.  Then, intel_update_winsys_renderbuffer_miptree would hit
> the !irb->mt case, creating a fresh new irb->mt, and setting
> irb->need_downsample = false.  That may not happen now, which is a
> subtle change.  (It might be OK...)
> 
> If front buffer drawing, we call intel_renderbuffer_upsample(), which
> asserts that irb->need_downsample == false.  So...if somehow we hit this
> path with a dirty multisample buffer and front buffer rendering...then
> I think we'd assert fail and die.  I'm not sure if that can happen; it
> at least seems unlikely...
> 
> Do you think the new multisampled behavior will work OK?  If so, perhaps
> we can at least leave a note about it in the commit message.

Hmm, yes we should be invalidating the multisample buffer if we swap out
the "true" singlesample buffer beneath it. Either the contents are
irrelevant (new undefined back buffer after swap) or the contents are
well defined and different (front buffer, aged back buffer). In the
latter case, we expect we flushed the multisample buffer before the
swap, so the only likely case is intel_prepare_render() being handed a
new front buffer which could have pending rendering, now invalid.

Just disregarding the unresolved multisample buffer seems like an easy
consistency fix:

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index e85c3f0..1ed39f2 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -837,6 +837,7 @@ intel_update_winsys_renderbuffer_miptree(struct brw_context 
*intel,
} else {
   intel_miptree_release(&irb->singlesample_mt);
   irb->singlesample_mt = singlesample_mt;
+  irb->need_downsample = false;
 
   if (!irb->mt ||
   irb->mt->logical_width0 != width ||
@@ -849,7 +850,6 @@ intel_update_winsys_renderbuffer_miptree(struct brw_context 
*intel,
  if (!multisample_mt)
 goto fail;
 
- irb->need_downsample = false;
  intel_miptree_release(&irb->mt);
  irb->mt = multisample_mt;
   }

-- 
Chris Wilson, Intel Open Source Technology Centre
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] r600, compute: setup RATs for write-only images

2015-08-14 Thread Zoltan Gilian
v2: Set CB_TARGET_MASK to zero for compute resources (Marek).
Remove unnecessary use of cb_target_mask (Marek).
Fix crash on non-contiguous RAT setup.

Non-contiguous RAT setup can occur when the kernel signature contains
no global buffer arguments, but there are write-only image args.
In this case RAT0 is not set, but RAT1 is.
---
 src/gallium/drivers/r600/evergreen_compute.c | 106 +--
 src/gallium/drivers/r600/evergreen_state.c   |  43 ++-
 src/gallium/drivers/r600/r600_pipe.h |   4 +-
 src/gallium/drivers/radeon/r600_texture.c|   1 +
 4 files changed, 113 insertions(+), 41 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
b/src/gallium/drivers/r600/evergreen_compute.c
index b191e0d..16816cf 100644
--- a/src/gallium/drivers/r600/evergreen_compute.c
+++ b/src/gallium/drivers/r600/evergreen_compute.c
@@ -99,8 +99,40 @@ struct r600_resource* r600_compute_buffer_alloc_vram(
return (struct r600_resource *)buffer;
 }
 
+static void evergreen_set_cbuf(
+   struct r600_context *ctx,
+   unsigned id,
+   struct pipe_surface *surf)
+{
+   struct pipe_framebuffer_state *state = &ctx->framebuffer.state;
+
+   /* Add the RAT to the list of color buffers */
+   state->cbufs[id] = surf;
+
+   /* Update the number of color buffers */
+   state->nr_cbufs = MAX2(id + 1, state->nr_cbufs);
+
+   /* cb_target_mask should be 0 for compute resources. */
+   ctx->compute_cb_target_mask &= ~(0xf << (id << 2));
+}
 
-static void evergreen_set_rat(
+static void evergreen_unset_cbuf(
+   struct r600_context *ctx,
+   unsigned id)
+{
+   unsigned i;
+   struct pipe_framebuffer_state *state = &ctx->framebuffer.state;
+
+   state->cbufs[id] = NULL;
+   for (i = state->nr_cbufs; i > 0; --i) {
+   if (state->cbufs[i - 1]) {
+   break;
+   }
+   }
+   state->nr_cbufs = i;
+}
+
+static void evergreen_set_rat_buf(
struct r600_context *ctx,
unsigned id,
struct r600_resource* bo,
@@ -123,23 +155,11 @@ static void evergreen_set_rat(
rat_templ.u.tex.first_layer = 0;
rat_templ.u.tex.last_layer = 0;
 
-   /* Add the RAT the list of color buffers */
-   ctx->framebuffer.state.cbufs[id] = ctx->b.b.create_surface(
-   (struct pipe_context *)ctx,
-   (struct pipe_resource *)bo, &rat_templ);
-
-   /* Update the number of color buffers */
-   ctx->framebuffer.state.nr_cbufs =
-   MAX2(id + 1, ctx->framebuffer.state.nr_cbufs);
-
-   /* Update the cb_target_mask
-* XXX: I think this is a potential spot for bugs once we start doing
-* GL interop.  cb_target_mask may be modified in the 3D sections
-* of this driver. */
-   ctx->compute_cb_target_mask |= (0xf << (id * 4));
+   evergreen_set_cbuf(ctx, id, ctx->b.b.create_surface(
+   (struct pipe_context *)ctx, (struct pipe_resource *)bo, 
&rat_templ));
 
surf = (struct r600_surface*)ctx->framebuffer.state.cbufs[id];
-   evergreen_init_color_surface_rat(rctx, surf);
+   evergreen_init_color_surface_rat_buf(ctx, surf);
 }
 
 static void evergreen_cs_set_vertex_buffer(
@@ -436,6 +456,8 @@ static void compute_emit_cs(struct r600_context *ctx, const 
uint *block_layout,
/* XXX support more than 8 colorbuffers (the offsets are not a multiple 
of 0x3C for CB8-11) */
for (i = 0; i < 8 && i < ctx->framebuffer.state.nr_cbufs; i++) {
struct r600_surface *cb = (struct 
r600_surface*)ctx->framebuffer.state.cbufs[i];
+   if (!cb)
+   continue;
unsigned reloc = r600_context_bo_reloc(&ctx->b, 
&ctx->b.rings.gfx,
   (struct 
r600_resource*)cb->base.texture,
   RADEON_USAGE_READWRITE,
@@ -623,32 +645,34 @@ static void evergreen_set_compute_resources(struct 
pipe_context * ctx_,
struct pipe_surface ** surfaces)
 {
struct r600_context *ctx = (struct r600_context *)ctx_;
-   struct r600_surface **resources = (struct r600_surface **)surfaces;
+   struct pipe_surface *surf = NULL;
 
COMPUTE_DBG(ctx->screen, "*** evergreen_set_compute_resources: start = 
%u count = %u\n",
start, count);
 
+   if (!surfaces) {
+   for (unsigned i = 0; i < count; ++i)
+   evergreen_unset_cbuf(ctx, 1 + i);
+   return;
+   }
+
for (unsigned i = 0; i < count; i++) {
-   /* The First two vertex buffers are reserved for parameters and
-* global buffers. */
-   unsigned vtx_id = 2 + i;
-   if (resources[i]) {
-   struct r600_resource_global *buffer =
-   (struct r600_resource_global*)
-

Re: [Mesa-dev] [PATCH 2/2] r600, compute: setup RATs for write-only images

2015-08-14 Thread Zoltán Gilián
On 8/14/15, Zoltan Gilian  wrote:
> Remove unnecessary use of cb_target_mask (Marek).

This is an error, it should be "Remove unnecessary use of
util_range_add". I'll fix this in a moment.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] r600, compute: setup RATs for write-only images

2015-08-14 Thread Marek Olšák
On Fri, Aug 14, 2015 at 11:16 AM, Zoltan Gilian  wrote:
> v2: Set CB_TARGET_MASK to zero for compute resources (Marek).
> Remove unnecessary use of cb_target_mask (Marek).
> Fix crash on non-contiguous RAT setup.
>
> Non-contiguous RAT setup can occur when the kernel signature contains
> no global buffer arguments, but there are write-only image args.
> In this case RAT0 is not set, but RAT1 is.
> ---
>  src/gallium/drivers/r600/evergreen_compute.c | 106 
> +--
>  src/gallium/drivers/r600/evergreen_state.c   |  43 ++-
>  src/gallium/drivers/r600/r600_pipe.h |   4 +-
>  src/gallium/drivers/radeon/r600_texture.c|   1 +
>  4 files changed, 113 insertions(+), 41 deletions(-)
>
> diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
> b/src/gallium/drivers/r600/evergreen_compute.c
> index b191e0d..16816cf 100644
> --- a/src/gallium/drivers/r600/evergreen_compute.c
> +++ b/src/gallium/drivers/r600/evergreen_compute.c
> @@ -99,8 +99,40 @@ struct r600_resource* r600_compute_buffer_alloc_vram(
> return (struct r600_resource *)buffer;
>  }
>
> +static void evergreen_set_cbuf(
> +   struct r600_context *ctx,
> +   unsigned id,
> +   struct pipe_surface *surf)
> +{
> +   struct pipe_framebuffer_state *state = &ctx->framebuffer.state;
> +
> +   /* Add the RAT to the list of color buffers */
> +   state->cbufs[id] = surf;
> +
> +   /* Update the number of color buffers */
> +   state->nr_cbufs = MAX2(id + 1, state->nr_cbufs);
> +
> +   /* cb_target_mask should be 0 for compute resources. */
> +   ctx->compute_cb_target_mask &= ~(0xf << (id << 2));
> +}
>
> -static void evergreen_set_rat(
> +static void evergreen_unset_cbuf(
> +   struct r600_context *ctx,
> +   unsigned id)
> +{
> +   unsigned i;
> +   struct pipe_framebuffer_state *state = &ctx->framebuffer.state;
> +
> +   state->cbufs[id] = NULL;
> +   for (i = state->nr_cbufs; i > 0; --i) {
> +   if (state->cbufs[i - 1]) {
> +   break;
> +   }
> +   }
> +   state->nr_cbufs = i;
> +}
> +
> +static void evergreen_set_rat_buf(
> struct r600_context *ctx,
> unsigned id,
> struct r600_resource* bo,
> @@ -123,23 +155,11 @@ static void evergreen_set_rat(
> rat_templ.u.tex.first_layer = 0;
> rat_templ.u.tex.last_layer = 0;
>
> -   /* Add the RAT the list of color buffers */
> -   ctx->framebuffer.state.cbufs[id] = ctx->b.b.create_surface(
> -   (struct pipe_context *)ctx,
> -   (struct pipe_resource *)bo, &rat_templ);
> -
> -   /* Update the number of color buffers */
> -   ctx->framebuffer.state.nr_cbufs =
> -   MAX2(id + 1, ctx->framebuffer.state.nr_cbufs);
> -
> -   /* Update the cb_target_mask
> -* XXX: I think this is a potential spot for bugs once we start doing
> -* GL interop.  cb_target_mask may be modified in the 3D sections
> -* of this driver. */
> -   ctx->compute_cb_target_mask |= (0xf << (id * 4));
> +   evergreen_set_cbuf(ctx, id, ctx->b.b.create_surface(
> +   (struct pipe_context *)ctx, (struct pipe_resource *)bo, 
> &rat_templ));
>
> surf = (struct r600_surface*)ctx->framebuffer.state.cbufs[id];
> -   evergreen_init_color_surface_rat(rctx, surf);
> +   evergreen_init_color_surface_rat_buf(ctx, surf);
>  }
>
>  static void evergreen_cs_set_vertex_buffer(
> @@ -436,6 +456,8 @@ static void compute_emit_cs(struct r600_context *ctx, 
> const uint *block_layout,
> /* XXX support more than 8 colorbuffers (the offsets are not a 
> multiple of 0x3C for CB8-11) */
> for (i = 0; i < 8 && i < ctx->framebuffer.state.nr_cbufs; i++) {
> struct r600_surface *cb = (struct 
> r600_surface*)ctx->framebuffer.state.cbufs[i];
> +   if (!cb)
> +   continue;
> unsigned reloc = r600_context_bo_reloc(&ctx->b, 
> &ctx->b.rings.gfx,
>(struct 
> r600_resource*)cb->base.texture,
>RADEON_USAGE_READWRITE,
> @@ -623,32 +645,34 @@ static void evergreen_set_compute_resources(struct 
> pipe_context * ctx_,
> struct pipe_surface ** surfaces)
>  {
> struct r600_context *ctx = (struct r600_context *)ctx_;
> -   struct r600_surface **resources = (struct r600_surface **)surfaces;
> +   struct pipe_surface *surf = NULL;
>
> COMPUTE_DBG(ctx->screen, "*** evergreen_set_compute_resources: start 
> = %u count = %u\n",
> start, count);
>
> +   if (!surfaces) {
> +   for (unsigned i = 0; i < count; ++i)
> +   evergreen_unset_cbuf(ctx, 1 + i);
> +   return;
> +   }
> +
> for (unsigned i = 0; i < count; i++) {
> -   /* The First two vertex buffers are 

Re: [Mesa-dev] [Mesa-stable] [PATCH] radeonsi: add new OLAND pci id

2015-08-14 Thread Emil Velikov
On 13 August 2015 at 22:59, Alex Deucher  wrote:
> On Thu, Aug 13, 2015 at 5:29 PM, Emil Velikov  
> wrote:
>> On 13/08/15 22:22, Emil Velikov wrote:
>>> On 13/08/15 18:11, Alex Deucher wrote:
 On Thu, Aug 13, 2015 at 12:06 PM, Emil Velikov  
 wrote:
> On 13 August 2015 at 16:42, Alex Deucher  wrote:
>> On Thu, Aug 13, 2015 at 11:11 AM, Emil Velikov 
>>  wrote:
>>> Hi Alex,
>>>
>>> On 10 August 2015 at 20:36, Alex Deucher  wrote:
 Signed-off-by: Alex Deucher 
 Cc: mesa-sta...@lists.freedesktop.org
 ---
  include/pci_ids/radeonsi_pci_ids.h | 1 +
  1 file changed, 1 insertion(+)

 diff --git a/include/pci_ids/radeonsi_pci_ids.h 
 b/include/pci_ids/radeonsi_pci_ids.h
 index c01ee20..52eada1 100644
 --- a/include/pci_ids/radeonsi_pci_ids.h
 +++ b/include/pci_ids/radeonsi_pci_ids.h
 @@ -63,6 +63,7 @@ CHIPSET(0x6608, OLAND_6608, OLAND)
  CHIPSET(0x6610, OLAND_6610, OLAND)
  CHIPSET(0x6611, OLAND_6611, OLAND)
  CHIPSET(0x6613, OLAND_6613, OLAND)
 +CHIPSET(0x6617, OLAND_6617, OLAND)
>>> Has there been any ideas/plans on getting this information
>>> consolidated in a single place ?
>>>
>>> It feels a bit "dirty" having the same information in four places -
>>> kernel, libdrm, ddx, mesa.
>>
>> There's not really a good solution that I know of due to the way X
>> works.
> If I have to guess, obtaining OLAND via DRM_IOCTL_RADEON_INFO won't be
> impacted by (nor have any impact on) how X works. Shouldn't this "just
> work" or there is something subtly off with the idea ? Can you
> elaborate what part of X might be an obstacle ?

 IIRC, X decides what driver to load based on the pci ids they support
 rather than letting drivers claim their devices.  It's remnant of the
 UMS days.

>>> I think we're talking about different things here.
>>>
>>> * The (dare I say it) detection code used to determine the ddx/dri
>>> module name happens before (and afaics is unrelated to) the driver
>>> internals, which depend on the OLAND{,_foo} values.
>>>
>>> Obviously there is a handfull of extra information about "all the
>>> supported vendor/device ids" in the ddx that you're thinking about, that
>>> I'm afraid one cannot get rid of, unless...
>>>
>>> * The DDX uses a nouveau-like approach, accepting every device id. At
>>> PreInit stage, one does a quick check with libdrm_radeon/amdgpu
>>> (amdgpu_device_initialize?). The latter of which already has a
>>> comprehensive enough list of device ids.
>>>
>> Actually scratch this part, libdrm_radeon does have a list of the
>> devices, but does not have a "device_init" type API. On the other hand
>> amdgpu has the API, but doesn't have the device ids.
>>
>> The earlier (original) idea still stands though:
>> By the time OLAND* is queried/used in the ddx/mesa, the ddx/dri module
>> has already been picked up, irrespective of that extra info
>
> I don't think that will work.
Seems like I wasn't clear enough. Let me try one final time.

> The X server attempts to load the ddxes
> based on the pci vendor id unless you manually specify the driver in
> your xorg.conf.  In the case of 0x1002 AMD/ATI, there are 5 drivers
> for device ids for that vendor id: ati_misc, mach64, r128, radeon, and
> amdgpu.
At this point the "CHIPSET(0x6617, OLAND_6617, OLAND)" information is
not used/needed, is it ?

> The ddx
Here we already have the fd or the bus information, the latter of
which can be used to retrieve the former. Now we can fetch OLAND/etc.
using the ioctl.

> then tells mesa what driver to load via dri2 based on
> the pci id in the ddx.
>
For 1.5 years, the mesa "loader" takes presedence over the driver_name
sent over the dri2 protocol. Regardless, at this stage we already have
the fd so

Note that I'm not trying to persuade anyone that it must be done, just
pointing out that it's perfectly doable.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/clover: Fix build against LLVM 3.8 SVN r244928

2015-08-14 Thread Francisco Jerez
Michel Dänzer  writes:

> From: Michel Dänzer 
>
> raw_svector_ostream::flush() is now unnecessary and forbidden:
>
>   CXX  llvm/libclllvm_la-invocation.lo
> ../../../../../src/gallium/state_trackers/clover/llvm/invocation.cpp: In 
> function 'clover::module {anonymous}::build_module_llvm(llvm::Module*, 
> unsigned int (&)[7])':
> ../../../../../src/gallium/state_trackers/clover/llvm/invocation.cpp:574:29: 
> error: use of deleted function 'void llvm::raw_svector_ostream::flush()'
>bitcode_ostream.flush();
>  ^
> In file included from 
> /home/daenzer/src/llvm-git/llvm/include/clang/Basic/VirtualFileSystem.h:22:0,
>  from 
> /home/daenzer/src/llvm-git/llvm/include/clang/Basic/FileManager.h:20,
>  from 
> /home/daenzer/src/llvm-git/llvm/include/clang/Basic/SourceManager.h:38,
>  from 
> /home/daenzer/src/llvm-git/llvm/include/clang/Frontend/CompilerInstance.h:16,
>  from 
> ../../../../../src/gallium/state_trackers/clover/llvm/invocation.cpp:25:
> /home/daenzer/src/llvm-git/llvm/include/llvm/Support/raw_ostream.h:512:8: 
> note: declared here
>void flush() = delete;
> ^
> Makefile:862: recipe for target 'llvm/libclllvm_la-invocation.lo' failed
>
> Signed-off-by: Michel Dänzer 

Thanks,

Reviewed-by: Francisco Jerez 

> ---
>  src/gallium/state_trackers/clover/llvm/invocation.cpp | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
> b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> index 86859af..63c3f8e 100644
> --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> @@ -571,7 +571,9 @@ namespace {
>llvm::raw_svector_ostream bitcode_ostream(llvm_bitcode);
>llvm::BitstreamWriter writer(llvm_bitcode);
>llvm::WriteBitcodeToFile(mod, bitcode_ostream);
> +#if HAVE_LLVM < 0x0308
>bitcode_ostream.flush();
> +#endif
>  
>const std::vector kernels = find_kernels(mod);
>for (unsigned i = 0; i < kernels.size(); ++i) {
> -- 
> 2.5.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] r600, compute: setup RATs for write-only images

2015-08-14 Thread Zoltan Gilian
v2: Set CB_TARGET_MASK to zero for compute resources (Marek).
Remove unnecessary use of util_range_add (Marek).
Fix crash on non-contiguous RAT setup.

Non-contiguous RAT setup can occur when the kernel signature contains
no global buffer arguments, but there are write-only image args.
In this case RAT0 is not set, but RAT1 is.
---
 src/gallium/drivers/r600/evergreen_compute.c | 106 +--
 src/gallium/drivers/r600/evergreen_state.c   |  43 ++-
 src/gallium/drivers/r600/r600_pipe.h |   4 +-
 src/gallium/drivers/radeon/r600_texture.c|   1 +
 4 files changed, 113 insertions(+), 41 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
b/src/gallium/drivers/r600/evergreen_compute.c
index b191e0d..16816cf 100644
--- a/src/gallium/drivers/r600/evergreen_compute.c
+++ b/src/gallium/drivers/r600/evergreen_compute.c
@@ -99,8 +99,40 @@ struct r600_resource* r600_compute_buffer_alloc_vram(
return (struct r600_resource *)buffer;
 }
 
+static void evergreen_set_cbuf(
+   struct r600_context *ctx,
+   unsigned id,
+   struct pipe_surface *surf)
+{
+   struct pipe_framebuffer_state *state = &ctx->framebuffer.state;
+
+   /* Add the RAT to the list of color buffers */
+   state->cbufs[id] = surf;
+
+   /* Update the number of color buffers */
+   state->nr_cbufs = MAX2(id + 1, state->nr_cbufs);
+
+   /* cb_target_mask should be 0 for compute resources. */
+   ctx->compute_cb_target_mask &= ~(0xf << (id << 2));
+}
 
-static void evergreen_set_rat(
+static void evergreen_unset_cbuf(
+   struct r600_context *ctx,
+   unsigned id)
+{
+   unsigned i;
+   struct pipe_framebuffer_state *state = &ctx->framebuffer.state;
+
+   state->cbufs[id] = NULL;
+   for (i = state->nr_cbufs; i > 0; --i) {
+   if (state->cbufs[i - 1]) {
+   break;
+   }
+   }
+   state->nr_cbufs = i;
+}
+
+static void evergreen_set_rat_buf(
struct r600_context *ctx,
unsigned id,
struct r600_resource* bo,
@@ -123,23 +155,11 @@ static void evergreen_set_rat(
rat_templ.u.tex.first_layer = 0;
rat_templ.u.tex.last_layer = 0;
 
-   /* Add the RAT the list of color buffers */
-   ctx->framebuffer.state.cbufs[id] = ctx->b.b.create_surface(
-   (struct pipe_context *)ctx,
-   (struct pipe_resource *)bo, &rat_templ);
-
-   /* Update the number of color buffers */
-   ctx->framebuffer.state.nr_cbufs =
-   MAX2(id + 1, ctx->framebuffer.state.nr_cbufs);
-
-   /* Update the cb_target_mask
-* XXX: I think this is a potential spot for bugs once we start doing
-* GL interop.  cb_target_mask may be modified in the 3D sections
-* of this driver. */
-   ctx->compute_cb_target_mask |= (0xf << (id * 4));
+   evergreen_set_cbuf(ctx, id, ctx->b.b.create_surface(
+   (struct pipe_context *)ctx, (struct pipe_resource *)bo, 
&rat_templ));
 
surf = (struct r600_surface*)ctx->framebuffer.state.cbufs[id];
-   evergreen_init_color_surface_rat(rctx, surf);
+   evergreen_init_color_surface_rat_buf(ctx, surf);
 }
 
 static void evergreen_cs_set_vertex_buffer(
@@ -436,6 +456,8 @@ static void compute_emit_cs(struct r600_context *ctx, const 
uint *block_layout,
/* XXX support more than 8 colorbuffers (the offsets are not a multiple 
of 0x3C for CB8-11) */
for (i = 0; i < 8 && i < ctx->framebuffer.state.nr_cbufs; i++) {
struct r600_surface *cb = (struct 
r600_surface*)ctx->framebuffer.state.cbufs[i];
+   if (!cb)
+   continue;
unsigned reloc = r600_context_bo_reloc(&ctx->b, 
&ctx->b.rings.gfx,
   (struct 
r600_resource*)cb->base.texture,
   RADEON_USAGE_READWRITE,
@@ -623,32 +645,34 @@ static void evergreen_set_compute_resources(struct 
pipe_context * ctx_,
struct pipe_surface ** surfaces)
 {
struct r600_context *ctx = (struct r600_context *)ctx_;
-   struct r600_surface **resources = (struct r600_surface **)surfaces;
+   struct pipe_surface *surf = NULL;
 
COMPUTE_DBG(ctx->screen, "*** evergreen_set_compute_resources: start = 
%u count = %u\n",
start, count);
 
+   if (!surfaces) {
+   for (unsigned i = 0; i < count; ++i)
+   evergreen_unset_cbuf(ctx, 1 + i);
+   return;
+   }
+
for (unsigned i = 0; i < count; i++) {
-   /* The First two vertex buffers are reserved for parameters and
-* global buffers. */
-   unsigned vtx_id = 2 + i;
-   if (resources[i]) {
-   struct r600_resource_global *buffer =
-   (struct r600_resource_global*)
-

[Mesa-dev] [PATCH 2/2] r600, compute: setup RATs for write-only images

2015-08-14 Thread Zoltan Gilian
v2: Set CB_TARGET_MASK to zero for compute resources (Marek).
Remove unnecessary use of util_range_add (Marek).
Fix crash on non-contiguous RAT setup.
v3: Unreference surface instead of calling destroy directly (Marek).

Non-contiguous RAT setup can occur when the kernel signature contains
no global buffer arguments, but there are write-only image args.
In this case RAT0 is not set, but RAT1 is.
---
 src/gallium/drivers/r600/evergreen_compute.c | 106 +--
 src/gallium/drivers/r600/evergreen_state.c   |  43 ++-
 src/gallium/drivers/r600/r600_pipe.h |   4 +-
 src/gallium/drivers/radeon/r600_texture.c|   1 +
 4 files changed, 113 insertions(+), 41 deletions(-)

diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
b/src/gallium/drivers/r600/evergreen_compute.c
index b191e0d..e98e3d8 100644
--- a/src/gallium/drivers/r600/evergreen_compute.c
+++ b/src/gallium/drivers/r600/evergreen_compute.c
@@ -99,8 +99,40 @@ struct r600_resource* r600_compute_buffer_alloc_vram(
return (struct r600_resource *)buffer;
 }
 
+static void evergreen_set_cbuf(
+   struct r600_context *ctx,
+   unsigned id,
+   struct pipe_surface *surf)
+{
+   struct pipe_framebuffer_state *state = &ctx->framebuffer.state;
+
+   /* Add the RAT to the list of color buffers */
+   state->cbufs[id] = surf;
+
+   /* Update the number of color buffers */
+   state->nr_cbufs = MAX2(id + 1, state->nr_cbufs);
+
+   /* cb_target_mask should be 0 for compute resources. */
+   ctx->compute_cb_target_mask &= ~(0xf << (id << 2));
+}
 
-static void evergreen_set_rat(
+static void evergreen_unset_cbuf(
+   struct r600_context *ctx,
+   unsigned id)
+{
+   unsigned i;
+   struct pipe_framebuffer_state *state = &ctx->framebuffer.state;
+
+   state->cbufs[id] = NULL;
+   for (i = state->nr_cbufs; i > 0; --i) {
+   if (state->cbufs[i - 1]) {
+   break;
+   }
+   }
+   state->nr_cbufs = i;
+}
+
+static void evergreen_set_rat_buf(
struct r600_context *ctx,
unsigned id,
struct r600_resource* bo,
@@ -123,23 +155,11 @@ static void evergreen_set_rat(
rat_templ.u.tex.first_layer = 0;
rat_templ.u.tex.last_layer = 0;
 
-   /* Add the RAT the list of color buffers */
-   ctx->framebuffer.state.cbufs[id] = ctx->b.b.create_surface(
-   (struct pipe_context *)ctx,
-   (struct pipe_resource *)bo, &rat_templ);
-
-   /* Update the number of color buffers */
-   ctx->framebuffer.state.nr_cbufs =
-   MAX2(id + 1, ctx->framebuffer.state.nr_cbufs);
-
-   /* Update the cb_target_mask
-* XXX: I think this is a potential spot for bugs once we start doing
-* GL interop.  cb_target_mask may be modified in the 3D sections
-* of this driver. */
-   ctx->compute_cb_target_mask |= (0xf << (id * 4));
+   evergreen_set_cbuf(ctx, id, ctx->b.b.create_surface(
+   (struct pipe_context *)ctx, (struct pipe_resource *)bo, 
&rat_templ));
 
surf = (struct r600_surface*)ctx->framebuffer.state.cbufs[id];
-   evergreen_init_color_surface_rat(rctx, surf);
+   evergreen_init_color_surface_rat_buf(ctx, surf);
 }
 
 static void evergreen_cs_set_vertex_buffer(
@@ -436,6 +456,8 @@ static void compute_emit_cs(struct r600_context *ctx, const 
uint *block_layout,
/* XXX support more than 8 colorbuffers (the offsets are not a multiple 
of 0x3C for CB8-11) */
for (i = 0; i < 8 && i < ctx->framebuffer.state.nr_cbufs; i++) {
struct r600_surface *cb = (struct 
r600_surface*)ctx->framebuffer.state.cbufs[i];
+   if (!cb)
+   continue;
unsigned reloc = r600_context_bo_reloc(&ctx->b, 
&ctx->b.rings.gfx,
   (struct 
r600_resource*)cb->base.texture,
   RADEON_USAGE_READWRITE,
@@ -623,32 +645,34 @@ static void evergreen_set_compute_resources(struct 
pipe_context * ctx_,
struct pipe_surface ** surfaces)
 {
struct r600_context *ctx = (struct r600_context *)ctx_;
-   struct r600_surface **resources = (struct r600_surface **)surfaces;
+   struct pipe_surface *surf = NULL;
 
COMPUTE_DBG(ctx->screen, "*** evergreen_set_compute_resources: start = 
%u count = %u\n",
start, count);
 
+   if (!surfaces) {
+   for (unsigned i = 0; i < count; ++i)
+   evergreen_unset_cbuf(ctx, 1 + i);
+   return;
+   }
+
for (unsigned i = 0; i < count; i++) {
-   /* The First two vertex buffers are reserved for parameters and
-* global buffers. */
-   unsigned vtx_id = 2 + i;
-   if (resources[i]) {
-   struct r600_resource_global *buffer =
-  

Re: [Mesa-dev] [PATCH v2] gles/es3.1: Implement glMemoryBarrierByRegion

2015-08-14 Thread Lofstedt, Marta
> -Original Message-
> From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On
> Behalf Of Tapani Pälli
> Sent: Friday, August 14, 2015 7:31 AM
> To: Marta Lofstedt; mesa-dev@lists.freedesktop.org
> Subject: Re: [Mesa-dev] [PATCH v2] gles/es3.1: Implement
> glMemoryBarrierByRegion
> 
> 
> 
> On 08/14/2015 08:21 AM, Tapani Pälli wrote:
> >
> >
> > On 08/07/2015 12:56 PM, Tapani Pälli wrote:
> >> Super, I've verified that this makes
> >> ES31-CTS.shader_image_load_store.basic-api-barrier-byRegion pass with
> >> one of Curro's branches.
> >>
> >> Reviewed-by: Tapani Pälli 
> >
> > Urgh should've tested this before but now I noticed that 'make check'
> > does not pass with this.
> >
> > This function is part of OpenGL 4.5 so we need to move it to GL4x.xml
> > and expose it in gl_core_functions_possible in dispatch_sanity.cpp
> > (with version 45). However, 'make check' still fails with these
> > changes, I'm trying to figure out why. The failing test is GL30 where
> > function ends up (no matter what GL version specified in entry).
> 
> I found that make check passes if I add 'desktop=false' to the xml entry
> making this function ES only. I guess we should go and do that to get forward.
> 
Hmm, desktop="false", certainly fix the old patch. But, since
glMemoryBarrierByRegion is now also in GL 4.5, that does not make sense. 
I have tested moving to GL4.xml and adding to dispatch_sanity with 45,
but then as you wrote, it is exposed under GL 3.0. 

I am really anxious to get this in, since the GLES 3.1 CTS segfaults without it.

I could send up a V3 with the desktop=false solution, but I don't think that 
is correct, I need to figure out how to have under GL 4.5 and GLES 3.1 only.

Anyways, thanks for your findings Tapani.
> 
> >
> >> On 08/04/2015 11:22 AM, Marta Lofstedt wrote:
> >>> From: Marta Lofstedt 
> >>>
> >>> Signed-off-by: Marta Lofstedt 
> >>> ---
> >>>   src/mapi/glapi/gen/gl_API.xml   |  4 
> >>>   src/mesa/main/shaderimage.c | 40
> >>> +
> >>>   src/mesa/main/shaderimage.h |  3 +++
> >>>   src/mesa/main/tests/dispatch_sanity.cpp |  3 +--
> >>>   4 files changed, 48 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/src/mapi/glapi/gen/gl_API.xml
> >>> b/src/mapi/glapi/gen/gl_API.xml index 658efa4..3db4349 100644
> >>> --- a/src/mapi/glapi/gen/gl_API.xml
> >>> +++ b/src/mapi/glapi/gen/gl_API.xml
> >>> @@ -2966,6 +2966,10 @@
> >>>   
> >>>   
> >>>   
> >>> +
> >>> +
> >>> +
> >>> +
> >>>   
> >>>
> >>>   
> >>> diff --git a/src/mesa/main/shaderimage.c
> >>> b/src/mesa/main/shaderimage.c index a348cdb..7337f22 100644
> >>> --- a/src/mesa/main/shaderimage.c
> >>> +++ b/src/mesa/main/shaderimage.c
> >>> @@ -653,3 +653,43 @@ _mesa_MemoryBarrier(GLbitfield barriers)
> >>>  if (ctx->Driver.MemoryBarrier)
> >>> ctx->Driver.MemoryBarrier(ctx, barriers);
> >>>   }
> >>> +
> >>> +void GLAPIENTRY
> >>> +_mesa_MemoryBarrierByRegion(GLbitfield barriers) {
> >>> +   GET_CURRENT_CONTEXT(ctx);
> >>> +
> >>> +   GLbitfield all_allowed_bits = GL_ATOMIC_COUNTER_BARRIER_BIT |
> >>> + GL_FRAMEBUFFER_BARRIER_BIT |
> >>> + GL_SHADER_IMAGE_ACCESS_BARRIER_BIT |
> >>> + GL_SHADER_STORAGE_BARRIER_BIT |
> >>> + GL_TEXTURE_FETCH_BARRIER_BIT |
> >>> + GL_UNIFORM_BARRIER_BIT;
> >>> +
> >>> +   if (ctx->Driver.MemoryBarrier) {
> >>> +  /* From section 7.11.2 of the OpenGL ES 3.1 specification:
> >>> +   *
> >>> +   *"When barriers is ALL_BARRIER_BITS, shader memory
> >>> accesses will be
> >>> +   * synchronized relative to all these barrier bits, but not
> >>> to other
> >>> +   * barrier bits specific to MemoryBarrier."
> >>> +   *
> >>> +   * That is, if barriers is the special value
> >>> GL_ALL_BARRIER_BITS, then all
> >>> +   * barriers allowed by glMemoryBarrierByRegion should be
> >>> activated."
> >>> +   */
> >>> +  if (barriers == GL_ALL_BARRIER_BITS)
> >>> + return ctx->Driver.MemoryBarrier(ctx, all_allowed_bits);
> >>> +
> >>> +  /* From section 7.11.2 of the OpenGL ES 3.1 specification:
> >>> +   *
> >>> +   *"An INVALID_VALUE error is generated if barriers is not
> >>> the special
> >>> +   * value ALL_BARRIER_BITS, and has any bits set other than
> >>> those
> >>> +   * described above."
> >>> +   */
> >>> +  if ((barriers & ~all_allowed_bits) != 0) {
> >>> + _mesa_error(ctx, GL_INVALID_VALUE,
> >>> + "glMemoryBarrierByRegion(unsupported barrier
> >>> bit");
> >>> +  }
> >>> +
> >>> +  ctx->Driver.MemoryBarrier(ctx, barriers);
> >>> +   }
> >>> +}
> >>> diff --git a/src/mesa/main/shaderimage.h
> >>> b/src/mesa/main/shaderimage.h index 33d8a1e..d08ece8 100644
> >>> --- a/src/mesa/main/shader

[Mesa-dev] [PATCH] glsl: add missing MS sampler builtin types for GLSL ES 3.10

2015-08-14 Thread Tapani Pälli
Signed-off-by: Tapani Pälli 
---
 src/glsl/builtin_types.cpp | 6 +++---
 src/glsl/glsl_lexer.ll | 7 ---
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/src/glsl/builtin_types.cpp b/src/glsl/builtin_types.cpp
index ffbc5e6..b0156a1 100644
--- a/src/glsl/builtin_types.cpp
+++ b/src/glsl/builtin_types.cpp
@@ -182,7 +182,7 @@ const static struct builtin_type_versions {
T(samplerCubeArray,400, 999)
T(sampler2DRect,   140, 999)
T(samplerBuffer,   140, 999)
-   T(sampler2DMS, 150, 999)
+   T(sampler2DMS, 150, 310)
T(sampler2DMSArray,150, 999)
 
T(isampler1D,  130, 999)
@@ -194,7 +194,7 @@ const static struct builtin_type_versions {
T(isamplerCubeArray,   400, 999)
T(isampler2DRect,  140, 999)
T(isamplerBuffer,  140, 999)
-   T(isampler2DMS,150, 999)
+   T(isampler2DMS,150, 310)
T(isampler2DMSArray,   150, 999)
 
T(usampler1D,  130, 999)
@@ -206,7 +206,7 @@ const static struct builtin_type_versions {
T(usamplerCubeArray,   400, 999)
T(usampler2DRect,  140, 999)
T(usamplerBuffer,  140, 999)
-   T(usampler2DMS,150, 999)
+   T(usampler2DMS,150, 310)
T(usampler2DMSArray,   150, 999)
 
T(sampler1DShadow, 110, 999)
diff --git a/src/glsl/glsl_lexer.ll b/src/glsl/glsl_lexer.ll
index efa0bb6..88565e8 100644
--- a/src/glsl/glsl_lexer.ll
+++ b/src/glsl/glsl_lexer.ll
@@ -343,9 +343,10 @@ usampler2DArrayKEYWORD(130, 300, 130, 300, 
USAMPLER2DARRAY);
 
/* additional keywords in ARB_texture_multisample, included in GLSL 1.50 */
/* these are reserved but not defined in GLSL 3.00 */
-sampler2DMSKEYWORD_WITH_ALT(150, 300, 150, 0, 
yyextra->ARB_texture_multisample_enable, SAMPLER2DMS);
-isampler2DMS   KEYWORD_WITH_ALT(150, 300, 150, 0, 
yyextra->ARB_texture_multisample_enable, ISAMPLER2DMS);
-usampler2DMS   KEYWORD_WITH_ALT(150, 300, 150, 0, 
yyextra->ARB_texture_multisample_enable, USAMPLER2DMS);
+   /* [iu]sampler2DMS are defined in GLSL ES 3.10 */
+sampler2DMSKEYWORD_WITH_ALT(150, 300, 150, 310, 
yyextra->ARB_texture_multisample_enable, SAMPLER2DMS);
+isampler2DMS   KEYWORD_WITH_ALT(150, 300, 150, 310, 
yyextra->ARB_texture_multisample_enable, ISAMPLER2DMS);
+usampler2DMS   KEYWORD_WITH_ALT(150, 300, 150, 310, 
yyextra->ARB_texture_multisample_enable, USAMPLER2DMS);
 sampler2DMSArray   KEYWORD_WITH_ALT(150, 300, 150, 0, 
yyextra->ARB_texture_multisample_enable, SAMPLER2DMSARRAY);
 isampler2DMSArray  KEYWORD_WITH_ALT(150, 300, 150, 0, 
yyextra->ARB_texture_multisample_enable, ISAMPLER2DMSARRAY);
 usampler2DMSArray  KEYWORD_WITH_ALT(150, 300, 150, 0, 
yyextra->ARB_texture_multisample_enable, USAMPLER2DMSARRAY);
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] gles/es3.1: Implement glMemoryBarrierByRegion

2015-08-14 Thread Dave Airlie
On 14 August 2015 at 20:27, Lofstedt, Marta  wrote:
>> -Original Message-
>> From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On
>> Behalf Of Tapani Pälli
>> Sent: Friday, August 14, 2015 7:31 AM
>> To: Marta Lofstedt; mesa-dev@lists.freedesktop.org
>> Subject: Re: [Mesa-dev] [PATCH v2] gles/es3.1: Implement
>> glMemoryBarrierByRegion
>>
>>
>>
>> On 08/14/2015 08:21 AM, Tapani Pälli wrote:
>> >
>> >
>> > On 08/07/2015 12:56 PM, Tapani Pälli wrote:
>> >> Super, I've verified that this makes
>> >> ES31-CTS.shader_image_load_store.basic-api-barrier-byRegion pass with
>> >> one of Curro's branches.
>> >>
>> >> Reviewed-by: Tapani Pälli 
>> >
>> > Urgh should've tested this before but now I noticed that 'make check'
>> > does not pass with this.
>> >
>> > This function is part of OpenGL 4.5 so we need to move it to GL4x.xml
>> > and expose it in gl_core_functions_possible in dispatch_sanity.cpp
>> > (with version 45). However, 'make check' still fails with these
>> > changes, I'm trying to figure out why. The failing test is GL30 where
>> > function ends up (no matter what GL version specified in entry).
>>
>> I found that make check passes if I add 'desktop=false' to the xml entry
>> making this function ES only. I guess we should go and do that to get 
>> forward.
>>
> Hmm, desktop="false", certainly fix the old patch. But, since
> glMemoryBarrierByRegion is now also in GL 4.5, that does not make sense.
> I have tested moving to GL4.xml and adding to dispatch_sanity with 45,
> but then as you wrote, it is exposed under GL 3.0.
>
> I am really anxious to get this in, since the GLES 3.1 CTS segfaults without 
> it.
>
> I could send up a V3 with the desktop=false solution, but I don't think that
> is correct, I need to figure out how to have under GL 4.5 and GLES 3.1 only.

you might need to add it to apiexec.py if you are making it core only,

also apiexec.py doesn't require the gl prefix.

the problems you are seeing with make check sound the same as the ones
I had with subroutines.

Dave.

>
> Anyways, thanks for your findings Tapani.
>>
>> >
>> >> On 08/04/2015 11:22 AM, Marta Lofstedt wrote:
>> >>> From: Marta Lofstedt 
>> >>>
>> >>> Signed-off-by: Marta Lofstedt 
>> >>> ---
>> >>>   src/mapi/glapi/gen/gl_API.xml   |  4 
>> >>>   src/mesa/main/shaderimage.c | 40
>> >>> +
>> >>>   src/mesa/main/shaderimage.h |  3 +++
>> >>>   src/mesa/main/tests/dispatch_sanity.cpp |  3 +--
>> >>>   4 files changed, 48 insertions(+), 2 deletions(-)
>> >>>
>> >>> diff --git a/src/mapi/glapi/gen/gl_API.xml
>> >>> b/src/mapi/glapi/gen/gl_API.xml index 658efa4..3db4349 100644
>> >>> --- a/src/mapi/glapi/gen/gl_API.xml
>> >>> +++ b/src/mapi/glapi/gen/gl_API.xml
>> >>> @@ -2966,6 +2966,10 @@
>> >>>   
>> >>>   
>> >>>   
>> >>> +
>> >>> +
>> >>> +
>> >>> +
>> >>>   
>> >>>
>> >>>   
>> >>> diff --git a/src/mesa/main/shaderimage.c
>> >>> b/src/mesa/main/shaderimage.c index a348cdb..7337f22 100644
>> >>> --- a/src/mesa/main/shaderimage.c
>> >>> +++ b/src/mesa/main/shaderimage.c
>> >>> @@ -653,3 +653,43 @@ _mesa_MemoryBarrier(GLbitfield barriers)
>> >>>  if (ctx->Driver.MemoryBarrier)
>> >>> ctx->Driver.MemoryBarrier(ctx, barriers);
>> >>>   }
>> >>> +
>> >>> +void GLAPIENTRY
>> >>> +_mesa_MemoryBarrierByRegion(GLbitfield barriers) {
>> >>> +   GET_CURRENT_CONTEXT(ctx);
>> >>> +
>> >>> +   GLbitfield all_allowed_bits = GL_ATOMIC_COUNTER_BARRIER_BIT |
>> >>> + GL_FRAMEBUFFER_BARRIER_BIT |
>> >>> + GL_SHADER_IMAGE_ACCESS_BARRIER_BIT |
>> >>> + GL_SHADER_STORAGE_BARRIER_BIT |
>> >>> + GL_TEXTURE_FETCH_BARRIER_BIT |
>> >>> + GL_UNIFORM_BARRIER_BIT;
>> >>> +
>> >>> +   if (ctx->Driver.MemoryBarrier) {
>> >>> +  /* From section 7.11.2 of the OpenGL ES 3.1 specification:
>> >>> +   *
>> >>> +   *"When barriers is ALL_BARRIER_BITS, shader memory
>> >>> accesses will be
>> >>> +   * synchronized relative to all these barrier bits, but not
>> >>> to other
>> >>> +   * barrier bits specific to MemoryBarrier."
>> >>> +   *
>> >>> +   * That is, if barriers is the special value
>> >>> GL_ALL_BARRIER_BITS, then all
>> >>> +   * barriers allowed by glMemoryBarrierByRegion should be
>> >>> activated."
>> >>> +   */
>> >>> +  if (barriers == GL_ALL_BARRIER_BITS)
>> >>> + return ctx->Driver.MemoryBarrier(ctx, all_allowed_bits);
>> >>> +
>> >>> +  /* From section 7.11.2 of the OpenGL ES 3.1 specification:
>> >>> +   *
>> >>> +   *"An INVALID_VALUE error is generated if barriers is not
>> >>> the special
>> >>> +   * value ALL_BARRIER_BITS, and has any bits set other than
>> >>> those
>> >>> +   * described above."
>> >>> +   */
>> >>> +  if ((barriers & ~all_all

[Mesa-dev] [PATCH v3] mesa: Implement glMemoryBarrierByRegion

2015-08-14 Thread Marta Lofstedt
From: Marta Lofstedt 

The function glMemoryBarrierByRegion is part of
OpenGL ES 3.1 and OpenGL 4.5 core and compatibility
profiles.

Signed-off-by: Marta Lofstedt 
---
 src/mapi/glapi/gen/GL4x.xml |  6 +
 src/mesa/main/shaderimage.c | 40 +
 src/mesa/main/shaderimage.h |  3 +++
 src/mesa/main/tests/dispatch_sanity.cpp |  9 ++--
 4 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/src/mapi/glapi/gen/GL4x.xml b/src/mapi/glapi/gen/GL4x.xml
index 94ddfb7..dee5027 100644
--- a/src/mapi/glapi/gen/GL4x.xml
+++ b/src/mapi/glapi/gen/GL4x.xml
@@ -44,4 +44,10 @@
   
 
 
+
+  
+
+  
+
+
 
diff --git a/src/mesa/main/shaderimage.c b/src/mesa/main/shaderimage.c
index a348cdb..7337f22 100644
--- a/src/mesa/main/shaderimage.c
+++ b/src/mesa/main/shaderimage.c
@@ -653,3 +653,43 @@ _mesa_MemoryBarrier(GLbitfield barriers)
if (ctx->Driver.MemoryBarrier)
   ctx->Driver.MemoryBarrier(ctx, barriers);
 }
+
+void GLAPIENTRY
+_mesa_MemoryBarrierByRegion(GLbitfield barriers)
+{
+   GET_CURRENT_CONTEXT(ctx);
+
+   GLbitfield all_allowed_bits = GL_ATOMIC_COUNTER_BARRIER_BIT |
+ GL_FRAMEBUFFER_BARRIER_BIT |
+ GL_SHADER_IMAGE_ACCESS_BARRIER_BIT |
+ GL_SHADER_STORAGE_BARRIER_BIT |
+ GL_TEXTURE_FETCH_BARRIER_BIT |
+ GL_UNIFORM_BARRIER_BIT;
+
+   if (ctx->Driver.MemoryBarrier) {
+  /* From section 7.11.2 of the OpenGL ES 3.1 specification:
+   *
+   *"When barriers is ALL_BARRIER_BITS, shader memory accesses will be
+   * synchronized relative to all these barrier bits, but not to other
+   * barrier bits specific to MemoryBarrier."
+   *
+   * That is, if barriers is the special value GL_ALL_BARRIER_BITS, then 
all
+   * barriers allowed by glMemoryBarrierByRegion should be activated."
+   */
+  if (barriers == GL_ALL_BARRIER_BITS)
+ return ctx->Driver.MemoryBarrier(ctx, all_allowed_bits);
+
+  /* From section 7.11.2 of the OpenGL ES 3.1 specification:
+   *
+   *"An INVALID_VALUE error is generated if barriers is not the special
+   * value ALL_BARRIER_BITS, and has any bits set other than those
+   * described above."
+   */
+  if ((barriers & ~all_allowed_bits) != 0) {
+ _mesa_error(ctx, GL_INVALID_VALUE,
+ "glMemoryBarrierByRegion(unsupported barrier bit");
+  }
+
+  ctx->Driver.MemoryBarrier(ctx, barriers);
+   }
+}
diff --git a/src/mesa/main/shaderimage.h b/src/mesa/main/shaderimage.h
index 33d8a1e..d08ece8 100644
--- a/src/mesa/main/shaderimage.h
+++ b/src/mesa/main/shaderimage.h
@@ -68,6 +68,9 @@ _mesa_BindImageTextures(GLuint first, GLsizei count, const 
GLuint *textures);
 void GLAPIENTRY
 _mesa_MemoryBarrier(GLbitfield barriers);
 
+void GLAPIENTRY
+_mesa_MemoryBarrierByRegion(GLbitfield barriers);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index af89d2c..59107eb 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -851,6 +851,9 @@ const struct function common_desktop_functions_possible[] = 
{
 // { "glTextureStorage2DMultisampleEXT", 43, -1 },  // XXX: Add to xml
 // { "glTextureStorage3DMultisampleEXT", 43, -1 },  // XXX: Add to xml
 
+/* GL 4.5 */
+   { "glMemoryBarrierByRegion", 45, -1 },
+
/* GL_ARB_internalformat_query */
{ "glGetInternalformativ", 30, -1 },
 
@@ -1739,6 +1742,9 @@ const struct function gl_core_functions_possible[] = {
 // { "glTextureStorage2DMultisampleEXT", 43, -1 },  // XXX: Add to xml
 // { "glTextureStorage3DMultisampleEXT", 43, -1 },  // XXX: Add to xml
 
+/* GL 4.5 */
+   { "glMemoryBarrierByRegion", 45, -1 },
+
/* GL_ARB_direct_state_access */
{ "glCreateTransformFeedbacks", 45, -1 },
{ "glTransformFeedbackBufferBase", 45, -1 },
@@ -2461,8 +2467,7 @@ const struct function gles31_functions_possible[] = {
{ "glGetBooleani_v", 31, -1 },
{ "glMemoryBarrier", 31, -1 },
 
-   // FINISHME: This function has not been implemented yet.
-   // { "glMemoryBarrierByRegion", 31, -1 },
+   { "glMemoryBarrierByRegion", 31, -1 },
 
{ "glTexStorage2DMultisample", 31, -1 },
{ "glGetMultisamplefv", 31, -1 },
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] gles/es3.1: Implement glMemoryBarrierByRegion

2015-08-14 Thread Lofstedt, Marta
> -Original Message-
> From: Dave Airlie [mailto:airl...@gmail.com]
> Sent: Friday, August 14, 2015 12:43 PM
> To: Lofstedt, Marta
> Cc: Palli, Tapani; Marta Lofstedt; mesa-dev@lists.freedesktop.org
> Subject: Re: [Mesa-dev] [PATCH v2] gles/es3.1: Implement
> glMemoryBarrierByRegion
> 
> On 14 August 2015 at 20:27, Lofstedt, Marta 
> wrote:
> >> -Original Message-
> >> From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On
> >> Behalf Of Tapani Pälli
> >> Sent: Friday, August 14, 2015 7:31 AM
> >> To: Marta Lofstedt; mesa-dev@lists.freedesktop.org
> >> Subject: Re: [Mesa-dev] [PATCH v2] gles/es3.1: Implement
> >> glMemoryBarrierByRegion
> >>
> >>
> >>
> >> On 08/14/2015 08:21 AM, Tapani Pälli wrote:
> >> >
> >> >
> >> > On 08/07/2015 12:56 PM, Tapani Pälli wrote:
> >> >> Super, I've verified that this makes
> >> >> ES31-CTS.shader_image_load_store.basic-api-barrier-byRegion pass
> >> >> with one of Curro's branches.
> >> >>
> >> >> Reviewed-by: Tapani Pälli 
> >> >
> >> > Urgh should've tested this before but now I noticed that 'make check'
> >> > does not pass with this.
> >> >
> >> > This function is part of OpenGL 4.5 so we need to move it to
> >> > GL4x.xml and expose it in gl_core_functions_possible in
> >> > dispatch_sanity.cpp (with version 45). However, 'make check' still
> >> > fails with these changes, I'm trying to figure out why. The failing
> >> > test is GL30 where function ends up (no matter what GL version
> specified in entry).
> >>
> >> I found that make check passes if I add 'desktop=false' to the xml
> >> entry making this function ES only. I guess we should go and do that to get
> forward.
> >>
> > Hmm, desktop="false", certainly fix the old patch. But, since
> > glMemoryBarrierByRegion is now also in GL 4.5, that does not make sense.
> > I have tested moving to GL4.xml and adding to dispatch_sanity with 45,
> > but then as you wrote, it is exposed under GL 3.0.
> >
> > I am really anxious to get this in, since the GLES 3.1 CTS segfaults 
> > without it.
> >
> > I could send up a V3 with the desktop=false solution, but I don't
> > think that is correct, I need to figure out how to have under GL 4.5 and
> GLES 3.1 only.
> 
> you might need to add it to apiexec.py if you are making it core only,
> 
> also apiexec.py doesn't require the gl prefix.
> 
> the problems you are seeing with make check sound the same as the ones I
> had with subroutines.
> 
> Dave.

Yeah, that's it, thanks Dave! 
I had only added to Core profile in dispatch_sanity.cpp.

V3 is coming up, now for both GLES 3.1 and GL 4.5. 

> 
> >
> > Anyways, thanks for your findings Tapani.
> >>
> >> >
> >> >> On 08/04/2015 11:22 AM, Marta Lofstedt wrote:
> >> >>> From: Marta Lofstedt 
> >> >>>
> >> >>> Signed-off-by: Marta Lofstedt 
> >> >>> ---
> >> >>>   src/mapi/glapi/gen/gl_API.xml   |  4 
> >> >>>   src/mesa/main/shaderimage.c | 40
> >> >>> +
> >> >>>   src/mesa/main/shaderimage.h |  3 +++
> >> >>>   src/mesa/main/tests/dispatch_sanity.cpp |  3 +--
> >> >>>   4 files changed, 48 insertions(+), 2 deletions(-)
> >> >>>
> >> >>> diff --git a/src/mapi/glapi/gen/gl_API.xml
> >> >>> b/src/mapi/glapi/gen/gl_API.xml index 658efa4..3db4349 100644
> >> >>> --- a/src/mapi/glapi/gen/gl_API.xml
> >> >>> +++ b/src/mapi/glapi/gen/gl_API.xml
> >> >>> @@ -2966,6 +2966,10 @@
> >> >>>   
> >> >>>   
> >> >>>   
> >> >>> +
> >> >>> +
> >> >>> +
> >> >>> +
> >> >>>   
> >> >>>
> >> >>>   
> >> >>> diff --git a/src/mesa/main/shaderimage.c
> >> >>> b/src/mesa/main/shaderimage.c index a348cdb..7337f22 100644
> >> >>> --- a/src/mesa/main/shaderimage.c
> >> >>> +++ b/src/mesa/main/shaderimage.c
> >> >>> @@ -653,3 +653,43 @@ _mesa_MemoryBarrier(GLbitfield barriers)
> >> >>>  if (ctx->Driver.MemoryBarrier)
> >> >>> ctx->Driver.MemoryBarrier(ctx, barriers);
> >> >>>   }
> >> >>> +
> >> >>> +void GLAPIENTRY
> >> >>> +_mesa_MemoryBarrierByRegion(GLbitfield barriers) {
> >> >>> +   GET_CURRENT_CONTEXT(ctx);
> >> >>> +
> >> >>> +   GLbitfield all_allowed_bits = GL_ATOMIC_COUNTER_BARRIER_BIT
> |
> >> >>> + GL_FRAMEBUFFER_BARRIER_BIT |
> >> >>> + GL_SHADER_IMAGE_ACCESS_BARRIER_BIT |
> >> >>> + GL_SHADER_STORAGE_BARRIER_BIT |
> >> >>> + GL_TEXTURE_FETCH_BARRIER_BIT |
> >> >>> + GL_UNIFORM_BARRIER_BIT;
> >> >>> +
> >> >>> +   if (ctx->Driver.MemoryBarrier) {
> >> >>> +  /* From section 7.11.2 of the OpenGL ES 3.1 specification:
> >> >>> +   *
> >> >>> +   *"When barriers is ALL_BARRIER_BITS, shader memory
> >> >>> accesses will be
> >> >>> +   * synchronized relative to all these barrier bits, but not
> >> >>> to other
> >> >>> +   * barrier bits specific to MemoryBarrier."
> >> >>> +   *
> >> >>> +   * That is, if barrier

[Mesa-dev] [PATCH] mesa: Use the effective internal format instead for validation

2015-08-14 Thread Eduardo Lima Mitev
When validating format+type+internalFormat for texture pixel operations
on GLES3, the effective internal format should be used if the one
specified is an unsized internal format. Page 127, section "3.8 Texturing"
of the GLES 3.0.4 spec says:

"if internalformat is a base internal format, the effective internal
 format is a sized internal format that is derived from the format and
 type for internal use by the GL. Table 3.12 specifies the mapping of
 format and type to effective internal formats. The effective internal
 format is used by the GL for purposes such as texture completeness or
 type checks for CopyTex* commands. In these cases, the GL is required
 to operate as if the effective internal format was used as the
 internalformat when specifying the texture data."

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91582
---
 src/mesa/main/glformats.c | 48 +++
 src/mesa/main/glformats.h |  4 
 src/mesa/main/teximage.c  | 27 ++
 3 files changed, 79 insertions(+)

diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c
index 3eb66da..fd83e9c 100644
--- a/src/mesa/main/glformats.c
+++ b/src/mesa/main/glformats.c
@@ -2824,3 +2824,51 @@ _mesa_format_from_format_and_type(GLenum format, GLenum 
type)
 */
unreachable("Unsupported format");
 }
+
+/**
+ * Returns the effective internal format from a texture format and type.
+ * This is used by texture image operations internally for validation, when
+ * the specified internal format is a base (unsized) format.
+ *
+ * \param format the texture format
+ * \param type the texture type
+ */
+GLenum
+_mesa_es3_effective_internal_format_for_format_and_type(GLenum format,
+GLenum type)
+{
+  switch (format) {
+  case GL_RGBA:
+ switch (type) {
+ case GL_UNSIGNED_BYTE:
+return GL_RGBA8;
+ case GL_UNSIGNED_SHORT_4_4_4_4:
+return GL_RGBA4;
+ case GL_UNSIGNED_SHORT_5_5_5_1:
+return GL_RGB5_A1;
+ }
+ break;
+  case GL_RGB:
+ switch (type) {
+ case GL_UNSIGNED_BYTE:
+return GL_RGB8;
+ case GL_UNSIGNED_SHORT_5_6_5:
+return GL_RGB565;
+ }
+ break;
+  case GL_LUMINANCE_ALPHA:
+ if (type == GL_UNSIGNED_BYTE)
+return GL_LUMINANCE8_ALPHA8;
+  case GL_LUMINANCE:
+ if (type == GL_UNSIGNED_BYTE)
+return GL_LUMINANCE8;
+  case GL_ALPHA:
+ if (type == GL_UNSIGNED_BYTE)
+return GL_ALPHA8;
+  default:
+ /* fall through and return NONE */
+ break;
+  }
+
+  return GL_NONE;
+}
diff --git a/src/mesa/main/glformats.h b/src/mesa/main/glformats.h
index 419955a..0686c71 100644
--- a/src/mesa/main/glformats.h
+++ b/src/mesa/main/glformats.h
@@ -135,6 +135,10 @@ _mesa_es3_error_check_format_and_type(const struct 
gl_context *ctx,
 extern uint32_t
 _mesa_format_from_format_and_type(GLenum format, GLenum type);
 
+extern GLenum
+_mesa_es3_effective_internal_format_for_format_and_type(GLenum format,
+GLenum type);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
index 3a556a6..c3dea8c 100644
--- a/src/mesa/main/teximage.c
+++ b/src/mesa/main/teximage.c
@@ -2111,6 +2111,33 @@ texture_format_error_check_gles(struct gl_context *ctx, 
GLenum format,
GLenum err;
 
if (_mesa_is_gles3(ctx)) {
+  /* Page 127, section "3.8 Texturing" of the GLES 3.0.4 spec says:
+   *
+   *"if internalformat is a base internal format, the effective
+   * internal format is a sized internal format that is derived
+   * from the format and type for internal use by the GL.
+   * Table 3.12 specifies the mapping of format and type to effective
+   * internal formats. The effective internal format is used by the GL
+   * for purposes such as texture completeness or type checks for
+   * CopyTex* commands. In these cases, the GL is required to operate
+   * as if the effective internal format was used as the internalformat
+   * when specifying the texture data."
+   */
+  if (_mesa_is_enum_format_unsized(internalFormat)) {
+ internalFormat =
+_mesa_es3_effective_internal_format_for_format_and_type(format,
+type);
+ if (internalFormat == GL_NONE) {
+_mesa_error(ctx, GL_INVALID_OPERATION,
+"%s(format = %s, type = %s, effective "
+"internalformat = %s)",
+callerName, _mesa_enum_to_string(format),
+_mesa_enum_to_string(type),
+_mesa_enum_to_string(internalFormat));
+return true;
+ }
+  }
+
   err = _mesa_es3_error_check_format_and_type(ctx, format, type,
   

[Mesa-dev] Fix saturation when coalescing registers in NIR-vec4

2015-08-14 Thread Antia Puentes
With this fix, we can load the constants in NIR-vec4 as integers,
as it is done in the FS backend.

This is related to:
http://lists.freedesktop.org/archives/mesa-dev/2015-July/089899.html

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] i965/vec4: Fix saturation errors when coalescing registers

2015-08-14 Thread Antia Puentes
If the register types do not match and the instruction
that contains the final destination is saturated, register
coalescing generated non-equivalent code.

This did not happen when using IR because types usually
matched, but it is visible in nir-vec4.

For example,
   mov  vgrf7:D vgrf2:D
   mov.sat  m4:F vgrf7:F

is coalesced to:
   mov.sat  m4:D vgrf2:D

The patch prevents coalescing in such scenario, unless the
instruction we want to coalesce into is a MOV. In that case,
the patch sets the register types to the type of the final
destination.
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index f18915a..52982c3 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1053,6 +1053,15 @@ vec4_visitor::opt_register_coalesce()
}
 }
 
+/* This doesn't handle saturation on the instruction we
+ * want to coalesce away if the register types do not match.
+ * But if scan_inst is a 'mov' we can fix the types later.
+ */
+if (inst->saturate &&
+inst->dst.type != scan_inst->dst.type &&
+scan_inst->opcode != BRW_OPCODE_MOV)
+   break;
+
 /* If we can't handle the swizzle, bail. */
 if (!scan_inst->can_reswizzle(inst->dst.writemask,
   inst->src[0].swizzle,
@@ -1128,6 +1137,15 @@ vec4_visitor::opt_register_coalesce()
   scan_inst->dst.file = inst->dst.file;
   scan_inst->dst.reg = inst->dst.reg;
   scan_inst->dst.reg_offset = inst->dst.reg_offset;
+   if (inst->saturate &&
+   inst->dst.type != scan_inst->dst.type) {
+  /* If we have reached this point, scan_inst is a 'mov' and
+   * we can modify its register types to match the ones in 
inst.
+   * Otherwise, we could have an incorrect saturation result.
+   */
+  scan_inst->dst.type = inst->dst.type;
+  scan_inst->src[0].type = inst->src[0].type;
+   }
   scan_inst->saturate |= inst->saturate;
}
scan_inst = (vec4_instruction *)scan_inst->next;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965/vec4_nir: Load constants as integers

2015-08-14 Thread Antia Puentes
Loads constants using integer as their register type, this is done
for consistency with the FS backend.
---
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index 923e2d3..e4ae899 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
@@ -456,7 +456,7 @@ void
 vec4_visitor::nir_emit_load_const(nir_load_const_instr *instr)
 {
dst_reg reg = dst_reg(GRF, alloc.allocate(1));
-   reg.type =  BRW_REGISTER_TYPE_F;
+   reg.type =  BRW_REGISTER_TYPE_D;
 
/* @FIXME: consider emitting vector operations to save some MOVs in
 * cases where the components are representable in 8 bits.
@@ -464,7 +464,7 @@ vec4_visitor::nir_emit_load_const(nir_load_const_instr 
*instr)
 */
for (unsigned i = 0; i < instr->def.num_components; ++i) {
   reg.writemask = 1 << i;
-  emit(MOV(reg, src_reg(instr->value.f[i])));
+  emit(MOV(reg, src_reg(instr->value.i[i])));
}
 
/* Set final writemask */
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] mesa: Implement glMemoryBarrierByRegion

2015-08-14 Thread Tapani Pälli

Reviewed-by: Tapani Pälli 

On 08/14/2015 02:30 PM, Marta Lofstedt wrote:

From: Marta Lofstedt 

The function glMemoryBarrierByRegion is part of
OpenGL ES 3.1 and OpenGL 4.5 core and compatibility
profiles.

Signed-off-by: Marta Lofstedt 
---
  src/mapi/glapi/gen/GL4x.xml |  6 +
  src/mesa/main/shaderimage.c | 40 +
  src/mesa/main/shaderimage.h |  3 +++
  src/mesa/main/tests/dispatch_sanity.cpp |  9 ++--
  4 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/src/mapi/glapi/gen/GL4x.xml b/src/mapi/glapi/gen/GL4x.xml
index 94ddfb7..dee5027 100644
--- a/src/mapi/glapi/gen/GL4x.xml
+++ b/src/mapi/glapi/gen/GL4x.xml
@@ -44,4 +44,10 @@

  

+
+  
+
+  
+
+
  
diff --git a/src/mesa/main/shaderimage.c b/src/mesa/main/shaderimage.c
index a348cdb..7337f22 100644
--- a/src/mesa/main/shaderimage.c
+++ b/src/mesa/main/shaderimage.c
@@ -653,3 +653,43 @@ _mesa_MemoryBarrier(GLbitfield barriers)
 if (ctx->Driver.MemoryBarrier)
ctx->Driver.MemoryBarrier(ctx, barriers);
  }
+
+void GLAPIENTRY
+_mesa_MemoryBarrierByRegion(GLbitfield barriers)
+{
+   GET_CURRENT_CONTEXT(ctx);
+
+   GLbitfield all_allowed_bits = GL_ATOMIC_COUNTER_BARRIER_BIT |
+ GL_FRAMEBUFFER_BARRIER_BIT |
+ GL_SHADER_IMAGE_ACCESS_BARRIER_BIT |
+ GL_SHADER_STORAGE_BARRIER_BIT |
+ GL_TEXTURE_FETCH_BARRIER_BIT |
+ GL_UNIFORM_BARRIER_BIT;
+
+   if (ctx->Driver.MemoryBarrier) {
+  /* From section 7.11.2 of the OpenGL ES 3.1 specification:
+   *
+   *"When barriers is ALL_BARRIER_BITS, shader memory accesses will be
+   * synchronized relative to all these barrier bits, but not to other
+   * barrier bits specific to MemoryBarrier."
+   *
+   * That is, if barriers is the special value GL_ALL_BARRIER_BITS, then 
all
+   * barriers allowed by glMemoryBarrierByRegion should be activated."
+   */
+  if (barriers == GL_ALL_BARRIER_BITS)
+ return ctx->Driver.MemoryBarrier(ctx, all_allowed_bits);
+
+  /* From section 7.11.2 of the OpenGL ES 3.1 specification:
+   *
+   *"An INVALID_VALUE error is generated if barriers is not the special
+   * value ALL_BARRIER_BITS, and has any bits set other than those
+   * described above."
+   */
+  if ((barriers & ~all_allowed_bits) != 0) {
+ _mesa_error(ctx, GL_INVALID_VALUE,
+ "glMemoryBarrierByRegion(unsupported barrier bit");
+  }
+
+  ctx->Driver.MemoryBarrier(ctx, barriers);
+   }
+}
diff --git a/src/mesa/main/shaderimage.h b/src/mesa/main/shaderimage.h
index 33d8a1e..d08ece8 100644
--- a/src/mesa/main/shaderimage.h
+++ b/src/mesa/main/shaderimage.h
@@ -68,6 +68,9 @@ _mesa_BindImageTextures(GLuint first, GLsizei count, const 
GLuint *textures);
  void GLAPIENTRY
  _mesa_MemoryBarrier(GLbitfield barriers);

+void GLAPIENTRY
+_mesa_MemoryBarrierByRegion(GLbitfield barriers);
+
  #ifdef __cplusplus
  }
  #endif
diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index af89d2c..59107eb 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -851,6 +851,9 @@ const struct function common_desktop_functions_possible[] = 
{
  // { "glTextureStorage2DMultisampleEXT", 43, -1 },  // XXX: Add to xml
  // { "glTextureStorage3DMultisampleEXT", 43, -1 },  // XXX: Add to xml

+/* GL 4.5 */
+   { "glMemoryBarrierByRegion", 45, -1 },
+
 /* GL_ARB_internalformat_query */
 { "glGetInternalformativ", 30, -1 },

@@ -1739,6 +1742,9 @@ const struct function gl_core_functions_possible[] = {
  // { "glTextureStorage2DMultisampleEXT", 43, -1 },  // XXX: Add to xml
  // { "glTextureStorage3DMultisampleEXT", 43, -1 },  // XXX: Add to xml

+/* GL 4.5 */
+   { "glMemoryBarrierByRegion", 45, -1 },
+
 /* GL_ARB_direct_state_access */
 { "glCreateTransformFeedbacks", 45, -1 },
 { "glTransformFeedbackBufferBase", 45, -1 },
@@ -2461,8 +2467,7 @@ const struct function gles31_functions_possible[] = {
 { "glGetBooleani_v", 31, -1 },
 { "glMemoryBarrier", 31, -1 },

-   // FINISHME: This function has not been implemented yet.
-   // { "glMemoryBarrierByRegion", 31, -1 },
+   { "glMemoryBarrierByRegion", 31, -1 },

 { "glTexStorage2DMultisample", 31, -1 },
 { "glGetMultisamplefv", 31, -1 },


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: check if return_deref in lower_subroutine_visitor::visit_leave isn't NULL

2015-08-14 Thread Kai Wasserbäch
Fixes a crash in Piglit's
spec@arb_shader_subroutine@lin...@no-mutual-recursion.vert for me.

Signed-off-by: Kai Wasserbäch 
---

Hey everyone,
I ran the Piglit quick test suite afterwards and haven't observed any
regressions over my previous quick run, but the crash went away. The test
itself passes now! Instead of

| Program received signal SIGSEGV, Segmentation fault.
| (anonymous namespace)::lower_subroutine_visitor::visit_leave 
(this=0x7fffdab0, ir=0xb1af58) at 
../../../../src/glsl/lower_subroutine.cpp:102

I'm seeing the (expected)

| Failed to link: error: function `void impl_b(int)' has static recursion.
| error: function `void impl_a(int)' has static recursion.
|
| Failed to link vertex shader 
/tests/spec/arb_shader_subroutine/linker/no-mutual-recursion.vert:
 
| PIGLIT: {"result": "pass" }

The builds used in both runs have been done in a clean pbuilder chroot (the
same for both builds). Ie. there weren't any other differences between the two
Piglit runs besides the addition of the patch in the latest Mesa build.

If you accept this patch, please commit it for me, as I don't have commit
access.

Thanks,
Kai


 src/glsl/lower_subroutine.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glsl/lower_subroutine.cpp b/src/glsl/lower_subroutine.cpp
index b29912a..c1aed61 100644
--- a/src/glsl/lower_subroutine.cpp
+++ b/src/glsl/lower_subroutine.cpp
@@ -98,7 +98,7 @@ lower_subroutine_visitor::visit_leave(ir_call *ir)
   else
  last_branch = if_tree(equal(subr_to_int(var), lc), new_call, 
last_branch);
 
-  if (s > 0)
+  if (return_deref && s > 0)
 return_deref = return_deref->clone(mem_ctx, NULL);
}
if (last_branch)
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/3] mesa: validate size parameters for glTexStorage*Multisample

2015-08-14 Thread Timothy Arceri
On Fri, 2015-08-14 at 08:55 +0300, Tapani Pälli wrote:
> 
> On 08/13/2015 11:54 AM, Timothy Arceri wrote:
> > I've sent a couple of follow-up patches I notice when reviewing 
> > this.
> > 
> > 
> > On Thu, 2015-08-13 at 09:30 +0300, Tapani Pälli wrote:
> > > v2: code cleanup
> > > 
> > > Signed-off-by: Tapani Pälli 
> > > ---
> > >   src/mesa/main/teximage.c | 24 
> > >   1 file changed, 24 insertions(+)
> > > 
> > > diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
> > > index d35dc12..add7438 100644
> > > --- a/src/mesa/main/teximage.c
> > > +++ b/src/mesa/main/teximage.c
> > > @@ -5782,6 +5782,18 @@ _mesa_TexImage3DMultisample(GLenum target, 
> > > GLsizei
> > > samples,
> > >  "glTexImage3DMultisample");
> > >   }
> > > 
> > > +static bool
> > > +valid_texstorage_ms_parameters(GLsizei width, GLsizei height, 
> > > GLsizei
> > > depth,
> > > +   GLsizei samples, unsigned dims)
> > > +{
> > > +   GET_CURRENT_CONTEXT(ctx);
> > > +
> > > +   if (_mesa_tex_storage_invalid_dim(width, height, depth) || 
> > > samples < 1)
> > 
> > 
> > Rather than do the samples < 1 check here you should move it to
> >   _mesa_texture_image_multisample as the spec says "An 
> > INVALID_VALUE error is
> > generated ifsamplesis zero." for glTexImage*Multisample too.
> > 
> > With that change you could change all the calls below to something 
> > like
> > 
> > if (_mesa_tex_storage_invalid_dim(width, height, depth)) {
> >_mesa_error(ctx, GL_INVALID_VALUE, 
> > "glTexStorage2DMultisample()"
> >return;
> > }
> 
> Would it be OK if we do possible unification for checking samples 
> parameter after these patches? My goal here is to do check for 
> glTexStorage*Multisample only. I don't feel comfortable changing 
> checks 
> for other functions at the same time, that should deserve it's own 
> commit so it's easier to bisect later.
> 

There is no reason to split the change into two patches. The correct
change is to update _mesa_texture_image_multisample (which I will
rename texture_image_multisample once your patch lands) to do the
samples check. The only other user is image*Multisample and the spec
clearly states that this is also an error when 0. That spec says it
undefined what happens when this is negative but we already throw an
error via _mesa_check_sample_count(). 

You don't have to touch _mesa_check_sample_count() that change should
be in another patch which I already sent [1] feel free not to review it
:)

Doing it the way your suggesting is code churn for no gain, it just
means your code needs to be deleted and replaced with the change I'm
suggesting later anyway, so why not just fix it right the first time.

Also feel free to keep your valid_texstorage_ms_parameters() function I
was just suggesting an alternative to having another function call.


[1] 
http://lists.freedesktop.org/archives/mesa-dev/2015-August/091588.html
> > > {
> > > +  _mesa_error(ctx, GL_INVALID_VALUE, 
> > > "glTexStorage%uDMultisample()",
> > > dims);
> > > +  return false;
> > > +   }
> > > +   return true;
> > > +}
> > >   void GLAPIENTRY
> > >   _mesa_TexStorage2DMultisample(GLenum target, GLsizei samples,
> > > @@ -5795,6 +5807,9 @@ _mesa_TexStorage2DMultisample(GLenum 
> > > target, GLsizei
> > > samples,
> > >  if (!texObj)
> > > return;
> > > 
> > > +   if (!valid_texstorage_ms_parameters(width, height, 1, 
> > > samples, 2))
> > > +  return;
> > > +
> > >  _mesa_texture_image_multisample(ctx, 2, texObj, target, 
> > > samples,
> > >  internalformat, width, 
> > > height, 1,
> > >  fixedsamplelocations, 
> > > GL_TRUE,
> > > @@ -5814,6 +5829,9 @@ _mesa_TexStorage3DMultisample(GLenum 
> > > target, GLsizei
> > > samples,
> > >  if (!texObj)
> > > return;
> > > 
> > > +   if (!valid_texstorage_ms_parameters(width, height, depth, 
> > > samples, 3))
> > > +  return;
> > > +
> > >  _mesa_texture_image_multisample(ctx, 3, texObj, target, 
> > > samples,
> > >  internalformat, width, 
> > > height, depth,
> > >  fixedsamplelocations, 
> > > GL_TRUE,
> > > @@ -5834,6 +5852,9 @@ _mesa_TextureStorage2DMultisample(GLuint 
> > > texture,
> > > GLsizei samples,
> > >  if (!texObj)
> > > return;
> > > 
> > > +   if (!valid_texstorage_ms_parameters(width, height, 1, 
> > > samples, 2))
> > > +  return;
> > > +
> > >  _mesa_texture_image_multisample(ctx, 2, texObj, texObj
> > > ->Target, samples,
> > >  internalformat, width, 
> > > height, 1,
> > >  fixedsamplelocations, 
> > > GL_TRUE,
> > > @@ -5855,6 +5876,9 @@ _mesa_TextureStorage3DMultisample(GLuint 
> > > texture,
> > > GLsizei samples,
> > >  if (!texObj)
> > > return;
> > > 
> > > 

Re: [Mesa-dev] [PATCH v2] mesa: refactor target error checking in glGetTexLevelParameter

2015-08-14 Thread Timothy Arceri
Change the commit message to "mesa: fix target error checking in
glGetTexLevelParameter" and Reviewed-by: Timothy Arceri



On Thu, 2015-08-13 at 17:03 +0300, Tapani Pälli wrote:
> With non-dsa functions we need to do target error checking before
> _mesa_get_current_tex_object which would just call _mesa_problem 
> without
> raising GL_INVALID_ENUM error. In other places of Mesa, target gets 
> checked
> before this call.
> 
> Fixes failures in:
>ES31
> -CTS.texture_storage_multisample.APIGLGetTexLevelParameterifv.*
> 
> v2: do the check also for dsa functions (Timothy)
> 
> Signed-off-by: Tapani Pälli 
> ---
>  src/mesa/main/texparam.c | 32 +---
>  1 file changed, 25 insertions(+), 7 deletions(-)
> 
> diff --git a/src/mesa/main/texparam.c b/src/mesa/main/texparam.c
> index d9efd15..76edcca 100644
> --- a/src/mesa/main/texparam.c
> +++ b/src/mesa/main/texparam.c
> @@ -1574,6 +1574,19 @@ invalid_pname:
> _mesa_enum_to_string(pname));
>  }
>  
> +static bool
> +valid_tex_level_parameteriv_target(struct gl_context *ctx, GLenum 
> target,
> +   bool dsa)
> +{
> +   const char *suffix = dsa ? "ture" : "";
> +   if (!legal_get_tex_level_parameter_target(ctx, target, dsa)) {
> +  _mesa_error(ctx, GL_INVALID_ENUM,
> +  "glGetTex%sLevelParameter[if]v(target=%s)", 
> suffix,
> +  _mesa_enum_to_string(target));
> +  return false;
> +   }
> +   return true;
> +}
>  
>  /**
>   * This isn't exposed to the rest of the driver because it is a part 
> of the
> @@ -1597,13 +1610,6 @@ get_tex_level_parameteriv(struct gl_context 
> *ctx,
>return;
> }
>  
> -   if (!legal_get_tex_level_parameter_target(ctx, target, dsa)) {
> -  _mesa_error(ctx, GL_INVALID_ENUM,
> -  "glGetTex%sLevelParameter[if]v(target=%s)", 
> suffix,
> -  _mesa_enum_to_string(target));
> -  return;
> -   }
> -
> maxLevels = _mesa_max_texture_levels(ctx, target);
> assert(maxLevels != 0);
>  
> @@ -1631,6 +1637,9 @@ _mesa_GetTexLevelParameterfv( GLenum target, 
> GLint level,
> GLint iparam;
> GET_CURRENT_CONTEXT(ctx);
>  
> +   if (!valid_tex_level_parameteriv_target(ctx, target, false))
> +  return;
> +
> texObj = _mesa_get_current_tex_object(ctx, target);
> if (!texObj)
>return;
> @@ -1648,6 +1657,9 @@ _mesa_GetTexLevelParameteriv( GLenum target, 
> GLint level,
> struct gl_texture_object *texObj;
> GET_CURRENT_CONTEXT(ctx);
>  
> +   if (!valid_tex_level_parameteriv_target(ctx, target, false))
> +  return;
> +
> texObj = _mesa_get_current_tex_object(ctx, target);
> if (!texObj)
>return;
> @@ -1669,6 +1681,9 @@ _mesa_GetTextureLevelParameterfv(GLuint 
> texture, GLint level,
> if (!texObj)
>return;
>  
> +   if (!valid_tex_level_parameteriv_target(ctx, texObj->Target, 
> true))
> +  return;
> +
> get_tex_level_parameteriv(ctx, texObj, texObj->Target, level,
>   pname, &iparam, true);
>  
> @@ -1687,6 +1702,9 @@ _mesa_GetTextureLevelParameteriv(GLuint 
> texture, GLint level,
> if (!texObj)
>return;
>  
> +   if (!valid_tex_level_parameteriv_target(ctx, texObj->Target, 
> true))
> +  return;
> +
> get_tex_level_parameteriv(ctx, texObj, texObj->Target, level,
>   pname, params, true);
>  }
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965/vec4_nir: Load constants as integers

2015-08-14 Thread Jason Ekstrand
I don't think this applies on top of the patch Matt sent yesterday to use
fewer moves for loading constants.
On Aug 14, 2015 4:56 AM, "Antia Puentes"  wrote:
>
> Loads constants using integer as their register type, this is done
> for consistency with the FS backend.
> ---
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index 923e2d3..e4ae899 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -456,7 +456,7 @@ void
>  vec4_visitor::nir_emit_load_const(nir_load_const_instr *instr)
>  {
> dst_reg reg = dst_reg(GRF, alloc.allocate(1));
> -   reg.type =  BRW_REGISTER_TYPE_F;
> +   reg.type =  BRW_REGISTER_TYPE_D;
>
> /* @FIXME: consider emitting vector operations to save some MOVs in
>  * cases where the components are representable in 8 bits.
> @@ -464,7 +464,7 @@
vec4_visitor::nir_emit_load_const(nir_load_const_instr *instr)
>  */
> for (unsigned i = 0; i < instr->def.num_components; ++i) {
>reg.writemask = 1 << i;
> -  emit(MOV(reg, src_reg(instr->value.f[i])));
> +  emit(MOV(reg, src_reg(instr->value.i[i])));
> }
>
> /* Set final writemask */
> --
> 2.1.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/vec4/nir: Emit single MOV to generate a scalar constant.

2015-08-14 Thread Eduardo Lima Mitev
On 08/12/2015 10:58 PM, Matt Turner wrote:
> If an immediate is written to multiple channels, we can load it in a
> single writemasked MOV.
> 
> total instructions in shared programs: 6285144 -> 6261991 (-0.37%)
> instructions in affected programs: 718991 -> 695838 (-3.22%)
> helped:5762
> ---
> The shader-db numbers are from a tree with the scalar_vs fixup.
> 

Ohh, I had a similar patch but fixes the problem just partially:

https://github.com/Igalia/mesa/commit/56f07bdd70ef41e837dd9801e48fcc1802a8c8ea

Yours is more generic, so great!

>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 21 ++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index 923e2d3..632e409 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -458,13 +458,28 @@ vec4_visitor::nir_emit_load_const(nir_load_const_instr 
> *instr)
> dst_reg reg = dst_reg(GRF, alloc.allocate(1));
> reg.type =  BRW_REGISTER_TYPE_F;
>  
> +   unsigned remaining = brw_writemask_for_size(instr->def.num_components);
> +
> /* @FIXME: consider emitting vector operations to save some MOVs in
>  * cases where the components are representable in 8 bits.
> -* By now, we emit a MOV for each component.
> +* For now, we emit a MOV for each distinct value.
>  */
> -   for (unsigned i = 0; i < instr->def.num_components; ++i) {
> -  reg.writemask = 1 << i;
> +   for (unsigned i = 0; i < instr->def.num_components; i++) {
> +  unsigned writemask = 1 << i;
> +
> +  if ((remaining & writemask) == 0)
> + continue;
> +
> +  for (unsigned j = i; j < instr->def.num_components; j++) {
> + if (instr->value.u[i] == instr->value.u[j]) {
> +writemask |= 1 << j;
> + }
> +  }
> +
> +  reg.writemask = writemask;
>emit(MOV(reg, src_reg(instr->value.f[i])));
> +
> +  remaining &= ~writemask;
> }
>  
> /* Set final writemask */
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Don't tell the hardware about our UAV access.

2015-08-14 Thread Francisco Jerez
The hardware documentation relating to the UAV HW-assisted coherency
mechanism and UAV access enable bits is scarce and sometimes
contradictory, and there's quite some guesswork behind this commit, so
let me summarize the background first: HSW and later hardware have
infrastructure to support a stricter form of data coherency between
shader invocations from separate primitives.  The mechanism is
controlled by the "Accesses UAV" bits on 3DSTATE_VS, _HS, _DS, _GS and
_PS (or _PS_EXTRA on BDW+), and the "UAV Coherency Required" bit on
the 3DPRIMITIVE command.

Regardless of whether "UAV Coherency Required" is set, the hardware
fixed-function units will increment a per-stage semaphore for each
request received if "Accesses UAV" is set for the same or any lower
stage.  An implicit DC flush is emitted by the lowermost stage with
"Accesses UAV" set once it's done processing the request, this also
happens regardless of the value of "UAV Coherency Required".  The
completion of the DC flush will cause the same stage and all previous
ones to decrement the semaphore, marking the UAV accesses for the
primitive as coherent with L3.

The "UAV Coherency Required" 3DPRIMITIVE bit will cause a pipeline
stall before any threads are dispatched for the first FF stage with
"Accesses UAV" set until the semaphore is cleared for the same stage.
Effectively this guarantees that UAV memory accesses performed by
previous primitives from any stage will be strictly ordered (and
thanks to the implicit DC flush visible in memory) with UAV accesses
from the following primitives.

None of this is required by the usual image, atomic counter and SSBO
GL APIs which have very relaxed cross-primitive coherency and ordering
requirements, so we don't actually ever set the "UAV Coherency
Required" bit -- Ordering with respect to shader invocations from
previous stages on the same primitive where there is a data dependency
is of course already guaranteed as the spec requires, regardless of
this mechanism being enabled.  We do set the "Accesses UAV" bits
though since my commit ac7664e493655e290783c23a0412b9c70936da50 (which
this patch partially reverts), mainly because of comments like the
following from the BDW PRM:

> 3DSTATE_GS
>[...]
> 12 Accesses UAV
>Format: Enable
>This field must be set when GS has a UAV access.

There are similar comments in the documentation for the other
3DSTATE_*S commands.  The "must" part is misleading and unjustified
AFAIK.  Most of the "Accesses UAV" bits don't seem to have any side
effects other than the implicit DC flushes and the related
book-keeping in anticipation for a subsequent primitive with "UAV
Coherency Required" set, so in most cases they are unnecessary and may
incur a performance penalty.  There is an exception though.  On Gen8+
the PS_EXTRA UAV access bit influences the calculation of the PS
UAV-only and ThreadDispatchEnable signals which on previous
generations were set explicitly by the driver, so we cannot always
avoid enabling it on the PS stage.

The primary motivation for this change is that in fact the hardware
coherency mechanism is buggy and will cause a rather non-deterministic
hang on Gen8 when VS is the only stage with "Accesses UAV" set and the
processing of a request terminates immediately after the implicit DC
flush is sent for a previous primitive with no additional vertices
being emitted for the second primitive, what will cause the hardware
to skip sending a second DC flush and cause the VS to stall
indefinitely waiting for a response from the DC (BDWGFX HSD 1912017).
This hardware bug can be reproduced on current master with the
spec@arb_shader_image_load_store@host-mem-barrier@Indirect/RaW piglit
subtest (if you have the patience to run it a few dozen times).

The proposed workaround is to insert CS STALLs speculatively between
3DPRIMITIVE commands when "Accesses UAV" is enabled for the VS stage
only.  Because this would affect one of the hottest paths in the
driver and likely decrease performance even further due to the
unnecessary serialization, and because we don't actually need the
implicit DC flushes, it seems better to just disable them.
---
 src/mesa/drivers/dri/i965/gen7_gs_state.c |  4 +---
 src/mesa/drivers/dri/i965/gen7_vs_state.c |  4 +---
 src/mesa/drivers/dri/i965/gen7_wm_state.c | 12 
 src/mesa/drivers/dri/i965/gen8_gs_state.c |  4 +---
 src/mesa/drivers/dri/i965/gen8_ps_state.c | 32 ---
 src/mesa/drivers/dri/i965/gen8_vs_state.c |  4 +---
 6 files changed, 41 insertions(+), 19 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen7_gs_state.c 
b/src/mesa/drivers/dri/i965/gen7_gs_state.c
index 497ecec..8d6d3fe 100644
--- a/src/mesa/drivers/dri/i965/gen7_gs_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_gs_state.c
@@ -59,9 +59,7 @@ upload_gs_state(struct brw_context *brw)
   OUT_BATCH(((ALIGN(stage_state->sampler_count, 4)/4) <<
  GEN6_GS_SAMPLER_COUNT_SHIFT) |
 ((brw->gs.prog_data->ba

Re: [Mesa-dev] [PATCH] glsl: add missing MS sampler builtin types for GLSL ES 3.10

2015-08-14 Thread Kenneth Graunke
On Friday, August 14, 2015 01:36:40 PM Tapani Pälli wrote:
> Signed-off-by: Tapani Pälli 
> ---
>  src/glsl/builtin_types.cpp | 6 +++---
>  src/glsl/glsl_lexer.ll | 7 ---
>  2 files changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/src/glsl/builtin_types.cpp b/src/glsl/builtin_types.cpp
> index ffbc5e6..b0156a1 100644
> --- a/src/glsl/builtin_types.cpp
> +++ b/src/glsl/builtin_types.cpp
> @@ -182,7 +182,7 @@ const static struct builtin_type_versions {
> T(samplerCubeArray,400, 999)
> T(sampler2DRect,   140, 999)
> T(samplerBuffer,   140, 999)
> -   T(sampler2DMS, 150, 999)
> +   T(sampler2DMS, 150, 310)
> T(sampler2DMSArray,150, 999)
>  
> T(isampler1D,  130, 999)
> @@ -194,7 +194,7 @@ const static struct builtin_type_versions {
> T(isamplerCubeArray,   400, 999)
> T(isampler2DRect,  140, 999)
> T(isamplerBuffer,  140, 999)
> -   T(isampler2DMS,150, 999)
> +   T(isampler2DMS,150, 310)
> T(isampler2DMSArray,   150, 999)
>  
> T(usampler1D,  130, 999)
> @@ -206,7 +206,7 @@ const static struct builtin_type_versions {
> T(usamplerCubeArray,   400, 999)
> T(usampler2DRect,  140, 999)
> T(usamplerBuffer,  140, 999)
> -   T(usampler2DMS,150, 999)
> +   T(usampler2DMS,150, 310)
> T(usampler2DMSArray,   150, 999)
>  
> T(sampler1DShadow, 110, 999)
> diff --git a/src/glsl/glsl_lexer.ll b/src/glsl/glsl_lexer.ll
> index efa0bb6..88565e8 100644
> --- a/src/glsl/glsl_lexer.ll
> +++ b/src/glsl/glsl_lexer.ll
> @@ -343,9 +343,10 @@ usampler2DArray  KEYWORD(130, 300, 130, 300, 
> USAMPLER2DARRAY);
>  
> /* additional keywords in ARB_texture_multisample, included in GLSL 1.50 
> */
> /* these are reserved but not defined in GLSL 3.00 */
> -sampler2DMSKEYWORD_WITH_ALT(150, 300, 150, 0, 
> yyextra->ARB_texture_multisample_enable, SAMPLER2DMS);
> -isampler2DMS   KEYWORD_WITH_ALT(150, 300, 150, 0, 
> yyextra->ARB_texture_multisample_enable, ISAMPLER2DMS);
> -usampler2DMS   KEYWORD_WITH_ALT(150, 300, 150, 0, 
> yyextra->ARB_texture_multisample_enable, USAMPLER2DMS);
> +   /* [iu]sampler2DMS are defined in GLSL ES 3.10 */
> +sampler2DMSKEYWORD_WITH_ALT(150, 300, 150, 310, 
> yyextra->ARB_texture_multisample_enable, SAMPLER2DMS);
> +isampler2DMS   KEYWORD_WITH_ALT(150, 300, 150, 310, 
> yyextra->ARB_texture_multisample_enable, ISAMPLER2DMS);
> +usampler2DMS   KEYWORD_WITH_ALT(150, 300, 150, 310, 
> yyextra->ARB_texture_multisample_enable, USAMPLER2DMS);
>  sampler2DMSArray   KEYWORD_WITH_ALT(150, 300, 150, 0, 
> yyextra->ARB_texture_multisample_enable, SAMPLER2DMSARRAY);
>  isampler2DMSArray  KEYWORD_WITH_ALT(150, 300, 150, 0, 
> yyextra->ARB_texture_multisample_enable, ISAMPLER2DMSARRAY);
>  usampler2DMSArray  KEYWORD_WITH_ALT(150, 300, 150, 0, 
> yyextra->ARB_texture_multisample_enable, USAMPLER2DMSARRAY);
> 

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Don't tell the hardware about our UAV access.

2015-08-14 Thread Roland Scheidegger
I guess though you'd need these bits when implementing things like
ARB_fragment_shader_ordering? (Thus stuff actually looks useful but I
don't know if anybody wants to implement it in the near term.)

Roland



Am 14.08.2015 um 18:27 schrieb Francisco Jerez:
> The hardware documentation relating to the UAV HW-assisted coherency
> mechanism and UAV access enable bits is scarce and sometimes
> contradictory, and there's quite some guesswork behind this commit, so
> let me summarize the background first: HSW and later hardware have
> infrastructure to support a stricter form of data coherency between
> shader invocations from separate primitives.  The mechanism is
> controlled by the "Accesses UAV" bits on 3DSTATE_VS, _HS, _DS, _GS and
> _PS (or _PS_EXTRA on BDW+), and the "UAV Coherency Required" bit on
> the 3DPRIMITIVE command.
> 
> Regardless of whether "UAV Coherency Required" is set, the hardware
> fixed-function units will increment a per-stage semaphore for each
> request received if "Accesses UAV" is set for the same or any lower
> stage.  An implicit DC flush is emitted by the lowermost stage with
> "Accesses UAV" set once it's done processing the request, this also
> happens regardless of the value of "UAV Coherency Required".  The
> completion of the DC flush will cause the same stage and all previous
> ones to decrement the semaphore, marking the UAV accesses for the
> primitive as coherent with L3.
> 
> The "UAV Coherency Required" 3DPRIMITIVE bit will cause a pipeline
> stall before any threads are dispatched for the first FF stage with
> "Accesses UAV" set until the semaphore is cleared for the same stage.
> Effectively this guarantees that UAV memory accesses performed by
> previous primitives from any stage will be strictly ordered (and
> thanks to the implicit DC flush visible in memory) with UAV accesses
> from the following primitives.
> 
> None of this is required by the usual image, atomic counter and SSBO
> GL APIs which have very relaxed cross-primitive coherency and ordering
> requirements, so we don't actually ever set the "UAV Coherency
> Required" bit -- Ordering with respect to shader invocations from
> previous stages on the same primitive where there is a data dependency
> is of course already guaranteed as the spec requires, regardless of
> this mechanism being enabled.  We do set the "Accesses UAV" bits
> though since my commit ac7664e493655e290783c23a0412b9c70936da50 (which
> this patch partially reverts), mainly because of comments like the
> following from the BDW PRM:
> 
>> 3DSTATE_GS
>> [...]
>> 12 Accesses UAV
>>Format: Enable
>>This field must be set when GS has a UAV access.
> 
> There are similar comments in the documentation for the other
> 3DSTATE_*S commands.  The "must" part is misleading and unjustified
> AFAIK.  Most of the "Accesses UAV" bits don't seem to have any side
> effects other than the implicit DC flushes and the related
> book-keeping in anticipation for a subsequent primitive with "UAV
> Coherency Required" set, so in most cases they are unnecessary and may
> incur a performance penalty.  There is an exception though.  On Gen8+
> the PS_EXTRA UAV access bit influences the calculation of the PS
> UAV-only and ThreadDispatchEnable signals which on previous
> generations were set explicitly by the driver, so we cannot always
> avoid enabling it on the PS stage.
> 
> The primary motivation for this change is that in fact the hardware
> coherency mechanism is buggy and will cause a rather non-deterministic
> hang on Gen8 when VS is the only stage with "Accesses UAV" set and the
> processing of a request terminates immediately after the implicit DC
> flush is sent for a previous primitive with no additional vertices
> being emitted for the second primitive, what will cause the hardware
> to skip sending a second DC flush and cause the VS to stall
> indefinitely waiting for a response from the DC (BDWGFX HSD 1912017).
> This hardware bug can be reproduced on current master with the
> spec@arb_shader_image_load_store@host-mem-barrier@Indirect/RaW piglit
> subtest (if you have the patience to run it a few dozen times).
> 
> The proposed workaround is to insert CS STALLs speculatively between
> 3DPRIMITIVE commands when "Accesses UAV" is enabled for the VS stage
> only.  Because this would affect one of the hottest paths in the
> driver and likely decrease performance even further due to the
> unnecessary serialization, and because we don't actually need the
> implicit DC flushes, it seems better to just disable them.
> ---
>  src/mesa/drivers/dri/i965/gen7_gs_state.c |  4 +---
>  src/mesa/drivers/dri/i965/gen7_vs_state.c |  4 +---
>  src/mesa/drivers/dri/i965/gen7_wm_state.c | 12 
>  src/mesa/drivers/dri/i965/gen8_gs_state.c |  4 +---
>  src/mesa/drivers/dri/i965/gen8_ps_state.c | 32 
> ---
>  src/mesa/drivers/dri/i965/gen8_vs_state.c |  4 +---
>  6 files changed, 41 insertions(+), 19 deletions(-)
> 
> diff --g

Re: [Mesa-dev] [PATCH v2 1/3] mesa: expose dimension check for glTex*Storage functions

2015-08-14 Thread Anuj Phogat
On Wed, Aug 12, 2015 at 11:30 PM, Tapani Pälli  wrote:
> This is done so that following patch can use it to verify dimensions
> for multisample variants of glTex*Storage.
>
> v2: move function to header, use bool instead GLboolean
>
> Signed-off-by: Tapani Pälli 
> ---
>  src/mesa/main/texstorage.c |  2 +-
>  src/mesa/main/texstorage.h | 21 +
>  2 files changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/main/texstorage.c b/src/mesa/main/texstorage.c
> index 4a2cc60..50a980f 100644
> --- a/src/mesa/main/texstorage.c
> +++ b/src/mesa/main/texstorage.c
> @@ -287,7 +287,7 @@ tex_storage_error_check(struct gl_context *ctx,
>  * order to allow meta functions to use legacy formats. */
>
> /* size check */
> -   if (width < 1 || height < 1 || depth < 1) {
> +   if (_mesa_tex_storage_invalid_dim(width, height, depth)) {
>_mesa_error(ctx, GL_INVALID_VALUE,
>"glTex%sStorage%uD(width, height or depth < 1)",
>suffix, dims);
> diff --git a/src/mesa/main/texstorage.h b/src/mesa/main/texstorage.h
> index 6f5495f..e23d53a 100644
> --- a/src/mesa/main/texstorage.h
> +++ b/src/mesa/main/texstorage.h
> @@ -38,6 +38,27 @@ _mesa_texture_storage(struct gl_context *ctx, GLuint dims,
>GLenum internalformat, GLsizei width,
>GLsizei height, GLsizei depth, bool dsa);
>
> +/**
> + * Texture width, height and depth check shared with the
> + * multisample variants of TexStorage functions.
> + *
> + * From OpenGL 4.5 Core spec, page 260 (section 8.19)
> + *
> + * "An INVALID_VALUE error is generated if width, height, depth
> + * or levels are less than 1, for commands with the corresponding
> + * parameters."
> + *
> + * (referring to TextureStorage* commands, these also match values
> + * specified for OpenGL ES 3.1.)
> + */
> +static inline bool
> +_mesa_tex_storage_invalid_dim(GLsizei width, GLsizei height, GLsizei depth)
How about naming it _mesa_valid_tex_storage_dim() and then doing the check
!_mesa_tex_storage_invalid_dim() ? That's how you named other functions in
patch 2.
> +{
> +   if (width < 1 || height < 1 || depth < 1)
> +  return true;
> +   return false;
> +}
> +
>  /*@}*/
>
>  /**
> --
> 2.1.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

With or without suggested change:
Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] mesa: validate size parameters for glTexStorage*Multisample

2015-08-14 Thread Anuj Phogat
On Mon, Aug 10, 2015 at 1:06 AM, Tapani Pälli  wrote:
> Signed-off-by: Tapani Pälli 
> ---
>  src/mesa/main/teximage.c | 28 
>  1 file changed, 28 insertions(+)
>
> diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
> index fc69387..3ea7b2a 100644
> --- a/src/mesa/main/teximage.c
> +++ b/src/mesa/main/teximage.c
> @@ -5782,6 +5782,18 @@ _mesa_TexImage3DMultisample(GLenum target, GLsizei 
> samples,
> "glTexImage3DMultisample");
>  }
>
> +static bool
> +valid_texstorage_ms_parameters(GLsizei width, GLsizei height, GLsizei depth,
> +   GLsizei samples, const char *func)
> +{
> +   GET_CURRENT_CONTEXT(ctx);
> +
> +   if (_mesa_tex_storage_invalid_dim(width, height, depth) || samples < 1) {
> +  _mesa_error(ctx, GL_INVALID_VALUE, "%s()", func);
> +  return false;
> +   }
> +   return true;
> +}
>
I think samples = 0 is acceptable by the spec. samples < 0 or samples
> max_samples is what generate GL_INVALID_VALUE error. samples
count check is anyway taken care of by _mesa_check_sample_count()
in _mesa_texture_image_multisample(). So, no need of a helper function
here. Just do _mesa_valid_tex_storage_dim() checks.

>  void GLAPIENTRY
>  _mesa_TexStorage2DMultisample(GLenum target, GLsizei samples,
> @@ -5795,6 +5807,10 @@ _mesa_TexStorage2DMultisample(GLenum target, GLsizei 
> samples,
> if (!texObj)
>return;
>
> +   if (!valid_texstorage_ms_parameters(width, height, 1, samples,
> +   "glTexStorage2DMultisample"))
> +  return;
> +
> _mesa_texture_image_multisample(ctx, 2, texObj, target, samples,
> internalformat, width, height, 1,
> fixedsamplelocations, GL_TRUE,
> @@ -5814,6 +5830,10 @@ _mesa_TexStorage3DMultisample(GLenum target, GLsizei 
> samples,
> if (!texObj)
>return;
>
> +   if (!valid_texstorage_ms_parameters(width, height, depth, samples,
> +   "glTexStorage3DMultisample"))
> +  return;
> +
> _mesa_texture_image_multisample(ctx, 3, texObj, target, samples,
> internalformat, width, height, depth,
> fixedsamplelocations, GL_TRUE,
> @@ -5834,6 +5854,10 @@ _mesa_TextureStorage2DMultisample(GLuint texture, 
> GLsizei samples,
> if (!texObj)
>return;
>
> +   if (!valid_texstorage_ms_parameters(width, height, 1, samples,
> +   "glTextureStorage2DMultisample"))
> +  return;
> +
> _mesa_texture_image_multisample(ctx, 2, texObj, texObj->Target, samples,
> internalformat, width, height, 1,
> fixedsamplelocations, GL_TRUE,
> @@ -5855,6 +5879,10 @@ _mesa_TextureStorage3DMultisample(GLuint texture, 
> GLsizei samples,
> if (!texObj)
>return;
>
> +   if (!valid_texstorage_ms_parameters(width, height, depth, samples,
> +   "glTextureStorage3DMultisample"))
> +  return;
> +
> _mesa_texture_image_multisample(ctx, 3, texObj, texObj->Target, samples,
> internalformat, width, height, depth,
> fixedsamplelocations, GL_TRUE,
> --
> 2.1.0
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/4] mesa: set correct error for non-renderable multisample textures

2015-08-14 Thread Anuj Phogat
On Mon, Aug 10, 2015 at 1:06 AM, Tapani Pälli  wrote:
> Signed-off-by: Tapani Pälli 
> ---
>  src/mesa/main/teximage.c | 15 ---
>  1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
> index 3ea7b2a..c6fd0be 100644
> --- a/src/mesa/main/teximage.c
> +++ b/src/mesa/main/teximage.c
> @@ -5639,9 +5639,18 @@ _mesa_texture_image_multisample(struct gl_context 
> *ctx, GLuint dims,
> }
>
> if (!is_renderable_texture_format(ctx, internalformat)) {
> -  _mesa_error(ctx, GL_INVALID_OPERATION,
> -"%s(internalformat=%s)",
> -func, _mesa_enum_to_string(internalformat));
> +  /* Page 172 of OpenGL ES 3.1 spec says:
> +   *   "An INVALID_ENUM error is generated if sizedinternalformat is not
> +   *   color-renderable, depth-renderable, or stencil-renderable (as
> +   *   defined in section 9.4).
> +   */
> +  if (_mesa_is_gles31(ctx)) {
> + _mesa_error(ctx, GL_INVALID_ENUM, "%s(internalformat=%s)", func,
> + _mesa_enum_to_string(internalformat));
> +  } else {
> + _mesa_error(ctx, GL_INVALID_OPERATION, "%s(internalformat=%s)", 
> func,
> + _mesa_enum_to_string(internalformat));
> +  }
>return;
> }
>
> --
> 2.1.0
>

Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] mesa: validate size parameters for glTexStorage*Multisample

2015-08-14 Thread Anuj Phogat
On Fri, Aug 14, 2015 at 10:30 AM, Anuj Phogat  wrote:
> On Mon, Aug 10, 2015 at 1:06 AM, Tapani Pälli  wrote:
>> Signed-off-by: Tapani Pälli 
>> ---
>>  src/mesa/main/teximage.c | 28 
>>  1 file changed, 28 insertions(+)
>>
>> diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
>> index fc69387..3ea7b2a 100644
>> --- a/src/mesa/main/teximage.c
>> +++ b/src/mesa/main/teximage.c
>> @@ -5782,6 +5782,18 @@ _mesa_TexImage3DMultisample(GLenum target, GLsizei 
>> samples,
>> "glTexImage3DMultisample");
>>  }
>>
>> +static bool
>> +valid_texstorage_ms_parameters(GLsizei width, GLsizei height, GLsizei depth,
>> +   GLsizei samples, const char *func)
>> +{
>> +   GET_CURRENT_CONTEXT(ctx);
>> +
>> +   if (_mesa_tex_storage_invalid_dim(width, height, depth) || samples < 1) {
>> +  _mesa_error(ctx, GL_INVALID_VALUE, "%s()", func);
>> +  return false;
>> +   }
>> +   return true;
>> +}
>>
> I think samples = 0 is acceptable by the spec. samples < 0 or samples
>> max_samples is what generate GL_INVALID_VALUE error. samples
> count check is anyway taken care of by _mesa_check_sample_count()
> in _mesa_texture_image_multisample(). So, no need of a helper function
> here. Just do _mesa_valid_tex_storage_dim() checks.
>
Now I noticed that samples = 0 is not acceptable in GLES 3.1 but don't see
it in GL. Add an ES 3.1 check here.

>>  void GLAPIENTRY
>>  _mesa_TexStorage2DMultisample(GLenum target, GLsizei samples,
>> @@ -5795,6 +5807,10 @@ _mesa_TexStorage2DMultisample(GLenum target, GLsizei 
>> samples,
>> if (!texObj)
>>return;
>>
>> +   if (!valid_texstorage_ms_parameters(width, height, 1, samples,
>> +   "glTexStorage2DMultisample"))
>> +  return;
>> +
>> _mesa_texture_image_multisample(ctx, 2, texObj, target, samples,
>> internalformat, width, height, 1,
>> fixedsamplelocations, GL_TRUE,
>> @@ -5814,6 +5830,10 @@ _mesa_TexStorage3DMultisample(GLenum target, GLsizei 
>> samples,
>> if (!texObj)
>>return;
>>
>> +   if (!valid_texstorage_ms_parameters(width, height, depth, samples,
>> +   "glTexStorage3DMultisample"))
>> +  return;
>> +
>> _mesa_texture_image_multisample(ctx, 3, texObj, target, samples,
>> internalformat, width, height, depth,
>> fixedsamplelocations, GL_TRUE,
>> @@ -5834,6 +5854,10 @@ _mesa_TextureStorage2DMultisample(GLuint texture, 
>> GLsizei samples,
>> if (!texObj)
>>return;
>>
>> +   if (!valid_texstorage_ms_parameters(width, height, 1, samples,
>> +   "glTextureStorage2DMultisample"))
>> +  return;
>> +
>> _mesa_texture_image_multisample(ctx, 2, texObj, texObj->Target, samples,
>> internalformat, width, height, 1,
>> fixedsamplelocations, GL_TRUE,
>> @@ -5855,6 +5879,10 @@ _mesa_TextureStorage3DMultisample(GLuint texture, 
>> GLsizei samples,
>> if (!texObj)
>>return;
>>
>> +   if (!valid_texstorage_ms_parameters(width, height, depth, samples,
>> +   "glTextureStorage3DMultisample"))
>> +  return;
>> +
>> _mesa_texture_image_multisample(ctx, 3, texObj, texObj->Target, samples,
>> internalformat, width, height, depth,
>> fixedsamplelocations, GL_TRUE,
>> --
>> 2.1.0
>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] mesa: allow multisampled integer texture formats for ES 3.1

2015-08-14 Thread Anuj Phogat
Marta landed a similar patch in master.

On Mon, Aug 10, 2015 at 1:06 AM, Tapani Pälli  wrote:
> Signed-off-by: Tapani Pälli 
> ---
>  src/mesa/main/multisample.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/main/multisample.c b/src/mesa/main/multisample.c
> index 490bad5..aac5a15 100644
> --- a/src/mesa/main/multisample.c
> +++ b/src/mesa/main/multisample.c
> @@ -164,9 +164,11 @@ _mesa_check_sample_count(struct gl_context *ctx, GLenum 
> target,
>  *
>  * "If internalformat is a signed or unsigned integer format and 
> samples
>  * is greater than zero, then the error INVALID_OPERATION is 
> generated."
> +*
> +* OpenGL ES 3.1 allows this.
>  */
> -   if (_mesa_is_gles3(ctx) && _mesa_is_enum_format_integer(internalFormat)
> -   && samples > 0) {
> +   if ((ctx->API == API_OPENGLES2 && ctx->Version == 30) &&
> +   _mesa_is_enum_format_integer(internalFormat) && samples > 0) {
>return GL_INVALID_OPERATION;
> }
>
> --
> 2.1.0
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/7] i965/gen9: Reuse YF alignment tables in tr_mode_..._texture_alignment()

2015-08-14 Thread Emil Velikov
Hi Anuj,

On 13 August 2015 at 22:51, Anuj Phogat  wrote:
...
> +   if (mt->tr_mode == INTEL_MIPTREE_TRMODE_YS)
> +  ret_align *= multiplier;
Out of curiosity, have you noticed if the compiler is clever enough to
optimise the above multiplication to a shift ?

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Use the effective internal format instead for validation

2015-08-14 Thread Anuj Phogat
On Fri, Aug 14, 2015 at 4:37 AM, Eduardo Lima Mitev  wrote:
> When validating format+type+internalFormat for texture pixel operations
> on GLES3, the effective internal format should be used if the one
> specified is an unsized internal format. Page 127, section "3.8 Texturing"
> of the GLES 3.0.4 spec says:
>
> "if internalformat is a base internal format, the effective internal
>  format is a sized internal format that is derived from the format and
>  type for internal use by the GL. Table 3.12 specifies the mapping of
>  format and type to effective internal formats. The effective internal
>  format is used by the GL for purposes such as texture completeness or
>  type checks for CopyTex* commands. In these cases, the GL is required
>  to operate as if the effective internal format was used as the
>  internalformat when specifying the texture data."
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91582
> ---
>  src/mesa/main/glformats.c | 48 
> +++
>  src/mesa/main/glformats.h |  4 
>  src/mesa/main/teximage.c  | 27 ++
>  3 files changed, 79 insertions(+)
>
> diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c
> index 3eb66da..fd83e9c 100644
> --- a/src/mesa/main/glformats.c
> +++ b/src/mesa/main/glformats.c
> @@ -2824,3 +2824,51 @@ _mesa_format_from_format_and_type(GLenum format, 
> GLenum type)
>  */
> unreachable("Unsupported format");
>  }
> +
> +/**
> + * Returns the effective internal format from a texture format and type.
> + * This is used by texture image operations internally for validation, when
> + * the specified internal format is a base (unsized) format.
> + *
> + * \param format the texture format
> + * \param type the texture type
> + */
> +GLenum
> +_mesa_es3_effective_internal_format_for_format_and_type(GLenum format,
> +GLenum type)
> +{
> +  switch (format) {
> +  case GL_RGBA:
> + switch (type) {
> + case GL_UNSIGNED_BYTE:
> +return GL_RGBA8;
> + case GL_UNSIGNED_SHORT_4_4_4_4:
> +return GL_RGBA4;
> + case GL_UNSIGNED_SHORT_5_5_5_1:
> +return GL_RGB5_A1;
> + }
> + break;
> +  case GL_RGB:
> + switch (type) {
> + case GL_UNSIGNED_BYTE:
> +return GL_RGB8;
> + case GL_UNSIGNED_SHORT_5_6_5:
> +return GL_RGB565;
> + }
> + break;
> +  case GL_LUMINANCE_ALPHA:
> + if (type == GL_UNSIGNED_BYTE)
> +return GL_LUMINANCE8_ALPHA8;
> +  case GL_LUMINANCE:
> + if (type == GL_UNSIGNED_BYTE)
> +return GL_LUMINANCE8;
> +  case GL_ALPHA:
> + if (type == GL_UNSIGNED_BYTE)
> +return GL_ALPHA8;
> +  default:
> + /* fall through and return NONE */
> + break;
> +  }
> +
> +  return GL_NONE;
> +}
> diff --git a/src/mesa/main/glformats.h b/src/mesa/main/glformats.h
> index 419955a..0686c71 100644
> --- a/src/mesa/main/glformats.h
> +++ b/src/mesa/main/glformats.h
> @@ -135,6 +135,10 @@ _mesa_es3_error_check_format_and_type(const struct 
> gl_context *ctx,
>  extern uint32_t
>  _mesa_format_from_format_and_type(GLenum format, GLenum type);
>
> +extern GLenum
> +_mesa_es3_effective_internal_format_for_format_and_type(GLenum format,
> +GLenum type);
> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
> index 3a556a6..c3dea8c 100644
> --- a/src/mesa/main/teximage.c
> +++ b/src/mesa/main/teximage.c
> @@ -2111,6 +2111,33 @@ texture_format_error_check_gles(struct gl_context 
> *ctx, GLenum format,
> GLenum err;
>
> if (_mesa_is_gles3(ctx)) {
> +  /* Page 127, section "3.8 Texturing" of the GLES 3.0.4 spec says:
> +   *
> +   *"if internalformat is a base internal format, the effective
> +   * internal format is a sized internal format that is derived
> +   * from the format and type for internal use by the GL.
> +   * Table 3.12 specifies the mapping of format and type to effective
> +   * internal formats. The effective internal format is used by the 
> GL
> +   * for purposes such as texture completeness or type checks for
> +   * CopyTex* commands. In these cases, the GL is required to operate
> +   * as if the effective internal format was used as the 
> internalformat
> +   * when specifying the texture data."
> +   */
> +  if (_mesa_is_enum_format_unsized(internalFormat)) {
> + internalFormat =
> +_mesa_es3_effective_internal_format_for_format_and_type(format,
> +type);
> + if (internalFormat == GL_NONE) {
> +_mesa_error(ctx, GL_INVALID_OPERATION,
> +"%s(format = %s, type = %s, effective "
> +"internalformat = %s)",
> +   

[Mesa-dev] [PATCH] meta: Update comment about unsupported texture types.

2015-08-14 Thread Matt Turner
Ken added support for 2DArray (commit ec23d5197e) and 1DArray (commit
14ca61125) last year.
---
 src/mesa/drivers/common/meta_generate_mipmap.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/mesa/drivers/common/meta_generate_mipmap.c 
b/src/mesa/drivers/common/meta_generate_mipmap.c
index 0655f05..a4103b9 100644
--- a/src/mesa/drivers/common/meta_generate_mipmap.c
+++ b/src/mesa/drivers/common/meta_generate_mipmap.c
@@ -150,8 +150,7 @@ prepare_mipmap_level(struct gl_context *ctx,
 
 /**
  * Called via ctx->Driver.GenerateMipmap()
- * Note: We don't yet support 3D textures, 1D/2D array textures or texture
- * borders.
+ * Note: We don't yet support 3D textures, or texture borders.
  */
 void
 _mesa_meta_GenerateMipmap(struct gl_context *ctx, GLenum target,
-- 
2.4.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] glClearColor is broken in a weird way if compiled with mangling enabled

2015-08-14 Thread Yuzhu Lu

Hi All, 

Since I need to load osmesa and system GL api at the same time on Linux, I need 
to compile 10.5.9 osmesa in a mangled way. After a simple fix in glapi.h: 

/* Is this needed? It is incomplete anyway. */ 
/* 
#ifdef USE_MGL_NAMESPACE 
#define _glapi_set_dispatch _mglapi_set_dispatch 
#define _glapi_get_dispatch _mglapi_get_dispatch 
#define _glapi_set_context _mglapi_set_context 
#define _glapi_get_context _mglapi_get_context 
#define _glapi_Dispatch _mglapi_Dispatch 
#define _glapi_Context _mglapi_Context 
#endif 
*/ 

Now it compiles fine and here is my configuration: 

./configure CFLAGS="-O2" CXXFLAGS="-O2" --disable-xvmc --disable-glx 
--disable-dri --with-dri-drivers="" --with-gallium-drivers="swrast" 
--enable-texture-float --enable-shared-glapi --disable-egl --enable-mangling 
--with-egl-platforms="" --enable-gallium-osmesa --enable-gallium-llvm=yes 
--disable-llvm-shared-libs --with-osmesa-bits=32 --with-max-width=65536 
--with-max-height=65536 

But glClearColor is broken in a weird way that the value passed in is revised 
internally. Here is my call: 
mglClearColor(0.5f, 0.3f, 0.3f, 0.3f); 

But if I print out the value in _mesa_ClearColor() method of Clear.c. It shows: 
ClearColor: 0.00, 2.00, 2.00, 2.00 

I totally have no idea why this is happening while everything works fine with 
mangling disabled. 

Also, the compiler complains that mglGetString returns int instead of const 
GLubyte*: 
osdemo32.c:441:1: warning: format ‘%s’ expects argument of type ‘char *’, but 
argument 2 has type ‘int’ [-Wformat=] 
printf("Version: %s\n", mglGetString(GL_VERSION)); 

And mglGetString(GL_SHADING_LANGUAGE_VERSION) will crash the application while 
other string works fine. 

I also try to compile it with disable-shared-glapi flag, but I get errors. I am 
not sure this is because gl_mangle.h is dated. (If I manually add the following 
functions in gl_mangle.h, it compiles fine but some functions like 
glCreateShader will be totally messed up.) 

I would really appreciate it if someone have solutions because this is a very 
important project for us. 



CXXLD libMangledOSMesa32.la 
../../../../src/mapi/glapi/.libs/libglapi.a(glapi_libglapi_la-glapi_getproc.o):(.data.rel.ro+0x5588):
 undefined reference to `glPointSizePointerOES' 
../../../../src/mapi/glapi/.libs/libglapi.a(glapi_libglapi_la-glapi_getproc.o):(.data.rel.ro+0x6ae8):
 undefined reference to `glPolygonOffsetClampEXT' 
../../../../src/mapi/glapi/.libs/libglapi.a(glapi_libglapi_la-glapi_getproc.o):(.data.rel.ro+0x6b78):
 undefined reference to `glAlphaFuncx' 
../../../../src/mapi/glapi/.libs/libglapi.a(glapi_libglapi_la-glapi_getproc.o):(.data.rel.ro+0x6b90):
 undefined reference to `glClearColorx' 
../../../../src/mapi/glapi/.libs/libglapi.a(glapi_libglapi_la-glapi_getproc.o):(.data.rel.ro+0x6ba8):
 undefined reference to `glClearDepthx' 
../../../../src/mapi/glapi/.libs/libglapi.a(glapi_libglapi_la-glapi_getproc.o):(.data.rel.ro+0x6bc0):
 undefined reference to `glColor4x' 
../../../../src/mapi/glapi/.libs/libglapi.a(glapi_libglapi_la-glapi_getproc.o):(.data.rel.ro+0x6bd8):
 undefined reference to `glDepthRangex' 
../../../../src/mapi/glapi/.libs/libglapi.a(glapi_libglapi_la-glapi_getproc.o):(.data.rel.ro+0x6bf0):
 undefined reference to `glFogx' 
../../../../src/mapi/glapi/.libs/libglapi.a(glapi_libglapi_la-glapi_getproc.o):(.data.rel.ro+0x6c08):
 undefined reference to `glFogxv' 
../../../../src/mapi/glapi/.libs/libglapi.a(glapi_libglapi_la-glapi_getproc.o):(.data.rel.ro+0x6c20):
 undefined reference to `glFrustumf' 
../../../../src/mapi/glapi/.libs/libglapi.a(glapi_libglapi_la-glapi_getproc.o):(.data.rel.ro+0x6c38):
 undefined reference to `glFrustumx' 
../../../../src/mapi/glapi/.libs/libglapi.a(glapi_libglapi_la-glapi_getproc.o):(.data.rel.ro+0x6c50):
 undefined reference to `glLightModelx' 
../../../../src/mapi/glapi/.libs/libglapi.a(glapi_libglapi_la-glapi_getproc.o):(.data.rel.ro+0x6c68):
 undefined reference to `glLightModelxv' 
../../../../src/mapi/glapi/.libs/libglapi.a(glapi_libglapi_la-glapi_getproc.o):(.data.rel.ro+0x6c80):
 undefined reference to `glLightx' 
../../../../src/mapi/glapi/.libs/libglapi.a(glapi_libglapi_la-glapi_getproc.o):(.data.rel.ro+0x6c98):
 undefined reference to `glLightxv' 
../../../../src/mapi/glapi/.libs/libglapi.a(glapi_libglapi_la-glapi_getproc.o):(.data.rel.ro+0x6cb0):
 undefined reference to `glLineWidthx' 
../../../../src/mapi/glapi/.libs/libglapi.a(glapi_libglapi_la-glapi_getproc.o):(.data.rel.ro+0x6cc8):
 undefined reference to `glLoadMatrixx' 
../../../../src/mapi/glapi/.libs/libglapi.a(glapi_libglapi_la-glapi_getproc.o):(.data.rel.ro+0x6ce0):
 undefined reference to `glMaterialx' 
../../../../src/mapi/glapi/.libs/libglapi.a(glapi_libglapi_la-glapi_getproc.o):(.data.rel.ro+0x6cf8):
 undefined reference to `glMaterialxv' 
../../../../src/mapi/glapi/.libs/libglapi.a(glapi_libglapi_la-glapi_getproc.o):(.data.rel.ro+0x6d10):
 undefined reference to `glMultMatrixx' 
../../../../s

Re: [Mesa-dev] [PATCH] i965: add ARB_texture_barrier support

2015-08-14 Thread Eric Anholt
Ilia Mirkin  writes:

> On Thu, Aug 13, 2015 at 7:19 PM, Eric Anholt  wrote:
>> Ilia Mirkin  writes:
>>
>>> Signed-off-by: Ilia Mirkin 
>>> ---
>>>
>>> The blending-in-shader piglit test passed even without the flush,
>>> which doesn't inspire me with confidence, but those piglit_draw_rect
>>> things are pretty heavy so perhaps cause a flush on their own anyways.
>>
>> I think you don't actually need the hook, because of the
>> brw_render_cache_set_check_flush(brw, tex_obj->mt->bo) in
>> intel_update_state().  You should be able to confirm that you're getting
>> those pipe controls between your draws using INTEL_DEBUG=batch
>
> I'm a complete newbie when it comes to the i965 driver, but why would
> intel_update_state() get called in between sequential draws? The idea
> is that you can do things like
>
> tex0 = tex;
> fb = tex;
> draw;
> barrier;
> draw;
> barrier;
> draw;
>
> which is actually exactly what the blending-in-shader piglit does. So
> there should be no extra state updates between them. Am I missing
> something?

Not sure, was just giving my best guess as to why the test was working
already.  You'd probably need to trace and see if you always get an
updatestate between draws.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: add ARB_texture_barrier support

2015-08-14 Thread Ilia Mirkin
On Fri, Aug 14, 2015 at 2:34 PM, Eric Anholt  wrote:
> Ilia Mirkin  writes:
>
>> On Thu, Aug 13, 2015 at 7:19 PM, Eric Anholt  wrote:
>>> Ilia Mirkin  writes:
>>>
 Signed-off-by: Ilia Mirkin 
 ---

 The blending-in-shader piglit test passed even without the flush,
 which doesn't inspire me with confidence, but those piglit_draw_rect
 things are pretty heavy so perhaps cause a flush on their own anyways.
>>>
>>> I think you don't actually need the hook, because of the
>>> brw_render_cache_set_check_flush(brw, tex_obj->mt->bo) in
>>> intel_update_state().  You should be able to confirm that you're getting
>>> those pipe controls between your draws using INTEL_DEBUG=batch
>>
>> I'm a complete newbie when it comes to the i965 driver, but why would
>> intel_update_state() get called in between sequential draws? The idea
>> is that you can do things like
>>
>> tex0 = tex;
>> fb = tex;
>> draw;
>> barrier;
>> draw;
>> barrier;
>> draw;
>>
>> which is actually exactly what the blending-in-shader piglit does. So
>> there should be no extra state updates between them. Am I missing
>> something?
>
> Not sure, was just giving my best guess as to why the test was working
> already.  You'd probably need to trace and see if you always get an
> updatestate between draws.

Oh right. That could well be the reason why that test works. But it's
not at all guaranteed that there will be an updatestate between the
two draws.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Use the effective internal format instead for validation

2015-08-14 Thread Mark Janes
This test regresses a large number of tests in piglit and dEQP, for
example:

piglit.spec.oes_texture_float.oes_texture_float.snbgt2m64 (from piglit)
Standard Output

/tmp/build_root/m64/lib/piglit/bin/oes_texture_float -auto -fbo
oes-texture-float: error 0x502 at 
/var/lib/jenkins/jobs/Leeroy/workspace@2/repos/piglit/tests/spec/oes_texture_float/oes_texture_float.c:343
oes-texture-float: error 0x502 at 
/var/lib/jenkins/jobs/Leeroy/workspace@2/repos/piglit/tests/spec/oes_texture_float/oes_texture_float.c:343
oes-texture-float: error 0x502 at 
/var/lib/jenkins/jobs/Leeroy/workspace@2/repos/piglit/tests/spec/oes_texture_float/oes_texture_float.c:343
oes-texture-float: error 0x502 at 
/var/lib/jenkins/jobs/Leeroy/workspace@2/repos/piglit/tests/spec/oes_texture_float/oes_texture_float.c:343
oes-texture-float: error 0x502 at 
/var/lib/jenkins/jobs/Leeroy/workspace@2/repos/piglit/tests/spec/oes_texture_float/oes_texture_float.c:343

Standard Error

Mesa: User error: GL_INVALID_OPERATION in glTexImage%dD(format = GL_RGBA, type 
= GL_FLOAT, effective internalformat = GL_FALSE)
Mesa: User error: GL_INVALID_OPERATION in glTexImage%dD(format = GL_RGB, type = 
GL_FLOAT, effective internalformat = GL_FALSE)
Mesa: User error: GL_INVALID_OPERATION in glTexImage%dD(format = GL_ALPHA, type 
= GL_FLOAT, effective internalformat = GL_FALSE)
Mesa: User error: GL_INVALID_OPERATION in glTexImage%dD(format = GL_LUMINANCE, 
type = GL_FLOAT, effective internalformat = GL_FALSE)
Mesa: User error: GL_INVALID_OPERATION in glTexImage%dD(format = 
GL_LUMINANCE_ALPHA, type = GL_FLOAT, effective internalformat = GL_FALSE)


Eduardo Lima Mitev  writes:

> When validating format+type+internalFormat for texture pixel operations
> on GLES3, the effective internal format should be used if the one
> specified is an unsized internal format. Page 127, section "3.8 Texturing"
> of the GLES 3.0.4 spec says:
>
> "if internalformat is a base internal format, the effective internal
>  format is a sized internal format that is derived from the format and
>  type for internal use by the GL. Table 3.12 specifies the mapping of
>  format and type to effective internal formats. The effective internal
>  format is used by the GL for purposes such as texture completeness or
>  type checks for CopyTex* commands. In these cases, the GL is required
>  to operate as if the effective internal format was used as the
>  internalformat when specifying the texture data."
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91582
> ---
>  src/mesa/main/glformats.c | 48 
> +++
>  src/mesa/main/glformats.h |  4 
>  src/mesa/main/teximage.c  | 27 ++
>  3 files changed, 79 insertions(+)
>
> diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c
> index 3eb66da..fd83e9c 100644
> --- a/src/mesa/main/glformats.c
> +++ b/src/mesa/main/glformats.c
> @@ -2824,3 +2824,51 @@ _mesa_format_from_format_and_type(GLenum format, 
> GLenum type)
>  */
> unreachable("Unsupported format");
>  }
> +
> +/**
> + * Returns the effective internal format from a texture format and type.
> + * This is used by texture image operations internally for validation, when
> + * the specified internal format is a base (unsized) format.
> + *
> + * \param format the texture format
> + * \param type the texture type
> + */
> +GLenum
> +_mesa_es3_effective_internal_format_for_format_and_type(GLenum format,
> +GLenum type)
> +{
> +  switch (format) {
> +  case GL_RGBA:
> + switch (type) {
> + case GL_UNSIGNED_BYTE:
> +return GL_RGBA8;
> + case GL_UNSIGNED_SHORT_4_4_4_4:
> +return GL_RGBA4;
> + case GL_UNSIGNED_SHORT_5_5_5_1:
> +return GL_RGB5_A1;
> + }
> + break;
> +  case GL_RGB:
> + switch (type) {
> + case GL_UNSIGNED_BYTE:
> +return GL_RGB8;
> + case GL_UNSIGNED_SHORT_5_6_5:
> +return GL_RGB565;
> + }
> + break;
> +  case GL_LUMINANCE_ALPHA:
> + if (type == GL_UNSIGNED_BYTE)
> +return GL_LUMINANCE8_ALPHA8;
> +  case GL_LUMINANCE:
> + if (type == GL_UNSIGNED_BYTE)
> +return GL_LUMINANCE8;
> +  case GL_ALPHA:
> + if (type == GL_UNSIGNED_BYTE)
> +return GL_ALPHA8;
> +  default:
> + /* fall through and return NONE */
> + break;
> +  }
> +
> +  return GL_NONE;
> +}
> diff --git a/src/mesa/main/glformats.h b/src/mesa/main/glformats.h
> index 419955a..0686c71 100644
> --- a/src/mesa/main/glformats.h
> +++ b/src/mesa/main/glformats.h
> @@ -135,6 +135,10 @@ _mesa_es3_error_check_format_and_type(const struct 
> gl_context *ctx,
>  extern uint32_t
>  _mesa_format_from_format_and_type(GLenum format, GLenum type);
>  
> +extern GLenum
> +_mesa_es3_effective_internal_format_for_format_and_type(GLenum format,
> +GLenum type);
> +
>  #ifdef __cplusplus
> 

Re: [Mesa-dev] [PATCH] meta: Update comment about unsupported texture types.

2015-08-14 Thread Anuj Phogat
On Fri, Aug 14, 2015 at 11:19 AM, Matt Turner  wrote:
> Ken added support for 2DArray (commit ec23d5197e) and 1DArray (commit
> 14ca61125) last year.
> ---
>  src/mesa/drivers/common/meta_generate_mipmap.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/src/mesa/drivers/common/meta_generate_mipmap.c 
> b/src/mesa/drivers/common/meta_generate_mipmap.c
> index 0655f05..a4103b9 100644
> --- a/src/mesa/drivers/common/meta_generate_mipmap.c
> +++ b/src/mesa/drivers/common/meta_generate_mipmap.c
> @@ -150,8 +150,7 @@ prepare_mipmap_level(struct gl_context *ctx,
>
>  /**
>   * Called via ctx->Driver.GenerateMipmap()
> - * Note: We don't yet support 3D textures, 1D/2D array textures or texture
> - * borders.
> + * Note: We don't yet support 3D textures, or texture borders.
>   */
>  void
>  _mesa_meta_GenerateMipmap(struct gl_context *ctx, GLenum target,
> --
> 2.4.6
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/7] i965/gen9: Reuse YF alignment tables in tr_mode_..._texture_alignment()

2015-08-14 Thread Anuj Phogat
On Fri, Aug 14, 2015 at 11:03 AM, Emil Velikov  wrote:
> Hi Anuj,
>
> On 13 August 2015 at 22:51, Anuj Phogat  wrote:
> ...
>> +   if (mt->tr_mode == INTEL_MIPTREE_TRMODE_YS)
>> +  ret_align *= multiplier;
> Out of curiosity, have you noticed if the compiler is clever enough to
> optimise the above multiplication to a shift ?
>
I think compiler will take care of optimizing it. But I haven't seen the
generated machine code.

> Thanks
> Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] mesa: validate size parameters for glTexStorage*Multisample

2015-08-14 Thread Timothy Arceri
On Fri, 2015-08-14 at 10:43 -0700, Anuj Phogat wrote:
> On Fri, Aug 14, 2015 at 10:30 AM, Anuj Phogat  wrote:
> > On Mon, Aug 10, 2015 at 1:06 AM, Tapani Pälli  
> > wrote:
> > > Signed-off-by: Tapani Pälli 
> > > ---
> > >  src/mesa/main/teximage.c | 28 
> > >  1 file changed, 28 insertions(+)
> > > 
> > > diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
> > > index fc69387..3ea7b2a 100644
> > > --- a/src/mesa/main/teximage.c
> > > +++ b/src/mesa/main/teximage.c
> > > @@ -5782,6 +5782,18 @@ _mesa_TexImage3DMultisample(GLenum target, 
> > > GLsizei samples,
> > > "glTexImage3DMultisample");
> > >  }
> > > 
> > > +static bool
> > > +valid_texstorage_ms_parameters(GLsizei width, GLsizei height, GLsizei 
> > > depth,
> > > +   GLsizei samples, const char *func)
> > > +{
> > > +   GET_CURRENT_CONTEXT(ctx);
> > > +
> > > +   if (_mesa_tex_storage_invalid_dim(width, height, depth) || samples < 
> > > 1) {
> > > +  _mesa_error(ctx, GL_INVALID_VALUE, "%s()", func);
> > > +  return false;
> > > +   }
> > > +   return true;
> > > +}
> > > 
> > I think samples = 0 is acceptable by the spec. samples < 0 or samples
> > > max_samples is what generate GL_INVALID_VALUE error. samples
> > count check is anyway taken care of by _mesa_check_sample_count()
> > in _mesa_texture_image_multisample(). So, no need of a helper function
> > here. Just do _mesa_valid_tex_storage_dim() checks.
> > 
> Now I noticed that samples = 0 is not acceptable in GLES 3.1 but don't see
> it in GL. Add an ES 3.1 check here.

You seem to have reviewed v1 of the series. Tapani has sent a v2.

The GL 4.5 spec says that samples = 0 is not acceptable for either
*Storage*Multisample or *Image*Multisample. See my comenets on v2 of the
patch.


> 
> > >  void GLAPIENTRY
> > >  _mesa_TexStorage2DMultisample(GLenum target, GLsizei samples,
> > > @@ -5795,6 +5807,10 @@ _mesa_TexStorage2DMultisample(GLenum target, 
> > > GLsizei samples,
> > > if (!texObj)
> > >return;
> > > 
> > > +   if (!valid_texstorage_ms_parameters(width, height, 1, samples,
> > > +   "glTexStorage2DMultisample"))
> > > +  return;
> > > +
> > > _mesa_texture_image_multisample(ctx, 2, texObj, target, samples,
> > > internalformat, width, height, 1,
> > > fixedsamplelocations, GL_TRUE,
> > > @@ -5814,6 +5830,10 @@ _mesa_TexStorage3DMultisample(GLenum target, 
> > > GLsizei samples,
> > > if (!texObj)
> > >return;
> > > 
> > > +   if (!valid_texstorage_ms_parameters(width, height, depth, samples,
> > > +   "glTexStorage3DMultisample"))
> > > +  return;
> > > +
> > > _mesa_texture_image_multisample(ctx, 3, texObj, target, samples,
> > > internalformat, width, height, 
> > > depth,
> > > fixedsamplelocations, GL_TRUE,
> > > @@ -5834,6 +5854,10 @@ _mesa_TextureStorage2DMultisample(GLuint texture, 
> > > GLsizei samples,
> > > if (!texObj)
> > >return;
> > > 
> > > +   if (!valid_texstorage_ms_parameters(width, height, 1, samples,
> > > +  
> > >  "glTextureStorage2DMultisample"))
> > > +  return;
> > > +
> > > _mesa_texture_image_multisample(ctx, 2, texObj, texObj->Target, 
> > > samples,
> > > internalformat, width, height, 1,
> > > fixedsamplelocations, GL_TRUE,
> > > @@ -5855,6 +5879,10 @@ _mesa_TextureStorage3DMultisample(GLuint texture, 
> > > GLsizei samples,
> > > if (!texObj)
> > >return;
> > > 
> > > +   if (!valid_texstorage_ms_parameters(width, height, depth, samples,
> > > +  
> > >  "glTextureStorage3DMultisample"))
> > > +  return;
> > > +
> > > _mesa_texture_image_multisample(ctx, 3, texObj, texObj->Target, 
> > > samples,
> > > internalformat, width, height, 
> > > depth,
> > > fixedsamplelocations, GL_TRUE,
> > > --
> > > 2.1.0
> > > 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] scons: Always define __STDC_LIMIT_MACROS.

2015-08-14 Thread Vinson Lee
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91591
Signed-off-by: Vinson Lee 
---
 scons/gallium.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/scons/gallium.py b/scons/gallium.py
index 51b84d7..46dbf0e 100755
--- a/scons/gallium.py
+++ b/scons/gallium.py
@@ -300,6 +300,7 @@ def generate(env):
 
 # C preprocessor options
 cppdefines = []
+cppdefines += ['__STDC_LIMIT_MACROS']
 if env['build'] in ('debug', 'checked'):
 cppdefines += ['DEBUG']
 else:
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/12] i965/vec4/nir: simplify glsl_type_for_nir_alu_type()

2015-08-14 Thread Connor Abbott
Less duplication and one less case to handle for doubles.

Signed-off-by: Connor Abbott 
---
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 16 ++--
 1 file changed, 2 insertions(+), 14 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index b13465b..19ae7fd 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
@@ -1339,20 +1339,8 @@ const glsl_type *
 glsl_type_for_nir_alu_type(nir_alu_type alu_type,
unsigned components)
 {
-   switch (alu_type) {
-   case nir_type_float:
-  return glsl_type::vec(components);
-   case nir_type_int:
-  return glsl_type::ivec(components);
-   case nir_type_unsigned:
-  return glsl_type::uvec(components);
-   case nir_type_bool:
-  return glsl_type::bvec(components);
-   default:
-  return glsl_type::error_type;
-   }
-
-   return glsl_type::error_type;
+   return glsl_type::get_instance(brw_glsl_base_type_for_nir_type(alu_type),
+  1, components);
 }
 
 void
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/12] glsl: fix ir_constant::equals() for doubles

2015-08-14 Thread Connor Abbott
Signed-off-by: Connor Abbott 
---
 src/glsl/ir_equals.cpp | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/glsl/ir_equals.cpp b/src/glsl/ir_equals.cpp
index cc1964e..c6446f7 100644
--- a/src/glsl/ir_equals.cpp
+++ b/src/glsl/ir_equals.cpp
@@ -58,8 +58,13 @@ ir_constant::equals(const ir_instruction *ir, enum 
ir_node_type) const
   return false;
 
for (unsigned i = 0; i < type->components(); i++) {
-  if (value.u[i] != other->value.u[i])
- return false;
+  if (type->base_type == GLSL_TYPE_DOUBLE) {
+ if (value.d[i] != other->value.d[i])
+return false;
+  } else {
+ if (value.u[i] != other->value.u[i])
+return false;
+  }
}
 
return true;
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/12] i965/fs: print non-1 strides when dumping instructions

2015-08-14 Thread Connor Abbott
Signed-off-by: Connor Abbott 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index ce1edc3..812648f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -4506,6 +4506,8 @@ fs_visitor::dump_instruction(backend_instruction 
*be_inst, FILE *file)
   fprintf(file, "???");
   break;
}
+   if (inst->dst.stride != 1)
+  fprintf(file, "<%u>", inst->dst.stride);
fprintf(file, ":%s, ", brw_reg_type_letters(inst->dst.type));
 
for (int i = 0; i < inst->sources; i++) {
@@ -4605,6 +4607,18 @@ fs_visitor::dump_instruction(backend_instruction 
*be_inst, FILE *file)
  fprintf(file, "|");
 
   if (inst->src[i].file != IMM) {
+ unsigned stride;
+ if (inst->src[i].file == HW_REG) {
+unsigned hstride = inst->src[i].fixed_hw_reg.hstride;
+stride = (hstride == 0 ? 0 : (1 << (hstride - 1)));
+ } else {
+stride = inst->src[i].stride;
+ }
+ if (stride != 1)
+fprintf(file, "<%u>", stride);
+  }
+
+  if (inst->src[i].file != IMM) {
  fprintf(file, ":%s", brw_reg_type_letters(inst->src[i].type));
   }
 
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/12] i965/fs: respect force_sechalf/force_writemask_all in CSE

2015-08-14 Thread Connor Abbott
Signed-off-by: Connor Abbott 
---
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
index c7628dc..44af5f3 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
@@ -208,6 +208,8 @@ create_copy_instr(const fs_builder &bld, fs_inst *inst, 
fs_reg src, bool negate)
   copy = bld.LOAD_PAYLOAD(inst->dst, payload, sources, header_size);
} else {
   copy = bld.MOV(inst->dst, src);
+  copy->force_sechalf = inst->force_sechalf;
+  copy->force_writemask_all = inst->force_writemask_all;
   copy->src[0].negate = negate;
}
assert(copy->regs_written == written);
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/12] nir/builder: include nir.h

2015-08-14 Thread Connor Abbott
This makes intelligent autocomplete plugins much happier.

Signed-off-by: Connor Abbott 
---
 src/glsl/nir/nir_builder.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/glsl/nir/nir_builder.h b/src/glsl/nir/nir_builder.h
index 9223e83..1745453 100644
--- a/src/glsl/nir/nir_builder.h
+++ b/src/glsl/nir/nir_builder.h
@@ -24,6 +24,8 @@
 #ifndef NIR_BUILDER_H
 #define NIR_BUILDER_H
 
+#include "nir.h"
+
 struct exec_list;
 
 typedef struct nir_builder {
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/12] i965/fs: add stride restrictions for copy propagation

2015-08-14 Thread Connor Abbott
There are various restrictions on what the hstride can be that depend on
the Gen, and now that we're using hstride == 2 for packing/unpacking
doubles, we're going to run into these restrictions a lot more often.
Pull them out into a separate function, and move the one restriction we
checked previously into it.

Signed-off-by: Connor Abbott 
---
 .../drivers/dri/i965/brw_fs_copy_propagation.cpp   | 58 +-
 1 file changed, 57 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index 5445ad5..b1febb6 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -285,6 +285,61 @@ can_change_source_types(fs_inst *inst)
 !inst->src[1].abs && !inst->src[1].negate));
 }
 
+static bool
+can_take_stride(fs_inst *inst, unsigned arg, unsigned stride,
+const brw_device_info *devinfo)
+{
+   if (stride > 4)
+  return false;
+
+   /* 3-source instructions can only be Align16, which restricts what strides
+* they can take. They can only take a stride of 1 (the usual case), or 0
+* with a special "repctrl" bit. But the repctrl bit doesn't work for
+* 64-bit datatypes, so if the source type is 64-bit then only a stride of
+* 1 is allowed. From the Broadwell PRM, Volume 7 "3D Media GPGPU", page
+* 944:
+*
+*This is applicable to 32b datatypes and 16b datatype. 64b datatypes
+*cannot use the replicate control.
+*/
+
+   if (inst->is_3src()) {
+  if (type_sz(inst->src[arg].type) > 4)
+ return stride == 1;
+  else
+ return stride == 1 || stride == 0;
+   }
+
+   /* From the Broadwell PRM, Volume 2a "Command Reference - Instructions",
+* page 391 ("Extended Math Function"):
+*
+* The following restrictions apply for align1 mode: Scalar source is
+* supported. Source and destination horizontal stride must be the
+* same.
+*
+* From the Haswell PRM Volume 2b "Command Reference - Instructions", page
+* 134 ("Extended Math Function"):
+*
+*Scalar source is supported. Source and destination horizontal stride
+*must be 1.
+*
+* and similar language exists for IVB and SNB. Pre-SNB, math instructions
+* are sends, so the sources are moved to MRF's and there are no
+* restrictions.
+*/
+
+   if (inst->is_math()) {
+  if (devinfo->gen == 6 || devinfo->gen == 7) {
+ assert(inst->dst.stride == 1);
+ return stride == 1 || stride == 0;
+  } else if (devinfo->gen >= 8) {
+ return stride == inst->dst.stride || stride == 0;
+  }
+   }
+
+   return true;
+}
+
 bool
 fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry)
 {
@@ -336,7 +391,8 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, 
acp_entry *entry)
/* Bail if the result of composing both strides would exceed the
 * hardware limit.
 */
-   if (entry->src.stride * inst->src[arg].stride > 4)
+   if (!can_take_stride(inst, arg, entry->src.stride * inst->src[arg].stride,
+devinfo))
   return false;
 
/* Bail if the instruction type is larger than the execution type of the
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/12] i965/fs: print writemask_all when it's enabled

2015-08-14 Thread Connor Abbott
Signed-off-by: Connor Abbott 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 386e9a2..5474eac 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -4641,6 +4641,9 @@ fs_visitor::dump_instruction(backend_instruction 
*be_inst, FILE *file)
  fprintf(file, "1sthalf ");
}
 
+   if (inst->force_writemask_all)
+  fprintf(file, "WE_all ");
+
fprintf(file, "\n");
 }
 
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/12] nir/builder: only read meaningful channels in nir_swizzle()

2015-08-14 Thread Connor Abbott
This way the caller doesn't have to initialize all 4 channels when they
aren't using them.

Signed-off-by: Connor Abbott 
---
 src/glsl/nir/nir_builder.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glsl/nir/nir_builder.h b/src/glsl/nir/nir_builder.h
index 1745453..9592b8a 100644
--- a/src/glsl/nir/nir_builder.h
+++ b/src/glsl/nir/nir_builder.h
@@ -235,7 +235,7 @@ nir_swizzle(nir_builder *build, nir_ssa_def *src, unsigned 
swiz[4],
 {
nir_alu_src alu_src = { NIR_SRC_INIT };
alu_src.src = nir_src_for_ssa(src);
-   for (int i = 0; i < 4; i++)
+   for (int i = 0; i < num_components; i++)
   alu_src.swizzle[i] = swiz[i];
 
return use_fmov ? nir_fmov_alu(build, alu_src, num_components) :
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/12] nir: fix constant folding of bfi

2015-08-14 Thread Connor Abbott
Signed-off-by: Connor Abbott 
---
 src/glsl/nir/nir_opcodes.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/glsl/nir/nir_opcodes.py b/src/glsl/nir/nir_opcodes.py
index df5b7e2..77c766b 100644
--- a/src/glsl/nir/nir_opcodes.py
+++ b/src/glsl/nir/nir_opcodes.py
@@ -510,7 +510,7 @@ opcode("bcsel", 0, tunsigned, [0, 0, 0],
   [tbool, tunsigned, tunsigned], "", "src0 ? src1 : src2")
 
 triop("bfi", tunsigned, """
-unsigned mask = src0, insert = src1 & mask, base = src2;
+unsigned mask = src0, insert = src1, base = src2;
 if (mask == 0) {
dst = base;
 } else {
@@ -519,7 +519,7 @@ if (mask == 0) {
   tmp >>= 1;
   insert <<= 1;
}
-   dst = (base & ~mask) | insert;
+   dst = (base & ~mask) | (insert & mask);
 }
 """)
 
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/12] glsl: fix isinf() for doubles

2015-08-14 Thread Connor Abbott
Signed-off-by: Connor Abbott 
---
 src/glsl/builtin_functions.cpp | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp
index 2175c66..ac55170 100644
--- a/src/glsl/builtin_functions.cpp
+++ b/src/glsl/builtin_functions.cpp
@@ -3258,7 +3258,16 @@ builtin_builder::_isinf(builtin_available_predicate 
avail, const glsl_type *type
 
ir_constant_data infinities;
for (int i = 0; i < type->vector_elements; i++) {
-  infinities.f[i] = INFINITY;
+  switch (type->base_type) {
+  case GLSL_TYPE_FLOAT:
+ infinities.f[i] = INFINITY;
+ break;
+  case GLSL_TYPE_DOUBLE:
+ infinities.d[i] = INFINITY;
+ break;
+  default:
+ unreachable("unknown type");
+  }
}
 
body.emit(ret(equal(abs(x), imm(type, infinities;
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/12] i965/fs: make SIMD-splitting respect the original stride/offset

2015-08-14 Thread Connor Abbott
In some cases, we need to emit ALU instructions with a certain stride
due to a HW limitation. When splitting that instruction, we need to
respect the original stride when creating the temporaries we load from
and store into. Otherwise, we'll reintroduce the problem we were trying
to work around.

Signed-off-by: Connor Abbott 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 812648f..386e9a2 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2370,12 +2370,14 @@ fs_visitor::opt_register_renaming()
 
   if (depth == 0 &&
   inst->dst.file == GRF &&
-  alloc.sizes[inst->dst.reg] == inst->exec_size / 8 &&
+  alloc.sizes[inst->dst.reg] ==
+inst->dst.component_size(inst->exec_size) &&
   !inst->is_partial_write()) {
  if (remap[dst] == -1) {
 remap[dst] = dst;
  } else {
-remap[dst] = alloc.allocate(inst->exec_size / 8);
+remap[dst] =
+   alloc.allocate(inst->dst.component_size(inst->exec_size));
 inst->dst.reg = remap[dst];
 progress = true;
  }
@@ -4334,6 +4336,8 @@ fs_visitor::lower_simd_width()
* temporary passed as source to the lowered instruction.
*/
   split_inst.src[j] = lbld.vgrf(inst->src[j].type, src_size);
+  split_inst.src[j].subreg_offset = inst->src[j].subreg_offset;
+  split_inst.src[j].stride = inst->src[j].stride;
   emit_transpose(lbld.group(copy_width, 0),
  split_inst.src[j], &src, 1, src_size, n);
}
@@ -4343,8 +4347,10 @@ fs_visitor::lower_simd_width()
/* Allocate enough space to hold the result of the lowered
 * instruction and fix up the number of registers written.
 */
-   split_inst.dst = dsts[i] =
-  lbld.vgrf(inst->dst.type, dst_size);
+   fs_reg dst = lbld.vgrf(inst->dst.type, dst_size);
+   dst.stride = inst->dst.stride;
+   dst.subreg_offset = inst->dst.subreg_offset;
+   split_inst.dst = dsts[i] = dst;
split_inst.regs_written =
   DIV_ROUND_UP(inst->regs_written * lower_width,
inst->exec_size);
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/12] i965/fs: don't propagate cmod when the exec sizes differ

2015-08-14 Thread Connor Abbott
This can happen when the source of the compare was split by the SIMD
lowering pass. Potentially, we could allow the case where the exec size
of scan_inst is larger, and scan_inst has the right quarter selected,
but doing that seems a little more risky.

Signed-off-by: Connor Abbott 
---
 src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp
index 469f2ea..e444390 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp
@@ -94,6 +94,9 @@ opt_cmod_propagation_local(bblock_t *block)
 scan_inst->dst.reg_offset != inst->src[0].reg_offset)
break;
 
+if (scan_inst->exec_size != inst->exec_size)
+   break;
+
 /* CMP's result is the same regardless of dest type. */
 if (inst->conditional_mod == BRW_CONDITIONAL_NZ &&
 scan_inst->opcode == BRW_OPCODE_CMP &&
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/12] Random fp64 fixes

2015-08-14 Thread Connor Abbott
While working on fp64 support for i965, I noticed a bunch of things that
were broken that weren't directly related to fp64 or weren't directly
related to i965. To help reduce the number of outstanding patches, I've
pulled out some of the ones that I think can land right now. So, here's
the result of that -- a random smattering of fixes and improvements here
and there that I did either to help debugging or to fix some of the fp64
piglit tests.

Connor Abbott (12):
  nir/builder: include nir.h
  nir/builder: only read meaningful channels in nir_swizzle()
  nir: fix constant folding of bfi
  i965/fs: print non-1 strides when dumping instructions
  glsl: fix isinf() for doubles
  glsl: fix ir_constant::equals() for doubles
  i965/fs: add stride restrictions for copy propagation
  i965/vec4/nir: simplify glsl_type_for_nir_alu_type()
  i965/fs: make SIMD-splitting respect the original stride/offset
  i965/fs: respect force_sechalf/force_writemask_all in CSE
  i965/fs: don't propagate cmod when the exec sizes differ
  i965/fs: print writemask_all when it's enabled

 src/glsl/builtin_functions.cpp | 11 +++-
 src/glsl/ir_equals.cpp |  9 +++-
 src/glsl/nir/nir_builder.h |  4 +-
 src/glsl/nir/nir_opcodes.py|  4 +-
 src/mesa/drivers/dri/i965/brw_fs.cpp   | 31 ++--
 .../drivers/dri/i965/brw_fs_cmod_propagation.cpp   |  3 ++
 .../drivers/dri/i965/brw_fs_copy_propagation.cpp   | 58 +-
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp   |  2 +
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 16 +-
 9 files changed, 113 insertions(+), 25 deletions(-)

-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] mesa: validate size parameters for glTexStorage*Multisample

2015-08-14 Thread Anuj Phogat
On Fri, Aug 14, 2015 at 3:21 PM, Timothy Arceri  wrote:
> On Fri, 2015-08-14 at 10:43 -0700, Anuj Phogat wrote:
>> On Fri, Aug 14, 2015 at 10:30 AM, Anuj Phogat  wrote:
>> > On Mon, Aug 10, 2015 at 1:06 AM, Tapani Pälli 
>> > wrote:
>> > > Signed-off-by: Tapani Pälli 
>> > > ---
>> > >  src/mesa/main/teximage.c | 28 
>> > >  1 file changed, 28 insertions(+)
>> > >
>> > > diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
>> > > index fc69387..3ea7b2a 100644
>> > > --- a/src/mesa/main/teximage.c
>> > > +++ b/src/mesa/main/teximage.c
>> > > @@ -5782,6 +5782,18 @@ _mesa_TexImage3DMultisample(GLenum target,
>> > > GLsizei samples,
>> > > "glTexImage3DMultisample");
>> > >  }
>> > >
>> > > +static bool
>> > > +valid_texstorage_ms_parameters(GLsizei width, GLsizei height, GLsizei
>> > > depth,
>> > > +   GLsizei samples, const char *func)
>> > > +{
>> > > +   GET_CURRENT_CONTEXT(ctx);
>> > > +
>> > > +   if (_mesa_tex_storage_invalid_dim(width, height, depth) || samples <
>> > > 1) {
>> > > +  _mesa_error(ctx, GL_INVALID_VALUE, "%s()", func);
>> > > +  return false;
>> > > +   }
>> > > +   return true;
>> > > +}
>> > >
>> > I think samples = 0 is acceptable by the spec. samples < 0 or samples
>> > > max_samples is what generate GL_INVALID_VALUE error. samples
>> > count check is anyway taken care of by _mesa_check_sample_count()
>> > in _mesa_texture_image_multisample(). So, no need of a helper function
>> > here. Just do _mesa_valid_tex_storage_dim() checks.
>> >
>> Now I noticed that samples = 0 is not acceptable in GLES 3.1 but don't see
>> it in GL. Add an ES 3.1 check here.
>
> You seem to have reviewed v1 of the series. Tapani has sent a v2.
>
Right :(. Thanks for the update.

> The GL 4.5 spec says that samples = 0 is not acceptable for either
> *Storage*Multisample or *Image*Multisample. See my comenets on v2 of the
> patch.
I found the spec reference.

>
>
>>
>> > >  void GLAPIENTRY
>> > >  _mesa_TexStorage2DMultisample(GLenum target, GLsizei samples,
>> > > @@ -5795,6 +5807,10 @@ _mesa_TexStorage2DMultisample(GLenum target,
>> > > GLsizei samples,
>> > > if (!texObj)
>> > >return;
>> > >
>> > > +   if (!valid_texstorage_ms_parameters(width, height, 1, samples,
>> > > +   "glTexStorage2DMultisample"))
>> > > +  return;
>> > > +
>> > > _mesa_texture_image_multisample(ctx, 2, texObj, target, samples,
>> > > internalformat, width, height, 1,
>> > > fixedsamplelocations, GL_TRUE,
>> > > @@ -5814,6 +5830,10 @@ _mesa_TexStorage3DMultisample(GLenum target,
>> > > GLsizei samples,
>> > > if (!texObj)
>> > >return;
>> > >
>> > > +   if (!valid_texstorage_ms_parameters(width, height, depth, samples,
>> > > +   "glTexStorage3DMultisample"))
>> > > +  return;
>> > > +
>> > > _mesa_texture_image_multisample(ctx, 3, texObj, target, samples,
>> > > internalformat, width, height,
>> > > depth,
>> > > fixedsamplelocations, GL_TRUE,
>> > > @@ -5834,6 +5854,10 @@ _mesa_TextureStorage2DMultisample(GLuint texture,
>> > > GLsizei samples,
>> > > if (!texObj)
>> > >return;
>> > >
>> > > +   if (!valid_texstorage_ms_parameters(width, height, 1, samples,
>> > > +
>> > >  "glTextureStorage2DMultisample"))
>> > > +  return;
>> > > +
>> > > _mesa_texture_image_multisample(ctx, 2, texObj, texObj->Target,
>> > > samples,
>> > > internalformat, width, height, 1,
>> > > fixedsamplelocations, GL_TRUE,
>> > > @@ -5855,6 +5879,10 @@ _mesa_TextureStorage3DMultisample(GLuint texture,
>> > > GLsizei samples,
>> > > if (!texObj)
>> > >return;
>> > >
>> > > +   if (!valid_texstorage_ms_parameters(width, height, depth, samples,
>> > > +
>> > >  "glTextureStorage3DMultisample"))
>> > > +  return;
>> > > +
>> > > _mesa_texture_image_multisample(ctx, 3, texObj, texObj->Target,
>> > > samples,
>> > > internalformat, width, height,
>> > > depth,
>> > > fixedsamplelocations, GL_TRUE,
>> > > --
>> > > 2.1.0
>> > >
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] mesa/formats: only do type and component lookup for uncompressed formats

2015-08-14 Thread Chad Versace
On Tue 11 Aug 2015, Nanley Chery wrote:
> From: Nanley Chery 
> 
> Only uncompressed formats have a non-void type and actual components per 
> pixel.
> Rename _mesa_format_to_type_and_comps to
> _mesa_uncompressed_format_to_type_and_comps and require callers to check if
> the format is not compressed.
> 
> Cc: Jason Ekstrand 
> Signed-off-by: Nanley Chery 
> ---
>  src/mesa/main/format_utils.c |  2 +-
>  src/mesa/main/formats.c  | 55 
> 
>  src/mesa/main/formats.h  |  2 +-
>  src/mesa/main/mipmap.c   |  2 +-
>  4 files changed, 12 insertions(+), 49 deletions(-)

I like several patches in this series, but I don't like this one. It
produces a lot of compiler warnings.

The problem is that the switch in _mesa_format_to_type_and_comps() is
intended to be an exhaustive switch. Removing the compressed format
cases causes a waterfall of compiler warnings like this:

  main/formats.c:1458:4: warning: enumeration value 
'MESA_FORMAT_LA_LATC2_UNORM' not handled in switch [-Wswitch]
  main/formats.c:1458:4: warning: enumeration value 
'MESA_FORMAT_LA_LATC2_SNORM' not handled in switch [-Wswitch]
  main/formats.c:1458:4: warning: enumeration value 'MESA_FORMAT_ETC1_RGB8' not 
handled in switch [-Wswitch]
  main/formats.c:1458:4: warning: enumeration value 'MESA_FORMAT_ETC2_RGB8' not 
handled in switch [-Wswitch]
  main/formats.c:1458:4: warning: enumeration value 'MESA_FORMAT_ETC2_SRGB8' 
not handled in switch [-Wswitch]
  main/formats.c:1458:4: warning: enumeration value 
'MESA_FORMAT_ETC2_RGBA8_EAC' not handled in switch [-Wswitch]
  main/formats.c:1458:4: warning: enumeration value 
'MESA_FORMAT_ETC2_SRGB8_ALPHA8_EAC' not handled in switch [-Wswitch]

I think you can salvage the patch by following the hint in this deleted
hunk...

> -   case MESA_FORMAT_RGB_FXT1:
> -   case MESA_FORMAT_RGBA_FXT1:
> -   case MESA_FORMAT_RGB_DXT1:
> -   case MESA_FORMAT_RGBA_DXT1:
> -   case MESA_FORMAT_RGBA_DXT3:
> -   case MESA_FORMAT_RGBA_DXT5:
> -   case MESA_FORMAT_SRGB_DXT1:
> -   case MESA_FORMAT_SRGBA_DXT1:
> -   case MESA_FORMAT_SRGBA_DXT3:
> -   case MESA_FORMAT_SRGBA_DXT5:
> -   case MESA_FORMAT_R_RGTC1_UNORM:
> -   case MESA_FORMAT_R_RGTC1_SNORM:
> -   case MESA_FORMAT_RG_RGTC2_UNORM:
> -   case MESA_FORMAT_RG_RGTC2_SNORM:
> -   case MESA_FORMAT_L_LATC1_UNORM:
> -   case MESA_FORMAT_L_LATC1_SNORM:
> -   case MESA_FORMAT_LA_LATC2_UNORM:
> -   case MESA_FORMAT_LA_LATC2_SNORM:
> -   case MESA_FORMAT_ETC1_RGB8:
> -   case MESA_FORMAT_ETC2_RGB8:
> -   case MESA_FORMAT_ETC2_SRGB8:
> -   case MESA_FORMAT_ETC2_RGBA8_EAC:
> -   case MESA_FORMAT_ETC2_SRGB8_ALPHA8_EAC:
> -   case MESA_FORMAT_ETC2_R11_EAC:
> -   case MESA_FORMAT_ETC2_RG11_EAC:
> -   case MESA_FORMAT_ETC2_SIGNED_R11_EAC:
> -   case MESA_FORMAT_ETC2_SIGNED_RG11_EAC:
> -   case MESA_FORMAT_ETC2_RGB8_PUNCHTHROUGH_ALPHA1:
> -   case MESA_FORMAT_ETC2_SRGB8_PUNCHTHROUGH_ALPHA1:
> -   case MESA_FORMAT_BPTC_RGBA_UNORM:
> -   case MESA_FORMAT_BPTC_SRGB_ALPHA_UNORM:
> -   case MESA_FORMAT_BPTC_RGB_SIGNED_FLOAT:
> -   case MESA_FORMAT_BPTC_RGB_UNSIGNED_FLOAT:
> -  /* XXX generate error instead? */
> -  *datatype = GL_UNSIGNED_BYTE;
> -  *comps = 0;
> -  return;
> -

In other words, rename the function to
_mesa_uncompressed_format_to_type_and_comps(); keep all the format
cases; and add an assertion or _mesa_problem() for the compressed cases.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] mesa/formats: refactor by removing compressed formats

2015-08-14 Thread Chad Versace
On Tue 11 Aug 2015, Nanley Chery wrote:
> From: Nanley Chery 
> 
> All compressed formats return GL_FALSE. Remove all switch cases for
> compressed formats. Compressed formats should be at the bottom of
> the switch statement, so ordering is still preserved.
> 
> Cc: Jason Ekstrand 
> Signed-off-by: Nanley Chery 
> ---
>  src/mesa/main/formats.c | 47 ++-
>  1 file changed, 2 insertions(+), 45 deletions(-)

This patch has the same issue as patch 1/4. It transforms an
intentionaly exhaustive switch to an unintentionally inexhaustive
switch, spewing compiler warnings.

If you want to move all the compressed cases to the bottom of the
switch, I think that's a good idea. But I think this switch should
remain exhaustive.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] scons: Always define __STDC_LIMIT_MACROS.

2015-08-14 Thread Roland Scheidegger
I'm hardly a build expert, but looks good to me.
Reviewed-by: Roland Scheidegger 

Am 15.08.2015 um 00:22 schrieb Vinson Lee:
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91591
> Signed-off-by: Vinson Lee 
> ---
>  scons/gallium.py | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/scons/gallium.py b/scons/gallium.py
> index 51b84d7..46dbf0e 100755
> --- a/scons/gallium.py
> +++ b/scons/gallium.py
> @@ -300,6 +300,7 @@ def generate(env):
>  
>  # C preprocessor options
>  cppdefines = []
> +cppdefines += ['__STDC_LIMIT_MACROS']
>  if env['build'] in ('debug', 'checked'):
>  cppdefines += ['DEBUG']
>  else:
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] mesa/formats: add more MESA_FORMAT_LAYOUTs

2015-08-14 Thread Chad Versace
On Wed 12 Aug 2015, Nanley Chery wrote:
> My comment about making ASTC format detection simpler is a little
> subjective. I'm planning to remove it from the final patch if it's
> accepted. What this change allows in general is detecting compressed
> formats belonging to a certain category of compressed formats.
> 
> Regards,
> Nanley

I like the idea of declaring the compression family as a type of texture
layout.

Patch 3/4 is
Reviewed-by: Chad Versace 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] mesa/formats: refactor by globbing on types in switch statement

2015-08-14 Thread Chad Versace
On Tue 11 Aug 2015, Nanley Chery wrote:
> From: Nanley Chery 
> 
> Combine the adjacent cases which have the same GL type in the switch statemnt.
> 
> Signed-off-by: Nanley Chery 
> ---
>  src/mesa/main/formats.c | 152 
> ++--
>  1 file changed, 17 insertions(+), 135 deletions(-)

I checked every line in this patch and found no errors.

Patch 4/4 is
Reviewed-by: Chad Versace 

As an aside... Do you think the format's datatype be stored directly in
the Mesa's format info table? It seems strange that Mesa maintains that
bit of information outside the info table.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/12] glsl: fix ir_constant::equals() for doubles

2015-08-14 Thread Timothy Arceri
Reviewed-by: Timothy Arceri 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/12] glsl: fix isinf() for doubles

2015-08-14 Thread Timothy Arceri
Reviewed-by: Timothy Arceri 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/8] i965/gen9: Add code to compute yf/ys tile masks

2015-08-14 Thread Anuj Phogat
A later patch in this series uses it to compute tile dimensions.

Cc: Ben Widawsky 
Signed-off-by: Anuj Phogat 
---
 src/mesa/drivers/dri/i965/brw_blorp.cpp   |  3 ++-
 src/mesa/drivers/dri/i965/brw_misc_state.c| 10 ---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 39 ---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  2 +-
 4 files changed, 45 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp.cpp
index cb5ef58..9195514 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp
@@ -144,7 +144,8 @@ brw_blorp_surface_info::compute_tile_offsets(uint32_t 
*tile_x,
 {
uint32_t mask_x, mask_y;
 
-   intel_miptree_get_tile_masks(mt->tiling, mt->cpp, &mask_x, &mask_y,
+   intel_miptree_get_tile_masks(mt->tiling, mt->tr_mode, mt->cpp,
+&mask_x, &mask_y,
 map_stencil_as_y_tiled);
 
*tile_x = x_offset & mask_x;
diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c 
b/src/mesa/drivers/dri/i965/brw_misc_state.c
index 246aefb..62bb1e3 100644
--- a/src/mesa/drivers/dri/i965/brw_misc_state.c
+++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
@@ -174,12 +174,14 @@ brw_get_depthstencil_tile_masks(struct intel_mipmap_tree 
*depth_mt,
uint32_t tile_mask_x = 0, tile_mask_y = 0;
 
if (depth_mt) {
-  intel_miptree_get_tile_masks(depth_mt->tiling, depth_mt->cpp,
-   &tile_mask_x, &tile_mask_y, false);
+  intel_miptree_get_tile_masks(depth_mt->tiling, depth_mt->tr_mode,
+   depth_mt->cpp, &tile_mask_x, &tile_mask_y,
+   false);
 
   if (intel_miptree_level_has_hiz(depth_mt, depth_level)) {
  uint32_t hiz_tile_mask_x, hiz_tile_mask_y;
  intel_miptree_get_tile_masks(depth_mt->hiz_buf->mt->tiling,
+  depth_mt->hiz_buf->mt->tr_mode,
   depth_mt->hiz_buf->mt->cpp,
   &hiz_tile_mask_x, &hiz_tile_mask_y,
   false);
@@ -202,7 +204,9 @@ brw_get_depthstencil_tile_masks(struct intel_mipmap_tree 
*depth_mt,
  tile_mask_y |= 63;
   } else {
  uint32_t stencil_tile_mask_x, stencil_tile_mask_y;
- intel_miptree_get_tile_masks(stencil_mt->tiling, stencil_mt->cpp,
+ intel_miptree_get_tile_masks(stencil_mt->tiling,
+  stencil_mt->tr_mode,
+  stencil_mt->cpp,
   &stencil_tile_mask_x,
   &stencil_tile_mask_y, false);
 
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index b4f2bd8..55dc80d 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1087,7 +1087,7 @@ intel_miptree_get_image_offset(const struct 
intel_mipmap_tree *mt,
  * untiled, the masks are set to 0.
  */
 void
-intel_miptree_get_tile_masks(uint32_t tiling, uint32_t cpp,
+intel_miptree_get_tile_masks(uint32_t tiling, uint32_t tr_mode, uint32_t cpp,
  uint32_t *mask_x, uint32_t *mask_y,
  bool map_stencil_as_y_tiled)
 {
@@ -1105,8 +1105,38 @@ intel_miptree_get_tile_masks(uint32_t tiling, uint32_t 
cpp,
   *mask_y = 7;
   break;
case I915_TILING_Y:
-  *mask_x = 128 / cpp - 1;
-  *mask_y = 31;
+  if (tr_mode == INTEL_MIPTREE_TRMODE_NONE)
+  {
+ *mask_x = 128 / cpp - 1;
+ *mask_y = 31;
+  } else {
+ uint32_t aspect_ratio = 1;
+ switch (cpp) {
+ case 1:
+*mask_y = 64;
+break;
+ case 2:
+aspect_ratio = 2;
+/* fallthrough */
+ case 4:
+*mask_y = 32;
+break;
+ case 8:
+aspect_ratio = 2;
+/* fallthrough */
+ case 16:
+*mask_y = 16;
+break;
+ default:
+unreachable("not reached");
+ }
+
+ if (tr_mode == INTEL_MIPTREE_TRMODE_YS)
+*mask_y *= 4;
+
+ *mask_x = *mask_y * aspect_ratio - 1;
+ *mask_y -= 1;
+  }
   break;
}
 }
@@ -1173,7 +1203,8 @@ intel_miptree_get_tile_offsets(const struct 
intel_mipmap_tree *mt,
uint32_t x, y;
uint32_t mask_x, mask_y;
 
-   intel_miptree_get_tile_masks(mt->tiling, mt->cpp, &mask_x, &mask_y, false);
+   intel_miptree_get_tile_masks(mt->tiling, mt->tr_mode, mt->cpp,
+&mask_x, &mask_y, false);
intel_miptree_get_image_offset(mt, level, slice, &x, &y);
 
*tile_x = x & mask_x;
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index b1617

[Mesa-dev] [PATCH 7/8] i965: Fix {src, dst}_pitch alignment check for XY_SRC_COPY_BLT

2015-08-14 Thread Anuj Phogat
Current code checks the alignment restrictions only for Y tiling.
From Broadwell PRM vol 10:

 "pitch is of 512Byte granularity for Tile-X: This means the tiled-x
  surface pitch can be (512, 1024, 1536, 2048...)/4 (in Dwords)."

This patch adds the restriction for X tiling as well.

Cc: Ben Widawsky 
Cc: 
Signed-off-by: Anuj Phogat 
---
 src/mesa/drivers/dri/i965/intel_blit.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_blit.c 
b/src/mesa/drivers/dri/i965/intel_blit.c
index d15a64d..729ffb0 100644
--- a/src/mesa/drivers/dri/i965/intel_blit.c
+++ b/src/mesa/drivers/dri/i965/intel_blit.c
@@ -531,6 +531,8 @@ intelEmitCopyBlit(struct brw_context *brw,
bool dst_y_tiled = dst_tiling == I915_TILING_Y;
bool src_y_tiled = src_tiling == I915_TILING_Y;
bool use_fast_copy_blit = false;
+   uint32_t src_tile_w, src_tile_h;
+   uint32_t dst_tile_w, dst_tile_h;
 
if ((dst_y_tiled || src_y_tiled) && brw->gen < 6)
   return false;
@@ -559,6 +561,11 @@ intelEmitCopyBlit(struct brw_context *brw,
src_buffer, src_pitch, src_offset, src_x, src_y,
dst_buffer, dst_pitch, dst_offset, dst_x, dst_y, w, h);
 
+   intel_miptree_get_tile_dimensions(src_tiling, src_tr_mode, cpp,
+ &src_tile_w, &src_tile_h);
+   intel_miptree_get_tile_dimensions(dst_tiling, dst_tr_mode, cpp,
+ &dst_tile_w, &dst_tile_h);
+
use_fast_copy_blit = can_fast_copy_blit(brw,
src_buffer,
src_x, src_y,
@@ -597,8 +604,8 @@ intelEmitCopyBlit(struct brw_context *brw,
 cpp, use_fast_copy_blit);
 
} else {
-  assert(!dst_y_tiled || (dst_pitch % 128) == 0);
-  assert(!src_y_tiled || (src_pitch % 128) == 0);
+  assert(src_tiling == I915_TILING_NONE || (src_pitch % src_tile_w) == 0);
+  assert(dst_tiling == I915_TILING_NONE || (dst_pitch % dst_tile_w) == 0);
 
   /* For big formats (such as floating point), do the copy using 16 or
* 32bpp and multiply the coordinates.
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/8] i965: Delete temporary variable 'src_pitch'

2015-08-14 Thread Anuj Phogat
Signed-off-by: Anuj Phogat 
---
 src/mesa/drivers/dri/i965/intel_blit.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_blit.c 
b/src/mesa/drivers/dri/i965/intel_blit.c
index f8606b8..c177eec 100644
--- a/src/mesa/drivers/dri/i965/intel_blit.c
+++ b/src/mesa/drivers/dri/i965/intel_blit.c
@@ -330,10 +330,6 @@ intel_miptree_blit(struct brw_context *brw,
if (dst_flip)
   dst_y = minify(dst_mt->physical_height0, dst_level - 
dst_mt->first_level) - dst_y - height;
 
-   int src_pitch = src_mt->pitch;
-   if (src_flip != dst_flip)
-  src_pitch = -src_pitch;
-
uint32_t src_image_x, src_image_y, dst_image_x, dst_image_y;
intel_miptree_get_image_offset(src_mt, src_level, src_slice,
   &src_image_x, &src_image_y);
@@ -356,7 +352,7 @@ intel_miptree_blit(struct brw_context *brw,
 
if (!intelEmitCopyBlit(brw,
   src_mt->cpp,
-  src_pitch,
+  src_flip == dst_flip ? src_mt->pitch : 
-src_mt->pitch,
   src_mt->bo, src_mt->offset,
   src_mt->tiling,
   src_mt->tr_mode,
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/8] i965: Add a helper function intel_miptree_get_tile_dimensions()

2015-08-14 Thread Anuj Phogat
Cc: Ben Widawsky 
Signed-off-by: Anuj Phogat 
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 67 ++-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h |  5 ++
 2 files changed, 50 insertions(+), 22 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 55dc80d..12f1a4d 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -563,35 +563,15 @@ static unsigned long
 intel_get_yf_ys_bo_size(struct intel_mipmap_tree *mt, unsigned *alignment,
 unsigned long *pitch)
 {
-   const uint32_t bpp = mt->cpp * 8;
-   const uint32_t aspect_ratio = (bpp == 16 || bpp == 64) ? 2 : 1;
uint32_t tile_width, tile_height;
unsigned long stride, size, aligned_y;
 
assert(mt->tr_mode != INTEL_MIPTREE_TRMODE_NONE);
-
-   switch (bpp) {
-   case 8:
-  tile_height = 64;
-  break;
-   case 16:
-   case 32:
-  tile_height = 32;
-  break;
-   case 64:
-   case 128:
-  tile_height = 16;
-  break;
-   default:
-  unreachable("not reached");
-   }
-
-   if (mt->tr_mode == INTEL_MIPTREE_TRMODE_YS)
-  tile_height *= 4;
+   intel_miptree_get_tile_dimensions(mt->tiling, mt->tr_mode, mt->cpp,
+ &tile_width, &tile_height);
 
aligned_y = ALIGN(mt->total_height, tile_height);
stride = mt->total_width * mt->cpp;
-   tile_width = tile_height * mt->cpp * aspect_ratio;
stride = ALIGN(stride, tile_width);
size = stride * aligned_y;
 
@@ -1081,6 +1061,49 @@ intel_miptree_get_image_offset(const struct 
intel_mipmap_tree *mt,
*y = mt->level[level].slice[slice].y_offset;
 }
 
+
+/**
+ * This function computes the width and height in bytes of different tiling
+ * patterns. If the BO is untiled, the dimensions are set to cpp.
+ */
+void
+intel_miptree_get_tile_dimensions(uint32_t tiling, uint32_t tr_mode,
+  uint32_t cpp, uint32_t *tile_w,
+  uint32_t *tile_h)
+{
+   /* Y tiled surfaces with TRMODE_NONE allows non power of 2 cpp. Using
+* intel_miptree_get_tile_masks() in those cases will give incorrect
+* tile dimensions. So handle the TRMODE_NONE here.
+*/
+   if (tr_mode == INTEL_MIPTREE_TRMODE_NONE) {
+  switch (tiling) {
+  case I915_TILING_X:
+ *tile_w = 512;
+ *tile_h = 8 * cpp;
+ break;
+  case I915_TILING_Y:
+ *tile_w = 128;
+ *tile_h = 32 * cpp;
+ break;
+  case I915_TILING_NONE:
+ *tile_w = cpp;
+ *tile_h = cpp;
+ break;
+  default:
+ unreachable("not reached");
+  }
+   } else {
+  uint32_t mask_x, mask_y;
+  assert(_mesa_is_pow_two(cpp));
+
+  intel_miptree_get_tile_masks(tiling, tr_mode, cpp,
+   &mask_x, &mask_y, false);
+  *tile_w = (mask_x + 1) * cpp;
+  *tile_h = (mask_y + 1) * cpp;
+   }
+}
+
+
 /**
  * This function computes masks that may be used to select the bits of the X
  * and Y coordinates that indicate the offset within a tile.  If the BO is
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index fc68146..54cce1b 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -622,6 +622,11 @@ intel_miptree_get_dimensions_for_image(struct 
gl_texture_image *image,
int *width, int *height, int *depth);
 
 void
+intel_miptree_get_tile_dimensions(uint32_t tiling, uint32_t tr_mode,
+  uint32_t cpp, uint32_t *tile_w,
+  uint32_t *tile_h);
+
+void
 intel_miptree_get_tile_masks(uint32_t tiling, uint32_t tr_mode, uint32_t cpp,
  uint32_t *mask_x, uint32_t *mask_y,
  bool map_stencil_as_y_tiled);
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/8] i965: Change the parameters passed to intel_miptree_get_tile_masks()

2015-08-14 Thread Anuj Phogat
This change is required by the later patches.

Cc: Ben Widawsky 
Signed-off-by: Anuj Phogat 
---
 src/mesa/drivers/dri/i965/brw_blorp.cpp   | 3 ++-
 src/mesa/drivers/dri/i965/brw_misc_state.c| 8 +---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 7 ++-
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 2 +-
 4 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp.cpp
index eac1f00..cb5ef58 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp
@@ -144,7 +144,8 @@ brw_blorp_surface_info::compute_tile_offsets(uint32_t 
*tile_x,
 {
uint32_t mask_x, mask_y;
 
-   intel_miptree_get_tile_masks(mt, &mask_x, &mask_y, map_stencil_as_y_tiled);
+   intel_miptree_get_tile_masks(mt->tiling, mt->cpp, &mask_x, &mask_y,
+map_stencil_as_y_tiled);
 
*tile_x = x_offset & mask_x;
*tile_y = y_offset & mask_y;
diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c 
b/src/mesa/drivers/dri/i965/brw_misc_state.c
index e9d9467..246aefb 100644
--- a/src/mesa/drivers/dri/i965/brw_misc_state.c
+++ b/src/mesa/drivers/dri/i965/brw_misc_state.c
@@ -174,11 +174,13 @@ brw_get_depthstencil_tile_masks(struct intel_mipmap_tree 
*depth_mt,
uint32_t tile_mask_x = 0, tile_mask_y = 0;
 
if (depth_mt) {
-  intel_miptree_get_tile_masks(depth_mt, &tile_mask_x, &tile_mask_y, 
false);
+  intel_miptree_get_tile_masks(depth_mt->tiling, depth_mt->cpp,
+   &tile_mask_x, &tile_mask_y, false);
 
   if (intel_miptree_level_has_hiz(depth_mt, depth_level)) {
  uint32_t hiz_tile_mask_x, hiz_tile_mask_y;
- intel_miptree_get_tile_masks(depth_mt->hiz_buf->mt,
+ intel_miptree_get_tile_masks(depth_mt->hiz_buf->mt->tiling,
+  depth_mt->hiz_buf->mt->cpp,
   &hiz_tile_mask_x, &hiz_tile_mask_y,
   false);
 
@@ -200,7 +202,7 @@ brw_get_depthstencil_tile_masks(struct intel_mipmap_tree 
*depth_mt,
  tile_mask_y |= 63;
   } else {
  uint32_t stencil_tile_mask_x, stencil_tile_mask_y;
- intel_miptree_get_tile_masks(stencil_mt,
+ intel_miptree_get_tile_masks(stencil_mt->tiling, stencil_mt->cpp,
   &stencil_tile_mask_x,
   &stencil_tile_mask_y, false);
 
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index e85c3f0..b4f2bd8 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1087,13 +1087,10 @@ intel_miptree_get_image_offset(const struct 
intel_mipmap_tree *mt,
  * untiled, the masks are set to 0.
  */
 void
-intel_miptree_get_tile_masks(const struct intel_mipmap_tree *mt,
+intel_miptree_get_tile_masks(uint32_t tiling, uint32_t cpp,
  uint32_t *mask_x, uint32_t *mask_y,
  bool map_stencil_as_y_tiled)
 {
-   int cpp = mt->cpp;
-   uint32_t tiling = mt->tiling;
-
if (map_stencil_as_y_tiled)
   tiling = I915_TILING_Y;
 
@@ -1176,7 +1173,7 @@ intel_miptree_get_tile_offsets(const struct 
intel_mipmap_tree *mt,
uint32_t x, y;
uint32_t mask_x, mask_y;
 
-   intel_miptree_get_tile_masks(mt, &mask_x, &mask_y, false);
+   intel_miptree_get_tile_masks(mt->tiling, mt->cpp, &mask_x, &mask_y, false);
intel_miptree_get_image_offset(mt, level, slice, &x, &y);
 
*tile_x = x & mask_x;
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index 790d312..b1617a2 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -622,7 +622,7 @@ intel_miptree_get_dimensions_for_image(struct 
gl_texture_image *image,
int *width, int *height, int *depth);
 
 void
-intel_miptree_get_tile_masks(const struct intel_mipmap_tree *mt,
+intel_miptree_get_tile_masks(uint32_t tiling, uint32_t cpp,
  uint32_t *mask_x, uint32_t *mask_y,
  bool map_stencil_as_y_tiled);
 
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/8] i965: Move conversion of {src, dst}_pitch to dwords outside if/else

2015-08-14 Thread Anuj Phogat
Signed-off-by: Anuj Phogat 
---
 src/mesa/drivers/dri/i965/intel_blit.c | 25 +
 1 file changed, 9 insertions(+), 16 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_blit.c 
b/src/mesa/drivers/dri/i965/intel_blit.c
index c177eec..d15a64d 100644
--- a/src/mesa/drivers/dri/i965/intel_blit.c
+++ b/src/mesa/drivers/dri/i965/intel_blit.c
@@ -596,15 +596,6 @@ intelEmitCopyBlit(struct brw_context *brw,
 dst_tiling, dst_tr_mode,
 cpp, use_fast_copy_blit);
 
-  /* For tiled source and destination, pitch value should be specified
-   * as a number of Dwords.
-   */
-  if (dst_tiling != I915_TILING_NONE)
- dst_pitch /= 4;
-
-  if (src_tiling != I915_TILING_NONE)
- src_pitch /= 4;
-
} else {
   assert(!dst_y_tiled || (dst_pitch % 128) == 0);
   assert(!src_y_tiled || (src_pitch % 128) == 0);
@@ -645,17 +636,19 @@ intelEmitCopyBlit(struct brw_context *brw,
   CMD = xy_blit_cmd(src_tiling, src_tr_mode,
 dst_tiling, dst_tr_mode,
 cpp, use_fast_copy_blit);
+   }
 
-  if (dst_tiling != I915_TILING_NONE)
- dst_pitch /= 4;
+   /* For tiled source and destination, pitch value should be specified
+* as a number of Dwords.
+*/
+   if (dst_tiling != I915_TILING_NONE)
+  dst_pitch /= 4;
 
-  if (src_tiling != I915_TILING_NONE)
- src_pitch /= 4;
-   }
+   if (src_tiling != I915_TILING_NONE)
+  src_pitch /= 4;
 
-   if (dst_y2 <= dst_y || dst_x2 <= dst_x) {
+   if (dst_y2 <= dst_y || dst_x2 <= dst_x)
   return true;
-   }
 
assert(dst_x < dst_x2);
assert(dst_y < dst_y2);
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/8] i965: Use helper function intel_miptree_get_tile_dimensions() in surface setup

2015-08-14 Thread Anuj Phogat
It takes care of using the correct tile width if we later use other tiling
patterns (e.g. Yf) for aux miptree.

Cc: Ben Widawsky 
Signed-off-by: Anuj Phogat 
---
 src/mesa/drivers/dri/i965/gen8_surface_state.c | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
b/src/mesa/drivers/dri/i965/gen8_surface_state.c
index 6c4d3e1..354d79f 100644
--- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
@@ -262,8 +262,13 @@ gen8_emit_texture_surface_state(struct brw_context *brw,
}
 
if (aux_mt) {
+  uint32_t tile_w, tile_h;
+  assert(aux_mt->tiling == I915_TILING_Y);
+  intel_miptree_get_tile_dimensions(aux_mt->tiling, aux_mt->tr_mode,
+aux_mt->cpp, &tile_w, &tile_h);
   surf[6] = SET_FIELD(mt->qpitch / 4, GEN8_SURFACE_AUX_QPITCH) |
-SET_FIELD((aux_mt->pitch / 128) - 1, GEN8_SURFACE_AUX_PITCH) |
+SET_FIELD((aux_mt->pitch / tile_w) - 1,
+  GEN8_SURFACE_AUX_PITCH) |
 aux_mode;
} else {
   surf[6] = 0;
@@ -487,8 +492,13 @@ gen8_update_renderbuffer_surface(struct brw_context *brw,
}
 
if (aux_mt) {
+  uint32_t tile_w, tile_h;
+  assert(aux_mt->tiling == I915_TILING_Y);
+  intel_miptree_get_tile_dimensions(aux_mt->tiling, aux_mt->tr_mode,
+aux_mt->cpp, &tile_w, &tile_h);
   surf[6] = SET_FIELD(mt->qpitch / 4, GEN8_SURFACE_AUX_QPITCH) |
-SET_FIELD((aux_mt->pitch / 128) - 1, GEN8_SURFACE_AUX_PITCH) |
+SET_FIELD((aux_mt->pitch / tile_w) - 1,
+  GEN8_SURFACE_AUX_PITCH) |
 aux_mode;
} else {
   surf[6] = 0;
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 8/8] i965/gen9: Fix {src, dst}_pitch alignment check for XY_FAST_COPY_BLT

2015-08-14 Thread Anuj Phogat
I misinterpreted the alignmnet restriction in XY_FAST_COPY_BLT earlier.
Instead of checking pitch for 64KB alignmnet we need to check it for
tile widh alignment.

Signed-off-by: Anuj Phogat 
Cc: Ben Widawsky 
---
 src/mesa/drivers/dri/i965/intel_blit.c | 18 +++---
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_blit.c 
b/src/mesa/drivers/dri/i965/intel_blit.c
index 729ffb0..b85fd3d 100644
--- a/src/mesa/drivers/dri/i965/intel_blit.c
+++ b/src/mesa/drivers/dri/i965/intel_blit.c
@@ -448,14 +448,6 @@ can_fast_copy_blit(struct brw_context *brw,
(dst_tiling_none && dst_pitch % 16 != 0))
   return false;
 
-   /* For Tiled surfaces, the pitch has to be a multiple of the Tile width
-* (X direction width of the Tile). This means the pitch value will
-* always be Cache Line aligned (64byte multiple).
-*/
-   if ((!dst_tiling_none && dst_pitch % 64 != 0) ||
-   (!src_tiling_none && src_pitch % 64 != 0))
-  return false;
-
return true;
 }
 
@@ -566,6 +558,13 @@ intelEmitCopyBlit(struct brw_context *brw,
intel_miptree_get_tile_dimensions(dst_tiling, dst_tr_mode, cpp,
  &dst_tile_w, &dst_tile_h);
 
+   /* For Tiled surfaces, the pitch has to be a multiple of the Tile width
+* (X direction width of the Tile). This is ensured while allocating the
+* buffer object.
+*/
+   assert(src_tiling == I915_TILING_NONE || (src_pitch % src_tile_w) == 0);
+   assert(dst_tiling == I915_TILING_NONE || (dst_pitch % dst_tile_w) == 0);
+
use_fast_copy_blit = can_fast_copy_blit(brw,
src_buffer,
src_x, src_y,
@@ -604,9 +603,6 @@ intelEmitCopyBlit(struct brw_context *brw,
 cpp, use_fast_copy_blit);
 
} else {
-  assert(src_tiling == I915_TILING_NONE || (src_pitch % src_tile_w) == 0);
-  assert(dst_tiling == I915_TILING_NONE || (dst_pitch % dst_tile_w) == 0);
-
   /* For big formats (such as floating point), do the copy using 16 or
* 32bpp and multiply the coordinates.
*/
-- 
2.4.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/12] nir/builder: only read meaningful channels in nir_swizzle()

2015-08-14 Thread Timothy Arceri
On Fri, 2015-08-14 at 15:30 -0700, Connor Abbott wrote:
> This way the caller doesn't have to initialize all 4 channels when they
> aren't using them.
> 
> Signed-off-by: Connor Abbott 
> ---
>  src/glsl/nir/nir_builder.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/glsl/nir/nir_builder.h b/src/glsl/nir/nir_builder.h
> index 1745453..9592b8a 100644
> --- a/src/glsl/nir/nir_builder.h
> +++ b/src/glsl/nir/nir_builder.h
> @@ -235,7 +235,7 @@ nir_swizzle(nir_builder *build, nir_ssa_def *src, 
> unsigned swiz[4],
>  {
> nir_alu_src alu_src = { NIR_SRC_INIT };
> alu_src.src = nir_src_for_ssa(src);
> -   for (int i = 0; i < 4; i++)
> +   for (int i = 0; i < num_components; i++)


Any reason you can't do this in nir_ssa_for_src() also?

>alu_src.swizzle[i] = swiz[i];
>  
> return use_fmov ? nir_fmov_alu(build, alu_src, num_components) :
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/12] i965/vec4/nir: simplify glsl_type_for_nir_alu_type()

2015-08-14 Thread Timothy Arceri
On Fri, 2015-08-14 at 15:30 -0700, Connor Abbott wrote:
> Less duplication and one less case to handle for doubles.
> 
> Signed-off-by: Connor Abbott 
> ---
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 16 ++--
>  1 file changed, 2 insertions(+), 14 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index b13465b..19ae7fd 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -1339,20 +1339,8 @@ const glsl_type *
>  glsl_type_for_nir_alu_type(nir_alu_type alu_type,
> unsigned components)
>  {
> -   switch (alu_type) {
> -   case nir_type_float:
> -  return glsl_type::vec(components);
> -   case nir_type_int:
> -  return glsl_type::ivec(components);
> -   case nir_type_unsigned:
> -  return glsl_type::uvec(components);
> -   case nir_type_bool:
> -  return glsl_type::bvec(components);
> -   default:
> -  return glsl_type::error_type;
> -   }
> -
> -   return glsl_type::error_type;
> +   return 
> glsl_type::get_instance(brw_glsl_base_type_for_nir_type(alu_type),
> +  1, components);

The switch above handles nir_type_bool using brw_glsl_base_type_for_nir_type()
you would hit unreachable("bad type") for nir_type_bool.

>  }
>  
>  void
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] docs: Update todo regarding StencilOp and StencilOpSeparate.

2015-08-14 Thread Rhys Kidd
On 8 August 2015 at 15:09, Rhys Kidd  wrote:

> OpenGL 2.0 function StencilOp() is in part internally implemented via
> StencilOpSeparate(). This change happened some time ago, however the
> accompanying doxygen todo comment was not accordingly updated.
>
> Replace the outdated portion of this doxygen todo comment, leaving the
> remainder
> unchanged.
>
> Also better respect the 80 character suggested line length in this file.
>
> Signed-off-by: Rhys Kidd 
> ---
>  src/mesa/main/stencil.c | 17 -
>  1 file changed, 8 insertions(+), 9 deletions(-)
>
> diff --git a/src/mesa/main/stencil.c b/src/mesa/main/stencil.c
> index 2a19a17..385a6d0 100644
> --- a/src/mesa/main/stencil.c
> +++ b/src/mesa/main/stencil.c
> @@ -124,8 +124,8 @@ _mesa_ClearStencil( GLint s )
>   * \sa glStencilFunc().
>   *
>   * Verifies the parameters and updates the respective values in
> - * __struct gl_contextRec::Stencil. On change flushes the vertices and
> notifies the
> - * driver via the dd_function_table::StencilFunc callback.
> + * __struct gl_contextRec::Stencil. On change flushes the vertices and
> notifies
> + * the driver via the dd_function_table::StencilFunc callback.
>   */
>  void GLAPIENTRY
>  _mesa_StencilFuncSeparateATI( GLenum frontfunc, GLenum backfunc, GLint
> ref, GLuint mask )
> @@ -178,8 +178,8 @@ _mesa_StencilFuncSeparateATI( GLenum frontfunc, GLenum
> backfunc, GLint ref, GLui
>   * \sa glStencilFunc().
>   *
>   * Verifies the parameters and updates the respective values in
> - * __struct gl_contextRec::Stencil. On change flushes the vertices and
> notifies the
> - * driver via the dd_function_table::StencilFunc callback.
> + * __struct gl_contextRec::Stencil. On change flushes the vertices and
> notifies
> + * the driver via the dd_function_table::StencilFunc callback.
>   */
>  void GLAPIENTRY
>  _mesa_StencilFunc( GLenum func, GLint ref, GLuint mask )
> @@ -298,8 +298,8 @@ _mesa_StencilMask( GLuint mask )
>   * \sa glStencilOp().
>   *
>   * Verifies the parameters and updates the respective fields in
> - * __struct gl_contextRec::Stencil. On change flushes the vertices and
> notifies the
> - * driver via the dd_function_table::StencilOp callback.
> + * __struct gl_contextRec::Stencil. On change flushes the vertices and
> notifies
> + * the driver via the dd_function_table::StencilOp callback.
>   */
>  void GLAPIENTRY
>  _mesa_StencilOp(GLenum fail, GLenum zfail, GLenum zpass)
> @@ -391,9 +391,8 @@ _mesa_ActiveStencilFaceEXT(GLenum face)
>
>  /**
>   * OpenGL 2.0 function.
> - * \todo Make StencilOp() call this function.  And eventually remove the
> - * ctx->Driver.StencilOp function and use ctx->Driver.StencilOpSeparate
> - * instead.
> + * \todo Eventually remove the ctx->Driver.StencilOp function and use
> + * ctx->Driver.StencilOpSeparate instead.
>   */
>  void GLAPIENTRY
>  _mesa_StencilOpSeparate(GLenum face, GLenum sfail, GLenum zfail, GLenum
> zpass)
> --
> 2.1.4
>
>
A friendly ping to the mailing list. Is there a taker for code review and
subsequent push to master?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nir: Add a glsl_uint_type() wrapper.

2015-08-14 Thread Kenneth Graunke
Signed-off-by: Kenneth Graunke 
---
 src/glsl/nir/nir_types.cpp | 6 ++
 src/glsl/nir/nir_types.h   | 1 +
 2 files changed, 7 insertions(+)

diff --git a/src/glsl/nir/nir_types.cpp b/src/glsl/nir/nir_types.cpp
index 62176f5..940c676 100644
--- a/src/glsl/nir/nir_types.cpp
+++ b/src/glsl/nir/nir_types.cpp
@@ -155,6 +155,12 @@ glsl_vec4_type(void)
 }
 
 const glsl_type *
+glsl_uint_type(void)
+{
+   return glsl_type::uint_type;
+}
+
+const glsl_type *
 glsl_array_type(const glsl_type *base, unsigned elements)
 {
return glsl_type::get_array_instance(base, elements);
diff --git a/src/glsl/nir/nir_types.h b/src/glsl/nir/nir_types.h
index 276d4ad..a8ff8f2 100644
--- a/src/glsl/nir/nir_types.h
+++ b/src/glsl/nir/nir_types.h
@@ -71,6 +71,7 @@ bool glsl_type_is_matrix(const struct glsl_type *type);
 const struct glsl_type *glsl_void_type(void);
 const struct glsl_type *glsl_float_type(void);
 const struct glsl_type *glsl_vec4_type(void);
+const struct glsl_type *glsl_uint_type(void);
 const struct glsl_type *glsl_array_type(const struct glsl_type *base,
 unsigned elements);
 
-- 
2.4.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev