[Mesa-dev] [Bug 85586] New: Draw module crashes in LLVM generated code since commit 60ec95fa1e0c42bd42358185970b20c9b81591fa

2014-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=85586

Bug ID: 85586
   Summary: Draw module crashes in LLVM generated code since
commit 60ec95fa1e0c42bd42358185970b20c9b81591fa
   Product: Mesa
   Version: git
  Hardware: Other
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: mic...@daenzer.net

URL:   
http://cgit.freedesktop.org/mesa/mesa/commit/?id=60ec95fa1e0c42bd42358185970b20c9b81591fa

Author: Neil Roberts 
Date:   Tue Sep 23 19:01:04 2014 +0100

mesa: Add support for the GL_KHR_context_flush_control extension

Since this commit, some piglit tests crash for me in the LLVM generated code
called from draw_pt_fetch_shade_pipeline_llvm.c:370:

   if (fetch_info->linear)
  clipped = fpme->current_variant->jit_func( &fpme->llvm->jit_context,
   llvm_vert_info.verts,
   draw->pt.user.vbuffer,
   fetch_info->start,
   fetch_info->count,
   fpme->vertex_size,
   draw->pt.vertex_buffer,
   draw->instance_id,
   draw->start_index,
   draw->start_instance);

I can avoid the crashes with the environment variable DRAW_USE_LLVM=0.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 85586] Draw module crashes in LLVM generated code since commit 60ec95fa1e0c42bd42358185970b20c9b81591fa

2014-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=85586

--- Comment #1 from Michel Dänzer  ---
Using current LLVM 3.6 Git snapshot.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 84570] Borderlands 2/Pre-Sequel: Constant frame rate drops while playing; really bad with additionl lighting

2014-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=84570

--- Comment #31 from Michel Dänzer  ---
(In reply to Kai from comment #30)
> Michel, is there any chance attachment 107544 [details] [review] will be
> part of 3.18?

No, but it's in Alex's queue for 3.19.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radeon/llvm: Dynamically allocate branch/loop stack arrays

2014-10-29 Thread Michel Dänzer
From: Michel Dänzer 

This prevents us from silently overflowing the stack arrays, and allows
arbitrary stack depths.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85454

Reported-and-Tested-by: Nick Sarnie 
Signed-off-by: Michel Dänzer 
---
 src/gallium/drivers/radeon/radeon_llvm.h   | 10 ---
 .../drivers/radeon/radeon_setup_tgsi_llvm.c| 33 --
 2 files changed, 37 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_llvm.h 
b/src/gallium/drivers/radeon/radeon_llvm.h
index 00714fb..8612ef8 100644
--- a/src/gallium/drivers/radeon/radeon_llvm.h
+++ b/src/gallium/drivers/radeon/radeon_llvm.h
@@ -33,10 +33,10 @@
 
 #define RADEON_LLVM_MAX_INPUTS 32 * 4
 #define RADEON_LLVM_MAX_OUTPUTS 32 * 4
-#define RADEON_LLVM_MAX_BRANCH_DEPTH 16
-#define RADEON_LLVM_MAX_LOOP_DEPTH 16
 #define RADEON_LLVM_MAX_ARRAYS 16
 
+#define RADEON_LLVM_INITIAL_CF_DEPTH 4
+
 #define RADEON_LLVM_MAX_SYSTEM_VALUES 4
 
 struct radeon_llvm_branch {
@@ -122,11 +122,13 @@ struct radeon_llvm_context {
 
/*=== Private Members ===*/
 
-   struct radeon_llvm_branch branch[RADEON_LLVM_MAX_BRANCH_DEPTH];
-   struct radeon_llvm_loop loop[RADEON_LLVM_MAX_LOOP_DEPTH];
+   struct radeon_llvm_branch *branch;
+   struct radeon_llvm_loop *loop;
 
unsigned branch_depth;
+   unsigned branch_depth_max;
unsigned loop_depth;
+   unsigned loop_depth_max;
 
struct tgsi_declaration_range arrays[RADEON_LLVM_MAX_ARRAYS];
unsigned num_arrays;
diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
index 2fa23ed..c30a9d0 100644
--- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
+++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
@@ -446,7 +446,19 @@ static void bgnloop_emit(
endloop_block, "LOOP");
LLVMBuildBr(gallivm->builder, loop_block);
LLVMPositionBuilderAtEnd(gallivm->builder, loop_block);
-   ctx->loop_depth++;
+
+   if (++ctx->loop_depth > ctx->loop_depth_max) {
+   unsigned new_max = ctx->loop_depth_max << 1;
+
+   if (!new_max)
+   new_max = RADEON_LLVM_INITIAL_CF_DEPTH;
+
+   ctx->loop = REALLOC(ctx->loop, ctx->loop_depth_max *
+   sizeof(ctx->loop[0]),
+   new_max * sizeof(ctx->loop[0]));
+   ctx->loop_depth_max = new_max;
+   }
+
ctx->loop[ctx->loop_depth - 1].loop_block = loop_block;
ctx->loop[ctx->loop_depth - 1].endloop_block = endloop_block;
 }
@@ -577,7 +589,18 @@ static void if_cond_emit(
LLVMBuildCondBr(gallivm->builder, cond, if_block, else_block);
LLVMPositionBuilderAtEnd(gallivm->builder, if_block);
 
-   ctx->branch_depth++;
+   if (++ctx->branch_depth > ctx->branch_depth_max) {
+   unsigned new_max = ctx->branch_depth_max << 1;
+
+   if (!new_max)
+   new_max = RADEON_LLVM_INITIAL_CF_DEPTH;
+
+   ctx->branch = REALLOC(ctx->branch, ctx->branch_depth_max *
+ sizeof(ctx->branch[0]),
+ new_max * sizeof(ctx->branch[0]));
+   ctx->branch_depth_max = new_max;
+   }
+
ctx->branch[ctx->branch_depth - 1].endif_block = endif_block;
ctx->branch[ctx->branch_depth - 1].if_block = if_block;
ctx->branch[ctx->branch_depth - 1].else_block = else_block;
@@ -1440,4 +1463,10 @@ void radeon_llvm_dispose(struct radeon_llvm_context * 
ctx)
LLVMContextDispose(ctx->soa.bld_base.base.gallivm->context);
FREE(ctx->temps);
ctx->temps = NULL;
+   FREE(ctx->loop);
+   ctx->loop = NULL;
+   ctx->loop_depth_max = 0;
+   FREE(ctx->branch);
+   ctx->branch = NULL;
+   ctx->branch_depth_max = 0;
 }
-- 
2.1.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glx/dri3: Implement LIBGL_SHOW_FPS=1 for DRI3/Present.

2014-10-29 Thread Kenneth Graunke
v2: Use the UST value provided in the PRESENT_COMPLETE_NOTIFY event
rather than gettimeofday(), which gives us the presentation time
instead of the time when SwapBuffers was called.  Suggested by
Keith Packard.  This relies on the fact that the X Present
implementation uses microseconds for UST.

Signed-off-by: Kenneth Graunke 
Cc: Keith Packard 
Cc: Marek Olšák 
---
 src/glx/dri3_glx.c  | 33 -
 src/glx/dri3_priv.h |  6 +-
 2 files changed, 37 insertions(+), 2 deletions(-)

Is this what you had in mind, Keith?  It seems to work fine as well,
and as long as we can rely on UST being in microseconds, it definitely
seems nicer.

diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index e8e5c4a..ff9c2f3 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -361,12 +361,34 @@ dri3_create_drawable(struct glx_screen *base, XID 
xDrawable,
return &pdraw->base;
 }
 
+static void
+show_fps(struct dri3_drawable *draw)
+{
+   const int interval =
+  ((struct dri3_screen *) draw->base.psc)->show_fps_interval;
+
+   draw->frames++;
+
+   /* The Present extension uses microseconds for UST. */
+   if (draw->previous_ust + interval * 100 <= draw->ust) {
+  if (draw->previous_ust) {
+ fprintf(stderr, "libGL: FPS = %.1f\n",
+ ((uint64_t)draw->frames * 100) /
+ (double)(draw->ust - draw->previous_ust));
+  }
+  draw->frames = 0;
+  draw->previous_ust = draw->ust;
+   }
+}
+
 /*
  * Process one Present event
  */
 static void
 dri3_handle_present_event(struct dri3_drawable *priv, 
xcb_present_generic_event_t *ge)
 {
+   struct dri3_screen *psc = (struct dri3_screen *) priv->base.psc;
+
switch (ge->evtype) {
case XCB_PRESENT_CONFIGURE_NOTIFY: {
   xcb_present_configure_notify_event_t *ce = (void *) ge;
@@ -400,6 +422,10 @@ dri3_handle_present_event(struct dri3_drawable *priv, 
xcb_present_generic_event_
   }
   priv->ust = ce->ust;
   priv->msc = ce->msc;
+
+  if (psc->show_fps_interval) {
+ show_fps(priv);
+  }
   break;
}
case XCB_PRESENT_EVENT_IDLE_NOTIFY: {
@@ -1830,7 +1856,7 @@ dri3_create_screen(int screen, struct glx_display * priv)
struct dri3_screen *psc;
__GLXDRIscreen *psp;
struct glx_config *configs = NULL, *visuals = NULL;
-   char *driverName, *deviceName;
+   char *driverName, *deviceName, *tmp;
int i;
 
psc = calloc(1, sizeof *psc);
@@ -1969,6 +1995,11 @@ dri3_create_screen(int screen, struct glx_display * priv)
free(driverName);
free(deviceName);
 
+   tmp = getenv("LIBGL_SHOW_FPS");
+   psc->show_fps_interval = tmp ? atoi(tmp) : 0;
+   if (psc->show_fps_interval < 0)
+  psc->show_fps_interval = 0;
+
return &psc->base;
 
 handle_error:
diff --git a/src/glx/dri3_priv.h b/src/glx/dri3_priv.h
index bdfe224..8e46640 100644
--- a/src/glx/dri3_priv.h
+++ b/src/glx/dri3_priv.h
@@ -138,7 +138,7 @@ struct dri3_screen {
int fd;
int is_different_gpu;
 
-   Bool show_fps;
+   int show_fps_interval;
 };
 
 struct dri3_context
@@ -198,6 +198,10 @@ struct dri3_drawable {
xcb_present_event_t eid;
xcb_gcontext_t gc;
xcb_special_event_t *special_event;
+
+   /* LIBGL_SHOW_FPS support */
+   uint64_t previous_ust;
+   unsigned frames;
 };
 
 
-- 
2.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: Improve the CSE pass debugging output.

2014-10-29 Thread Kenneth Graunke
The CSE pass now prints out why it thinks a value is not a candidate for
adding to the AE set.

Signed-off-by: Kenneth Graunke 
---
 src/glsl/opt_cse.cpp | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/glsl/opt_cse.cpp b/src/glsl/opt_cse.cpp
index 9c96835..b0b67f4 100644
--- a/src/glsl/opt_cse.cpp
+++ b/src/glsl/opt_cse.cpp
@@ -194,6 +194,8 @@ is_cse_candidate_visitor::visit(ir_dereference_variable *ir)
if (ir->var->data.read_only) {
   return visit_continue;
} else {
+  if (debug)
+ printf("CSE: non-candidate: var %s is not read only\n", 
ir->var->name);
   ok = false;
   return visit_stop;
}
@@ -220,8 +222,11 @@ is_cse_candidate(ir_rvalue *ir)
/* Our temporary variable assignment generation isn't ready to handle
 * anything bigger than a vector.
 */
-   if (!ir->type->is_vector() && !ir->type->is_scalar())
+   if (!ir->type->is_vector() && !ir->type->is_scalar()) {
+  if (debug)
+ printf("CSE: non-candidate: not a vector/scalar\n");
   return false;
+   }
 
/* Only handle expressions and textures currently.  We may want to extend
 * to variable-index array dereferences at some point.
@@ -231,6 +236,8 @@ is_cse_candidate(ir_rvalue *ir)
case ir_type_texture:
   break;
default:
+  if (debug)
+ printf("CSE: non-candidate: not an expression/texture\n");
   return false;
}
 
-- 
2.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeon/llvm: Dynamically allocate branch/loop stack arrays

2014-10-29 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Wed, Oct 29, 2014 at 8:58 AM, Michel Dänzer  wrote:
> From: Michel Dänzer 
>
> This prevents us from silently overflowing the stack arrays, and allows
> arbitrary stack depths.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85454
>
> Reported-and-Tested-by: Nick Sarnie 
> Signed-off-by: Michel Dänzer 
> ---
>  src/gallium/drivers/radeon/radeon_llvm.h   | 10 ---
>  .../drivers/radeon/radeon_setup_tgsi_llvm.c| 33 
> --
>  2 files changed, 37 insertions(+), 6 deletions(-)
>
> diff --git a/src/gallium/drivers/radeon/radeon_llvm.h 
> b/src/gallium/drivers/radeon/radeon_llvm.h
> index 00714fb..8612ef8 100644
> --- a/src/gallium/drivers/radeon/radeon_llvm.h
> +++ b/src/gallium/drivers/radeon/radeon_llvm.h
> @@ -33,10 +33,10 @@
>
>  #define RADEON_LLVM_MAX_INPUTS 32 * 4
>  #define RADEON_LLVM_MAX_OUTPUTS 32 * 4
> -#define RADEON_LLVM_MAX_BRANCH_DEPTH 16
> -#define RADEON_LLVM_MAX_LOOP_DEPTH 16
>  #define RADEON_LLVM_MAX_ARRAYS 16
>
> +#define RADEON_LLVM_INITIAL_CF_DEPTH 4
> +
>  #define RADEON_LLVM_MAX_SYSTEM_VALUES 4
>
>  struct radeon_llvm_branch {
> @@ -122,11 +122,13 @@ struct radeon_llvm_context {
>
> /*=== Private Members ===*/
>
> -   struct radeon_llvm_branch branch[RADEON_LLVM_MAX_BRANCH_DEPTH];
> -   struct radeon_llvm_loop loop[RADEON_LLVM_MAX_LOOP_DEPTH];
> +   struct radeon_llvm_branch *branch;
> +   struct radeon_llvm_loop *loop;
>
> unsigned branch_depth;
> +   unsigned branch_depth_max;
> unsigned loop_depth;
> +   unsigned loop_depth_max;
>
> struct tgsi_declaration_range arrays[RADEON_LLVM_MAX_ARRAYS];
> unsigned num_arrays;
> diff --git a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c 
> b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
> index 2fa23ed..c30a9d0 100644
> --- a/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
> +++ b/src/gallium/drivers/radeon/radeon_setup_tgsi_llvm.c
> @@ -446,7 +446,19 @@ static void bgnloop_emit(
> endloop_block, "LOOP");
> LLVMBuildBr(gallivm->builder, loop_block);
> LLVMPositionBuilderAtEnd(gallivm->builder, loop_block);
> -   ctx->loop_depth++;
> +
> +   if (++ctx->loop_depth > ctx->loop_depth_max) {
> +   unsigned new_max = ctx->loop_depth_max << 1;
> +
> +   if (!new_max)
> +   new_max = RADEON_LLVM_INITIAL_CF_DEPTH;
> +
> +   ctx->loop = REALLOC(ctx->loop, ctx->loop_depth_max *
> +   sizeof(ctx->loop[0]),
> +   new_max * sizeof(ctx->loop[0]));
> +   ctx->loop_depth_max = new_max;
> +   }
> +
> ctx->loop[ctx->loop_depth - 1].loop_block = loop_block;
> ctx->loop[ctx->loop_depth - 1].endloop_block = endloop_block;
>  }
> @@ -577,7 +589,18 @@ static void if_cond_emit(
> LLVMBuildCondBr(gallivm->builder, cond, if_block, else_block);
> LLVMPositionBuilderAtEnd(gallivm->builder, if_block);
>
> -   ctx->branch_depth++;
> +   if (++ctx->branch_depth > ctx->branch_depth_max) {
> +   unsigned new_max = ctx->branch_depth_max << 1;
> +
> +   if (!new_max)
> +   new_max = RADEON_LLVM_INITIAL_CF_DEPTH;
> +
> +   ctx->branch = REALLOC(ctx->branch, ctx->branch_depth_max *
> + sizeof(ctx->branch[0]),
> + new_max * sizeof(ctx->branch[0]));
> +   ctx->branch_depth_max = new_max;
> +   }
> +
> ctx->branch[ctx->branch_depth - 1].endif_block = endif_block;
> ctx->branch[ctx->branch_depth - 1].if_block = if_block;
> ctx->branch[ctx->branch_depth - 1].else_block = else_block;
> @@ -1440,4 +1463,10 @@ void radeon_llvm_dispose(struct radeon_llvm_context * 
> ctx)
> LLVMContextDispose(ctx->soa.bld_base.base.gallivm->context);
> FREE(ctx->temps);
> ctx->temps = NULL;
> +   FREE(ctx->loop);
> +   ctx->loop = NULL;
> +   ctx->loop_depth_max = 0;
> +   FREE(ctx->branch);
> +   ctx->branch = NULL;
> +   ctx->branch_depth_max = 0;
>  }
> --
> 2.1.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] egl: rework handling EGL_CONTEXT_FLAGS for ES debug contexts

2014-10-29 Thread Matthew Waters
From: Matthew Waters 

As of version 15 of the EGL_KHR_create_context spec, debug contexts
are allowed for ES contexts.  We should allow creation instead of
erroring.

Signed-off-by: Matthew Waters 
---
 src/egl/main/eglcontext.c  | 51 ++
 src/mesa/drivers/dri/common/dri_util.c | 17 
 2 files changed, 45 insertions(+), 23 deletions(-)

diff --git a/src/egl/main/eglcontext.c b/src/egl/main/eglcontext.c
index 514b91a..ab50fe7 100644
--- a/src/egl/main/eglcontext.c
+++ b/src/egl/main/eglcontext.c
@@ -121,12 +121,51 @@ _eglParseContextAttribList(_EGLContext *ctx, _EGLDisplay 
*dpy,
 
  /* The EGL_KHR_create_context spec says:
   *
-  * "Flags are only defined for OpenGL context creation, and
-  * specifying a flags value other than zero for other types of
-  * contexts, including OpenGL ES contexts, will generate an
-  * error."
+  * "If the EGL_CONTEXT_OPENGL_DEBUG_BIT_KHR flag bit is set in
+  * EGL_CONTEXT_FLAGS_KHR, then a  will be created.
+  * [...]
+  * In some cases a debug context may be identical to a non-debug
+  * context. This bit is supported for OpenGL and OpenGL ES
+  * contexts."
+  */
+ if (api != EGL_OPENGL_API && api != EGL_OPENGL_ES_API
+&& (val & EGL_CONTEXT_OPENGL_DEBUG_BIT_KHR)) {
+err = EGL_BAD_ATTRIBUTE;
+break;
+ }
+
+ /* The EGL_KHR_create_context spec says:
+  *
+  * "If the EGL_CONTEXT_OPENGL_FORWARD_COMPATIBLE_BIT_KHR flag bit
+  * is set in EGL_CONTEXT_FLAGS_KHR, then a 
+  * context will be created. Forward-compatible contexts are
+  * defined only for OpenGL versions 3.0 and later. They must not
+  * support functionality marked as  by that version of
+  * the API, while a non-forward-compatible context must support
+  * all functionality in that version, deprecated or not. This bit
+  * is supported for OpenGL contexts, and requesting a
+  * forward-compatible context for OpenGL versions less than 3.0
+  * will generate an error."
+  */
+ if ((val & EGL_CONTEXT_OPENGL_FORWARD_COMPATIBLE_BIT_KHR)
+&& (api != EGL_OPENGL_API || ctx->ClientMajorVersion < 3)) {
+err = EGL_BAD_ATTRIBUTE;
+break;
+ }
+
+ /* The EGL_KHR_create_context_spec says:
+  *
+  * "If the EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR bit is set in
+  * EGL_CONTEXT_FLAGS_KHR, then a context supporting  will be created. Robust buffer access is defined in the
+  * GL_ARB_robustness extension specification, and the resulting
+  * context must also support either the GL_ARB_robustness
+  * extension, or a version of OpenGL incorporating equivalent
+  * functionality. This bit is supported for OpenGL contexts.
   */
- if (api != EGL_OPENGL_API && val != 0) {
+ if ((val & EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR)
+&& (api != EGL_OPENGL_API
+|| !dpy->Extensions.EXT_create_context_robustness)) {
 err = EGL_BAD_ATTRIBUTE;
 break;
  }
@@ -194,7 +233,7 @@ _eglParseContextAttribList(_EGLContext *ctx, _EGLDisplay 
*dpy,
 break;
  }
 
- ctx->Flags = EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR;
+ ctx->Flags |= EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR;
  break;
 
   default:
diff --git a/src/mesa/drivers/dri/common/dri_util.c 
b/src/mesa/drivers/dri/common/dri_util.c
index 6c78928..7a953ba 100644
--- a/src/mesa/drivers/dri/common/dri_util.c
+++ b/src/mesa/drivers/dri/common/dri_util.c
@@ -376,23 +376,6 @@ driCreateContextAttribs(__DRIscreen *screen, int api,
return NULL;
 }
 
-/* The EGL_KHR_create_context spec says:
- *
- * "Flags are only defined for OpenGL context creation, and specifying
- * a flags value other than zero for other types of contexts,
- * including OpenGL ES contexts, will generate an error."
- *
- * The GLX_EXT_create_context_es2_profile specification doesn't say
- * anything specific about this case.  However, none of the known flags
- * have any meaning in an ES context, so this seems safe.
- */
-if (mesa_api != API_OPENGL_COMPAT
-&& mesa_api != API_OPENGL_CORE
-&& flags != 0) {
-   *error = __DRI_CTX_ERROR_BAD_FLAG;
-   return NULL;
-}
-
 /* There are no forward-compatible contexts before OpenGL 3.0.  The
  * GLX_ARB_create_context spec says:
  *
-- 
2.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] glapi: add function pointers for KHR_debug for gles

2014-10-29 Thread Matthew Waters
From: Matthew Waters 

Signed-off-by: Matthew Waters 
---
 src/mapi/glapi/gen/KHR_debug.xml| 73 +
 src/mesa/main/extensions.c  |  2 +-
 src/mesa/main/tests/dispatch_sanity.cpp | 25 +++
 3 files changed, 99 insertions(+), 1 deletion(-)

diff --git a/src/mapi/glapi/gen/KHR_debug.xml b/src/mapi/glapi/gen/KHR_debug.xml
index 48f7fa7..a5c826c 100644
--- a/src/mapi/glapi/gen/KHR_debug.xml
+++ b/src/mapi/glapi/gen/KHR_debug.xml
@@ -145,6 +145,79 @@
 
   
 
+  
+  
+
+
+
+
+
+
+  
+
+  
+
+
+
+
+
+
+  
+
+  
+
+
+  
+
+  
+
+
+
+
+
+
+
+
+
+  
+
+  
+
+
+
+
+  
+
+  
+
+  
+
+
+
+
+  
+
+  
+
+
+
+
+
+  
+
+  
+
+
+
+  
+
+  
+
+
+
+
+  
+
 
 
 
diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
index 0df04c2..01c3247 100644
--- a/src/mesa/main/extensions.c
+++ b/src/mesa/main/extensions.c
@@ -319,7 +319,7 @@ static const struct extension extension_table[] = {
{ "GL_OES_vertex_array_object", o(dummy_true),  
 ES1 | ES2, 2010 },
 
/* KHR extensions */
-   { "GL_KHR_debug",   o(dummy_true),  
GL, 2012 },
+   { "GL_KHR_debug",   o(dummy_true),  
GL | ES1 | ES2, 2012 },
{ "GL_KHR_context_flush_control",   o(dummy_true),  
GL   | ES2, 2014 },
 
/* Vendor extensions */
diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index 03428dd..6735422 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -1149,6 +1149,19 @@ const struct function gles11_functions_possible[] = {
{ "glUnmapBufferOES", 11, -1 },
{ "glVertexPointer", 11, _gloffset_VertexPointer },
{ "glViewport", 11, _gloffset_Viewport },
+
+   /* GL_KHR_debug */
+   { "glPushDebugGroupKHR", 20, -1 },
+   { "glPopDebugGroupKHR", 20, -1 },
+   { "glDebugMessageCallbackKHR", 20, -1 },
+   { "glDebugMessageControlKHR", 20, -1 },
+   { "glDebugMessageInsertKHR", 20, -1 },
+   { "glGetDebugMessageLogKHR", 20, -1 },
+   { "glGetObjectLabelKHR", 20, -1 },
+   { "glGetObjectPtrLabelKHR", 20, -1 },
+   { "glObjectLabelKHR", 20, -1 },
+   { "glObjectPtrLabelKHR", 20, -1 },
+
{ NULL, 0, -1 }
 };
 
@@ -1372,6 +1385,18 @@ const struct function gles2_functions_possible[] = {
{ "glEndPerfQueryINTEL", 20, -1 },
{ "glGetPerfQueryDataINTEL", 20, -1 },
 
+   /* GL_KHR_debug */
+   { "glPushDebugGroupKHR", 20, -1 },
+   { "glPopDebugGroupKHR", 20, -1 },
+   { "glDebugMessageCallbackKHR", 20, -1 },
+   { "glDebugMessageControlKHR", 20, -1 },
+   { "glDebugMessageInsertKHR", 20, -1 },
+   { "glGetDebugMessageLogKHR", 20, -1 },
+   { "glGetObjectLabelKHR", 20, -1 },
+   { "glGetObjectPtrLabelKHR", 20, -1 },
+   { "glObjectLabelKHR", 20, -1 },
+   { "glObjectPtrLabelKHR", 20, -1 },
+
{ NULL, 0, -1 }
 };
 
-- 
2.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] main/get: make KHR_debug enums available everywhere

2014-10-29 Thread Matthew Waters
From: Matthew Waters 

Although GL_CONTEXT_FLAGS is not explicitly added by KHR_debug,
it contains,

"It is implementation defined how much debug output is generated if
the context was created without the CONTEXT_DEBUG_BIT set. This is a new
query bit added to the existing GL_CONTEXT_FLAGS state to specify whether
the context was created with debug enabled."

implying the GL_CONTEXT_FLAGS parameter is supported whenever KHR_debug
is also supported.

Signed-off-by: Matthew Waters 
---
 src/mesa/main/get_hash_params.py | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index a931d9d..2b0f1e3 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -124,6 +124,18 @@ descriptor=[
 
 # GL_EXT_texture_filter_anisotropic
   [ "MAX_TEXTURE_MAX_ANISOTROPY_EXT", 
"CONTEXT_FLOAT(Const.MaxTextureMaxAnisotropy), 
extra_EXT_texture_filter_anisotropic" ],
+
+# GL_KHR_debug (GL 4.3)/ GL_ARB_debug_output
+  [ "DEBUG_LOGGED_MESSAGES", "LOC_CUSTOM, TYPE_INT, 0, NO_EXTRA" ],
+  [ "DEBUG_NEXT_LOGGED_MESSAGE_LENGTH", "LOC_CUSTOM, TYPE_INT, 0, NO_EXTRA" ],
+  [ "MAX_DEBUG_LOGGED_MESSAGES", "CONST(MAX_DEBUG_LOGGED_MESSAGES), NO_EXTRA" 
],
+  [ "MAX_DEBUG_MESSAGE_LENGTH", "CONST(MAX_DEBUG_MESSAGE_LENGTH), NO_EXTRA" ],
+  [ "MAX_LABEL_LENGTH", "CONST(MAX_LABEL_LENGTH), NO_EXTRA" ],
+  [ "MAX_DEBUG_GROUP_STACK_DEPTH", "CONST(MAX_DEBUG_GROUP_STACK_DEPTH), 
NO_EXTRA" ],
+  [ "DEBUG_GROUP_STACK_DEPTH", "LOC_CUSTOM, TYPE_INT, 0, NO_EXTRA" ],
+
+# GL 3.0 / KHR_debug
+  [ "CONTEXT_FLAGS", "CONTEXT_INT(Const.ContextFlags), NO_EXTRA" ],
 ]},
 
 # Enums in OpenGL and GLES1
@@ -699,9 +711,6 @@ descriptor=[
 # GL_ARB_sampler_objects / GL 3.3
   [ "SAMPLER_BINDING", "LOC_CUSTOM, TYPE_INT, GL_SAMPLER_BINDING, NO_EXTRA" ],
 
-# GL 3.0
-  [ "CONTEXT_FLAGS", "CONTEXT_INT(Const.ContextFlags), extra_version_30" ],
-
 # GL3.0 / GL_EXT_framebuffer_sRGB
   [ "FRAMEBUFFER_SRGB_EXT", "CONTEXT_BOOL(Color.sRGBEnabled), 
extra_EXT_framebuffer_sRGB" ],
   [ "FRAMEBUFFER_SRGB_CAPABLE_EXT", "BUFFER_INT(Visual.sRGBCapable), 
extra_EXT_framebuffer_sRGB_and_new_buffers" ],
@@ -723,15 +732,6 @@ descriptor=[
 # GL_ARB_robustness
   [ "RESET_NOTIFICATION_STRATEGY_ARB", "CONTEXT_ENUM(Const.ResetStrategy), 
NO_EXTRA" ],
 
-# GL_KHR_debug (GL 4.3)/ GL_ARB_debug_output
-  [ "DEBUG_LOGGED_MESSAGES", "LOC_CUSTOM, TYPE_INT, 0, NO_EXTRA" ],
-  [ "DEBUG_NEXT_LOGGED_MESSAGE_LENGTH", "LOC_CUSTOM, TYPE_INT, 0, NO_EXTRA" ],
-  [ "MAX_DEBUG_LOGGED_MESSAGES", "CONST(MAX_DEBUG_LOGGED_MESSAGES), NO_EXTRA" 
],
-  [ "MAX_DEBUG_MESSAGE_LENGTH", "CONST(MAX_DEBUG_MESSAGE_LENGTH), NO_EXTRA" ],
-  [ "MAX_LABEL_LENGTH", "CONST(MAX_LABEL_LENGTH), NO_EXTRA" ],
-  [ "MAX_DEBUG_GROUP_STACK_DEPTH", "CONST(MAX_DEBUG_GROUP_STACK_DEPTH), 
NO_EXTRA" ],
-  [ "DEBUG_GROUP_STACK_DEPTH", "LOC_CUSTOM, TYPE_INT, 0, NO_EXTRA" ],
-
   [ "MAX_DUAL_SOURCE_DRAW_BUFFERS", 
"CONTEXT_INT(Const.MaxDualSourceDrawBuffers), extra_ARB_blend_func_extended" ],
 
 # GL_ARB_uniform_buffer_object
-- 
2.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/3] add KHR_debug for gles contexts

2014-10-29 Thread Matthew Waters
- rebase and resend.

v3:
 - fix up the EGL_CONTEXT_OPENGL_ROBUST_ACCESS_BIT_KHR check

v2:
 - replace &= with |=
 - remove offset="assign" from the api xml

Matthew Waters (3):
  egl: rework handling EGL_CONTEXT_FLAGS for ES debug contexts
  glapi: add function pointers for KHR_debug for gles
  main/get: make KHR_debug enums available everywhere

 src/egl/main/eglcontext.c   | 51 ---
 src/mapi/glapi/gen/KHR_debug.xml| 73 +
 src/mesa/drivers/dri/common/dri_util.c  | 17 
 src/mesa/main/extensions.c  |  2 +-
 src/mesa/main/get_hash_params.py| 24 +--
 src/mesa/main/tests/dispatch_sanity.cpp | 25 +++
 6 files changed, 156 insertions(+), 36 deletions(-)

-- 
2.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH V5] mesa: add SSE optimisation for glDrawElements

2014-10-29 Thread Timothy Arceri
Makes use of SSE to speed up compute of min and max elements

Callgrind cpu usage results from pts benchmarks:

Openarena 0.8.8: 3.67% -> 1.03%
UrbanTerror: 2.36% -> 0.81%

V5:
- actually make use of the optimisation in android (Emil Velikov)
- set a better array size limit for using SSE and added TODO

V4:
- fixed bugs with incrementing pointer and updating counters

V3:
- Removed sse_minmax.c from Makefile.sources
- handle the first few values without SSE until the pointer is aligned
 and use _mm_load_si128 rather than _mm_loadu_si128
- guard the call to the SSE code better at build time

V2:
- removed GL* types
- use _mm_store_si128() rather than _mm_store_ps()
- add runtime check for SSE
- use aligned attribute for local mix/max
- bunch of tidyups

Signed-off-by: Timothy Arceri 
---
 src/mesa/Android.libmesa_dricore.mk |  8 ++-
 src/mesa/Android.libmesa_st_mesa.mk |  5 ++
 src/mesa/Makefile.am|  3 +-
 src/mesa/main/sse_minmax.c  | 97 +
 src/mesa/main/sse_minmax.h  | 30 
 src/mesa/vbo/vbo_exec_array.c   | 14 --
 6 files changed, 152 insertions(+), 5 deletions(-)
 create mode 100644 src/mesa/main/sse_minmax.c
 create mode 100644 src/mesa/main/sse_minmax.h

diff --git a/src/mesa/Android.libmesa_dricore.mk 
b/src/mesa/Android.libmesa_dricore.mk
index 1e6d948..2ab593d 100644
--- a/src/mesa/Android.libmesa_dricore.mk
+++ b/src/mesa/Android.libmesa_dricore.mk
@@ -51,10 +51,16 @@ endif # MESA_ENABLE_ASM
 
 ifeq ($(ARCH_X86_HAVE_SSE4_1),true)
 LOCAL_SRC_FILES += \
-   $(SRCDIR)main/streaming-load-memcpy.c
+   $(SRCDIR)main/streaming-load-memcpy.c \
+   $(SRCDIR)main/sse_minmax.c
 LOCAL_CFLAGS := -msse4.1
 endif
 
+ifeq ($(ARCH_X86_HAVE_SSE4_1),true)
+LOCAL_CFLAGS += \
+   -DUSE_SSE41
+endif
+
 LOCAL_C_INCLUDES := \
$(call intermediates-dir-for STATIC_LIBRARIES,libmesa_program,,) \
$(MESA_TOP)/src \
diff --git a/src/mesa/Android.libmesa_st_mesa.mk 
b/src/mesa/Android.libmesa_st_mesa.mk
index 8b8d652..618d6bf 100644
--- a/src/mesa/Android.libmesa_st_mesa.mk
+++ b/src/mesa/Android.libmesa_st_mesa.mk
@@ -48,6 +48,11 @@ ifeq ($(TARGET_ARCH),x86)
 endif # x86
 endif # MESA_ENABLE_ASM
 
+ifeq ($(ARCH_X86_HAVE_SSE4_1),true)
+LOCAL_CFLAGS := \
+   -DUSE_SSE41
+endif
+
 LOCAL_C_INCLUDES := \
$(call intermediates-dir-for STATIC_LIBRARIES,libmesa_program,,) \
$(MESA_TOP)/src/gallium/auxiliary \
diff --git a/src/mesa/Makefile.am b/src/mesa/Makefile.am
index e71bccb..932db4f 100644
--- a/src/mesa/Makefile.am
+++ b/src/mesa/Makefile.am
@@ -151,7 +151,8 @@ libmesagallium_la_LIBADD = \
$(ARCH_LIBS)
 
 libmesa_sse41_la_SOURCES = \
-   main/streaming-load-memcpy.c
+   main/streaming-load-memcpy.c \
+   main/sse_minmax.c
 libmesa_sse41_la_CFLAGS = $(AM_CFLAGS) -msse4.1
 
 pkgconfigdir = $(libdir)/pkgconfig
diff --git a/src/mesa/main/sse_minmax.c b/src/mesa/main/sse_minmax.c
new file mode 100644
index 000..91a55e5
--- /dev/null
+++ b/src/mesa/main/sse_minmax.c
@@ -0,0 +1,97 @@
+/*
+ * Copyright © 2014 Timothy Arceri
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Author:
+ *Timothy Arceri 
+ *
+ */
+
+#ifdef __SSE4_1__
+#include "main/sse_minmax.h"
+#include 
+#include 
+
+void
+_mesa_uint_array_min_max(const unsigned *ui_indices, unsigned *min_index,
+ unsigned *max_index, const unsigned count)
+{
+   unsigned max_ui = 0;
+   unsigned min_ui = ~0U;
+   unsigned i = 0;
+   unsigned aligned_count = count;
+
+   /* handle the first few values without SSE until the pointer is aligned */
+   while (((uintptr_t)ui_indices & 15) && aligned_count) {
+  if (*ui_indices > max_ui)
+ max_ui = *ui_indices;
+  if (*ui_indices < min_ui)
+ min_ui = *ui_indices;
+
+  aligned_count--;
+  ui_indices++;
+   }
+
+   /* TODO: The actual threshold for SSE begin usef

Re: [Mesa-dev] [PATCH V4] mesa: add SSE optimisation for glDrawElements

2014-10-29 Thread Timothy Arceri
On Wed, 2014-10-29 at 16:58 +1100, Timothy Arceri wrote:
> On Tue, 2014-10-28 at 22:14 +, Bruno Jimenez wrote:
> > Hi,
> > 
> > I haven't had time to play yet with OpenMP, but I have seen the assembly
> > it produces in my computer. If I enable SSE2 it can use it, and if I
> > enable SSE4.1 it uses the parallel max. But it uses unaligned loads, and
> > since we are trying to avoid them I don't know if we want to use just
> > OpenMP.
> > 
> > Processing the first unaligned elements by hand and using the
> > __builtin_assume_aligned as the article you link allows OpenMP to use
> > aligned loads.
> > 
> > Also, it must be noted that I am using GCC, I don't know what Clang may
> > produce (for the plain algorithms, as it doesn't support OpenMP if I
> > recall correctly).
> > 
> > When I have time I'll try to collect all the variants and make some kind
> > of benchmark between them to see if it is worthy to use any of them.
> > 
> > Also, do we know if 'count' has an upper bound? or if we can force the
> > array to be aligned so we don't have to worry about the first items?
> 
> I'm not sure about count but the indices array comes from the
> glDrawRangeElements() call so I don't think we can do anything about the
> alignment.

Sorry I meant glDrawElements()

>  
> 
> > 
> > BTW, Thanks for the article!
> > - Bruno
> > 
> > > 
> > > 
> > > > 
> > > > - Bruno
> > > > 
> > > > > 
> > > > > > - Bruno
> > > > > > 
> > > > > >> +  unsigned max_arr[4] __attribute__ ((aligned (16)));
> > > > > >> +  unsigned min_arr[4] __attribute__ ((aligned (16)));
> > > > > >> +  unsigned vec_count;
> > > > > >> +  __m128i max_ui4 = _mm_setzero_si128();
> > > > > >> +  __m128i min_ui4 = _mm_set1_epi32(~0U);
> > > > > >> +  __m128i ui_indices4;
> > > > > >> +  __m128i *ui_indices_ptr;
> > > > > >> +
> > > > [snip]
> > > 
> > > 
> > 
> > 
> 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PULL] i965: rename brw_gs -> brw_ff_gs; rename brw_vec4_gs -> brw_gs.

2014-10-29 Thread Iago Toral
On Tue, 2014-10-28 at 19:27 -0700, Kenneth Graunke wrote:
> Hello,
> 
> I'd like to rename some files in i965:
> 
> - brw_gs.c  -> brw_ff_gs.c
> - brw_gs.h  -> brw_ff_gs.h
> - brw_gs_emit.c -> brw_ff_gs_emit.c
> - brw_vec4_gs.c -> brw_gs.c
> - brw_vec4_gs.h -> brw_gs.h
> 
> The current "brw_gs" files are about emulating fixed-function functionality 
> (VF primitive decomposition and SOL) via the geometry shader; actual 
> programmable geometry shader code is handled by brw_vec4_gs.[ch].
> 
> With the advent of SIMD8 geometry shaders, "vec4_gs" will be confusing.  "gs" 
> is nicer.  Most of the legacy code uses the "ff_gs" name already - when Paul 
> respun his GS series, he renamed everything, but didn't change the filenames.
> 
> Objections?  Acks?
> 
> The "i965-reorg" branch of my tree (~kwg/mesa) has two patches to do the 
> renames:
> 
> i965: Rename brw_vec4_gs.[ch] to brw_gs.[ch].
> i965: Rename brw_gs{,_emit}.[ch] to brw_ff_gs{,_emit}.[ch].
> 
> Since they're purely "git mv" and #include fixes, I figured mailing out the 
> diff would be useless.

I think this is a good idea. I remember being a big confused by the file
names while I was working on geometry shaders. the new names looks
better to me.

Iago

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glx/dri3: Implement LIBGL_SHOW_FPS=1 for DRI3/Present.

2014-10-29 Thread Keith Packard
Kenneth Graunke  writes:

> v2: Use the UST value provided in the PRESENT_COMPLETE_NOTIFY event
> rather than gettimeofday(), which gives us the presentation time
> instead of the time when SwapBuffers was called.  Suggested by
> Keith Packard.  This relies on the fact that the X Present
> implementation uses microseconds for UST.
>
> Signed-off-by: Kenneth Graunke 
> Cc: Keith Packard 
> Cc: Marek Olšák 
> ---
>  src/glx/dri3_glx.c  | 33 -
>  src/glx/dri3_priv.h |  6 +-
>  2 files changed, 37 insertions(+), 2 deletions(-)
>
> Is this what you had in mind, Keith?  It seems to work fine as well,
> and as long as we can rely on UST being in microseconds, it definitely
> seems nicer.

Present doesn't actually define UST at this point, but I think we can
just fix that; it seems useless to *not* define it, and microseconds
seems like a fine resolution for this clock. Certainly anything using
DRI3 will use microseconds as that's the kind of time stamps it uses.

> diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
> index e8e5c4a..ff9c2f3 100644
> --- a/src/glx/dri3_glx.c
> +++ b/src/glx/dri3_glx.c
> @@ -361,12 +361,34 @@ dri3_create_drawable(struct glx_screen *base, XID 
> xDrawable,
> return &pdraw->base;
>  }
>  
> +static void
> +show_fps(struct dri3_drawable *draw)
> +{
> +   const int interval =
> +  ((struct dri3_screen *) draw->base.psc)->show_fps_interval;
> +
> +   draw->frames++;
> +
> +   /* The Present extension uses microseconds for UST. */
> +   if (draw->previous_ust + interval * 100 <= draw->ust) {

Might want a cast here before the multiply, otherwise that gets done
with only 32 bits. It probably doesn't matter because interval is likely
to be small.

> +  if (draw->previous_ust) {
> + fprintf(stderr, "libGL: FPS = %.1f\n",
> + ((uint64_t)draw->frames * 100) /
> + (double)(draw->ust - draw->previous_ust));
> +  }
> +  draw->frames = 0;
> +  draw->previous_ust = draw->ust;
> +   }
> +}
> +
>  /*
>   * Process one Present event
>   */
>  static void
>  dri3_handle_present_event(struct dri3_drawable *priv, 
> xcb_present_generic_event_t *ge)
>  {
> +   struct dri3_screen *psc = (struct dri3_screen *) priv->base.psc;
> +
> switch (ge->evtype) {
> case XCB_PRESENT_CONFIGURE_NOTIFY: {
>xcb_present_configure_notify_event_t *ce = (void *) ge;
> @@ -400,6 +422,10 @@ dri3_handle_present_event(struct dri3_drawable *priv, 
> xcb_present_generic_event_
>}
>priv->ust = ce->ust;
>priv->msc = ce->msc;
> +
> +  if (psc->show_fps_interval) {
> + show_fps(priv);
> +  }

This actually needs to be inside the COMPLETE_KIND_PIXMAP; this same
event is delivered when the application gets the current MSC, and you
don't want to count those.

-- 
keith.pack...@intel.com


pgpvBPF1D0o1I.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] gk20a: use NOUVEAU_BO_GART as VRAM domain

2014-10-29 Thread Ilia Mirkin
On Mon, Oct 27, 2014 at 6:34 AM, Alexandre Courbot  wrote:
> GK20A does not have dedicated VRAM, therefore allocating in VRAM can be
> sub-optimal and sometimes even harmful. Set its VRAM domain to
> NOUVEAU_BO_GART so all objects are allocated in system memory.
>
> Signed-off-by: Alexandre Courbot 
> ---
>  src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
> index ac5823e4a8d5..ad143cd9a140 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c
> @@ -620,6 +620,16 @@ nvc0_screen_create(struct nouveau_device *dev)
>return NULL;
> pscreen = &screen->base.base;
>
> +   /* Recognize chipsets with no VRAM */
> +   switch (dev->chipset) {
> +   /* GK20A */
> +   case 0xea:
> +  screen->base.vram_domain = NOUVEAU_BO_GART;

I think you also want to set vidmem_bindings = 0... although
potentially after the |= that's done below. Although I guess that
constbuf + command args buf need to be |='d into the sysmem_bindings
for this to work out well. That said, we don't really handle explicit
migration well right now, and those PIPE_BIND_* are *incredibly*
misleading and don't actually necessarily reflect the current usage.
[I have some patches to improve the situation, but you don't really
have to worry about that.]

> +  break;
> +   default:
> +  break;
> +   }
> +
> ret = nouveau_screen_init(&screen->base, dev);
> if (ret) {
>nvc0_screen_destroy(pscreen);
> --
> 2.1.2
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 84566] Unify the format conversion code

2014-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=84566

--- Comment #45 from Jason Ekstrand  ---
(In reply to Iago Toral from comment #44)
> (In reply to Iago Toral from comment #43)
> (...)
> > 3) Luminance formats have special requirements. A conversion to Luminance
> > from RGBA requires to do L=R+G+B for example. This is something that
> > _mesa_format_convert cannot achieve at the moment, because neither
> > pack/unpack functions nor _mesa_swizzle_and_convert do this kind of
> > operation, so I wonder what is the right thing to do here. We could run
> > another pass that computes these values after the conversion. We could do
> > this inside _mesa_format_convert or in the client, after calling
> > _mesa_format_convert I suppose there are no other options. The current code
> > does another pass specifically for this, right before packing to the dst.
> 
> And likewise, a conversion to RGBA from Luminance requires to do
> R=L,G=0,B=0,A=1, but unpack functions and _mesa_swizzle_and_convert will do
> R=L,G=L,B=L,A=1, so again we need another pass to correct this to the
> expected result.

That one should be easy.  We can just have a fake internal format for RA.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Improve the CSE pass debugging output.

2014-10-29 Thread Matt Turner
Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 84566] Unify the format conversion code

2014-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=84566

--- Comment #46 from Jason Ekstrand  ---
(In reply to Iago Toral from comment #43)
> Jason, we are running into some issues when attempting to use
> _mesa_format_convert for glReadPixels and glGetTexImage.
> 
> Generally, one thing that is different in this case is that the current
> implementation never attempts direct conversions from src to dst (except for
> a couple of specific cases that had been written ad-hoc).
> 
> These functions do the conversion in 3 steps:
> 1) unpack to RGBA (float or  uint).
> 2) Rebase (to consider the base format of the source pixel data).
> 3) Pack to dst (this not only packs, also handles type conversion,
> transferops, semantics specific to things like luminance formats, type
> conversions, etc). 
> 
> So we are hitting various issues when attempting to replace this logic with
> _mesa_format_convert:
> 
> 1) You mentioned that things like transferops should be handled by the
> client. To achieve  this we have _mesa_apply_rgba_transfer_ops, which
> requires an RGBA format, so converting (in the client) to RGBA first is
> required in these cases so they can use this function. I suppose this is
> okay since this is what the current implementation is doing in all cases
> right now, with or without transferops.
> 
> 2) So far we have been focusing on pixel uploads. I mentioned then that we
> needed to add a new parameter to _mesa_format_convert to consider the base
> internal format we are converting to. Well, now we have the same situation
> but with the format we are converting from, and the semantics are different
> (in the sense that we need to know if we have a base format for the source
> or the destination in order to know what we need to do, like computing the
> right swizzle transform). I suppose that we could add another parameter to
> _mesa_format_convert and all the logic necessary to do the right thing in
> that case too. This, will complicate the implementation a bit I think, but I
> guess is the most consistent option. That said, the alternative would be to
> always transform to RGBA first, then call _mesa_rebase_rgba_* as the current
> code does... since the current code always convert to RGBA first we would
> not be losing performance and we would not have to add all the logic to
> _mesa_format_convert to account for a source base format. It would be less
> consistent though since _mesa_format_convert would support base formats for
> the dst but not for the src.
> 
> 3) Luminance formats have special requirements. A conversion to Luminance
> from RGBA requires to do L=R+G+B for example. This is something that
> _mesa_format_convert cannot achieve at the moment, because neither
> pack/unpack functions nor _mesa_swizzle_and_convert do this kind of
> operation, so I wonder what is the right thing to do here. We could run
> another pass that computes these values after the conversion. We could do
> this inside _mesa_format_convert or in the client, after calling
> _mesa_format_convert I suppose there are no other options. The current code
> does another pass specifically for this, right before packing to the dst.
> 
> 4) Step 3 does a lot of clamping work even after handling transfer ops,
> (right after packing to the dst). I am not sure that pack/unpack functions
> and _mesa_swizzle_and_convert will do the thing we need in all cases, but
> right now I don't have a specific example, so we can burn that bridge once
> we get there.

My though there was to do the following (in pseudocode)

if (!has_transfer_ops && format != LUMINANCE_ALPHA) {
mesa_format_convert(dst, src);
} else {

}

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 77449] Tracker bug for all bugs related to Steam titles

2014-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=77449

Neil Roberts  changed:

   What|Removed |Added

 Depends on||83908

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/14] i965: Add SIMD8 URB write low-level IR instruction

2014-10-29 Thread Kristian Høgsberg
On Tue, Oct 28, 2014 at 4:17 PM, Matt Turner  wrote:
> On Tue, Oct 28, 2014 at 3:17 PM, Kristian Høgsberg  wrote:
>> This is all we need from the generator for SIMD8 vertex shaders.  This
>> opcode is just the send instruction, all the hard work will happen
>> in the visitor using LOAD_PAYLOAD.
>>
>> Signed-off-by: Kristian Høgsberg 
>> ---
>>  src/mesa/drivers/dri/i965/brw_defines.h   |  1 +
>>  src/mesa/drivers/dri/i965/brw_fs.cpp  |  4 
>>  src/mesa/drivers/dri/i965/brw_fs.h|  1 +
>>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp| 25 
>> +++
>>  src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 16 ++-
>>  src/mesa/drivers/dri/i965/brw_shader.cpp  |  1 +
>>  6 files changed, 47 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
>> b/src/mesa/drivers/dri/i965/brw_defines.h
>> index ab45d3d..bc7304b 100644
>> --- a/src/mesa/drivers/dri/i965/brw_defines.h
>> +++ b/src/mesa/drivers/dri/i965/brw_defines.h
>> @@ -1520,6 +1520,7 @@ enum brw_message_target {
>>
>>  #define BRW_URB_OPCODE_WRITE_HWORD  0
>>  #define BRW_URB_OPCODE_WRITE_OWORD  1
>> +#define BRW_URB_OPCODE_SIMD8_WRITE  7
>
> BSpec is failing me -- if this is Gen8+, prefix with GEN8 rather than BRW.

It is, I'll fix the prefix.

>>
>>  #define BRW_URB_SWIZZLE_NONE  0
>>  #define BRW_URB_SWIZZLE_INTERLEAVE1
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> index 97fefff..815c8c2 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> @@ -509,6 +509,7 @@ fs_inst::is_send_from_grf() const
>> case FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET:
>> case SHADER_OPCODE_UNTYPED_ATOMIC:
>> case SHADER_OPCODE_UNTYPED_SURFACE_READ:
>> +   case VS_OPCODE_URB_WRITE:
>
> Presumably we'll do SIMD8 geometry shaders (and tessellation in the
> future?). As a follow on, could we consolidate [GV]S_OPCODE_URB_WRITE
> into one SHADER_OPCODE_URB_WRITE?

Yeah, good point, I'll just update the opcode now.

Kristian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/14] i965: Prepare for using the ATTR register file in the fs backend

2014-10-29 Thread Kristian Høgsberg
On Tue, Oct 28, 2014 at 4:33 PM, Matt Turner  wrote:
> On Tue, Oct 28, 2014 at 3:17 PM, Kristian Høgsberg  wrote:
>> The scalar vertex shader will use the ATTR register file for vertex
>> attributes.  This patch adds support for the ATTR file to fs_visitor.
>>
>> Signed-off-by: Kristian Høgsberg 
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs.cpp   | 12 ++--
>>  src/mesa/drivers/dri/i965/brw_fs.h |  3 +++
>>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp |  2 ++
>>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   | 11 +--
>>  4 files changed, 24 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> index 815c8c2..e8819ef 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> @@ -76,7 +76,7 @@ fs_inst::init(enum opcode opcode, uint8_t exec_size, const 
>> fs_reg &dst,
>>   this->exec_size = dst.width;
>>} else {
>>   for (int i = 0; i < sources; ++i) {
>> -if (src[i].file != GRF)
>> +if (src[i].file != GRF && src[i].file != ATTR)
>> continue;
>>
>>  if (this->exec_size <= 1)
>> @@ -97,6 +97,7 @@ fs_inst::init(enum opcode opcode, uint8_t exec_size, const 
>> fs_reg &dst,
>>   break;
>>case GRF:
>>case HW_REG:
>> +  case ATTR:
>>   assert(this->src[i].width > 0);
>>   if (this->src[i].width == 1) {
>>  this->src[i].effective_width = this->exec_size;
>> @@ -121,6 +122,7 @@ fs_inst::init(enum opcode opcode, uint8_t exec_size, 
>> const fs_reg &dst,
>> case GRF:
>> case HW_REG:
>> case MRF:
>> +   case ATTR:
>>this->regs_written = (dst.width * dst.stride * type_sz(dst.type) + 
>> 31) / 32;
>>break;
>> case BAD_FILE:
>> @@ -636,7 +638,7 @@ fs_reg::is_contiguous() const
>>  bool
>>  fs_reg::is_valid_3src() const
>>  {
>> -   return file == GRF || file == UNIFORM;
>> +   return file == GRF || file == UNIFORM || file == ATTR;
>>  }
>>
>>  int
>> @@ -3148,6 +3150,9 @@ fs_visitor::dump_instruction(backend_instruction 
>> *be_inst, FILE *file)
>> case UNIFORM:
>>fprintf(file, "***u%d***", inst->dst.reg + inst->dst.reg_offset);
>>break;
>> +   case ATTR:
>> +  fprintf(file, "a%d", inst->dst.reg + inst->dst.reg_offset);
>
> a0 is an address register, not this. Print these like the vec4 code
> does -- attr%d.

Ugh, right.

>> +  break;
>> case HW_REG:
>>if (inst->dst.fixed_hw_reg.file == BRW_ARCHITECTURE_REGISTER_FILE) {
>>   switch (inst->dst.fixed_hw_reg.nr) {
>> @@ -3199,6 +3204,9 @@ fs_visitor::dump_instruction(backend_instruction 
>> *be_inst, FILE *file)
>>case MRF:
>>   fprintf(file, "***m%d***", inst->src[i].reg);
>>   break;
>> +  case ATTR:
>> + fprintf(file, "a%d", inst->src[i].reg + inst->src[i].reg_offset);
>
> and here.
>
>> + break;
>>case UNIFORM:
>>   fprintf(file, "u%d", inst->src[i].reg + inst->src[i].reg_offset);
>>   if (inst->src[i].reladdr) {
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
>> b/src/mesa/drivers/dri/i965/brw_fs.h
>> index 67a5cdd..8d60544 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.h
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
>> @@ -132,6 +132,7 @@ byte_offset(fs_reg reg, unsigned delta)
>> case BAD_FILE:
>>break;
>> case GRF:
>> +   case ATTR:
>>reg.reg_offset += delta / 32;
>>break;
>> case MRF:
>> @@ -157,6 +158,7 @@ horiz_offset(fs_reg reg, unsigned delta)
>>break;
>> case GRF:
>> case MRF:
>> +   case ATTR:
>>return byte_offset(reg, delta * reg.stride * type_sz(reg.type));
>> default:
>>assert(delta == 0);
>> @@ -173,6 +175,7 @@ offset(fs_reg reg, unsigned delta)
>>break;
>> case GRF:
>> case MRF:
>> +   case ATTR:
>>return byte_offset(reg, delta * reg.width * reg.stride * 
>> type_sz(reg.type));
>> case UNIFORM:
>>reg.reg_offset += delta;
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>> index a463386..74fe79c 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
>> @@ -1272,6 +1272,8 @@ brw_reg_from_fs_reg(fs_reg *reg)
>>break;
>> case UNIFORM:
>>unreachable("not reached");
>> +   case ATTR:
>> +  unreachable("not reached");
>
> How about
>
>case UNIFORM:
>case ATTR:
>default:
>   unreachable("not reached");
>
> instead?
>
>> default:
>>unreachable("not reached");
>> }

Yeah... I think the idea with having different cases for the different
unexpected values is that you can see which one it is from the
assertion failure (from the line number).  But in that case you'll
want to run it under gdb anyway, so I'll just leave only the default
case (like

[Mesa-dev] [PATCH 2/2] util: Move bitset to the util/ folder

2014-10-29 Thread Jason Ekstrand
---
 .../drivers/dri/i965/brw_fs_copy_propagation.cpp   |   2 +-
 src/mesa/drivers/dri/i965/brw_fs_live_variables.h  |   2 +-
 .../drivers/dri/i965/brw_performance_monitor.c |   2 +-
 .../drivers/dri/i965/brw_vec4_live_variables.h |   2 +-
 src/mesa/drivers/dri/nouveau/nouveau_context.h |   2 +-
 src/mesa/main/bitset.h | 100 -
 src/mesa/main/performance_monitor.c|   2 +-
 src/mesa/main/texstate.c   |   2 +-
 src/util/bitset.h  |  99 
 src/util/register_allocate.c   |   2 +-
 10 files changed, 107 insertions(+), 108 deletions(-)
 delete mode 100644 src/mesa/main/bitset.h
 create mode 100644 src/util/bitset.h

diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index e1989cb..1a97153 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -34,7 +34,7 @@
 
 #define ACP_HASH_SIZE 16
 
-#include "main/bitset.h"
+#include "util/bitset.h"
 #include "brw_fs.h"
 #include "brw_cfg.h"
 
diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.h 
b/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
index 6cc8a98..d5f883d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
+++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
@@ -26,7 +26,7 @@
  */
 
 #include "brw_fs.h"
-#include "main/bitset.h"
+#include "util/bitset.h"
 
 struct cfg_t;
 
diff --git a/src/mesa/drivers/dri/i965/brw_performance_monitor.c 
b/src/mesa/drivers/dri/i965/brw_performance_monitor.c
index edfa3d2..c174c81 100644
--- a/src/mesa/drivers/dri/i965/brw_performance_monitor.c
+++ b/src/mesa/drivers/dri/i965/brw_performance_monitor.c
@@ -44,7 +44,7 @@
 
 #include 
 
-#include "main/bitset.h"
+#include "util/bitset.h"
 #include "main/hash.h"
 #include "main/macros.h"
 #include "main/mtypes.h"
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h 
b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
index 03cc813..b50a36a 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
@@ -25,7 +25,7 @@
  *
  */
 
-#include "main/bitset.h"
+#include "util/bitset.h"
 #include "brw_vec4.h"
 
 namespace brw {
diff --git a/src/mesa/drivers/dri/nouveau/nouveau_context.h 
b/src/mesa/drivers/dri/nouveau/nouveau_context.h
index 8ea431b..b6cbde4 100644
--- a/src/mesa/drivers/dri/nouveau/nouveau_context.h
+++ b/src/mesa/drivers/dri/nouveau/nouveau_context.h
@@ -32,7 +32,7 @@
 #include "nouveau_scratch.h"
 #include "nouveau_render.h"
 
-#include "main/bitset.h"
+#include "util/bitset.h"
 
 enum nouveau_fallback {
HWTNL = 0,
diff --git a/src/mesa/main/bitset.h b/src/mesa/main/bitset.h
deleted file mode 100644
index f50b14f..000
--- a/src/mesa/main/bitset.h
+++ /dev/null
@@ -1,100 +0,0 @@
-/*
- * Mesa 3-D graphics library
- *
- * Copyright (C) 2006  Brian Paul   All Rights Reserved.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included
- * in all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
- * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- */
-
-/**
- * \file bitset.h
- * \brief Bitset of arbitrary size definitions.
- * \author Michal Krol
- */
-
-#ifndef BITSET_H
-#define BITSET_H
-
-#include "imports.h"
-#include "util/bitcount.h"
-
-/
- * generic bitset implementation
- */
-
-#define BITSET_WORD GLuint
-#define BITSET_WORDBITS (sizeof (BITSET_WORD) * 8)
-
-/* bitset declarations
- */
-#define BITSET_WORDS(bits) (ALIGN(bits, BITSET_WORDBITS) / BITSET_WORDBITS)
-#define BITSET_DECLARE(name, bits) BITSET_WORD name[BITSET_WORDS(bits)]
-
-/* bitset operations
- */
-#define BITSET_COPY(x, y) memcpy( (x), (y), sizeof (x) )
-#define BITSET_EQUAL(x, y) (memcmp( (x), (y), sizeof (x) ) == 0)
-#define BITSET_ZERO(x) memset( (x), 0, sizeof (x) )
-#define BITSET_ONES(x) memset

Re: [Mesa-dev] [PATCH 09/14] i965: Move more code into codegen-branch of the fs_visitor::run() if statement

2014-10-29 Thread Kristian Høgsberg
On Tue, Oct 28, 2014 at 4:36 PM, Matt Turner  wrote:
> On Tue, Oct 28, 2014 at 3:17 PM, Kristian Høgsberg  wrote:
>> These last few operations all only apply when we've actually generated code,
>> optimized and allocated registers.  The dummy and the repclear shaders don't
>> touch uncompressed_stack, don't need the gen4 send workaround, and don't
>> spill.  This means we can move these lines into the else-branch, which will
>> make the following refactoring easier.
>>
>> Signed-off-by: Kristian Høgsberg 
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs.cpp | 24 
>>  1 file changed, 12 insertions(+), 12 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> index e8819ef..cfb56bb 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> @@ -3649,22 +3649,22 @@ fs_visitor::run()
>> break;
>>   }
>>}
>> -   }
>> -   assert(force_uncompressed_stack == 0);
>>
>> -   /* This must come after all optimization and register allocation, since
>> -* it inserts dead code that happens to have side effects, and it does
>> -* so based on the actual physical registers in use.
>> -*/
>> -   insert_gen4_send_dependency_workarounds();
>> +  assert(force_uncompressed_stack == 0);
>>
>> -   if (failed)
>> -  return false;
>> +  /* This must come after all optimization and register allocation, 
>> since
>> +   * it inserts dead code that happens to have side effects, and it does
>> +   * so based on the actual physical registers in use.
>> +   */
>> +  insert_gen4_send_dependency_workarounds();
>> +
>> +  if (failed)
>> + return false;
>>
>> -   if (!allocated_without_spills)
>> -  schedule_instructions(SCHEDULE_POST);
>> +  if (!allocated_without_spills)
>> + schedule_instructions(SCHEDULE_POST);
>>
>> -   if (last_scratch > 0) {
>> +  if (last_scratch > 0)
>>prog_data->total_scratch = brw_get_scratch_size(last_scratch);
>
> Need to indent this line too.

Yup.

Kristian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] util: Move ffs, _mesa_bitcount, and friends to the util folder

2014-10-29 Thread Jason Ekstrand
---
 src/gallium/state_trackers/glx/xlib/glx_api.c |   6 +-
 src/gallium/state_trackers/glx/xlib/xm_api.c  |  10 +-
 src/mesa/drivers/common/meta.c|   3 +-
 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp  |   4 +-
 src/mesa/drivers/dri/i965/brw_curbe.c |   2 +-
 src/mesa/drivers/dri/i965/brw_draw.c  |   6 +-
 src/mesa/drivers/dri/i965/brw_fs.cpp  |  12 +--
 src/mesa/drivers/dri/i965/brw_shader.cpp  |   2 +-
 src/mesa/drivers/dri/i965/brw_vec4.cpp|   2 +-
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |   2 +-
 src/mesa/drivers/dri/i965/brw_wm.c|   4 +-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c  |   2 +-
 src/mesa/drivers/x11/fakeglx.c|   6 +-
 src/mesa/drivers/x11/xm_api.c |  16 +--
 src/mesa/main/bitset.h|   1 +
 src/mesa/main/buffers.c   |   6 +-
 src/mesa/main/imports.c   |  88 -
 src/mesa/main/imports.h   |  54 +-
 src/mesa/program/program_parse.y  |   2 +-
 src/util/Makefile.sources |   1 +
 src/util/bitcount.c   | 115 ++
 src/util/bitcount.h   |  94 ++
 22 files changed, 255 insertions(+), 183 deletions(-)
 create mode 100644 src/util/bitcount.c
 create mode 100644 src/util/bitcount.h

diff --git a/src/gallium/state_trackers/glx/xlib/glx_api.c 
b/src/gallium/state_trackers/glx/xlib/glx_api.c
index 976791b..9914116 100644
--- a/src/gallium/state_trackers/glx/xlib/glx_api.c
+++ b/src/gallium/state_trackers/glx/xlib/glx_api.c
@@ -402,9 +402,9 @@ get_visual( Display *dpy, int scr, unsigned int depth, int 
xclass )
 * 10 bits per color channel.  Mesa's limited to a max of 8 bits/channel.
 */
if (vis && depth > 24 && (xclass==TrueColor || xclass==DirectColor)) {
-  if (_mesa_bitcount((GLuint) vis->red_mask  ) <= 8 &&
-  _mesa_bitcount((GLuint) vis->green_mask) <= 8 &&
-  _mesa_bitcount((GLuint) vis->blue_mask ) <= 8) {
+  if (util_bitcount((GLuint) vis->red_mask  ) <= 8 &&
+  util_bitcount((GLuint) vis->green_mask) <= 8 &&
+  util_bitcount((GLuint) vis->blue_mask ) <= 8) {
  return vis;
   }
   else {
diff --git a/src/gallium/state_trackers/glx/xlib/xm_api.c 
b/src/gallium/state_trackers/glx/xlib/xm_api.c
index 1b77729..74c5637 100644
--- a/src/gallium/state_trackers/glx/xlib/xm_api.c
+++ b/src/gallium/state_trackers/glx/xlib/xm_api.c
@@ -736,9 +736,9 @@ XMesaVisual XMesaCreateVisual( Display *display,
{
   const int xclass = v->visualType;
   if (xclass == GLX_TRUE_COLOR || xclass == GLX_DIRECT_COLOR) {
- red_bits   = _mesa_bitcount(GET_REDMASK(v));
- green_bits = _mesa_bitcount(GET_GREENMASK(v));
- blue_bits  = _mesa_bitcount(GET_BLUEMASK(v));
+ red_bits   = util_bitcount(GET_REDMASK(v));
+ green_bits = util_bitcount(GET_GREENMASK(v));
+ blue_bits  = util_bitcount(GET_BLUEMASK(v));
   }
   else {
  /* this is an approximation */
@@ -1067,8 +1067,8 @@ XMesaCreatePixmapTextureBuffer(XMesaVisual v, Pixmap p,
   if (ctx->Extensions.ARB_texture_non_power_of_two) {
  target = GLX_TEXTURE_2D_EXT;
   }
-  else if (   _mesa_bitcount(b->width)  == 1
-   && _mesa_bitcount(b->height) == 1) {
+  else if (   util_bitcount(b->width)  == 1
+   && util_bitcount(b->height) == 1) {
  /* power of two size */
  if (b->height == 1) {
 target = GLX_TEXTURE_1D_EXT;
diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index 87532c1..22a5b3e 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -85,6 +85,7 @@
 #include "main/enums.h"
 #include "main/glformats.h"
 #include "util/ralloc.h"
+#include "util/bitcount.h"
 
 /** Return offset in bytes of the field within a vertex struct */
 #define OFFSET(FIELD) ((void *) offsetof(struct vertex, FIELD))
@@ -1640,7 +1641,7 @@ _mesa_meta_drawbuffers_from_bitfield(GLbitfield bits)
assert((bits & ~BUFFER_BITS_COLOR) == 0);
 
/* Make sure we don't overflow any arrays. */
-   assert(_mesa_bitcount(bits) <= MAX_DRAW_BUFFERS);
+   assert(util_bitcount(bits) <= MAX_DRAW_BUFFERS);
 
enums[0] = GL_NONE;
 
diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
index 844f5e4..7ccdff5 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
@@ -1346,7 +1346,7 @@ inline int count_trailing_one_bits(unsigned value)
 #ifdef HAVE___BUILTIN_CTZ
return __builtin_ctz(~value);
 #else
-   return _mesa_bitcount(value & ~(value + 1));
+   return util_bitcount(value & ~(value + 1));
 #endif
 }
 
@@ -1388,7 +1388,7 @@ brw_blor

Re: [Mesa-dev] [PATCH WIP 1/1] configure: include llvm systemlibs when using static llvm

2014-10-29 Thread Emil Velikov
On 27/10/14 21:03, Jan Vesely wrote:
> On Mon, 2014-10-27 at 20:22 +, Emil Velikov wrote:
>> On 27/10/14 18:05, Jan Vesely wrote:
>>> On Mon, 2014-10-27 at 02:24 +, Emil Velikov wrote:
 On 26/10/14 19:36, Jan Vesely wrote:
> On Fri, 2014-10-24 at 23:54 +, Emil Velikov wrote:
>> On 24/10/14 17:03, Jan Vesely wrote:
>>> -Wl,--exclude-libs prevents automatic export of symbols
>>>
>>>
>>> CC: Kai Wasserbach 
>>> CC: Emil Velikov 
>>> Signed-off-by: Jan Vesely 
>>> ---
>>>
>>> Kai,
>>> can you try this patch with your setup, and check whether LLVM symbols 
>>> are
>>> exported from mesa library? (and it's still working)
>>>
>>> Emil,
>>> would it help to have --exclude-libs ALL enabled globally?
>>>
>> Haven't really looked up on the documentation about it, yet there should
>> be no (unneeded) exported symbols thanks to the version scripts.
>> As such I'm not entirely sure what this patch (attempts to) resolve :(
>
> you are right. I don't know why I thought it was still a problem.
> In that case the attached patch should fix compiling with llvm static
> libs (#70410)
>
 For future patches please add the full link in the commit message
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70410

 Afaics the bug mentioned has a slightly different patch, which brings
 the question - do we need to append to the existing LLVM_LIBS, or can we
 overwrite them ? Either way it would be nice if we can get a tested-by
 or two :)
>>>
>>> Hi,
>>>
>>> I looked at this again. LLVM cmake links system-libs as either PUBLIC or
>>> INTERFACE [0].
>>> it means my patch is incorrect, and we should link against system-libs
>>> even if we use llvm-shared-libs.
>>> you can add my R-b to the patch by K. Sobiecky that is attached to the
>>> bug.[1]
>>>
>> Sigh... "cmake why you so PoS ?"
>>
>> On a more mature note:
>> I do not see why would we need it to link against those libraries for
>> shared linking. If their libs are broken (have unresolved symbols), and
>> we need this hack to get them working then maybe, but
>> ... looking at line 151 - # FIXME: Should this be really PUBLIC?
>> Answer: PRIVATE for shared libs, PUBLIC for static ones.
>>
>> Using PUBLIC causes all the users to recursively link against those
>> deps. Leading to over-linking and opening the door for serious issues.
> 
> looks like misdesign on llvm side. they use public to bring systemlibs
> to other llvm libs, while ignore private deps elsewhere (like libffi).
> I'd try to send a patch, but my last attempt to get Alexander's rtti
> patch in is stuck for 9 months...
> 
Yes, dealing with build systems is a job no-one wants to do, and I fear
most people underestimate it. Please let them know and be persistent
otherwise they might will end up abusing it too much :)

> anyway, since we only need those libs in static builds, do you want me
> to repost the patch with bug reference and Kai's tested by?
> 
There is no problem. I've picked it up, added the tags + cc'd
mesa-stable, as I would expect a few more people willing to use mesa
with static linked llvm.

Thanks
Emil

> jan
> 
>>
>>
>> -Emil
>>
>> P.S. Both their automake + cmake builds seems _quite_ bad.
>> autoconf/Readme has a nice documentation of it :)
>>
>>
>>> jan
>>>
>>>
>>> [0] lib/Support/CMakeLists.txt:150
>>> [1] https://bugs.freedesktop.org/attachment.cgi?id=91764

 Thanks
 Emil

> jan
>
>>
>> -Emil
>>
>>> jan
>>>
>>>  configure.ac | 10 +-
>>>  1 file changed, 9 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/configure.ac b/configure.ac
>>> index 3c76deb..b4b4b13 100644
>>> --- a/configure.ac
>>> +++ b/configure.ac
>>> @@ -1981,7 +1981,15 @@ if test "x$MESA_LLVM" != x0; then
>>> dnl already added all of these objects to LLVM_LIBS.
>>>  fi
>>>  else
>>> -AC_MSG_WARN([Building mesa with staticly linked LLVM may cause 
>>> compilation issues])
>>> +AC_MSG_WARN([Building mesa with statically linked LLVM may 
>>> cause compilation issues])
>>> +   dnl Don't export symbols automatically
>>> +   dnl TODO: Do we want to list llvm libs explicitly here?
>>> +   LLVM_LDFLAGS+=" -Wl,exclude-libs ALL"
>>> +   dnl We need to link to llvm system libs when using static libs
>>> +   dnl However, only llvm 3.5+ provides --system-libs
>>> +   if test $LLVM_VERSION_MAJOR -eq 3 -a $LLVM_VERSION_MINOR -ge 5; 
>>> then
>>> +   LLVM_LIBS+=" `$LLVM_CONFIG --system-libs`"
>>> +   fi
>>>  fi
>>>  fi
>>>  
>>>
>>
>

>>>
>>
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 70410] egl-static/Makefile: linking fails with llvm >= 3.4

2014-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=70410

Emil Velikov  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #24 from Emil Velikov  ---
Afaict system-libs should not be used when linking against shared llvm. I've
just pushed a similar patch which should resolve the problems with static llvm,
while preserving the shared one as is.

commit af9551e68c8c964a3a80d74b6ed543b800318b33
Author: Jan Vesely 
Date:   Thu Oct 23 17:17:07 2014 -0400

configure: include llvm systemlibs when using static llvm

v2: drop -WL,--exclude-libs, it's not necessary
fix tabs/spaces

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/1] configure: fix typos

2014-10-29 Thread Emil Velikov
On 21/10/14 16:19, Jan Vesely wrote:
> Signed-off-by: Jan Vesely 
> ---
>  configure.ac | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/configure.ac b/configure.ac
> index 93b25a2..a588d55 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -1970,7 +1970,7 @@ if test -n "$with_gallium_drivers"; then
>  fi
>  
>  dnl Set LLVM_LIBS - This is done after the driver configuration so
> -dnl that drivers can add additonal components to LLVM_COMPONENTS.
> +dnl that drivers can add additional components to LLVM_COMPONENTS.
>  dnl Previously, gallium drivers were updating LLVM_LIBS directly
>  dnl by calling llvm-config --libs ${DRIVER_LLVM_COMPONENTS}, but
>  dnl this was causing the same libraries to be appear multiple times
> @@ -2003,11 +2003,11 @@ if test "x$MESA_LLVM" != x0; then
>   invocation and rebuild.])])
>  
> dnl We don't need to update LLVM_LIBS in this case because the 
> LLVM
> -   dnl install uses a shared object for each compoenent and we have
> +   dnl install uses a shared object for each component and we have
> dnl already added all of these objects to LLVM_LIBS.
>  fi
>  else
> -AC_MSG_WARN([Building mesa with staticly linked LLVM may cause 
> compilation issues])
> +AC_MSG_WARN([Building mesa with statically linked LLVM may cause 
> compilation issues])
>  fi
>  fi
>  
> 
Pushed to master.

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glx/dri3: Implement LIBGL_SHOW_FPS=1 for DRI3/Present.

2014-10-29 Thread Kenneth Graunke
v2: Use the UST value provided in the PRESENT_COMPLETE_NOTIFY event
rather than gettimeofday(), which gives us the presentation time
instead of the time when SwapBuffers was called.  Suggested by
Keith Packard.  This relies on the fact that the X DRI3/Present
implementations use microseconds for UST.

v3: Properly ignore PresentCompleteKindMSCNotify; multiply in 64 bits
(caught by Keith Packard).

Signed-off-by: Kenneth Graunke 
Cc: Keith Packard 
---
 src/glx/dri3_glx.c  | 32 +++-
 src/glx/dri3_priv.h |  6 +-
 2 files changed, 36 insertions(+), 2 deletions(-)

Oops - thanks for catching that.  I should've looked at the protocol.

This moves show_fps into the PresentCompleteKindPixmap block, which
necessitates passing ce->ust, since it hasn't been assigned to priv->ust
yet.  It also changes "const int interval" to "const uint64_t interval",
effectively doing a cast so the multiply will be done in 64 bits.

diff --git a/src/glx/dri3_glx.c b/src/glx/dri3_glx.c
index e8e5c4a..a9ff73b 100644
--- a/src/glx/dri3_glx.c
+++ b/src/glx/dri3_glx.c
@@ -361,12 +361,34 @@ dri3_create_drawable(struct glx_screen *base, XID 
xDrawable,
return &pdraw->base;
 }
 
+static void
+show_fps(struct dri3_drawable *draw, uint64_t current_ust)
+{
+   const uint64_t interval =
+  ((struct dri3_screen *) draw->base.psc)->show_fps_interval;
+
+   draw->frames++;
+
+   /* DRI3+Present together uses microseconds for UST. */
+   if (draw->previous_ust + interval * 100 <= current_ust) {
+  if (draw->previous_ust) {
+ fprintf(stderr, "libGL: FPS = %.1f\n",
+ ((uint64_t) draw->frames * 100) /
+ (double)(current_ust - draw->previous_ust));
+  }
+  draw->frames = 0;
+  draw->previous_ust = current_ust;
+   }
+}
+
 /*
  * Process one Present event
  */
 static void
 dri3_handle_present_event(struct dri3_drawable *priv, 
xcb_present_generic_event_t *ge)
 {
+   struct dri3_screen *psc = (struct dri3_screen *) priv->base.psc;
+
switch (ge->evtype) {
case XCB_PRESENT_CONFIGURE_NOTIFY: {
   xcb_present_configure_notify_event_t *ce = (void *) ge;
@@ -395,6 +417,9 @@ dri3_handle_present_event(struct dri3_drawable *priv, 
xcb_present_generic_event_
 break;
  }
  dri3_update_num_back(priv);
+
+ if (psc->show_fps_interval)
+show_fps(priv, ce->ust);
   } else {
  priv->recv_msc_serial = ce->serial;
   }
@@ -1830,7 +1855,7 @@ dri3_create_screen(int screen, struct glx_display * priv)
struct dri3_screen *psc;
__GLXDRIscreen *psp;
struct glx_config *configs = NULL, *visuals = NULL;
-   char *driverName, *deviceName;
+   char *driverName, *deviceName, *tmp;
int i;
 
psc = calloc(1, sizeof *psc);
@@ -1969,6 +1994,11 @@ dri3_create_screen(int screen, struct glx_display * priv)
free(driverName);
free(deviceName);
 
+   tmp = getenv("LIBGL_SHOW_FPS");
+   psc->show_fps_interval = tmp ? atoi(tmp) : 0;
+   if (psc->show_fps_interval < 0)
+  psc->show_fps_interval = 0;
+
return &psc->base;
 
 handle_error:
diff --git a/src/glx/dri3_priv.h b/src/glx/dri3_priv.h
index bdfe224..8e46640 100644
--- a/src/glx/dri3_priv.h
+++ b/src/glx/dri3_priv.h
@@ -138,7 +138,7 @@ struct dri3_screen {
int fd;
int is_different_gpu;
 
-   Bool show_fps;
+   int show_fps_interval;
 };
 
 struct dri3_context
@@ -198,6 +198,10 @@ struct dri3_drawable {
xcb_present_event_t eid;
xcb_gcontext_t gc;
xcb_special_event_t *special_event;
+
+   /* LIBGL_SHOW_FPS support */
+   uint64_t previous_ust;
+   unsigned frames;
 };
 
 
-- 
2.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] egl: rework handling EGL_CONTEXT_FLAGS for ES debug contexts

2014-10-29 Thread Emil Velikov
On 29/10/14 10:43, Matthew Waters wrote:
> From: Matthew Waters 
> 
> As of version 15 of the EGL_KHR_create_context spec, debug contexts
> are allowed for ES contexts.  We should allow creation instead of
> erroring.
> 
By moving the check from the dri module to the loader we can end up with
combination (old loader and new dri module) where neither one does the
checking.
No objections against the patch, just wondering what the correct
solution for that case would be.

-Emil


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] util: Move ffs, _mesa_bitcount, and friends to the util folder

2014-10-29 Thread Kenneth Graunke
On Wednesday, October 29, 2014 11:27:56 AM Jason Ekstrand wrote:
> ---
>  src/gallium/state_trackers/glx/xlib/glx_api.c |   6 +-
>  src/gallium/state_trackers/glx/xlib/xm_api.c  |  10 +-
>  src/mesa/drivers/common/meta.c|   3 +-
>  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp  |   4 +-
>  src/mesa/drivers/dri/i965/brw_curbe.c |   2 +-
>  src/mesa/drivers/dri/i965/brw_draw.c  |   6 +-
>  src/mesa/drivers/dri/i965/brw_fs.cpp  |  12 +--
>  src/mesa/drivers/dri/i965/brw_shader.cpp  |   2 +-
>  src/mesa/drivers/dri/i965/brw_vec4.cpp|   2 +-
>  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |   2 +-
>  src/mesa/drivers/dri/i965/brw_wm.c|   4 +-
>  src/mesa/drivers/dri/i965/brw_wm_surface_state.c  |   2 +-
>  src/mesa/drivers/x11/fakeglx.c|   6 +-
>  src/mesa/drivers/x11/xm_api.c |  16 +--
>  src/mesa/main/bitset.h|   1 +
>  src/mesa/main/buffers.c   |   6 +-
>  src/mesa/main/imports.c   |  88 -
>  src/mesa/main/imports.h   |  54 +-
>  src/mesa/program/program_parse.y  |   2 +-
>  src/util/Makefile.sources |   1 +
>  src/util/bitcount.c   | 115 
++
>  src/util/bitcount.h   |  94 ++
>  22 files changed, 255 insertions(+), 183 deletions(-)
>  create mode 100644 src/util/bitcount.c
>  create mode 100644 src/util/bitcount.h

I like the idea of moving these to src/util, but I don't see much point in 
renaming them from (e.g.) _mesa_bitcount() to util_bitcount().  I suppose it 
matches the Gallium name, so if your intent is to unify them, then it might be 
a reasonable move.

But I don't see you deleting the Gallium code - they already have a 
util_bitcount in u_math.h, and I imagine it would conflict.  Did you build 
test it?

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glx/dri3: Implement LIBGL_SHOW_FPS=1 for DRI3/Present.

2014-10-29 Thread Keith Packard
Kenneth Graunke  writes:

> v2: Use the UST value provided in the PRESENT_COMPLETE_NOTIFY event
> rather than gettimeofday(), which gives us the presentation time
> instead of the time when SwapBuffers was called.  Suggested by
> Keith Packard.  This relies on the fact that the X DRI3/Present
> implementations use microseconds for UST.
>
> v3: Properly ignore PresentCompleteKindMSCNotify; multiply in 64 bits
> (caught by Keith Packard).

Reviewed-by: Keith Packard 

-- 
keith.pack...@intel.com


pgptgjeecaS60.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] util: Move ffs, _mesa_bitcount, and friends to the util folder

2014-10-29 Thread Roland Scheidegger
I like the idea of the series, however gallium still uses its own
definitions (by the looks of it supporting more compilers for native
definitions but in some cases with worse code for the fallback)
sometimes with different names (fls/util_last_bit) and sometimes with
the same even (ffs and util_bitcount) which looks like it might
conflict. I think it would be great if these would be unified.

Roland

Am 29.10.2014 um 19:27 schrieb Jason Ekstrand:
> ---
>  src/gallium/state_trackers/glx/xlib/glx_api.c |   6 +-
>  src/gallium/state_trackers/glx/xlib/xm_api.c  |  10 +-
>  src/mesa/drivers/common/meta.c|   3 +-
>  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp  |   4 +-
>  src/mesa/drivers/dri/i965/brw_curbe.c |   2 +-
>  src/mesa/drivers/dri/i965/brw_draw.c  |   6 +-
>  src/mesa/drivers/dri/i965/brw_fs.cpp  |  12 +--
>  src/mesa/drivers/dri/i965/brw_shader.cpp  |   2 +-
>  src/mesa/drivers/dri/i965/brw_vec4.cpp|   2 +-
>  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |   2 +-
>  src/mesa/drivers/dri/i965/brw_wm.c|   4 +-
>  src/mesa/drivers/dri/i965/brw_wm_surface_state.c  |   2 +-
>  src/mesa/drivers/x11/fakeglx.c|   6 +-
>  src/mesa/drivers/x11/xm_api.c |  16 +--
>  src/mesa/main/bitset.h|   1 +
>  src/mesa/main/buffers.c   |   6 +-
>  src/mesa/main/imports.c   |  88 -
>  src/mesa/main/imports.h   |  54 +-
>  src/mesa/program/program_parse.y  |   2 +-
>  src/util/Makefile.sources |   1 +
>  src/util/bitcount.c   | 115 
> ++
>  src/util/bitcount.h   |  94 ++
>  22 files changed, 255 insertions(+), 183 deletions(-)
>  create mode 100644 src/util/bitcount.c
>  create mode 100644 src/util/bitcount.h
> 
> diff --git a/src/gallium/state_trackers/glx/xlib/glx_api.c 
> b/src/gallium/state_trackers/glx/xlib/glx_api.c
> index 976791b..9914116 100644
> --- a/src/gallium/state_trackers/glx/xlib/glx_api.c
> +++ b/src/gallium/state_trackers/glx/xlib/glx_api.c
> @@ -402,9 +402,9 @@ get_visual( Display *dpy, int scr, unsigned int depth, 
> int xclass )
>  * 10 bits per color channel.  Mesa's limited to a max of 8 bits/channel.
>  */
> if (vis && depth > 24 && (xclass==TrueColor || xclass==DirectColor)) {
> -  if (_mesa_bitcount((GLuint) vis->red_mask  ) <= 8 &&
> -  _mesa_bitcount((GLuint) vis->green_mask) <= 8 &&
> -  _mesa_bitcount((GLuint) vis->blue_mask ) <= 8) {
> +  if (util_bitcount((GLuint) vis->red_mask  ) <= 8 &&
> +  util_bitcount((GLuint) vis->green_mask) <= 8 &&
> +  util_bitcount((GLuint) vis->blue_mask ) <= 8) {
>   return vis;
>}
>else {
> diff --git a/src/gallium/state_trackers/glx/xlib/xm_api.c 
> b/src/gallium/state_trackers/glx/xlib/xm_api.c
> index 1b77729..74c5637 100644
> --- a/src/gallium/state_trackers/glx/xlib/xm_api.c
> +++ b/src/gallium/state_trackers/glx/xlib/xm_api.c
> @@ -736,9 +736,9 @@ XMesaVisual XMesaCreateVisual( Display *display,
> {
>const int xclass = v->visualType;
>if (xclass == GLX_TRUE_COLOR || xclass == GLX_DIRECT_COLOR) {
> - red_bits   = _mesa_bitcount(GET_REDMASK(v));
> - green_bits = _mesa_bitcount(GET_GREENMASK(v));
> - blue_bits  = _mesa_bitcount(GET_BLUEMASK(v));
> + red_bits   = util_bitcount(GET_REDMASK(v));
> + green_bits = util_bitcount(GET_GREENMASK(v));
> + blue_bits  = util_bitcount(GET_BLUEMASK(v));
>}
>else {
>   /* this is an approximation */
> @@ -1067,8 +1067,8 @@ XMesaCreatePixmapTextureBuffer(XMesaVisual v, Pixmap p,
>if (ctx->Extensions.ARB_texture_non_power_of_two) {
>   target = GLX_TEXTURE_2D_EXT;
>}
> -  else if (   _mesa_bitcount(b->width)  == 1
> -   && _mesa_bitcount(b->height) == 1) {
> +  else if (   util_bitcount(b->width)  == 1
> +   && util_bitcount(b->height) == 1) {
>   /* power of two size */
>   if (b->height == 1) {
>  target = GLX_TEXTURE_1D_EXT;
> diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
> index 87532c1..22a5b3e 100644
> --- a/src/mesa/drivers/common/meta.c
> +++ b/src/mesa/drivers/common/meta.c
> @@ -85,6 +85,7 @@
>  #include "main/enums.h"
>  #include "main/glformats.h"
>  #include "util/ralloc.h"
> +#include "util/bitcount.h"
>  
>  /** Return offset in bytes of the field within a vertex struct */
>  #define OFFSET(FIELD) ((void *) offsetof(struct vertex, FIELD))
> @@ -1640,7 +1641,7 @@ _mesa_meta_drawbuffers_from_bitfield(GLbitfield bits)
> assert((bits & ~BUFFER_BITS_COLOR) == 0);
>  
> /* Make sure we don't overflow any arrays. */
> -   assert(_mes

[Mesa-dev] [PATCH 2/2] i965/vec4: Perform CSE on MAD instructions with final arguments switched.

2014-10-29 Thread Matt Turner
---
 src/mesa/drivers/dri/i965/brw_vec4_cse.cpp | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
index 28c69ca..630d335 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp
@@ -104,7 +104,11 @@ is_expression_commutative(enum opcode op)
 static bool
 operands_match(enum opcode op, src_reg *xs, src_reg *ys)
 {
-   if (!is_expression_commutative(op)) {
+   if (op == BRW_OPCODE_MAD) {
+  return xs[0].equals(ys[0]) &&
+ ((xs[1].equals(ys[1]) && xs[2].equals(ys[2])) ||
+  (xs[2].equals(ys[1]) && xs[1].equals(ys[2])));
+   } else if (!is_expression_commutative(op)) {
   return xs[0].equals(ys[0]) && xs[1].equals(ys[1]) && xs[2].equals(ys[2]);
} else {
   return (xs[0].equals(ys[0]) && xs[1].equals(ys[1])) ||
-- 
2.0.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] i965/fs: Perform CSE on MAD instructions with final arguments switched.

2014-10-29 Thread Matt Turner
Multiplication is commutative.

instructions in affected programs: 48314 -> 47954 (-0.75%)
---
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
index 8012001..5fdbf46 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
@@ -128,7 +128,11 @@ operands_match(fs_inst *a, fs_inst *b)
fs_reg *xs = a->src;
fs_reg *ys = b->src;
 
-   if (!is_expression_commutative(a->opcode)) {
+   if (a->opcode == BRW_OPCODE_MAD) {
+  return xs[0].equals(ys[0]) &&
+ ((xs[1].equals(ys[1]) && xs[2].equals(ys[2])) ||
+  (xs[2].equals(ys[1]) && xs[1].equals(ys[2])));
+   } else if (!is_expression_commutative(a->opcode)) {
   bool match = true;
   for (int i = 0; i < a->sources; i++) {
  if (!xs[i].equals(ys[i])) {
-- 
2.0.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] i965/vec4: Make live_intervals part of the vec4_visitor class.

2014-10-29 Thread Matt Turner
Like in fs_visitor.
---
 src/mesa/drivers/dri/i965/brw_vec4.h  |  5 +++--
 src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp | 15 +++
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp|  2 +-
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index 750f491..795f4ff 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -87,6 +87,8 @@ namespace brw {
 
 class dst_reg;
 
+class vec4_live_variables;
+
 unsigned
 swizzle_for_size(int size);
 
@@ -301,6 +303,7 @@ public:
unsigned int max_grf;
int *virtual_grf_start;
int *virtual_grf_end;
+   brw::vec4_live_variables *live_intervals;
dst_reg userplane[MAX_CLIP_PLANES];
 
/**
@@ -311,8 +314,6 @@ public:
/** Per-virtual-grf indices into an array of size virtual_grf_reg_count */
int *virtual_grf_reg_map;
 
-   bool live_intervals_valid;
-
dst_reg *variable_storage(ir_variable *var);
 
void reladdr_to_temp(ir_instruction *ir, src_reg *reg, int *num_reladdr);
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
index 80b912a..44eed1c 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp
@@ -195,7 +195,7 @@ vec4_live_variables::~vec4_live_variables()
 void
 vec4_visitor::calculate_live_intervals()
 {
-   if (this->live_intervals_valid)
+   if (this->live_intervals)
   return;
 
int *start = ralloc_array(mem_ctx, int, this->virtual_grf_count * 4);
@@ -247,29 +247,28 @@ vec4_visitor::calculate_live_intervals()
 * The control flow-aware analysis was done at a channel level, while at
 * this point we're distilling it down to vgrfs.
 */
-   vec4_live_variables livevars(this, cfg);
+   this->live_intervals = new(mem_ctx) vec4_live_variables(this, cfg);
 
foreach_block (block, cfg) {
-  for (int i = 0; i < livevars.num_vars; i++) {
-if (BITSET_TEST(livevars.bd[block->num].livein, i)) {
+  for (int i = 0; i < live_intervals->num_vars; i++) {
+if (BITSET_TEST(live_intervals->bd[block->num].livein, i)) {
start[i] = MIN2(start[i], block->start_ip);
end[i] = MAX2(end[i], block->start_ip);
 }
 
-if (BITSET_TEST(livevars.bd[block->num].liveout, i)) {
+if (BITSET_TEST(live_intervals->bd[block->num].liveout, i)) {
start[i] = MIN2(start[i], block->end_ip);
end[i] = MAX2(end[i], block->end_ip);
 }
   }
}
-
-   this->live_intervals_valid = true;
 }
 
 void
 vec4_visitor::invalidate_live_intervals()
 {
-   live_intervals_valid = false;
+   ralloc_free(live_intervals);
+   live_intervals = NULL;
 }
 
 bool
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index b46879b..a6afc7a 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -3547,7 +3547,7 @@ vec4_visitor::vec4_visitor(struct brw_context *brw,
this->virtual_grf_reg_map = NULL;
this->virtual_grf_reg_count = 0;
this->virtual_grf_array_size = 0;
-   this->live_intervals_valid = false;
+   this->live_intervals = NULL;
 
this->max_grf = brw->gen >= 7 ? GEN7_MRF_HACK_START : BRW_MAX_GRF;
 
-- 
2.0.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/5] i965/fs: Track liveness of the flag register.

2014-10-29 Thread Matt Turner
---
 .../drivers/dri/i965/brw_fs_live_variables.cpp | 35 ++
 src/mesa/drivers/dri/i965/brw_fs_live_variables.h  |  5 
 2 files changed, 40 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
index ab81e94..dbe1d34 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
@@ -157,6 +157,18 @@ fs_live_variables::setup_def_use()
reg.reg_offset++;
 }
 }
+ if (inst->reads_flag()) {
+/* The vertical combination predicates read f0.0 and f0.1. */
+if (inst->predicate == BRW_PREDICATE_ALIGN1_ANYV ||
+inst->predicate == BRW_PREDICATE_ALIGN1_ALLV) {
+   if (!BITSET_TEST(bd->flag_def, 1 - inst->flag_subreg)) {
+  BITSET_SET(bd->flag_use, 1 - inst->flag_subreg);
+   }
+}
+if (!BITSET_TEST(bd->flag_def, inst->flag_subreg)) {
+   BITSET_SET(bd->flag_use, inst->flag_subreg);
+}
+ }
 
  /* Set def[] for this instruction */
  if (inst->dst.file == GRF) {
@@ -166,6 +178,11 @@ fs_live_variables::setup_def_use()
reg.reg_offset++;
 }
 }
+ if (inst->writes_flag()) {
+if (!BITSET_TEST(bd->flag_use, inst->flag_subreg)) {
+   BITSET_SET(bd->flag_def, inst->flag_subreg);
+}
+ }
 
 ip++;
   }
@@ -199,6 +216,13 @@ fs_live_variables::compute_live_variables()
cont = true;
}
 }
+ BITSET_WORD new_livein = (bd->flag_use[0] |
+   (bd->flag_liveout[0] &
+~bd->flag_def[0]));
+ if (new_livein & ~bd->flag_livein[0]) {
+bd->flag_livein[0] |= new_livein;
+cont = true;
+ }
 
 /* Update liveout */
 foreach_list_typed(bblock_link, child_link, link, &block->children) {
@@ -212,6 +236,12 @@ fs_live_variables::compute_live_variables()
   cont = true;
}
}
+BITSET_WORD new_liveout = (child_bd->flag_livein[0] &
+   ~bd->flag_liveout[0]);
+if (new_liveout) {
+   bd->flag_liveout[0] |= new_liveout;
+   cont = true;
+}
 }
   }
}
@@ -283,6 +313,11 @@ fs_live_variables::fs_live_variables(fs_visitor *v, const 
cfg_t *cfg)
   block_data[i].use = rzalloc_array(mem_ctx, BITSET_WORD, bitset_words);
   block_data[i].livein = rzalloc_array(mem_ctx, BITSET_WORD, bitset_words);
   block_data[i].liveout = rzalloc_array(mem_ctx, BITSET_WORD, 
bitset_words);
+
+  block_data[i].flag_def[0] = 0;
+  block_data[i].flag_use[0] = 0;
+  block_data[i].flag_livein[0] = 0;
+  block_data[i].flag_liveout[0] = 0;
}
 
setup_def_use();
diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.h 
b/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
index 5d63901..2bfb583 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
+++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
@@ -51,6 +51,11 @@ struct block_data {
 
/** Which defs reach the exit point of the block. */
BITSET_WORD *liveout;
+
+   BITSET_WORD flag_def[1];
+   BITSET_WORD flag_use[1];
+   BITSET_WORD flag_livein[1];
+   BITSET_WORD flag_liveout[1];
 };
 
 class fs_live_variables {
-- 
2.0.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] i965/fs: Dead code eliminate instructions writing the flag.

2014-10-29 Thread Matt Turner
Most prominently helps Natural Selection 2, which has a surprising
number shaders that do very complicated things before drawing black.

instructions in affected programs: 23824 -> 19570 (-17.86%)
---
 .../dri/i965/brw_fs_dead_code_eliminate.cpp| 23 +++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
index 9cf8d89..414c4a0 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
@@ -43,15 +43,16 @@ fs_visitor::dead_code_eliminate()
 
int num_vars = live_intervals->num_vars;
BITSET_WORD *live = ralloc_array(NULL, BITSET_WORD, BITSET_WORDS(num_vars));
+   BITSET_WORD *flag_live = ralloc_array(NULL, BITSET_WORD, 1);
 
foreach_block (block, cfg) {
   memcpy(live, live_intervals->block_data[block->num].liveout,
  sizeof(BITSET_WORD) * BITSET_WORDS(num_vars));
+  memcpy(flag_live, live_intervals->block_data[block->num].flag_liveout,
+ sizeof(BITSET_WORD));
 
   foreach_inst_in_block_reverse(fs_inst, inst, block) {
- if (inst->dst.file == GRF &&
- !inst->has_side_effects() &&
- !inst->writes_flag()) {
+ if (inst->dst.file == GRF && !inst->has_side_effects()) {
 bool result_live = false;
 
 if (inst->regs_written == 1) {
@@ -76,6 +77,13 @@ fs_visitor::dead_code_eliminate()
 }
  }
 
+ if (inst->dst.is_null() && inst->writes_flag()) {
+if (!BITSET_TEST(flag_live, inst->flag_subreg)) {
+   inst->opcode = BRW_OPCODE_NOP;
+   continue;
+}
+ }
+
  if (inst->dst.file == GRF) {
 if (!inst->is_partial_write()) {
int var = live_intervals->var_from_reg(&inst->dst);
@@ -85,6 +93,10 @@ fs_visitor::dead_code_eliminate()
 }
  }
 
+ if (inst->writes_flag()) {
+BITSET_CLEAR(flag_live, inst->flag_subreg);
+ }
+
  for (int i = 0; i < inst->sources; i++) {
 if (inst->src[i].file == GRF) {
int var = live_intervals->var_from_reg(&inst->src[i]);
@@ -94,10 +106,15 @@ fs_visitor::dead_code_eliminate()
}
 }
  }
+
+ if (inst->reads_flag()) {
+BITSET_SET(flag_live, inst->flag_subreg);
+ }
   }
}
 
ralloc_free(live);
+   ralloc_free(flag_live);
 
if (progress) {
   foreach_block_and_inst_safe (block, backend_instruction, inst, cfg) {
-- 
2.0.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] i965: Use local pointer to block_data in live intervals.

2014-10-29 Thread Matt Turner
The next patch will be simplified because of this, and makes reading the
code a lot easier.
---
 .../dri/i965/brw_fs_dead_code_eliminate.cpp|  2 +-
 .../drivers/dri/i965/brw_fs_live_variables.cpp | 54 --
 src/mesa/drivers/dri/i965/brw_fs_live_variables.h  |  6 +--
 .../drivers/dri/i965/brw_vec4_live_variables.cpp   | 46 ++
 .../drivers/dri/i965/brw_vec4_live_variables.h |  2 +-
 5 files changed, 61 insertions(+), 49 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
index 7838775..9cf8d89 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
@@ -45,7 +45,7 @@ fs_visitor::dead_code_eliminate()
BITSET_WORD *live = ralloc_array(NULL, BITSET_WORD, BITSET_WORDS(num_vars));
 
foreach_block (block, cfg) {
-  memcpy(live, live_intervals->bd[block->num].liveout,
+  memcpy(live, live_intervals->block_data[block->num].liveout,
  sizeof(BITSET_WORD) * BITSET_WORDS(num_vars));
 
   foreach_inst_in_block_reverse(fs_inst, inst, block) {
diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
index ea3c0d1..ab81e94 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
@@ -53,7 +53,7 @@ using namespace brw;
  */
 
 void
-fs_live_variables::setup_one_read(bblock_t *block, fs_inst *inst,
+fs_live_variables::setup_one_read(struct block_data *bd, fs_inst *inst,
   int ip, fs_reg reg)
 {
int var = var_from_reg(®);
@@ -100,12 +100,12 @@ fs_live_variables::setup_one_read(bblock_t *block, 
fs_inst *inst,
 * channel) without having completely defined that variable within the
 * block.
 */
-   if (!BITSET_TEST(bd[block->num].def, var))
-  BITSET_SET(bd[block->num].use, var);
+   if (!BITSET_TEST(bd->def, var))
+  BITSET_SET(bd->use, var);
 }
 
 void
-fs_live_variables::setup_one_write(bblock_t *block, fs_inst *inst,
+fs_live_variables::setup_one_write(struct block_data *bd, fs_inst *inst,
int ip, fs_reg reg)
 {
int var = var_from_reg(®);
@@ -118,8 +118,8 @@ fs_live_variables::setup_one_write(bblock_t *block, fs_inst 
*inst,
 * screens off previous updates of that variable (VGRF channel).
 */
if (inst->dst.file == GRF && !inst->is_partial_write()) {
-  if (!BITSET_TEST(bd[block->num].use, var))
- BITSET_SET(bd[block->num].def, var);
+  if (!BITSET_TEST(bd->use, var))
+ BITSET_SET(bd->def, var);
}
 }
 
@@ -142,6 +142,8 @@ fs_live_variables::setup_def_use()
   if (block->num > 0)
 assert(cfg->blocks[block->num - 1]->end_ip == ip - 1);
 
+  struct block_data *bd = &block_data[block->num];
+
   foreach_inst_in_block(fs_inst, inst, block) {
 /* Set use[] for this instruction */
 for (unsigned int i = 0; i < inst->sources; i++) {
@@ -151,7 +153,7 @@ fs_live_variables::setup_def_use()
continue;
 
 for (int j = 0; j < inst->regs_read(v, i); j++) {
-   setup_one_read(block, inst, ip, reg);
+   setup_one_read(bd, inst, ip, reg);
reg.reg_offset++;
 }
 }
@@ -160,7 +162,7 @@ fs_live_variables::setup_def_use()
  if (inst->dst.file == GRF) {
 fs_reg reg = inst->dst;
 for (int j = 0; j < inst->regs_written; j++) {
-   setup_one_write(block, inst, ip, reg);
+   setup_one_write(bd, inst, ip, reg);
reg.reg_offset++;
 }
 }
@@ -185,26 +187,28 @@ fs_live_variables::compute_live_variables()
   cont = false;
 
   foreach_block (block, cfg) {
+ struct block_data *bd = &block_data[block->num];
+
 /* Update livein */
 for (int i = 0; i < bitset_words; i++) {
-BITSET_WORD new_livein = (bd[block->num].use[i] |
-  (bd[block->num].liveout[i] &
-   ~bd[block->num].def[i]));
-   if (new_livein & ~bd[block->num].livein[i]) {
-   bd[block->num].livein[i] |= new_livein;
+BITSET_WORD new_livein = (bd->use[i] |
+  (bd->liveout[i] &
+   ~bd->def[i]));
+   if (new_livein & ~bd->livein[i]) {
+   bd->livein[i] |= new_livein;
cont = true;
}
 }
 
 /* Update liveout */
 foreach_list_typed(bblock_link, child_link, link, &block->children) {
-   bblock_t *child = child_link->block;
+struct block_data *child_bd = &block_data[child_link->block->num];
 
for (int i = 0; i < bitset_words; i++) {
-   BITSET_WORD new_liveout = (bd[

[Mesa-dev] [PATCH 5/5] i965/fs: Use const fs_reg & rather than a copy or pointer.

2014-10-29 Thread Matt Turner
Also while we're touching var_from_reg, just make it an inline function.
---
 src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp  |  8 
 src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp   | 14 --
 src/mesa/drivers/dri/i965/brw_fs_live_variables.h | 11 ---
 src/mesa/drivers/dri/i965/brw_fs_saturate_propagation.cpp |  2 +-
 4 files changed, 17 insertions(+), 18 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
index 414c4a0..2b26177 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp
@@ -56,10 +56,10 @@ fs_visitor::dead_code_eliminate()
 bool result_live = false;
 
 if (inst->regs_written == 1) {
-   int var = live_intervals->var_from_reg(&inst->dst);
+   int var = live_intervals->var_from_reg(inst->dst);
result_live = BITSET_TEST(live, var);
 } else {
-   int var = live_intervals->var_from_reg(&inst->dst);
+   int var = live_intervals->var_from_reg(inst->dst);
for (int i = 0; i < inst->regs_written; i++) {
   result_live = result_live || BITSET_TEST(live, var + i);
}
@@ -86,7 +86,7 @@ fs_visitor::dead_code_eliminate()
 
  if (inst->dst.file == GRF) {
 if (!inst->is_partial_write()) {
-   int var = live_intervals->var_from_reg(&inst->dst);
+   int var = live_intervals->var_from_reg(inst->dst);
for (int i = 0; i < inst->regs_written; i++) {
   BITSET_CLEAR(live, var + i);
}
@@ -99,7 +99,7 @@ fs_visitor::dead_code_eliminate()
 
  for (int i = 0; i < inst->sources; i++) {
 if (inst->src[i].file == GRF) {
-   int var = live_intervals->var_from_reg(&inst->src[i]);
+   int var = live_intervals->var_from_reg(inst->src[i]);
 
for (int j = 0; j < inst->regs_read(this, i); j++) {
   BITSET_SET(live, var + j);
diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
index dbe1d34..b5c81cc 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
@@ -54,9 +54,9 @@ using namespace brw;
 
 void
 fs_live_variables::setup_one_read(struct block_data *bd, fs_inst *inst,
-  int ip, fs_reg reg)
+  int ip, const fs_reg ®)
 {
-   int var = var_from_reg(®);
+   int var = var_from_reg(reg);
assert(var < num_vars);
 
/* In most cases, a register can be written over safely by the
@@ -106,9 +106,9 @@ fs_live_variables::setup_one_read(struct block_data *bd, 
fs_inst *inst,
 
 void
 fs_live_variables::setup_one_write(struct block_data *bd, fs_inst *inst,
-   int ip, fs_reg reg)
+   int ip, const fs_reg ®)
 {
-   int var = var_from_reg(®);
+   int var = var_from_reg(reg);
assert(var < num_vars);
 
start[var] = MIN2(start[var], ip);
@@ -272,12 +272,6 @@ fs_live_variables::compute_start_end()
}
 }
 
-int
-fs_live_variables::var_from_reg(fs_reg *reg)
-{
-   return var_from_vgrf[reg->reg] + reg->reg_offset;
-}
-
 fs_live_variables::fs_live_variables(fs_visitor *v, const cfg_t *cfg)
: v(v), cfg(cfg)
 {
diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.h 
b/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
index 2bfb583..a52f922 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
+++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
@@ -66,7 +66,10 @@ public:
~fs_live_variables();
 
bool vars_interfere(int a, int b);
-   int var_from_reg(fs_reg *reg);
+   int var_from_reg(const fs_reg ®) const
+   {
+  return var_from_vgrf[reg.reg] + reg.reg_offset;
+   }
 
/** Map from virtual GRF number to index in block_data arrays. */
int *var_from_vgrf;
@@ -96,8 +99,10 @@ public:
 
 protected:
void setup_def_use();
-   void setup_one_read(struct block_data *bd, fs_inst *inst, int ip, fs_reg 
reg);
-   void setup_one_write(struct block_data *bd, fs_inst *inst, int ip, fs_reg 
reg);
+   void setup_one_read(struct block_data *bd, fs_inst *inst, int ip,
+   const fs_reg ®);
+   void setup_one_write(struct block_data *bd, fs_inst *inst, int ip,
+const fs_reg ®);
void compute_live_variables();
void compute_start_end();
 
diff --git a/src/mesa/drivers/dri/i965/brw_fs_saturate_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_saturate_propagation.cpp
index 347a78e..a4145ac 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_saturate_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_saturate_propagation.cpp
@@ -45,7 +45,7 @@ opt_saturate_propagation_local(fs_vi

Re: [Mesa-dev] [PATCH 12/14] i965: Add fs_visitor::run_vs() to generate scalar vertex shader code

2014-10-29 Thread Kristian Høgsberg
On Tue, Oct 28, 2014 at 4:50 PM, Matt Turner  wrote:
> On Tue, Oct 28, 2014 at 3:17 PM, Kristian Høgsberg  wrote:
>> This patch uses the previous refactoring to add a new run_vs() method
>> that generates vertex shader code using the scalar visitor and
>> optimizer.
>>
>> Signed-off-by: Kristian Høgsberg 
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs.cpp | 101 -
>>  src/mesa/drivers/dri/i965/brw_fs.h   |  21 +-
>>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 312 
>> ++-
>>  3 files changed, 423 insertions(+), 11 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> index dfad6b9..93f6a49 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
>> @@ -1828,6 +1828,56 @@ fs_visitor::assign_urb_setup()
>>urb_start + prog_data->num_varying_inputs * 2;
>>  }
>>
>> +void
>> +fs_visitor::assign_vs_urb_setup()
>> +{
>> +   brw_vs_prog_data *vs_prog_data = (brw_vs_prog_data *) prog_data;
>> +   int grf, count, slot, channel, attr;
>> +
>> +   assert(stage == MESA_SHADER_VERTEX);
>> +   count = _mesa_bitcount_64(vs_prog_data->inputs_read);
>> +   if (vs_prog_data->uses_vertexid || vs_prog_data->uses_instanceid)
>> +  count++;
>> +
>> +   /* Each attribute is 4 regs. */
>> +   this->first_non_payload_grf =
>> +  payload.num_regs + prog_data->curb_read_length + count * 4;
>> +
>> +   unsigned vue_entries =
>> +  MAX2(count, vs_prog_data->base.vue_map.num_slots);
>> +
>> +   vs_prog_data->base.urb_entry_size = ALIGN(vue_entries, 4) / 4;
>> +   vs_prog_data->base.urb_read_length = (count + 1) / 2;
>> +
>> +   assert(vs_prog_data->base.urb_read_length <= 15);
>> +
>> +   /* Rewrite all ATTR file references to the hw grf that they land in. */
>> +   foreach_block_and_inst(block, fs_inst, inst, cfg) {
>> +  for (int i = 0; i < inst->sources; i++) {
>> + if (inst->src[i].file == ATTR) {
>> +
>> +if (inst->src[i].reg == VERT_ATTRIB_MAX) {
>> +   slot = count - 1;
>> +} else {
>> +   attr = inst->src[i].reg + inst->src[i].reg_offset / 4;
>> +   slot = _mesa_bitcount_64(vs_prog_data->inputs_read &
>> +BITFIELD64_MASK(attr));
>> +}
>> +
>> +channel = inst->src[i].reg_offset & 3;
>> +
>> +grf = payload.num_regs +
>> +   prog_data->curb_read_length +
>> +   slot * 4 + channel;
>> +
>> +inst->src[i].file = HW_REG;
>> +inst->src[i].fixed_hw_reg =
>> +   retype(brw_vec8_grf(grf, 0), inst->src[i].type);
>> + }
>> +  }
>> +   }
>> +}
>> +
>>  /**
>>   * Split large virtual GRFs into separate components if we can.
>>   *
>> @@ -3405,6 +3455,13 @@ fs_visitor::setup_payload_gen6()
>>  }
>>
>>  void
>> +fs_visitor::setup_vs_payload()
>> +{
>> +   /* R0: thread header, R1: urb handles */
>> +   payload.num_regs = 2;
>> +}
>> +
>> +void
>>  fs_visitor::assign_binding_table_offsets()
>>  {
>> assert(stage == MESA_SHADER_FRAGMENT);
>> @@ -3471,6 +3528,8 @@ fs_visitor::opt_drop_redundant_mov_to_flags()
>>  void
>>  fs_visitor::optimize()
>>  {
>> +   const char *stage_name = stage == MESA_SHADER_VERTEX ? "vs" : "fs";
>> +
>> calculate_cfg();
>>
>> split_virtual_grfs();
>> @@ -3487,8 +3546,8 @@ fs_visitor::optimize()
>>  \
>>if (unlikely(INTEL_DEBUG & DEBUG_OPTIMIZER) && this_progress) {   \
>>   char filename[64]; \
>> - snprintf(filename, 64, "fs%d-%04d-%02d-%02d-" #pass,   \
>> -  dispatch_width, shader_prog ? shader_prog->Name : 0, 
>> iteration, pass_num); \
>> + snprintf(filename, 64, "%s%d-%04d-%02d-%02d-" #pass,  \
>> +  stage_name, dispatch_width, shader_prog ? 
>> shader_prog->Name : 0, iteration, pass_num); \
>>  \
>>   backend_visitor::dump_instructions(filename);  \
>>} \
>> @@ -3498,8 +3557,8 @@ fs_visitor::optimize()
>>
>> if (unlikely(INTEL_DEBUG & DEBUG_OPTIMIZER)) {
>>char filename[64];
>> -  snprintf(filename, 64, "fs%d-%04d-00-start",
>> -   dispatch_width, shader_prog ? shader_prog->Name : 0);
>> +  snprintf(filename, 64, "%s%d-%04d-00-start",
>> +   stage_name, dispatch_width, shader_prog ? shader_prog->Name 
>> : 0);
>>
>>backend_visitor::dump_instructions(filename);
>> }
>> @@ -3608,6 +3667,40 @@ fs_visitor::allocate_registers()
>>  }
>>
>>  bool
>> +fs_visitor::run_vs()
>> +{
>> +   assert(stage == MESA_SHADER_VERTEX);
>> +
>> +   assign_common_binding_table_offsets(0);
>> +   setup_vs_payload();
>> +
>> +  

[Mesa-dev] [PATCH] i965/fs: Don't compute_to_mrf() in the optimization loop.

2014-10-29 Thread Matt Turner
... or on Gen >= 7 at all. We use load_payload to gather results for the
FB write(s) now, so we never write to MRFs directly. It's still called
after lower_load_payload() since that will generate MOVs to MRFs on
platforms with MRFs.

No differences in shader-db on Haswell (Gen 7.5).
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index aa1d8d2..b223ae5 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2427,6 +2427,10 @@ fs_visitor::compute_to_mrf()
bool progress = false;
int next_ip = 0;
 
+   /* No MRFs on Gen >= 7. */
+   if (brw->gen >= 7)
+  return false;
+
calculate_live_intervals();
 
foreach_block_and_inst_safe(block, fs_inst, inst, cfg) {
@@ -3575,7 +3579,6 @@ fs_visitor::run()
  OPT(opt_register_renaming);
  OPT(opt_saturate_propagation);
  OPT(register_coalesce);
- OPT(compute_to_mrf);
 
  OPT(compact_virtual_grfs);
   } while (progress);
-- 
2.0.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 84186] X 1.16.1 RC 1 segfaults and reports "XXX fail to create fbo" with Radeon HD 7970.

2014-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=84186

Mathias Brodala  changed:

   What|Removed |Added

 CC||i...@noctus.net

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 57702] Eliminate RTLD_GLOBAL glapi hacks after removing support for static libglapi

2014-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=57702

fjhenigman  changed:

   What|Removed |Added

 CC||fjhmesa...@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 84186] X 1.16.1 RC 1 segfaults and reports "XXX fail to create fbo" with Radeon HD 7970.

2014-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=84186

--- Comment #6 from Mathias Brodala  ---
Created attachment 108659
  --> https://bugs.freedesktop.org/attachment.cgi?id=108659&action=edit
X startup with EGL debug info

I am experiencing basically the same issue with my HD4670 when trying to enable
Glamor acceleration. I'll attach a log produced by the command line mentioned
in comment 5.

A noticeable difference is the "AIGLX error: r600 does not export required DRI
extension" line and related.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Don't compute_to_mrf() in the optimization loop.

2014-10-29 Thread Kenneth Graunke
On Wednesday, October 29, 2014 02:31:53 PM Matt Turner wrote:
> ... or on Gen >= 7 at all. We use load_payload to gather results for the
> FB write(s) now, so we never write to MRFs directly. It's still called
> after lower_load_payload() since that will generate MOVs to MRFs on
> platforms with MRFs.
> 
> No differences in shader-db on Haswell (Gen 7.5).
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index aa1d8d2..b223ae5 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -2427,6 +2427,10 @@ fs_visitor::compute_to_mrf()
> bool progress = false;
> int next_ip = 0;
>  
> +   /* No MRFs on Gen >= 7. */
> +   if (brw->gen >= 7)
> +  return false;
> +
> calculate_live_intervals();
>  
> foreach_block_and_inst_safe(block, fs_inst, inst, cfg) {
> @@ -3575,7 +3579,6 @@ fs_visitor::run()
>   OPT(opt_register_renaming);
>   OPT(opt_saturate_propagation);
>   OPT(register_coalesce);
> - OPT(compute_to_mrf);
>  
>   OPT(compact_virtual_grfs);
>} while (progress);
> 

Reviewed-by: Kenneth Graunke 

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] util: Move ffs, _mesa_bitcount, and friends to the util folder

2014-10-29 Thread Jason Ekstrand
On Wed, Oct 29, 2014 at 1:45 PM, Roland Scheidegger 
wrote:

> I like the idea of the series, however gallium still uses its own
> definitions (by the looks of it supporting more compilers for native
> definitions but in some cases with worse code for the fallback)
> sometimes with different names (fls/util_last_bit) and sometimes with
> the same even (ffs and util_bitcount) which looks like it might
> conflict. I think it would be great if these would be unified.
>

I'm sending out a v2 with the gallium versions removed and gallium pointing
at the util/ version.  Please try and build it on MSVC and let me know if
anything needs to be squashed in so we don't break the build.


> Roland
>
> Am 29.10.2014 um 19:27 schrieb Jason Ekstrand:
> > ---
> >  src/gallium/state_trackers/glx/xlib/glx_api.c |   6 +-
> >  src/gallium/state_trackers/glx/xlib/xm_api.c  |  10 +-
> >  src/mesa/drivers/common/meta.c|   3 +-
> >  src/mesa/drivers/dri/i965/brw_blorp_blit.cpp  |   4 +-
> >  src/mesa/drivers/dri/i965/brw_curbe.c |   2 +-
> >  src/mesa/drivers/dri/i965/brw_draw.c  |   6 +-
> >  src/mesa/drivers/dri/i965/brw_fs.cpp  |  12 +--
> >  src/mesa/drivers/dri/i965/brw_shader.cpp  |   2 +-
> >  src/mesa/drivers/dri/i965/brw_vec4.cpp|   2 +-
> >  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |   2 +-
> >  src/mesa/drivers/dri/i965/brw_wm.c|   4 +-
> >  src/mesa/drivers/dri/i965/brw_wm_surface_state.c  |   2 +-
> >  src/mesa/drivers/x11/fakeglx.c|   6 +-
> >  src/mesa/drivers/x11/xm_api.c |  16 +--
> >  src/mesa/main/bitset.h|   1 +
> >  src/mesa/main/buffers.c   |   6 +-
> >  src/mesa/main/imports.c   |  88
> -
> >  src/mesa/main/imports.h   |  54 +-
> >  src/mesa/program/program_parse.y  |   2 +-
> >  src/util/Makefile.sources |   1 +
> >  src/util/bitcount.c   | 115
> ++
> >  src/util/bitcount.h   |  94
> ++
> >  22 files changed, 255 insertions(+), 183 deletions(-)
> >  create mode 100644 src/util/bitcount.c
> >  create mode 100644 src/util/bitcount.h
> >
> > diff --git a/src/gallium/state_trackers/glx/xlib/glx_api.c
> b/src/gallium/state_trackers/glx/xlib/glx_api.c
> > index 976791b..9914116 100644
> > --- a/src/gallium/state_trackers/glx/xlib/glx_api.c
> > +++ b/src/gallium/state_trackers/glx/xlib/glx_api.c
> > @@ -402,9 +402,9 @@ get_visual( Display *dpy, int scr, unsigned int
> depth, int xclass )
> >  * 10 bits per color channel.  Mesa's limited to a max of 8
> bits/channel.
> >  */
> > if (vis && depth > 24 && (xclass==TrueColor || xclass==DirectColor))
> {
> > -  if (_mesa_bitcount((GLuint) vis->red_mask  ) <= 8 &&
> > -  _mesa_bitcount((GLuint) vis->green_mask) <= 8 &&
> > -  _mesa_bitcount((GLuint) vis->blue_mask ) <= 8) {
> > +  if (util_bitcount((GLuint) vis->red_mask  ) <= 8 &&
> > +  util_bitcount((GLuint) vis->green_mask) <= 8 &&
> > +  util_bitcount((GLuint) vis->blue_mask ) <= 8) {
> >   return vis;
> >}
> >else {
> > diff --git a/src/gallium/state_trackers/glx/xlib/xm_api.c
> b/src/gallium/state_trackers/glx/xlib/xm_api.c
> > index 1b77729..74c5637 100644
> > --- a/src/gallium/state_trackers/glx/xlib/xm_api.c
> > +++ b/src/gallium/state_trackers/glx/xlib/xm_api.c
> > @@ -736,9 +736,9 @@ XMesaVisual XMesaCreateVisual( Display *display,
> > {
> >const int xclass = v->visualType;
> >if (xclass == GLX_TRUE_COLOR || xclass == GLX_DIRECT_COLOR) {
> > - red_bits   = _mesa_bitcount(GET_REDMASK(v));
> > - green_bits = _mesa_bitcount(GET_GREENMASK(v));
> > - blue_bits  = _mesa_bitcount(GET_BLUEMASK(v));
> > + red_bits   = util_bitcount(GET_REDMASK(v));
> > + green_bits = util_bitcount(GET_GREENMASK(v));
> > + blue_bits  = util_bitcount(GET_BLUEMASK(v));
> >}
> >else {
> >   /* this is an approximation */
> > @@ -1067,8 +1067,8 @@ XMesaCreatePixmapTextureBuffer(XMesaVisual v,
> Pixmap p,
> >if (ctx->Extensions.ARB_texture_non_power_of_two) {
> >   target = GLX_TEXTURE_2D_EXT;
> >}
> > -  else if (   _mesa_bitcount(b->width)  == 1
> > -   && _mesa_bitcount(b->height) == 1) {
> > +  else if (   util_bitcount(b->width)  == 1
> > +   && util_bitcount(b->height) == 1) {
> >   /* power of two size */
> >   if (b->height == 1) {
> >  target = GLX_TEXTURE_1D_EXT;
> > diff --git a/src/mesa/drivers/common/meta.c
> b/src/mesa/drivers/common/meta.c
> > index 87532c1..22a5b3e 100644
> > --- a/src/mesa/drivers/common/meta.c
> > +++ b/src/mesa/drivers/common/meta.c
> > @@ -85,6 +85,7 @@

[Mesa-dev] [PATCH 1/2] util: Add a bitcount.h file and move stuff from both mesa and gallium to it

2014-10-29 Thread Jason Ekstrand
---
 configure.ac   |   1 +
 src/gallium/auxiliary/tgsi/tgsi_exec.c |   1 +
 src/gallium/auxiliary/tgsi/tgsi_scan.c |   2 +-
 src/gallium/auxiliary/util/u_helpers.c |   1 +
 src/gallium/auxiliary/util/u_math.h| 118 -
 src/gallium/auxiliary/util/u_vbuf.c|   1 +
 src/gallium/drivers/i915/i915_state_emit.c |   1 +
 src/gallium/drivers/ilo/ilo_shader.c   |   1 +
 src/gallium/drivers/ilo/ilo_state.c|   1 +
 src/gallium/drivers/llvmpipe/lp_rast_tri.c |   1 +
 src/gallium/drivers/llvmpipe/lp_setup_tri.c|   1 +
 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   |   5 +-
 .../drivers/nouveau/codegen/nv50_ir_util.cpp   |   1 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c |   1 +
 .../drivers/nouveau/nv50/nv50_shader_state.c   |   1 +
 src/gallium/drivers/r600/evergreen_compute.c   |   1 +
 src/gallium/drivers/r600/r600_blit.c   |   1 +
 src/gallium/drivers/r600/r600_state_common.c   |   1 +
 src/gallium/drivers/radeon/r600_streamout.c|   1 +
 src/gallium/drivers/radeonsi/si_descriptors.c  |   1 +
 src/gallium/drivers/radeonsi/si_state_draw.c   |   3 +-
 src/gallium/drivers/softpipe/sp_quad_fs.c  |   1 +
 src/gallium/state_trackers/clover/api/memory.cpp   |   1 +
 src/gallium/state_trackers/glx/xlib/glx_api.c  |   6 +-
 src/gallium/state_trackers/glx/xlib/xm_api.c   |  10 +-
 src/mesa/drivers/common/meta.c |   3 +-
 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp   |   4 +-
 src/mesa/drivers/dri/i965/brw_curbe.c  |   2 +-
 src/mesa/drivers/dri/i965/brw_draw.c   |   6 +-
 src/mesa/drivers/dri/i965/brw_fs.cpp   |  12 +-
 src/mesa/drivers/dri/i965/brw_shader.cpp   |   2 +-
 src/mesa/drivers/dri/i965/brw_vec4.cpp |   2 +-
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp  |   2 +-
 src/mesa/drivers/dri/i965/brw_wm.c |   4 +-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c   |   2 +-
 src/mesa/drivers/x11/fakeglx.c |   6 +-
 src/mesa/drivers/x11/xm_api.c  |  16 +-
 src/mesa/main/bitset.h |   1 +
 src/mesa/main/buffers.c|   6 +-
 src/mesa/main/imports.c|  88 -
 src/mesa/main/imports.h|  54 +-
 src/mesa/program/program_parse.y   |   2 +-
 src/util/bitcount.h| 196 +
 43 files changed, 264 insertions(+), 307 deletions(-)
 create mode 100644 src/util/bitcount.h

diff --git a/configure.ac b/configure.ac
index 03f1bca..e2258eb 100644
--- a/configure.ac
+++ b/configure.ac
@@ -131,6 +131,7 @@ dnl Check for compiler builtins
 AX_GCC_BUILTIN([__builtin_bswap32])
 AX_GCC_BUILTIN([__builtin_bswap64])
 AX_GCC_BUILTIN([__builtin_clz])
+AX_GCC_BUILTIN([__builtin_clrsb])
 AX_GCC_BUILTIN([__builtin_clzll])
 AX_GCC_BUILTIN([__builtin_ctz])
 AX_GCC_BUILTIN([__builtin_expect])
diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c 
b/src/gallium/auxiliary/tgsi/tgsi_exec.c
index 7794801..d5830b0 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_exec.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c
@@ -60,6 +60,7 @@
 #include "tgsi_exec.h"
 #include "util/u_memory.h"
 #include "util/u_math.h"
+#include "util/bitcount.h"
 
 
 #define DEBUG_EXECUTION 0
diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c 
b/src/gallium/auxiliary/tgsi/tgsi_scan.c
index 42bc61e..b87a7b0 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_scan.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c
@@ -41,7 +41,7 @@
 #include "tgsi/tgsi_parse.h"
 #include "tgsi/tgsi_util.h"
 #include "tgsi/tgsi_scan.h"
-
+#include "util/bitcount.h"
 
 
 
diff --git a/src/gallium/auxiliary/util/u_helpers.c 
b/src/gallium/auxiliary/util/u_helpers.c
index ac1edcd..f8df4b9 100644
--- a/src/gallium/auxiliary/util/u_helpers.c
+++ b/src/gallium/auxiliary/util/u_helpers.c
@@ -27,6 +27,7 @@
 
 #include "util/u_helpers.h"
 #include "util/u_inlines.h"
+#include "util/bitcount.h"
 
 /**
  * This function is used to copy an array of pipe_vertex_buffer structures,
diff --git a/src/gallium/auxiliary/util/u_math.h 
b/src/gallium/auxiliary/util/u_math.h
index 0113fb1..6004e96 100644
--- a/src/gallium/auxiliary/util/u_math.h
+++ b/src/gallium/auxiliary/util/u_math.h
@@ -52,10 +52,6 @@ extern "C" {
 #include 
 #include 
 
-#ifdef PIPE_OS_UNIX
-#include  /* for ffs */
-#endif
-
 
 #ifndef M_SQRT2
 #define M_SQRT2 1.41421356237309504880
@@ -492,85 +488,6 @@ util_half_inf_sign(int16_t x)
return (x < 0) ? -1 : 1;
 }
 
-
-/**
- * Find first bit set in word.  Least significant bit is 1.
- * Return 0 if no bits set.
- */
-#ifndef FFS_DEFINED
-#define FFS_DEFINED 1
-
-#if defined(_MSC_VER) && _MSC_VER >= 1300 && (_M_IX86 || _M_AMD64 || _M_IA64)
-unsigned char _BitScanForward(uns

Re: [Mesa-dev] [PATCH 1/2] util: Add a bitcount.h file and move stuff from both mesa and gallium to it

2014-10-29 Thread Matt Turner
On Wed, Oct 29, 2014 at 3:42 PM, Jason Ekstrand  wrote:
> diff --git a/configure.ac b/configure.ac
> index 03f1bca..e2258eb 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -131,6 +131,7 @@ dnl Check for compiler builtins
>  AX_GCC_BUILTIN([__builtin_bswap32])
>  AX_GCC_BUILTIN([__builtin_bswap64])
>  AX_GCC_BUILTIN([__builtin_clz])
> +AX_GCC_BUILTIN([__builtin_clrsb])
>  AX_GCC_BUILTIN([__builtin_clzll])
>  AX_GCC_BUILTIN([__builtin_ctz])
>  AX_GCC_BUILTIN([__builtin_expect])

I think I just need a script that reads mesa-dev and responds to
patches to configure.ac with "Alphabetize!"

Also, the scons file scons/gallium.py needs to be updated (just
duplicating the old gcc version logic).
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/5] i965/fs: Track liveness of the flag register.

2014-10-29 Thread Matt Turner
On Wed, Oct 29, 2014 at 2:10 PM, Matt Turner  wrote:
> ---
>  .../drivers/dri/i965/brw_fs_live_variables.cpp | 35 
> ++
>  src/mesa/drivers/dri/i965/brw_fs_live_variables.h  |  5 
>  2 files changed, 40 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
> index ab81e94..dbe1d34 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.cpp
> @@ -157,6 +157,18 @@ fs_live_variables::setup_def_use()
> reg.reg_offset++;
>  }
>  }
> + if (inst->reads_flag()) {
> +/* The vertical combination predicates read f0.0 and f0.1. */
> +if (inst->predicate == BRW_PREDICATE_ALIGN1_ANYV ||
> +inst->predicate == BRW_PREDICATE_ALIGN1_ALLV) {
> +   if (!BITSET_TEST(bd->flag_def, 1 - inst->flag_subreg)) {
> +  BITSET_SET(bd->flag_use, 1 - inst->flag_subreg);

Since don't expect (+f0.1.allv) to work (i.e., vertical predicates
with a subregister of 1), maybe I should just assert(inst->flag_subreg
== 0) and then do BITSET_*(..., 1) here.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965/fs: Perform CSE on MAD instructions with final arguments switched.

2014-10-29 Thread Kenneth Graunke
On Wednesday, October 29, 2014 02:09:55 PM Matt Turner wrote:
> Multiplication is commutative.
> 
> instructions in affected programs: 48314 -> 47954 (-0.75%)
> ---
>  src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> index 8012001..5fdbf46 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> @@ -128,7 +128,11 @@ operands_match(fs_inst *a, fs_inst *b)
> fs_reg *xs = a->src;
> fs_reg *ys = b->src;
>  
> -   if (!is_expression_commutative(a->opcode)) {
> +   if (a->opcode == BRW_OPCODE_MAD) {
> +  return xs[0].equals(ys[0]) &&
> + ((xs[1].equals(ys[1]) && xs[2].equals(ys[2])) ||
> +  (xs[2].equals(ys[1]) && xs[1].equals(ys[2])));
> +   } else if (!is_expression_commutative(a->opcode)) {
>bool match = true;
>for (int i = 0; i < a->sources; i++) {
>   if (!xs[i].equals(ys[i])) {
> 

Series is:
Reviewed-by: Kenneth Graunke 

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 57702] Eliminate RTLD_GLOBAL glapi hacks after removing support for static libglapi

2014-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=57702

--- Comment #3 from fjhenigman  ---
If I'm understanding correctly:
- this bug asks that, for example, /usr/lib64/dri/i965_dri.so pull in libglapi
- at the moment it's deliberately not pulled because some use case didn't want
that
- that other use case no longer exists in current code, but there's reluctance
to break older versions by making the requested change
- in time, fear of breaking old code will reduce, and this change could happen

Here's another reason for making that change: gbm_create_device() fails because
dlopening (for example) i965_dri.so fails due to missing glapi symbols.  Unless
you link in or dlopen libglapi, or link in something that pulls it in such as
libGL.  The dlopen(libglapi) hack seems to be widespread:

chrome:
http://src.chromium.org/chrome/trunk/src/ui/ozone/platform/dri/ozone_platform_gbm.cc

wayland:
http://fossies.org/linux/weston/src/compositor-drm.c

enlightenment:
https://git.enlightenment.org/core/efl.git/commit/?h=devs/devilhorns/drm&id=73a7ac2ec8201123785ec17eff97364f72a474a1

Now I find waffle has the same problem.  Do I need to add the same hack there?
https://github.com/waffle-gl/waffle/pull/21

If we must use the hack for now, wouldn't it be better in gbm_create_device, so
every
gbm user doesn't clutter their code with it?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Drop constant 0.0 components from dot products.

2014-10-29 Thread Kenneth Graunke
On Thursday, October 23, 2014 04:19:19 PM Matt Turner wrote:
> Helps a small number of vertex shaders in the games Dungeon Defenders
> and Shank, as well as an internal benchmark.
> 
> instructions in affected programs: 2801 -> 2719 (-2.93%)
> ---
>  src/glsl/opt_algebraic.cpp | 25 +
>  1 file changed, 25 insertions(+)
> 
> diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
> index 0cdb8ec..6976ee7 100644
> --- a/src/glsl/opt_algebraic.cpp
> +++ b/src/glsl/opt_algebraic.cpp
> @@ -553,6 +553,31 @@ ir_algebraic_visitor::handle_expression(ir_expression 
*ir)
>}
>return new(mem_ctx) ir_swizzle(ir->operands[0], component, 0, 0, 0, 1);
>}
> +
> +  for (int i = 0; i < 2; i++) {
> + if (!op_const[i])
> +continue;
> +
> + unsigned components[4] = { 0 }, count = 0;
> +
> + for (unsigned c = 0; c < op_const[i]->type->vector_elements; c++) 
{
> +if (op_const[i]->value.f[c] == 0.0)
> +   continue;
> +

   /* Store which channels have non-zero values. */

> +components[count] = c;
> +count++;
> + }
> +

/* No channels had zero values; bail. */

> + if (count >= op_const[i]->type->vector_elements)
> +break;

/* Swizzle both operands to remove the channels that were zero. */

> + return new(mem_ctx)
> +ir_expression(ir_binop_dot, glsl_type::float_type,
> +  new(mem_ctx) ir_swizzle(ir->operands[0],
> +  components, count),
> +  new(mem_ctx) ir_swizzle(ir->operands[1],
> +  components, count));
> +  }
>break;
>  
> case ir_binop_less:

With or without the comments,
Reviewed-by: Kenneth Graunke 

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 06/14] i965: Add new SIMD8 VS prog data flag

2014-10-29 Thread Kristian Høgsberg
On Tue, Oct 28, 2014 at 5:48 PM, Kenneth Graunke  wrote:
> On Tuesday, October 28, 2014 04:25:05 PM Matt Turner wrote:
>> On Tue, Oct 28, 2014 at 3:17 PM, Kristian Høgsberg 
> wrote:
>> > This flag signals that we have a SIMD8 VS shader so we can set up the
>> > corresponding state accordingly.  This boils down to setting
>> > the BDW+ SIMD8 enable bit in 3DSTATE_VS and making UBO and pull
>> > constant buffers use dword pitch.
>> >
>> > Signed-off-by: Kristian Høgsberg 
>> > ---
>> >  src/mesa/drivers/dri/i965/brw_context.h  |  5 -
>> >  src/mesa/drivers/dri/i965/brw_defines.h  |  2 ++
>> >  src/mesa/drivers/dri/i965/brw_gs_surface_state.c |  2 +-
>> >  src/mesa/drivers/dri/i965/brw_vs_surface_state.c | 10 --
>> >  src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  7 ---
>> >  src/mesa/drivers/dri/i965/gen8_vs_state.c|  2 ++
>> >  6 files changed, 21 insertions(+), 7 deletions(-)
>> >
>> > diff --git a/src/mesa/drivers/dri/i965/brw_context.h
> b/src/mesa/drivers/dri/i965/brw_context.h
>> > index eb37e75..e7cd30f 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_context.h
>> > +++ b/src/mesa/drivers/dri/i965/brw_context.h
>> > @@ -543,6 +543,8 @@ struct brw_vec4_prog_data {
>> >  * is the size of the URB entry used for output.
>> >  */
>> > GLuint urb_entry_size;
>> > +
>> > +   bool simd8;
>>
>> brw_vec4_prog_data is going to be the prog_data struct for SIMD8
>> vertex shaders? :\
>>
>> brw_gs_prog_data, which inherits brw_vec4_prog_data, has
>>
>>/**
>> * Dispatch mode, can be any of:
>> * GEN7_GS_DISPATCH_MODE_DUAL_OBJECT
>> * GEN7_GS_DISPATCH_MODE_DUAL_INSTANCE
>> * GEN7_GS_DISPATCH_MODE_SINGLE
>> */
>>int dispatch_mode;
>>
>> Maybe it shouldn't hold the values of the things in the comment
>> directly (since they're things like 2<<11) but shouldn't we pull this
>> field out and have an enum or something?
>
> I like that plan.  A bunch of stages can (or will be able to) run in 4x1
> (single), 4x2 dual object, 4x2 dual instance, or SIMD8 mode.
>
> enum shader_dispatch_mode {
>DISPATCH_MODE_4X2_DUAL_OBJECT,
>DISPATCH_MODE_4X2_DUAL_INSTANCE,
>DISPATCH_MODE_4X1_SINGLE,
>DISPATCH_MODE_SIMD8,
> };
>
> The state upload code would use these values.

I think that makes sense once we get SIMD8 GS and at that point it
will be an easy enough change to make. For now, we only have 4x2 or
SIMD8 for VS so a simd8 flag is sufficient.  I'm not saying it's a bad
idea, but let's do it when we need it.

As for using brw_vec4_prog_data for scalar vs, I agree that it looks
weird, but there's nothing vec4 specific in there.  A better name may
be brw_vue_prog_data, but is this really worse than using fs_visitor
to generate vs code?  Either way, we have some renaming to decide on,
which can do before or after this lands.

Kristian

>> >  const struct brw_tracked_state brw_gs_ubo_surfaces = {
>> > diff --git a/src/mesa/drivers/dri/i965/brw_vs_surface_state.c
> b/src/mesa/drivers/dri/i965/brw_vs_surface_state.c
>> > index 1cc96cf..24bc06d 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_vs_surface_state.c
>> > +++ b/src/mesa/drivers/dri/i965/brw_vs_surface_state.c
>> > @@ -112,6 +112,7 @@ static void
>> >  brw_upload_vs_pull_constants(struct brw_context *brw)
>> >  {
>> > struct brw_stage_state *stage_state = &brw->vs.base;
>> > +   bool dword_pitch;
>>
>> I can't figure out the name of this variable. In most any context I
>> would imagine 'dword_pitch' is an integer.
>>
>> What does this mean, and can we name it something more descriptive?
>
> See brw_create_constant_surface in brw_wm_surface_state.c.  For SIMD8 shader
> access, we configure the constant buffer SURFACE_STATE to use a pitch of 4,
> while for SIMD4x2 access, we configure it to use a pitch of 16 (vec4 size).
>
> I believe Eric introduced the name, and it's been around for quite a while
> now.  Feel free to submit a patch to rename it.
>
> --Ken
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] util: Add a bitcount.h file and move stuff from both mesa and gallium to it

2014-10-29 Thread Jason Ekstrand
On Wed, Oct 29, 2014 at 3:51 PM, Matt Turner  wrote:

> On Wed, Oct 29, 2014 at 3:42 PM, Jason Ekstrand 
> wrote:
> > diff --git a/configure.ac b/configure.ac
> > index 03f1bca..e2258eb 100644
> > --- a/configure.ac
> > +++ b/configure.ac
> > @@ -131,6 +131,7 @@ dnl Check for compiler builtins
> >  AX_GCC_BUILTIN([__builtin_bswap32])
> >  AX_GCC_BUILTIN([__builtin_bswap64])
> >  AX_GCC_BUILTIN([__builtin_clz])
> > +AX_GCC_BUILTIN([__builtin_clrsb])
> >  AX_GCC_BUILTIN([__builtin_clzll])
> >  AX_GCC_BUILTIN([__builtin_ctz])
> >  AX_GCC_BUILTIN([__builtin_expect])
>
> I think I just need a script that reads mesa-dev and responds to
> patches to configure.ac with "Alphabetize!"
>
> Also, the scons file scons/gallium.py needs to be updated (just
> duplicating the old gcc version logic).
>

Both issues fixed locally
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 2/2] glsl: Lower constant arrays to uniform arrays.

2014-10-29 Thread Kenneth Graunke
Consider GLSL code such as:

   const ivec2 offsets[] =
  ivec2[](ivec2(-1, -1), ivec2(-1, 0), ivec2(-1, 1),
  ivec2(0, -1),  ivec2(0, 0),  ivec2(0, 1),
  ivec2(1, -1),  ivec2(1, 0),  ivec2(1, 1));

   ivec2 offset = offsets[];

Both i965 and nv50 currently handle this very poorly.  On i965, this
becomes a pile of MOVs to load the immediate constants into registers,
a pile of scratch writes to move the whole array to memory, and one
scratch read to actually access the value - effectively the same as if
it were a non-constant array.

We'd much rather upload large blocks of constant data as uniform data,
so drivers can simply upload the data via constbufs, and not have to
populate it via shader instructions.

This is currently non-optional because both i965 and nouveau benefit
from it, and according to Marek radeonsi would benefit today as well.
(According to Tom, radeonsi may want to handle this itself in the long
term, but we can always add a flag when it becomes useful.)

Improves performance in a terrain rendering microbenchmark by about 2x,
and cuts the number of instructions in about half.  Helps a lot of
"Natural Selection 2" shaders, as well as one "HOARD" shader.

total instructions in shared programs: 5473459 -> 5471765 (-0.03%)
instructions in affected programs: 5880 -> 4186 (-28.81%)

v2: Use ir_var_hidden to avoid exposing the new uniform via the GL
uniform introspection API.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77957
Signed-off-by: Kenneth Graunke 
---
 src/glsl/Makefile.sources   |   1 +
 src/glsl/ir_optimization.h  |   1 +
 src/glsl/linker.cpp |   2 +
 src/glsl/lower_const_arrays_to_uniforms.cpp | 102 
 4 files changed, 106 insertions(+)
 create mode 100644 src/glsl/lower_const_arrays_to_uniforms.cpp

diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
index 0c55327..6aed52d 100644
--- a/src/glsl/Makefile.sources
+++ b/src/glsl/Makefile.sources
@@ -58,6 +58,7 @@ LIBGLSL_FILES = \
$(GLSL_SRCDIR)/loop_analysis.cpp \
$(GLSL_SRCDIR)/loop_controls.cpp \
$(GLSL_SRCDIR)/loop_unroll.cpp \
+   $(GLSL_SRCDIR)/lower_const_arrays_to_uniforms.cpp \
$(GLSL_SRCDIR)/lower_clip_distance.cpp \
$(GLSL_SRCDIR)/lower_discard.cpp \
$(GLSL_SRCDIR)/lower_discard_flow.cpp \
diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h
index e25857a..34e0b4b 100644
--- a/src/glsl/ir_optimization.h
+++ b/src/glsl/ir_optimization.h
@@ -114,6 +114,7 @@ bool lower_noise(exec_list *instructions);
 bool lower_variable_index_to_cond_assign(exec_list *instructions,
 bool lower_input, bool lower_output, bool lower_temp, bool lower_uniform);
 bool lower_quadop_vector(exec_list *instructions, bool dont_lower_swz);
+bool lower_const_arrays_to_uniforms(exec_list *instructions);
 bool lower_clip_distance(gl_shader *shader);
 void lower_output_reads(exec_list *instructions);
 bool lower_packing_builtins(exec_list *instructions, int op_mask);
diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index 2d31801..bd2aa3c 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -2678,6 +2678,8 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
 &ctx->Const.ShaderCompilerOptions[i],
 ctx->Const.NativeIntegers))
 ;
+
+  lower_const_arrays_to_uniforms(prog->_LinkedShaders[i]->ir);
}
 
/* Check and validate stream emissions in geometry shaders */
diff --git a/src/glsl/lower_const_arrays_to_uniforms.cpp 
b/src/glsl/lower_const_arrays_to_uniforms.cpp
new file mode 100644
index 000..b3c0ee2
--- /dev/null
+++ b/src/glsl/lower_const_arrays_to_uniforms.cpp
@@ -0,0 +1,102 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE

[Mesa-dev] [PATCH v2 1/2] glsl: Add infrastructure for "hidden" uniforms.

2014-10-29 Thread Kenneth Graunke
In the compiler, we'd like to generate implicit uniforms for internal
use.  These should not be visible via the GL uniform introspection API.

To support that, we add a new ir_variable::how_declared value of
ir_var_hidden, and plumb that through to gl_uniform_storage.

v2 (idr): Fix some memory management issues in
move_hidden_uniforms_to_end.  The comment block on the function has more
details.

Signed-off-by: Kenneth Graunke 
Signed-off-by: Ian Romanick 
---
 src/glsl/ir.h  |  6 ++
 src/glsl/ir_uniform.h  |  6 ++
 src/glsl/link_uniforms.cpp | 50 ++
 src/mesa/main/mtypes.h |  1 +
 src/mesa/main/shaderapi.c  |  6 --
 5 files changed, 67 insertions(+), 2 deletions(-)

diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index 90c443c..a7c4c6b 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -359,6 +359,12 @@ enum ir_var_declaration_type {
 * re-declared by the shader.
 */
ir_var_declared_implicitly,
+
+   /**
+* Variable is implicitly generated by the compiler and should not be
+* visible via the API.
+*/
+   ir_var_hidden,
 };
 
 /**
diff --git a/src/glsl/ir_uniform.h b/src/glsl/ir_uniform.h
index b9ecf7c..21b5d05 100644
--- a/src/glsl/ir_uniform.h
+++ b/src/glsl/ir_uniform.h
@@ -175,6 +175,12 @@ struct gl_uniform_storage {
 * arrays this is the first element in the array.
 */
unsigned remap_location;
+
+   /**
+* This is a compiler-generated uniform that should not be advertised
+* via the API.
+*/
+   bool hidden;
 };
 
 #ifdef __cplusplus
diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp
index 400e134..de2f6c9 100644
--- a/src/glsl/link_uniforms.cpp
+++ b/src/glsl/link_uniforms.cpp
@@ -585,6 +585,8 @@ private:
   this->uniforms[id].driver_storage = NULL;
   this->uniforms[id].storage = this->values;
   this->uniforms[id].atomic_buffer_index = -1;
+  this->uniforms[id].hidden =
+ current_var->data.how_declared == ir_var_hidden;
   if (this->ubo_block_index != -1) {
 this->uniforms[id].block_index = this->ubo_block_index;
 
@@ -806,6 +808,50 @@ link_set_image_access_qualifiers(struct gl_shader_program 
*prog)
}
 }
 
+/**
+ * Sort the array of uniform storage so that the non-hidden uniforms are first
+ *
+ * This function sorts the list "in place."  This is important because some of
+ * the storage accessible from \c uniforms has \c uniforms as its \c ralloc
+ * context.  If \c uniforms is freed, some other storage will also be freed.
+ */
+static unsigned
+move_hidden_uniforms_to_end(struct gl_shader_program *prog,
+struct gl_uniform_storage *uniforms,
+unsigned num_elements)
+{
+   struct gl_uniform_storage *sorted_uniforms =
+  ralloc_array(prog, struct gl_uniform_storage, num_elements);
+   unsigned hidden_uniforms = 0;
+   unsigned j = 0;
+
+   /* Add the non-hidden uniforms. */
+   for (unsigned i = 0; i < num_elements; i++) {
+  if (!uniforms[i].hidden)
+ sorted_uniforms[j++] = uniforms[i];
+   }
+
+   /* Add and count the hidden uniforms. */
+   for (unsigned i = 0; i < num_elements; i++) {
+  if (uniforms[i].hidden) {
+ sorted_uniforms[j++] = uniforms[i];
+ hidden_uniforms++;
+  }
+   }
+
+   assert(prog->UniformHash != NULL);
+   prog->UniformHash->clear();
+   for (unsigned i = 0; i < num_elements; i++) {
+  if (sorted_uniforms[i].name != NULL)
+ prog->UniformHash->put(i, sorted_uniforms[i].name);
+   }
+
+   memcpy(uniforms, sorted_uniforms, sizeof(uniforms[0]) * num_elements);
+   ralloc_free(sorted_uniforms);
+
+   return hidden_uniforms;
+}
+
 void
 link_assign_uniform_locations(struct gl_shader_program *prog,
   unsigned int boolean_true)
@@ -926,6 +972,9 @@ link_assign_uniform_locations(struct gl_shader_program 
*prog,
  sizeof(prog->_LinkedShaders[i]->SamplerTargets));
}
 
+   const unsigned hidden_uniforms =
+  move_hidden_uniforms_to_end(prog, uniforms, num_user_uniforms);
+
/* Reserve all the explicit locations of the active uniforms. */
for (unsigned i = 0; i < num_user_uniforms; i++) {
   if (uniforms[i].remap_location != UNMAPPED_UNIFORM_LOC) {
@@ -978,6 +1027,7 @@ link_assign_uniform_locations(struct gl_shader_program 
*prog,
 #endif
 
prog->NumUserUniformStorage = num_user_uniforms;
+   prog->NumHiddenUniforms = hidden_uniforms;
prog->UniformStorage = uniforms;
 
link_set_image_access_qualifiers(prog);
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 35f5f69..7583f2c 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2843,6 +2843,7 @@ struct gl_shader_program
 
/* post-link info: */
unsigned NumUserUniformStorage;
+   unsigned NumHiddenUniforms;
struct gl_uniform_storage *UniformStorage;
 
/**
diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 2be9092..6657820 

[Mesa-dev] [prefix=PATCH v3 1/3] util: Add a bitcount.h file and move stuff from both mesa and gallium to it

2014-10-29 Thread Jason Ekstrand
---
 configure.ac   |   1 +
 scons/gallium.py   |   2 +
 src/gallium/auxiliary/tgsi/tgsi_exec.c |   1 +
 src/gallium/auxiliary/tgsi/tgsi_scan.c |   2 +-
 src/gallium/auxiliary/util/u_helpers.c |   1 +
 src/gallium/auxiliary/util/u_math.h| 118 -
 src/gallium/auxiliary/util/u_vbuf.c|   1 +
 src/gallium/drivers/i915/i915_state_emit.c |   1 +
 src/gallium/drivers/ilo/ilo_shader.c   |   1 +
 src/gallium/drivers/ilo/ilo_state.c|   1 +
 src/gallium/drivers/llvmpipe/lp_rast_tri.c |   1 +
 src/gallium/drivers/llvmpipe/lp_setup_tri.c|   1 +
 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   |   5 +-
 .../drivers/nouveau/codegen/nv50_ir_util.cpp   |   1 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c |   1 +
 .../drivers/nouveau/nv50/nv50_shader_state.c   |   1 +
 src/gallium/drivers/r600/evergreen_compute.c   |   1 +
 src/gallium/drivers/r600/r600_blit.c   |   1 +
 src/gallium/drivers/r600/r600_state_common.c   |   1 +
 src/gallium/drivers/radeon/r600_streamout.c|   1 +
 src/gallium/drivers/radeonsi/si_descriptors.c  |   1 +
 src/gallium/drivers/radeonsi/si_state_draw.c   |   3 +-
 src/gallium/drivers/softpipe/sp_quad_fs.c  |   1 +
 src/gallium/state_trackers/clover/api/memory.cpp   |   1 +
 src/gallium/state_trackers/glx/xlib/glx_api.c  |   6 +-
 src/gallium/state_trackers/glx/xlib/xm_api.c   |  10 +-
 src/mesa/drivers/common/meta.c |   3 +-
 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp   |   4 +-
 src/mesa/drivers/dri/i965/brw_curbe.c  |   2 +-
 src/mesa/drivers/dri/i965/brw_draw.c   |   6 +-
 src/mesa/drivers/dri/i965/brw_fs.cpp   |  12 +-
 src/mesa/drivers/dri/i965/brw_shader.cpp   |   2 +-
 src/mesa/drivers/dri/i965/brw_vec4.cpp |   2 +-
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp  |   2 +-
 src/mesa/drivers/dri/i965/brw_wm.c |   4 +-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c   |   2 +-
 src/mesa/drivers/x11/fakeglx.c |   6 +-
 src/mesa/drivers/x11/xm_api.c  |  16 +-
 src/mesa/main/bitset.h |   1 +
 src/mesa/main/buffers.c|   6 +-
 src/mesa/main/imports.c|  88 -
 src/mesa/main/imports.h|  54 +-
 src/mesa/program/program_parse.y   |   2 +-
 src/util/bitcount.h| 196 +
 44 files changed, 266 insertions(+), 307 deletions(-)
 create mode 100644 src/util/bitcount.h

diff --git a/configure.ac b/configure.ac
index 03f1bca..be673da 100644
--- a/configure.ac
+++ b/configure.ac
@@ -130,6 +130,7 @@ fi
 dnl Check for compiler builtins
 AX_GCC_BUILTIN([__builtin_bswap32])
 AX_GCC_BUILTIN([__builtin_bswap64])
+AX_GCC_BUILTIN([__builtin_clrsb])
 AX_GCC_BUILTIN([__builtin_clz])
 AX_GCC_BUILTIN([__builtin_clzll])
 AX_GCC_BUILTIN([__builtin_ctz])
diff --git a/scons/gallium.py b/scons/gallium.py
index dd5ca56..2eb6e91 100755
--- a/scons/gallium.py
+++ b/scons/gallium.py
@@ -606,6 +606,8 @@ def generate(env):
 ]
 if distutils.version.LooseVersion(ccversion) >= 
distutils.version.LooseVersion('4.5'):
 cppdefines += ['HAVE___BUILTIN_UNREACHABLE']
+if distutils.version.LooseVersion(ccversion) >= 
distutils.version.LooseVersion('4.7'):
+cppdefines += ['HAVE___BUILTIN_CLRSB']
 
 # Load tools
 env.Tool('lex')
diff --git a/src/gallium/auxiliary/tgsi/tgsi_exec.c 
b/src/gallium/auxiliary/tgsi/tgsi_exec.c
index 7794801..d5830b0 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_exec.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_exec.c
@@ -60,6 +60,7 @@
 #include "tgsi_exec.h"
 #include "util/u_memory.h"
 #include "util/u_math.h"
+#include "util/bitcount.h"
 
 
 #define DEBUG_EXECUTION 0
diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c 
b/src/gallium/auxiliary/tgsi/tgsi_scan.c
index 42bc61e..b87a7b0 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_scan.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c
@@ -41,7 +41,7 @@
 #include "tgsi/tgsi_parse.h"
 #include "tgsi/tgsi_util.h"
 #include "tgsi/tgsi_scan.h"
-
+#include "util/bitcount.h"
 
 
 
diff --git a/src/gallium/auxiliary/util/u_helpers.c 
b/src/gallium/auxiliary/util/u_helpers.c
index ac1edcd..f8df4b9 100644
--- a/src/gallium/auxiliary/util/u_helpers.c
+++ b/src/gallium/auxiliary/util/u_helpers.c
@@ -27,6 +27,7 @@
 
 #include "util/u_helpers.h"
 #include "util/u_inlines.h"
+#include "util/bitcount.h"
 
 /**
  * This function is used to copy an array of pipe_vertex_buffer structures,
diff --git a/src/gallium/auxiliary/util/u_math.h 
b/src/gallium/auxiliary/util/u_math.h
index 0113fb1..6004e96 100644
--- a/src/gallium/auxiliary/util/u_math.h
+++ b/s

[Mesa-dev] [prefix=PATCH v3 3/3] util: Move bitset to the util/ folder

2014-10-29 Thread Jason Ekstrand
---
 .../drivers/dri/i965/brw_fs_copy_propagation.cpp   |   2 +-
 src/mesa/drivers/dri/i965/brw_fs_live_variables.h  |   2 +-
 .../drivers/dri/i965/brw_performance_monitor.c |   2 +-
 .../drivers/dri/i965/brw_vec4_live_variables.h |   2 +-
 src/mesa/drivers/dri/nouveau/nouveau_context.h |   2 +-
 src/mesa/main/bitset.h | 101 -
 src/mesa/main/performance_monitor.c|   2 +-
 src/mesa/main/texstate.c   |   2 +-
 src/util/bitset.h  | 100 
 src/util/register_allocate.c   |   2 +-
 10 files changed, 108 insertions(+), 109 deletions(-)
 delete mode 100644 src/mesa/main/bitset.h
 create mode 100644 src/util/bitset.h

diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index e1989cb..1a97153 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -34,7 +34,7 @@
 
 #define ACP_HASH_SIZE 16
 
-#include "main/bitset.h"
+#include "util/bitset.h"
 #include "brw_fs.h"
 #include "brw_cfg.h"
 
diff --git a/src/mesa/drivers/dri/i965/brw_fs_live_variables.h 
b/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
index 6cc8a98..d5f883d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
+++ b/src/mesa/drivers/dri/i965/brw_fs_live_variables.h
@@ -26,7 +26,7 @@
  */
 
 #include "brw_fs.h"
-#include "main/bitset.h"
+#include "util/bitset.h"
 
 struct cfg_t;
 
diff --git a/src/mesa/drivers/dri/i965/brw_performance_monitor.c 
b/src/mesa/drivers/dri/i965/brw_performance_monitor.c
index edfa3d2..c174c81 100644
--- a/src/mesa/drivers/dri/i965/brw_performance_monitor.c
+++ b/src/mesa/drivers/dri/i965/brw_performance_monitor.c
@@ -44,7 +44,7 @@
 
 #include 
 
-#include "main/bitset.h"
+#include "util/bitset.h"
 #include "main/hash.h"
 #include "main/macros.h"
 #include "main/mtypes.h"
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h 
b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
index 03cc813..b50a36a 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h
@@ -25,7 +25,7 @@
  *
  */
 
-#include "main/bitset.h"
+#include "util/bitset.h"
 #include "brw_vec4.h"
 
 namespace brw {
diff --git a/src/mesa/drivers/dri/nouveau/nouveau_context.h 
b/src/mesa/drivers/dri/nouveau/nouveau_context.h
index 8ea431b..b6cbde4 100644
--- a/src/mesa/drivers/dri/nouveau/nouveau_context.h
+++ b/src/mesa/drivers/dri/nouveau/nouveau_context.h
@@ -32,7 +32,7 @@
 #include "nouveau_scratch.h"
 #include "nouveau_render.h"
 
-#include "main/bitset.h"
+#include "util/bitset.h"
 
 enum nouveau_fallback {
HWTNL = 0,
diff --git a/src/mesa/main/bitset.h b/src/mesa/main/bitset.h
deleted file mode 100644
index dbf1af9..000
--- a/src/mesa/main/bitset.h
+++ /dev/null
@@ -1,101 +0,0 @@
-/*
- * Mesa 3-D graphics library
- *
- * Copyright (C) 2006  Brian Paul   All Rights Reserved.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included
- * in all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
- * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- */
-
-/**
- * \file bitset.h
- * \brief Bitset of arbitrary size definitions.
- * \author Michal Krol
- */
-
-#ifndef BITSET_H
-#define BITSET_H
-
-#include "imports.h"
-#include "util/bitcount.h"
-#include "util/macros.h"
-
-/
- * generic bitset implementation
- */
-
-#define BITSET_WORD GLuint
-#define BITSET_WORDBITS (sizeof (BITSET_WORD) * 8)
-
-/* bitset declarations
- */
-#define BITSET_WORDS(bits) (ALIGN(bits, BITSET_WORDBITS) / BITSET_WORDBITS)
-#define BITSET_DECLARE(name, bits) BITSET_WORD name[BITSET_WORDS(bits)]
-
-/* bitset operations
- */
-#define BITSET_COPY(x, y) memcpy( (x), (y), sizeof (x) )
-#define BITSET_EQUAL(x, y) (memcmp( (x), (y), sizeof (x) ) == 0)
-#define BITSET_ZERO(x) memset( (x), 0, sizeof (x) )
-#de

[Mesa-dev] [prefix=PATCH v3 2/3] util: Move ALIGN from mesa/main/macros.h to util/macros.h

2014-10-29 Thread Jason Ekstrand
---
 src/mesa/main/bitset.h |  1 +
 src/mesa/main/macros.h | 27 ---
 src/util/macros.h  | 27 +++
 3 files changed, 28 insertions(+), 27 deletions(-)

diff --git a/src/mesa/main/bitset.h b/src/mesa/main/bitset.h
index f50b14f..dbf1af9 100644
--- a/src/mesa/main/bitset.h
+++ b/src/mesa/main/bitset.h
@@ -33,6 +33,7 @@
 
 #include "imports.h"
 #include "util/bitcount.h"
+#include "util/macros.h"
 
 /
  * generic bitset implementation
diff --git a/src/mesa/main/macros.h b/src/mesa/main/macros.h
index cd5f2d6..33cc583 100644
--- a/src/mesa/main/macros.h
+++ b/src/mesa/main/macros.h
@@ -702,33 +702,6 @@ is_power_of_two(unsigned value)
return (value & (value - 1)) == 0;
 }
 
-/**
- * Align a value up to an alignment value
- *
- * If \c value is not already aligned to the requested alignment value, it
- * will be rounded up.
- *
- * \param value  Value to be rounded
- * \param alignment  Alignment value to be used.  This must be a power of two.
- *
- * \sa ROUND_DOWN_TO()
- */
-#define ALIGN(value, alignment)  (((value) + (alignment) - 1) & ~((alignment) 
- 1))
-
-/**
- * Align a value down to an alignment value
- *
- * If \c value is not already aligned to the requested alignment value, it
- * will be rounded down.
- *
- * \param value  Value to be rounded
- * \param alignment  Alignment value to be used.  This must be a power of two.
- *
- * \sa ALIGN()
- */
-#define ROUND_DOWN_TO(value, alignment) ((value) & ~(alignment - 1))
-
-
 /** Cross product of two 3-element vectors */
 static inline void
 CROSS3(GLfloat n[3], const GLfloat u[3], const GLfloat v[3])
diff --git a/src/util/macros.h b/src/util/macros.h
index ff37a7d..1b750b6 100644
--- a/src/util/macros.h
+++ b/src/util/macros.h
@@ -31,6 +31,33 @@
 
 
 /**
+ * Align a value up to an alignment value
+ *
+ * If \c value is not already aligned to the requested alignment value, it
+ * will be rounded up.
+ *
+ * \param value  Value to be rounded
+ * \param alignment  Alignment value to be used.  This must be a power of two.
+ *
+ * \sa ROUND_DOWN_TO()
+ */
+#define ALIGN(value, alignment)  (((value) + (alignment) - 1) & ~((alignment) 
- 1))
+
+/**
+ * Align a value down to an alignment value
+ *
+ * If \c value is not already aligned to the requested alignment value, it
+ * will be rounded down.
+ *
+ * \param value  Value to be rounded
+ * \param alignment  Alignment value to be used.  This must be a power of two.
+ *
+ * \sa ALIGN()
+ */
+#define ROUND_DOWN_TO(value, alignment) ((value) & ~(alignment - 1))
+
+
+/**
  * __builtin_expect macros
  */
 #if !defined(HAVE___BUILTIN_EXPECT)
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/14] i965: Generalize fs_generator further

2014-10-29 Thread Kristian Høgsberg
On Tue, Oct 28, 2014 at 5:30 PM, Matt Turner  wrote:
> On Tue, Oct 28, 2014 at 3:59 PM, Matt Turner  wrote:
 -   assert(stage == MESA_SHADER_FRAGMENT);
>>>
>>> I like removing these asserts from the function bodies, but I'm
>>> confused why you're doing it. The VS isn't going to call
>>> fire_fb_write, or emit a derivative instruction.
>>
>> Oh, you're actually removing the stage member entirely. Isn't that
>> useful to have?
>>
>> Maybe I need to keep reading the series..
>
> Okay, still don't understand. It seems like your objective in patch 3
> is to be able to print things other than "fragment" under various
> INTEL_DEBUG=... settings. Wouldn't it be simpler to contain that logic
> in the generator using the stage member -- which would be nice to have
> for assertions anyway?

What I was trying to do here was to make the generator independent of
the shader stage.  It receives a cfg_t and spits out assembly and
doesn't care about what stage it may be.  It simplifies the generator
a little and passing in the debug name means that the generator
doesn't have to know about blorp, for example.  I think my biggest
problem with losing the asserts is that they protect casts of
prog_data to brw_wm_prog_data (except for the one in
generate_assembly), but those casts only happens in fs specific
opcodes.

Either way, I'm not attached to this change, we can put stage back in.

Kristian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/6] glsl: Move common code to constant_util

2014-10-29 Thread Thomas Helland
This will be used later on in opt_minmax

Signed-off-by: Thomas Helland 
---
 src/glsl/ir_constant_util.h | 103 
 src/glsl/opt_algebraic.cpp  |  95 ++--
 src/glsl/opt_minmax.cpp |  19 ++--
 3 files changed, 109 insertions(+), 108 deletions(-)
 create mode 100644 src/glsl/ir_constant_util.h

diff --git a/src/glsl/ir_constant_util.h b/src/glsl/ir_constant_util.h
new file mode 100644
index 000..b3b9a19
--- /dev/null
+++ b/src/glsl/ir_constant_util.h
@@ -0,0 +1,103 @@
+/*
+ * ir_constant_util.h
+ *
+ *  Created on: 13. okt. 2014
+ *  Author: helland
+ */
+
+#ifndef IR_CONSTANT_UTIL_H_
+#define IR_CONSTANT_UTIL_H_
+
+#include "main/macros.h"
+#include "ir_builder.h"
+#include "program/prog_instruction.h"
+
+using namespace ir_builder;
+
+/* When eliminating an expression and just returning one of its operands,
+ * we may need to swizzle that operand out to a vector if the expression was
+ * vector type.
+ */
+static ir_rvalue *
+swizzle_if_required(ir_expression *expr,
+ ir_rvalue *operand)
+{
+   if (expr->type->is_vector() && operand->type->is_scalar()) {
+  return swizzle(operand, SWIZZLE_, expr->type->vector_elements);
+   } else
+  return operand;
+}
+
+static inline bool
+is_vec_zero(ir_constant *ir)
+{
+   return (ir == NULL) ? false : ir->is_zero();
+}
+
+static inline bool
+is_vec_one(ir_constant *ir)
+{
+   return (ir == NULL) ? false : ir->is_one();
+}
+
+static inline bool
+is_vec_two(ir_constant *ir)
+{
+   return (ir == NULL) ? false : ir->is_value(2.0, 2);
+}
+
+static inline bool
+is_vec_negative_one(ir_constant *ir)
+{
+   return (ir == NULL) ? false : ir->is_negative_one();
+}
+
+static inline bool
+is_vec_basis(ir_constant *ir)
+{
+   return (ir == NULL) ? false : ir->is_basis();
+}
+
+static inline bool
+is_valid_vec_const(ir_constant *ir)
+{
+   if (ir == NULL)
+  return false;
+
+   if (!ir->type->is_scalar() && !ir->type->is_vector())
+  return false;
+
+   return true;
+}
+
+static inline bool
+is_less_than_one(ir_constant *ir)
+{
+   if (!is_valid_vec_const(ir))
+  return false;
+
+   unsigned component = 0;
+   for (int c = 0; c < ir->type->vector_elements; c++) {
+  if (ir->get_float_component(c) < 1.0f)
+ component++;
+   }
+
+   return (component == ir->type->vector_elements);
+}
+
+static inline bool
+is_greater_than_zero(ir_constant *ir)
+{
+   if (!is_valid_vec_const(ir))
+  return false;
+
+   unsigned component = 0;
+   for (int c = 0; c < ir->type->vector_elements; c++) {
+  if (ir->get_float_component(c) > 0.0f)
+ component++;
+   }
+
+   return (component == ir->type->vector_elements);
+}
+
+#endif /* IR_CONSTANT_UTIL_H_ */
diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
index 0cdb8ec..8392017 100644
--- a/src/glsl/opt_algebraic.cpp
+++ b/src/glsl/opt_algebraic.cpp
@@ -29,13 +29,13 @@
  */
 
 #include "ir.h"
-#include "ir_visitor.h"
+//#include "ir_visitor.h"
 #include "ir_rvalue_visitor.h"
 #include "ir_optimization.h"
-#include "ir_builder.h"
 #include "glsl_types.h"
+#include "ir_constant_util.h"
+
 
-using namespace ir_builder;
 
 namespace {
 
@@ -68,8 +68,6 @@ public:
 int op1,
 ir_expression *ir2,
 int op2);
-   ir_rvalue *swizzle_if_required(ir_expression *expr,
- ir_rvalue *operand);
 
const struct gl_shader_compiler_options *options;
void *mem_ctx;
@@ -80,78 +78,6 @@ public:
 
 } /* unnamed namespace */
 
-static inline bool
-is_vec_zero(ir_constant *ir)
-{
-   return (ir == NULL) ? false : ir->is_zero();
-}
-
-static inline bool
-is_vec_one(ir_constant *ir)
-{
-   return (ir == NULL) ? false : ir->is_one();
-}
-
-static inline bool
-is_vec_two(ir_constant *ir)
-{
-   return (ir == NULL) ? false : ir->is_value(2.0, 2);
-}
-
-static inline bool
-is_vec_negative_one(ir_constant *ir)
-{
-   return (ir == NULL) ? false : ir->is_negative_one();
-}
-
-static inline bool
-is_vec_basis(ir_constant *ir)
-{
-   return (ir == NULL) ? false : ir->is_basis();
-}
-
-static inline bool
-is_valid_vec_const(ir_constant *ir)
-{
-   if (ir == NULL)
-  return false;
-
-   if (!ir->type->is_scalar() && !ir->type->is_vector())
-  return false;
-
-   return true;
-}
-
-static inline bool
-is_less_than_one(ir_constant *ir)
-{
-   if (!is_valid_vec_const(ir))
-  return false;
-
-   unsigned component = 0;
-   for (int c = 0; c < ir->type->vector_elements; c++) {
-  if (ir->get_float_component(c) < 1.0f)
- component++;
-   }
-
-   return (component == ir->type->vector_elements);
-}
-
-static inline bool
-is_greater_than_zero(ir_constant *ir)
-{
-   if (!is_valid_vec_const(ir))
-  return false;
-
-   unsigned component = 0;
-   for (int c = 0; c < ir->type->vector_elements; c++) {
-  if (ir->get_float_component(c) > 0.0f)
- component++;
-   }
-
-   

[Mesa-dev] [PATCH 0/6][RFC] glsl: Expand opt_minmax get_range

2014-10-29 Thread Thomas Helland
This series does some initial work to make expansion of
the get_range function a lot cleaner.
It also adds a couple simple initial ranges.
These patches are by no means perfect, but I hope
they will provide some feedback and ideas.
I'm hoping to expand this to do the following:
  -Add get_range for most opcodes I can think of
  -Add more utility functions to the constant_util file.
  -Repurpose the file to optimize more than just min/max.
  -Elimintate if's that we know the result of
  -Whatever pops into my head

I have some questions about undefined behaviour regarding this.
Do we have anyway of signaling in our IR that
the variable is the result of undefined behaviour?

In compilers like llvm, if I recall, they have a flag for this
so they can signal undefined behaviour and use whatever value
gives the most efficient code for its uses.(used in -ffast-math).

A hypotetichal situation: 
We find that we have sqrt(x) where x has upper bound < 0.
The spec says the behavior is undefined for x < 0.
The same applies for inverse sqrt, log, log2 and pow.
How should this be handled?
Should a warning be issued?
Could we simplify this to a constant 0?
That would allow more optimizations to occur.

Thomas Helland (6):
  glsl: Move common code to constant_util
  glsl: Expand constant_util
  glsl: Change to using switch-case in get_range
  glsl: Expand get_range to include sin/cos/sign
  glsl: Add saturate to get_range
  glsl: Add abs/sqrt/exp to get_range

 src/glsl/ir_constant_util.h | 134 
 src/glsl/opt_algebraic.cpp  |  95 +--
 src/glsl/opt_minmax.cpp |  73 +---
 3 files changed, 189 insertions(+), 113 deletions(-)
 create mode 100644 src/glsl/ir_constant_util.h

-- 
2.0.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/6] glsl: Add abs/sqrt/exp to get_range

2014-10-29 Thread Thomas Helland
All of these are guaranteed to be larger than 0

Signed-off-by: Thomas Helland 
---
 src/glsl/opt_minmax.cpp | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/glsl/opt_minmax.cpp b/src/glsl/opt_minmax.cpp
index 4088c80..e768857 100644
--- a/src/glsl/opt_minmax.cpp
+++ b/src/glsl/opt_minmax.cpp
@@ -307,6 +307,14 @@ get_range(ir_rvalue *rval)
 high = r0.high;
  return minmax_range(low, high);
 
+  case ir_unop_abs:
+  case ir_unop_sqrt:
+  case ir_unop_rsq:
+  case ir_unop_exp:
+  case ir_unop_exp2:
+ low = new(mem_ctx) ir_constant(0.0f);
+ return minmax_range(low, NULL);
+
   default:
  break;
   }
-- 
2.0.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/6] glsl: Add saturate to get_range

2014-10-29 Thread Thomas Helland
Also, if the operand has bounds between 0.0 and 1.0
then copy that range up.

Signed-off-by: Thomas Helland 
---
 src/glsl/opt_minmax.cpp | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/glsl/opt_minmax.cpp b/src/glsl/opt_minmax.cpp
index 0b9ddc2..4088c80 100644
--- a/src/glsl/opt_minmax.cpp
+++ b/src/glsl/opt_minmax.cpp
@@ -293,6 +293,20 @@ get_range(ir_rvalue *rval)
  low = new(mem_ctx) ir_constant(-1.0f);
  return minmax_range(low, high);
 
+  case ir_unop_saturate:
+ high = new(mem_ctx) ir_constant(1.0f);
+ low = new(mem_ctx) ir_constant(0.0f);
+ r0 = get_range(expr->operands[0]);
+ // Operand has lower bounds between 0.0 - 1.0 gives us new lower 
bounds
+ if (r0.low && compare_components(r0.low, low) > EQUAL &&
+   compare_components(r0.low, high) < EQUAL)
+low = r0.low;
+ // Operand has upper bounds between 0.0 - 1.0 gives us new lower 
bounds
+ if (r0.high && compare_components(r0.high, low) > EQUAL &&
+compare_components(r0.high, high) < EQUAL)
+high = r0.high;
+ return minmax_range(low, high);
+
   default:
  break;
   }
-- 
2.0.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/6] glsl: Expand get_range to include sin/cos/sign

2014-10-29 Thread Thomas Helland
This gets rid of extra instructions in some
shaders I purposefully wrote to test this.
Works for shaders similar to the following:

vec3 c = {8, 8, 8};
gl_FragColor.rgb = max(sin(d), c);

Signed-off-by: Thomas Helland 
---
 src/glsl/opt_minmax.cpp | 17 +
 1 file changed, 17 insertions(+)

diff --git a/src/glsl/opt_minmax.cpp b/src/glsl/opt_minmax.cpp
index b21daca..0b9ddc2 100644
--- a/src/glsl/opt_minmax.cpp
+++ b/src/glsl/opt_minmax.cpp
@@ -272,6 +272,10 @@ get_range(ir_rvalue *rval)
minmax_range r0;
minmax_range r1;
 
+   void *mem_ctx = ralloc_parent(rval);
+   ir_constant *low;
+   ir_constant *high;
+
if(expr) {
   switch(expr->operation) {
   case ir_binop_min:
@@ -279,6 +283,19 @@ get_range(ir_rvalue *rval)
  r0 = get_range(expr->operands[0]);
  r1 = get_range(expr->operands[1]);
  return combine_range(r0, r1, expr->operation == ir_binop_min);
+
+  case ir_unop_sin:
+  case ir_unop_sin_reduced:
+  case ir_unop_cos:
+  case ir_unop_cos_reduced:
+  case ir_unop_sign:
+ high = new(mem_ctx) ir_constant(1.0f);
+ low = new(mem_ctx) ir_constant(-1.0f);
+ return minmax_range(low, high);
+
+  default:
+ break;
+  }
}
 
ir_constant *c = rval->as_constant();
-- 
2.0.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/6] glsl: Expand constant_util

2014-10-29 Thread Thomas Helland
Add functions for is_greater_than_one
and is_less_than_zero

Signed-off-by: Thomas Helland 
---
 src/glsl/ir_constant_util.h | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/src/glsl/ir_constant_util.h b/src/glsl/ir_constant_util.h
index b3b9a19..9dae974 100644
--- a/src/glsl/ir_constant_util.h
+++ b/src/glsl/ir_constant_util.h
@@ -86,6 +86,21 @@ is_less_than_one(ir_constant *ir)
 }
 
 static inline bool
+is_greater_than_one(ir_constant *ir)
+{
+   if (!is_valid_vec_const(ir))
+  return false;
+
+   unsigned component = 0;
+   for (int c = 0; c < ir->type->vector_elements; c++) {
+  if (ir->get_float_component(c) > 1.0f)
+ component++;
+   }
+
+   return (component == ir->type->vector_elements);
+}
+
+static inline bool
 is_greater_than_zero(ir_constant *ir)
 {
if (!is_valid_vec_const(ir))
@@ -100,4 +115,20 @@ is_greater_than_zero(ir_constant *ir)
return (component == ir->type->vector_elements);
 }
 
+static inline bool
+is_less_than_zero(ir_constant *ir)
+{
+   if (!is_valid_vec_const(ir))
+  return false;
+
+   unsigned component = 0;
+   for (int c = 0; c < ir->type->vector_elements; c++) {
+  if (ir->get_float_component(c) < 0.0f)
+ component++;
+   }
+
+   return (component == ir->type->vector_elements);
+}
+
+
 #endif /* IR_CONSTANT_UTIL_H_ */
-- 
2.0.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/6] glsl: Change to using switch-case in get_range

2014-10-29 Thread Thomas Helland
This will make expansion easier and less cluttered.

Signed-off-by: Thomas Helland 
---
 src/glsl/opt_minmax.cpp | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/src/glsl/opt_minmax.cpp b/src/glsl/opt_minmax.cpp
index e4141bc..b21daca 100644
--- a/src/glsl/opt_minmax.cpp
+++ b/src/glsl/opt_minmax.cpp
@@ -269,11 +269,16 @@ static minmax_range
 get_range(ir_rvalue *rval)
 {
ir_expression *expr = rval->as_expression();
-   if (expr && (expr->operation == ir_binop_min ||
-expr->operation == ir_binop_max)) {
-  minmax_range r0 = get_range(expr->operands[0]);
-  minmax_range r1 = get_range(expr->operands[1]);
-  return combine_range(r0, r1, expr->operation == ir_binop_min);
+   minmax_range r0;
+   minmax_range r1;
+
+   if(expr) {
+  switch(expr->operation) {
+  case ir_binop_min:
+  case ir_binop_max:
+ r0 = get_range(expr->operands[0]);
+ r1 = get_range(expr->operands[1]);
+ return combine_range(r0, r1, expr->operation == ir_binop_min);
}
 
ir_constant *c = rval->as_constant();
-- 
2.0.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] glsl: Move common code to constant_util

2014-10-29 Thread Matt Turner
On Wed, Oct 29, 2014 at 6:11 PM, Thomas Helland
 wrote:
> This will be used later on in opt_minmax
>
> Signed-off-by: Thomas Helland 
> ---
>  src/glsl/ir_constant_util.h | 103 
> 
>  src/glsl/opt_algebraic.cpp  |  95 ++--
>  src/glsl/opt_minmax.cpp |  19 ++--
>  3 files changed, 109 insertions(+), 108 deletions(-)
>  create mode 100644 src/glsl/ir_constant_util.h
>
> diff --git a/src/glsl/ir_constant_util.h b/src/glsl/ir_constant_util.h
> new file mode 100644
> index 000..b3b9a19
> --- /dev/null
> +++ b/src/glsl/ir_constant_util.h
> @@ -0,0 +1,103 @@
> +/*
> + * ir_constant_util.h
> + *
> + *  Created on: 13. okt. 2014
> + *  Author: helland

The file needs to have a copyright and license header.

We've been trying to get away from author tags, but if you want one,
at least put your whole name and email like the others. I don't see
any value in "Created on"

> + */
> +
> +#ifndef IR_CONSTANT_UTIL_H_
> +#define IR_CONSTANT_UTIL_H_
> +
> +#include "main/macros.h"
> +#include "ir_builder.h"
> +#include "program/prog_instruction.h"
> +
> +using namespace ir_builder;
> +
> +/* When eliminating an expression and just returning one of its operands,
> + * we may need to swizzle that operand out to a vector if the expression was
> + * vector type.
> + */
> +static ir_rvalue *
> +swizzle_if_required(ir_expression *expr,
> + ir_rvalue *operand)
> +{
> +   if (expr->type->is_vector() && operand->type->is_scalar()) {
> +  return swizzle(operand, SWIZZLE_, expr->type->vector_elements);
> +   } else
> +  return operand;
> +}
> +
> +static inline bool
> +is_vec_zero(ir_constant *ir)
> +{
> +   return (ir == NULL) ? false : ir->is_zero();
> +}
> +
> +static inline bool
> +is_vec_one(ir_constant *ir)
> +{
> +   return (ir == NULL) ? false : ir->is_one();
> +}
> +
> +static inline bool
> +is_vec_two(ir_constant *ir)
> +{
> +   return (ir == NULL) ? false : ir->is_value(2.0, 2);
> +}
> +
> +static inline bool
> +is_vec_negative_one(ir_constant *ir)
> +{
> +   return (ir == NULL) ? false : ir->is_negative_one();
> +}
> +
> +static inline bool
> +is_vec_basis(ir_constant *ir)
> +{
> +   return (ir == NULL) ? false : ir->is_basis();
> +}
> +
> +static inline bool
> +is_valid_vec_const(ir_constant *ir)
> +{
> +   if (ir == NULL)
> +  return false;
> +
> +   if (!ir->type->is_scalar() && !ir->type->is_vector())
> +  return false;
> +
> +   return true;
> +}
> +
> +static inline bool
> +is_less_than_one(ir_constant *ir)
> +{
> +   if (!is_valid_vec_const(ir))
> +  return false;
> +
> +   unsigned component = 0;
> +   for (int c = 0; c < ir->type->vector_elements; c++) {
> +  if (ir->get_float_component(c) < 1.0f)
> + component++;
> +   }
> +
> +   return (component == ir->type->vector_elements);
> +}
> +
> +static inline bool
> +is_greater_than_zero(ir_constant *ir)
> +{
> +   if (!is_valid_vec_const(ir))
> +  return false;
> +
> +   unsigned component = 0;
> +   for (int c = 0; c < ir->type->vector_elements; c++) {
> +  if (ir->get_float_component(c) > 0.0f)
> + component++;
> +   }
> +
> +   return (component == ir->type->vector_elements);
> +}
> +
> +#endif /* IR_CONSTANT_UTIL_H_ */
> diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
> index 0cdb8ec..8392017 100644
> --- a/src/glsl/opt_algebraic.cpp
> +++ b/src/glsl/opt_algebraic.cpp
> @@ -29,13 +29,13 @@
>   */
>
>  #include "ir.h"
> -#include "ir_visitor.h"
> +//#include "ir_visitor.h"

Presumably you meant to just delete this line.

>  #include "ir_rvalue_visitor.h"
>  #include "ir_optimization.h"
> -#include "ir_builder.h"
>  #include "glsl_types.h"
> +#include "ir_constant_util.h"
> +

Extra new line.

>
> -using namespace ir_builder;
>
>  namespace {
>
> @@ -68,8 +68,6 @@ public:
>  int op1,
>  ir_expression *ir2,
>  int op2);
> -   ir_rvalue *swizzle_if_required(ir_expression *expr,
> - ir_rvalue *operand);
>
> const struct gl_shader_compiler_options *options;
> void *mem_ctx;
> @@ -80,78 +78,6 @@ public:
>
>  } /* unnamed namespace */
>
> -static inline bool
> -is_vec_zero(ir_constant *ir)
> -{
> -   return (ir == NULL) ? false : ir->is_zero();
> -}
> -
> -static inline bool
> -is_vec_one(ir_constant *ir)
> -{
> -   return (ir == NULL) ? false : ir->is_one();
> -}
> -
> -static inline bool
> -is_vec_two(ir_constant *ir)
> -{
> -   return (ir == NULL) ? false : ir->is_value(2.0, 2);
> -}
> -
> -static inline bool
> -is_vec_negative_one(ir_constant *ir)
> -{
> -   return (ir == NULL) ? false : ir->is_negative_one();
> -}
> -
> -static inline bool
> -is_vec_basis(ir_constant *ir)
> -{
> -   return (ir == NULL) ? false : ir->is_basis();
> -}
> -
> -static inline bool
> -is_valid_vec_const(ir_constant *ir)
> -{
> -   if (ir == NULL)
> -  return false;
> -
> -   if (!ir-

Re: [Mesa-dev] [PATCH v2 2/2] glsl: Lower constant arrays to uniform arrays.

2014-10-29 Thread Matt Turner
On Wed, Oct 29, 2014 at 5:16 PM, Kenneth Graunke  wrote:
> diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
> index 0c55327..6aed52d 100644
> --- a/src/glsl/Makefile.sources
> +++ b/src/glsl/Makefile.sources
> @@ -58,6 +58,7 @@ LIBGLSL_FILES = \
> $(GLSL_SRCDIR)/loop_analysis.cpp \
> $(GLSL_SRCDIR)/loop_controls.cpp \
> $(GLSL_SRCDIR)/loop_unroll.cpp \
> +   $(GLSL_SRCDIR)/lower_const_arrays_to_uniforms.cpp \

Alphabetize!

> $(GLSL_SRCDIR)/lower_clip_distance.cpp \
> $(GLSL_SRCDIR)/lower_discard.cpp \
> $(GLSL_SRCDIR)/lower_discard_flow.cpp \
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: Skip loop-too-large heuristic if indexing arrays of a certain size

2014-10-29 Thread Kenneth Graunke
A pattern in certain shaders is:

   uniform vec4 colors[NUM_LIGHTS];

   for (int i = 0; i < NUM_LIGHTS; i++) {
  ...use colors[i]...
   }

In this case, the application author expects the shader compiler to
unroll the loop.  By doing so, it replaces variable indexing of the
array with constant indexing, which is more efficient.

This patch extends the heuristic to see if arrays accessed within the
loop are indexed by an induction variable, and if the array size exactly
matches the number of loop iterations.  If so, the application author
probably intended us to unroll it.  If not, we rely on the existing
loop-too-large heuristic.

Improves performance in a phong shading microbenchmark by 2.88x, and a
shadow mapping microbenchmark by 1.63x.  Without variable indexing, we
can upload the small uniform arrays as push constants instead of pull
constants, avoiding shader memory access.  Affects several games, but
doesn't appear to impact their performance.

Signed-off-by: Kenneth Graunke 
---
 src/glsl/loop_unroll.cpp | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/src/glsl/loop_unroll.cpp b/src/glsl/loop_unroll.cpp
index ce795f6..635e1dd 100644
--- a/src/glsl/loop_unroll.cpp
+++ b/src/glsl/loop_unroll.cpp
@@ -64,6 +64,7 @@ class loop_unroll_count : public ir_hierarchical_visitor {
 public:
int nodes;
bool unsupported_variable_indexing;
+   bool array_indexed_by_induction_var_with_exact_iterations;
/* If there are nested loops, the node count will be inaccurate. */
bool nested_loop;
 
@@ -74,6 +75,7 @@ public:
   nodes = 0;
   nested_loop = false;
   unsupported_variable_indexing = false;
+  array_indexed_by_induction_var_with_exact_iterations = false;
 
   run(list);
}
@@ -112,6 +114,14 @@ public:
  ir_variable *array = ir->array->variable_referenced();
  loop_variable *lv = ls->get(ir->array_index->variable_referenced());
  if (array && lv && lv->is_induction_var()) {
+/* If an array is indexed by a loop induction variable, and the
+ * array size is exactly the number of loop iterations, this is
+ * probably a simple for-loop trying to access each element in
+ * turn; the application may expect it to be unrolled.
+ */
+if (int(array->type->length) == 
ls->limiting_terminator->iterations)
+   array_indexed_by_induction_var_with_exact_iterations = true;
+
 switch (array->data.mode) {
 case ir_var_auto:
 case ir_var_temporary:
@@ -314,7 +324,8 @@ loop_unroll_visitor::visit_leave(ir_loop *ir)
bool loop_too_large =
   count.nested_loop || count.nodes * iterations > max_iterations * 5;
 
-   if (loop_too_large && !count.unsupported_variable_indexing)
+   if (loop_too_large && !count.unsupported_variable_indexing &&
+   !count.array_indexed_by_induction_var_with_exact_iterations)
   return visit_continue;
 
/* Note: the limiting terminator contributes 1 to ls->num_loop_jumps.
-- 
2.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/5] mesa: Handle clip control in meta operations.

2014-10-29 Thread Mathias Fröhlich

Hi Neil,

On Tuesday, October 28, 2014 18:22:33 Neil Roberts wrote:
> On inspection it looks like this would potentially break
> _mesa_meta_Clear when it is using GLSL because that does not save the
> MESA_META_TRANSFORM state.
> 
> I wonder if MESA_META_TRANSFORM is not the right state flag for this
> because all of the other state in it is about fixed-function stuff which
> is irrelevant for shaders. It would be a shame for shader-based meta ops
> to suddenly have to save all of that state too. Maybe it would make more
> sense in MESA_META_VIEWPORT?
That moved to MESA_META_TRANSFORM due to a review request. As I
understood the rationale, it's already connected to the gl_context::Transform
variable by the GL spec, so it belongs into transform also for meta operations.
Brian, or what was there an other reason for moving this to MESA_META_TRANSFORM?

Initially I had that saved if either MESA_META_VIEWPORT or
MESA_META_DEPTH_TEST is requested since the origin argument affects
the viewport transform and the depth mode affects the mapping of depth
values into the depth buffer.
I have at that time also thought about introducing a completely new
MESA_META_CLIP_CONTROL since clip control does not exactly fit anywhere.
Or as a third alternative split out a _mesa_clip_control_origin(origin)
and a _mesa_clip_control_mode(mode) from _mesa_ClipControl(mode, origin)
and use the mesa internal functions in meta.c to save if either
MESA_META_VIEWPORT or MESA_META_DEPTH_TEST was requested.

Greetings

Mathias
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev