Re: [Mesa-dev] [PATCH] mesa: set override_version per api version override

2015-06-10 Thread Jordan Justen
Whoops. I re-wrote this last night:

http://cgit.freedesktop.org/~jljusten/mesa/commit/?h=cs-33&id=3ebe7b79

On 2015-06-04 21:41:15, Tapani Pälli wrote:
> Before 9b5e92f get_gl_override was called only once, but now it is
> called for multiple APIs (GLES2, GL), version needs to be set always.
> 
> Signed-off-by: Tapani Pälli 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90797
> ---
>  src/mesa/main/version.c | 36 ++--
>  1 file changed, 22 insertions(+), 14 deletions(-)
> 
> diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
> index 409e5ae..5e9943c 100644
> --- a/src/mesa/main/version.c
> +++ b/src/mesa/main/version.c
> @@ -58,34 +58,42 @@ get_gl_override(gl_api api, int *version, bool 
> *fwd_context,
>? "MESA_GL_VERSION_OVERRIDE" : "MESA_GLES_VERSION_OVERRIDE";
> const char *version_str;
> int major, minor, n;
> -   static int override_version = -1;
> -   static bool fc_suffix = false;
> -   static bool compat_suffix = false;
> +   static struct override_info {
> +  int version;
> +  bool fc_suffix;
> +  bool compat_suffix;
> +   } override[API_OPENGL_LAST + 1] = {
> +  { -1, false, false},
> +  { -1, false, false},
> +  { -1, false, false},
> +  { -1, false, false},
> +   };

I did this a bit differently. Rather than one for each API, I had one
per override env var.

If you think once per API is preferable, how about these changes to
your patch:

   override[API_OPENGL_LAST + 1] => override[]

   STATIC_ASSERT(ARRAY_SIZE(override) == API_OPENGL_LAST + 1);

-Jordan

> if (api == API_OPENGLES)
>return;
>  
> -   if (override_version < 0) {
> -  override_version = 0;
> +   if (override[api].version < 0) {
> +  override[api].version = 0;
>  
>version_str = getenv(env_var);
>if (version_str) {
> - fc_suffix = check_for_ending(version_str, "FC");
> - compat_suffix = check_for_ending(version_str, "COMPAT");
> + override[api].fc_suffix = check_for_ending(version_str, "FC");
> + override[api].compat_suffix = check_for_ending(version_str, 
> "COMPAT");
>  
>   n = sscanf(version_str, "%u.%u", &major, &minor);
>   if (n != 2) {
>  fprintf(stderr, "error: invalid value for %s: %s\n",
>  env_var, version_str);
> -override_version = 0;
> +override[api].version = 0;
>   } else {
> -override_version = major * 10 + minor;
> +override[api].version = major * 10 + minor;
>  
>  /* There is no such thing as compatibility or forward-compatible 
> for
>   * OpenGL ES 2.0 or 3.x APIs.
>   */
> -if ((override_version < 30 && fc_suffix) ||
> -(api == API_OPENGLES2 && (fc_suffix || compat_suffix))) {
> +if ((override[api].version < 30 && override[api].fc_suffix) ||
> +(api == API_OPENGLES2 && (override[api].fc_suffix ||
> +  override[api].compat_suffix))) {
> fprintf(stderr, "error: invalid value for %s: %s\n",
> env_var, version_str);
>  }
> @@ -93,9 +101,9 @@ get_gl_override(gl_api api, int *version, bool 
> *fwd_context,
>}
> }
>  
> -   *version = override_version;
> -   *fwd_context = fc_suffix;
> -   *compat_context = compat_suffix;
> +   *version = override[api].version;
> +   *fwd_context = override[api].fc_suffix;
> +   *compat_context = override[api].compat_suffix;
>  }
>  
>  /**
> -- 
> 2.1.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 09/82] mesa: Add shader storage buffer support to struct gl_context

2015-06-11 Thread Jordan Justen
On 2015-06-10 02:30:10, Samuel Iglesias Gonsálvez wrote:
> On 09/06/15 19:34, Jordan Justen wrote:
> > On 2015-06-09 00:30:45, Samuel Iglesias Gonsálvez wrote:
> >> On 09/06/15 08:18, Samuel Iglesias Gonsálvez wrote:
> >> I plan to modify patch [0] to set prog->MaxShaderStorageBlocks = 8 and
> >> write a separate patch that makes i965 driver to set the following
> >> constant values:
> >>
> >> * MaxShaderStorageBlocks = 12
> >> * MaxCombinedUniformBlocks = 36 (12 * 3 different shader types: vertex,
> >> geometry and fragment)
> > 
> > Will it need to be '* 5' with the 2 tessellation stages? I wonder if
> > we can somehow guard against it being missed later. I guess TS tests
> > should be able to catch this.
> > 
> 
> There are two possibilities:
> 
> 1) Add a FIXME comment in the separate patch for i965, where I am going
> to modify the value of MaxCombinedUniformBlocks, saying that we don't
> count tessellation stages yet.
> 
> 2) Set MaxCombinedUniformBlocks = 12 * 5 in that separate patch and add
> a comment saying that it counts tessellation stages, although they are
> not supported yet.
> 
> I prefer 1) because future TS support will need to do similar changes in
> other places. What do you think?

Yeah, 1 seems fine.

Thanks,

-Jordan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: set override_version per api version override

2015-06-11 Thread Jordan Justen
On 2015-06-10 13:03:20, Jordan Justen wrote:
> Whoops. I re-wrote this last night:
> 
> http://cgit.freedesktop.org/~jljusten/mesa/commit/?h=cs-33&id=3ebe7b79
> 
> On 2015-06-04 21:41:15, Tapani Pälli wrote:
> > Before 9b5e92f get_gl_override was called only once, but now it is
> > called for multiple APIs (GLES2, GL), version needs to be set always.
> > 
> > Signed-off-by: Tapani Pälli 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90797
> > ---
> >  src/mesa/main/version.c | 36 ++--
> >  1 file changed, 22 insertions(+), 14 deletions(-)
> > 
> > diff --git a/src/mesa/main/version.c b/src/mesa/main/version.c
> > index 409e5ae..5e9943c 100644
> > --- a/src/mesa/main/version.c
> > +++ b/src/mesa/main/version.c
> > @@ -58,34 +58,42 @@ get_gl_override(gl_api api, int *version, bool 
> > *fwd_context,
> >? "MESA_GL_VERSION_OVERRIDE" : "MESA_GLES_VERSION_OVERRIDE";
> > const char *version_str;
> > int major, minor, n;
> > -   static int override_version = -1;
> > -   static bool fc_suffix = false;
> > -   static bool compat_suffix = false;
> > +   static struct override_info {
> > +  int version;
> > +  bool fc_suffix;
> > +  bool compat_suffix;
> > +   } override[API_OPENGL_LAST + 1] = {
> > +  { -1, false, false},
> > +  { -1, false, false},
> > +  { -1, false, false},
> > +  { -1, false, false},
> > +   };
> 
> I did this a bit differently. Rather than one for each API, I had one
> per override env var.
> 
> If you think once per API is preferable, how about these changes to
> your patch:
> 
>override[API_OPENGL_LAST + 1] => override[]
> 
>STATIC_ASSERT(ARRAY_SIZE(override) == API_OPENGL_LAST + 1);

Oh ... and with those changes,

Reviewed-by: Jordan Justen 

> 
> > if (api == API_OPENGLES)
> >return;
> >  
> > -   if (override_version < 0) {
> > -  override_version = 0;
> > +   if (override[api].version < 0) {
> > +  override[api].version = 0;
> >  
> >version_str = getenv(env_var);
> >if (version_str) {
> > - fc_suffix = check_for_ending(version_str, "FC");
> > - compat_suffix = check_for_ending(version_str, "COMPAT");
> > + override[api].fc_suffix = check_for_ending(version_str, "FC");
> > + override[api].compat_suffix = check_for_ending(version_str, 
> > "COMPAT");
> >  
> >   n = sscanf(version_str, "%u.%u", &major, &minor);
> >   if (n != 2) {
> >  fprintf(stderr, "error: invalid value for %s: %s\n",
> >  env_var, version_str);
> > -override_version = 0;
> > +override[api].version = 0;
> >   } else {
> > -override_version = major * 10 + minor;
> > +override[api].version = major * 10 + minor;
> >  
> >  /* There is no such thing as compatibility or 
> > forward-compatible for
> >   * OpenGL ES 2.0 or 3.x APIs.
> >   */
> > -if ((override_version < 30 && fc_suffix) ||
> > -(api == API_OPENGLES2 && (fc_suffix || compat_suffix))) {
> > +if ((override[api].version < 30 && override[api].fc_suffix) ||
> > +(api == API_OPENGLES2 && (override[api].fc_suffix ||
> > +  override[api].compat_suffix))) {
> > fprintf(stderr, "error: invalid value for %s: %s\n",
> > env_var, version_str);
> >  }
> > @@ -93,9 +101,9 @@ get_gl_override(gl_api api, int *version, bool 
> > *fwd_context,
> >}
> > }
> >  
> > -   *version = override_version;
> > -   *fwd_context = fc_suffix;
> > -   *compat_context = compat_suffix;
> > +   *version = override[api].version;
> > +   *fwd_context = override[api].fc_suffix;
> > +   *compat_context = override[api].compat_suffix;
> >  }
> >  
> >  /**
> > -- 
> > 2.1.0
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 13/82] glsl: buffer variables cannot be defined outside interface blocks

2015-06-11 Thread Jordan Justen
On 2015-06-03 00:01:03, Iago Toral Quiroga wrote:
> From: Samuel Iglesias Gonsalvez 
> 
> Section 4.3.7 "Buffer Variables", GLSL 4.30 spec:
> 
> "Buffer variables may only be declared inside interface blocks
> (section 4.3.9 “Interface Blocks”), which are then referred to as
> shader storage blocks. It is a compile-time error to declare buffer
> variables at global scope (outside a block)."
> 
> Signed-off-by: Samuel Iglesias Gonsalvez 
> ---
>  src/glsl/ast_to_hir.cpp | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
> index d246c00..99b6eeb 100644
> --- a/src/glsl/ast_to_hir.cpp
> +++ b/src/glsl/ast_to_hir.cpp
> @@ -3378,6 +3378,18 @@ ast_declarator_list::hir(exec_list *instructions,
>  
> decl_type = this->type->glsl_type(& type_name, state);
>  
> +   /* Section 4.3.7 "Buffer Variables" of the GLSL 4.30 spec:
> +*"Buffer variables may only be declared inside interface blocks
> +*(section 4.3.9 “Interface Blocks”), which are then referred to as
> +*shader storage blocks. It is a compile-time error to declare buffer
> +*variables at global scope (outside a block)."
> +*/
> +   if (type->qualifier.flags.q.buffer && !decl_type->is_interface()) {
> +  _mesa_glsl_error(&loc, state,
> +   "buffer variables cannot be declared outside"
> +   " interface blocks");

I think we usually put the space at the end of the previous line.

Reviewed-by: Jordan Justen 

> +   }
> +
> /* An offset-qualified atomic counter declaration sets the default
>  * offset for the next declaration within the same atomic counter
>  * buffer.
> -- 
> 1.9.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/cs: Use exec all for CS terminate

2015-06-11 Thread Jordan Justen
This prevents an assertion from being hit with SIMD16:

Assertion `inst->exec_size == dispatch_width() || force_writemask_all' failed.

Signed-off-by: Jordan Justen 
Cc: Francisco Jerez 
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index c41284b..588966b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1948,7 +1948,8 @@ fs_visitor::emit_cs_terminate()
bld.exec_all().MOV(payload, g0);
 
/* Send a message to the thread spawner to terminate the thread. */
-   fs_inst *inst = bld.emit(CS_OPCODE_CS_TERMINATE, reg_undef, payload);
+   fs_inst *inst = bld.exec_all()
+  .emit(CS_OPCODE_CS_TERMINATE, reg_undef, payload);
inst->eot = true;
 }
 
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/cs: Initialize GPGPU Thread Count

2015-06-11 Thread Jordan Justen
This field should always be set for gen8. In the bdw PRM, Volume 2d:
Command Reference: Structures under INTERFACE_DESCRIPTOR_DATA, DWORD
6, Bits 9:0, Number of Threads in GPGPU Thread Group:

"This field should not be set to 0 even if the barrier is disabled,
since an accurate value is needed for proper pre-emption."

In the HSW PRM, the it doesn't mention that it must always be set, but
it should not hurt.

Reported-by: Kristian Høgsberg 
Signed-off-by: Jordan Justen 
Cc: Kristian Høgsberg 
---
 src/mesa/drivers/dri/i965/brw_cs.cpp| 19 +++
 src/mesa/drivers/dri/i965/brw_defines.h |  5 +
 2 files changed, 24 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_cs.cpp 
b/src/mesa/drivers/dri/i965/brw_cs.cpp
index 1f2a9d2..44c76ba 100644
--- a/src/mesa/drivers/dri/i965/brw_cs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_cs.cpp
@@ -284,6 +284,17 @@ brw_cs_precompile(struct gl_context *ctx,
 }
 
 
+static unsigned
+get_cs_thread_count(const struct brw_cs_prog_data *cs_prog_data)
+{
+   const unsigned simd_size = cs_prog_data->simd_size;
+   unsigned group_size = cs_prog_data->local_size[0] *
+  cs_prog_data->local_size[1] * cs_prog_data->local_size[2];
+
+   return (group_size + simd_size - 1) / simd_size;
+}
+
+
 static void
 brw_upload_cs_state(struct brw_context *brw)
 {
@@ -309,6 +320,8 @@ brw_upload_cs_state(struct brw_context *brw)
 
prog_data->binding_table.size_bytes,
 32, &stage_state->bind_bo_offset);
 
+   unsigned threads = get_cs_thread_count(cs_prog_data);
+
uint32_t dwords = brw->gen < 8 ? 8 : 9;
BEGIN_BATCH(dwords);
OUT_BATCH(MEDIA_VFE_STATE << 16 | (dwords - 2));
@@ -358,6 +371,12 @@ brw_upload_cs_state(struct brw_context *brw)
desc[dw++] = 0;
desc[dw++] = 0;
desc[dw++] = stage_state->bind_bo_offset;
+   desc[dw++] = 0;
+   const uint32_t media_threads =
+  brw->gen >= 8 ?
+  SET_FIELD(threads, GEN8_MEDIA_GPGPU_THREAD_COUNT) :
+  SET_FIELD(threads, MEDIA_GPGPU_THREAD_COUNT);
+   desc[dw++] = media_threads;
 
BEGIN_BATCH(4);
OUT_BATCH(MEDIA_INTERFACE_DESCRIPTOR_LOAD << 16 | (4 - 2));
diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index f6da305..2a8f500 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -2495,6 +2495,11 @@ enum brw_wm_barycentric_interp_mode {
 # define MEDIA_VFE_STATE_CURBE_ALLOC_MASK   INTEL_MASK(15, 0)
 
 #define MEDIA_INTERFACE_DESCRIPTOR_LOAD 0x7002
+/* GEN7 DW5, GEN8+ DW6 */
+# define MEDIA_GPGPU_THREAD_COUNT_SHIFT 0
+# define MEDIA_GPGPU_THREAD_COUNT_MASK  INTEL_MASK(7, 0)
+# define GEN8_MEDIA_GPGPU_THREAD_COUNT_SHIFT0
+# define GEN8_MEDIA_GPGPU_THREAD_COUNT_MASK INTEL_MASK(9, 0)
 #define MEDIA_STATE_FLUSH   0x7004
 #define GPGPU_WALKER0x7105
 /* GEN8+ DW2 */
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] i965/cs: Setup push constant data for uniforms

2015-06-16 Thread Jordan Justen
brw_upload_cs_push_constants was based on gen6_upload_push_constants.

Signed-off-by: Jordan Justen 
---
 These 2 patches allow this piglit to pass:
 tests/spec/arb_compute_shader/execution/basic-uniform-access-atomic.shader_test
 (Also requires overriding the GL version and some extensions...)

 src/mesa/drivers/dri/i965/brw_context.h  |   2 +-
 src/mesa/drivers/dri/i965/brw_cs.cpp | 119 ++-
 src/mesa/drivers/dri/i965/brw_defines.h  |   6 ++
 src/mesa/drivers/dri/i965/brw_state.h|   1 +
 src/mesa/drivers/dri/i965/brw_state_upload.c |   2 +
 5 files changed, 125 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 01c4283..9ea0dfd 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1457,7 +1457,7 @@ struct brw_context
 
int num_atoms[BRW_NUM_PIPELINES];
const struct brw_tracked_state render_atoms[57];
-   const struct brw_tracked_state compute_atoms[3];
+   const struct brw_tracked_state compute_atoms[4];
 
/* If (INTEL_DEBUG & DEBUG_BATCH) */
struct {
diff --git a/src/mesa/drivers/dri/i965/brw_cs.cpp 
b/src/mesa/drivers/dri/i965/brw_cs.cpp
index 44c76ba..e26d576 100644
--- a/src/mesa/drivers/dri/i965/brw_cs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_cs.cpp
@@ -320,6 +320,9 @@ brw_upload_cs_state(struct brw_context *brw)
 
prog_data->binding_table.size_bytes,
 32, &stage_state->bind_bo_offset);
 
+   unsigned push_constant_size =
+  prog_data->nr_params * sizeof(gl_constant_value);
+   unsigned reg_aligned_constant_size = ALIGN(push_constant_size, 32);
unsigned threads = get_cs_thread_count(cs_prog_data);
 
uint32_t dwords = brw->gen < 8 ? 8 : 9;
@@ -352,12 +355,24 @@ brw_upload_cs_state(struct brw_context *brw)
 
OUT_BATCH(0);
const uint32_t vfe_urb_allocation = brw->gen >= 8 ? 2 : 0;
-   OUT_BATCH(SET_FIELD(vfe_urb_allocation, MEDIA_VFE_STATE_URB_ALLOC));
+   const uint32_t vfe_curbe_allocation =
+  (reg_aligned_constant_size / 32) * threads + 32;
+   OUT_BATCH(SET_FIELD(vfe_urb_allocation, MEDIA_VFE_STATE_URB_ALLOC) |
+ SET_FIELD(vfe_curbe_allocation, MEDIA_VFE_STATE_CURBE_ALLOC));
OUT_BATCH(0);
OUT_BATCH(0);
OUT_BATCH(0);
ADVANCE_BATCH();
 
+   if (reg_aligned_constant_size > 0) {
+  BEGIN_BATCH(4);
+  OUT_BATCH(MEDIA_CURBE_LOAD << 16 | (4 - 2));
+  OUT_BATCH(0);
+  OUT_BATCH(reg_aligned_constant_size * threads);
+  OUT_BATCH(stage_state->push_const_offset);
+  ADVANCE_BATCH();
+   }
+
/* BRW_NEW_SURFACES and BRW_NEW_*_CONSTBUF */
memcpy(bind, stage_state->surf_offset,
   prog_data->binding_table.size_bytes);
@@ -371,7 +386,8 @@ brw_upload_cs_state(struct brw_context *brw)
desc[dw++] = 0;
desc[dw++] = 0;
desc[dw++] = stage_state->bind_bo_offset;
-   desc[dw++] = 0;
+   desc[dw++] = SET_FIELD((reg_aligned_constant_size / 32) + 0,
+  MEDIA_CURBE_READ_LENGTH);
const uint32_t media_threads =
   brw->gen >= 8 ?
   SET_FIELD(threads, GEN8_MEDIA_GPGPU_THREAD_COUNT) :
@@ -392,8 +408,103 @@ const struct brw_tracked_state brw_cs_state = {
/* explicit initialisers aren't valid C++, comment
 * them for documentation purposes */
/* .dirty = */{
-  /* .mesa = */ 0,
-  /* .brw = */  BRW_NEW_CS_PROG_DATA,
+  /* .mesa = */ _NEW_PROGRAM_CONSTANTS,
+  /* .brw = */  BRW_NEW_CS_PROG_DATA |
+BRW_NEW_PUSH_CONSTANT_ALLOCATION,
},
/* .emit = */ brw_upload_cs_state
 };
+
+
+/**
+ * Creates a region containing the push constants for the CS on gen7+.
+ *
+ * Push constants are constant values (such as GLSL uniforms) that are
+ * pre-loaded into a shader stage's register space at thread spawn time.
+ *
+ * Not all GLSL uniforms will be uploaded as push constants: The hardware has
+ * a limitation of 32 or 64 EU registers (256 or 512 floats) per stage to be
+ * uploaded as push constants, while GL 4.4 requires at least 1024 components
+ * to be usable for the VS.  Plus, currently we always use pull constants
+ * instead of push constants when doing variable-index array access.
+ *
+ * For other stages, see brw_curbe.c for the equivalent gen4/5 code and
+ * gen6_vs_state.c for gen6+.
+ */
+static void
+brw_upload_cs_push_constants(struct brw_context *brw,
+ const struct gl_program *prog,
+ const struct brw_cs_prog_data *cs_prog_data,
+ struct brw_stage_state *stage_state,
+ enum aub_state_struct_type type)
+{
+   struct gl_context *ctx = &brw->ctx;
+   const struct brw_stage_prog_data *prog_data =
+  (brw_stage_prog_data*) cs_prog_data;
+
+   /* Updates the ParamaterValues[i] 

[Mesa-dev] [PATCH 2/2] main/state: Flag new constants for compute shaders

2015-06-16 Thread Jordan Justen
Signed-off-by: Jordan Justen 
---
 src/mesa/main/state.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/mesa/main/state.c b/src/mesa/main/state.c
index bede7fe..beb2721 100644
--- a/src/mesa/main/state.c
+++ b/src/mesa/main/state.c
@@ -278,6 +278,16 @@ update_program_constants(struct gl_context *ctx)
   }
}
 
+   if (ctx->ComputeProgram._Current) {
+  const struct gl_program_parameter_list *params =
+ ctx->ComputeProgram._Current->Base.Parameters;
+  /*FIXME: StateFlags is always 0 because we have unnamed constant
+   *   not state changes */
+  if (params /*&& params->StateFlags & ctx->NewState*/) {
+ new_state |= _NEW_PROGRAM_CONSTANTS;
+  }
+   }
+
return new_state;
 }
 
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/compute: Fix undefined code with right_mask for SIMD32

2015-06-16 Thread Jordan Justen
Although we don't support SIMD32, krh pointed out that the left shift
by 32 is undefined by C/C++ for 32-bit integers.

Suggested-by: Kristian Høgsberg 
Signed-off-by: Jordan Justen 
Cc: Kristian Høgsberg 
---
 src/mesa/drivers/dri/i965/brw_compute.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_compute.c 
b/src/mesa/drivers/dri/i965/brw_compute.c
index b3d6de5..5693ab5 100644
--- a/src/mesa/drivers/dri/i965/brw_compute.c
+++ b/src/mesa/drivers/dri/i965/brw_compute.c
@@ -45,7 +45,7 @@ brw_emit_gpgpu_walker(struct brw_context *brw, const GLuint 
*num_groups)
unsigned thread_width_max =
   (group_size + simd_size - 1) / simd_size;
 
-   uint32_t right_mask = (1u << simd_size) - 1;
+   uint32_t right_mask = 0xu >> (32 - simd_size);
const unsigned right_non_aligned = group_size & (simd_size - 1);
if (right_non_aligned != 0)
   right_mask >>= (simd_size - right_non_aligned);
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 15/82] mesa: Initialize and free shader storage buffers

2015-06-16 Thread Jordan Justen
On 2015-06-03 00:01:05, Iago Toral Quiroga wrote:
> ---
>  src/mesa/main/bufferobj.c | 19 +++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
> index c5d4ada..a528787 100644
> --- a/src/mesa/main/bufferobj.c
> +++ b/src/mesa/main/bufferobj.c
> @@ -836,6 +836,9 @@ _mesa_init_buffer_objects( struct gl_context *ctx )
> _mesa_reference_buffer_object(ctx, &ctx->UniformBuffer,
>  ctx->Shared->NullBufferObj);
>  
> +   _mesa_reference_buffer_object(ctx, &ctx->ShaderStorageBuffer,
> +ctx->Shared->NullBufferObj);

Tabs here?

With that fixed, this and
14/82 glsl: fix error messages in invalid declarations of shader storage blocks

Reviewed-by: Jordan Justen 

> _mesa_reference_buffer_object(ctx, &ctx->AtomicBuffer,
>  ctx->Shared->NullBufferObj);
>  
> @@ -850,6 +853,14 @@ _mesa_init_buffer_objects( struct gl_context *ctx )
>ctx->UniformBufferBindings[i].Size = -1;
> }
>  
> +   for (i = 0; i < MAX_COMBINED_SHADER_STORAGE_BUFFERS; i++) {
> +  _mesa_reference_buffer_object(ctx,
> +
> &ctx->ShaderStorageBufferBindings[i].BufferObject,
> +ctx->Shared->NullBufferObj);
> +  ctx->ShaderStorageBufferBindings[i].Offset = -1;
> +  ctx->ShaderStorageBufferBindings[i].Size = -1;
> +   }
> +
> for (i = 0; i < MAX_COMBINED_ATOMIC_BUFFERS; i++) {
>_mesa_reference_buffer_object(ctx,
> 
> &ctx->AtomicBufferBindings[i].BufferObject,
> @@ -872,6 +883,8 @@ _mesa_free_buffer_objects( struct gl_context *ctx )
>  
> _mesa_reference_buffer_object(ctx, &ctx->UniformBuffer, NULL);
>  
> +   _mesa_reference_buffer_object(ctx, &ctx->ShaderStorageBuffer, NULL);
> +
> _mesa_reference_buffer_object(ctx, &ctx->AtomicBuffer, NULL);
>  
> _mesa_reference_buffer_object(ctx, &ctx->DrawIndirectBuffer, NULL);
> @@ -882,6 +895,12 @@ _mesa_free_buffer_objects( struct gl_context *ctx )
> NULL);
> }
>  
> +   for (i = 0; i < MAX_COMBINED_SHADER_STORAGE_BUFFERS; i++) {
> +  _mesa_reference_buffer_object(ctx,
> +
> &ctx->ShaderStorageBufferBindings[i].BufferObject,
> +NULL);
> +   }
> +
> for (i = 0; i < MAX_COMBINED_ATOMIC_BUFFERS; i++) {
>_mesa_reference_buffer_object(ctx,
> 
> &ctx->AtomicBufferBindings[i].BufferObject,
> -- 
> 1.9.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/46] mesa: require VS if TCS or TES is present in pipeline

2015-06-16 Thread Jordan Justen
I guess you wanted to lower case the initial word of all Chris's patch
subjects? :)

This patch was in a range of 11 patches that I added my r-b for:
http://lists.freedesktop.org/archives/mesa-dev/2014-September/068742.html

It seems Chris didn't get a chance to add my r-b to these patches,
including the recently already committed 358b6bb.

-Jordan

On 2015-06-16 16:01:09, Marek Olšák wrote:
> From: Chris Forbes 
> 
> Signed-off-by: Chris Forbes 
> ---
>  src/mesa/main/pipelineobj.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c
> index 0b5fa29..7d4127e 100644
> --- a/src/mesa/main/pipelineobj.c
> +++ b/src/mesa/main/pipelineobj.c
> @@ -788,7 +788,9 @@ _mesa_validate_program_pipeline(struct gl_context* ctx,
>  *   executable vertex shader."
>  */
> if (!pipe->CurrentProgram[MESA_SHADER_VERTEX]
> -   && pipe->CurrentProgram[MESA_SHADER_GEOMETRY]) {
> +   && (pipe->CurrentProgram[MESA_SHADER_GEOMETRY] ||
> +   pipe->CurrentProgram[MESA_SHADER_TESS_CTRL] ||
> +   pipe->CurrentProgram[MESA_SHADER_TESS_EVAL])) {
>pipe->InfoLog = ralloc_strdup(pipe, "Program lacks a vertex shader");
>goto err;
> }
> -- 
> 2.1.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] main/state: Flag new constants for compute shaders

2015-06-16 Thread Jordan Justen
On 2015-06-16 14:51:38, Marek Olšák wrote:
> One question: Why?
> 
> I just removed the same code for GS:
> http://cgit.freedesktop.org/mesa/mesa/commit/?id=42a3c1ec8471fc76eb6d3d6f1bd1739e24a5f33a

Yeah. I think you are right. I'll drop this patch. Thanks!

-Jordan

> On Tue, Jun 16, 2015 at 11:21 PM, Jordan Justen
>  wrote:
> > Signed-off-by: Jordan Justen 
> > ---
> >  src/mesa/main/state.c | 10 ++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/src/mesa/main/state.c b/src/mesa/main/state.c
> > index bede7fe..beb2721 100644
> > --- a/src/mesa/main/state.c
> > +++ b/src/mesa/main/state.c
> > @@ -278,6 +278,16 @@ update_program_constants(struct gl_context *ctx)
> >}
> > }
> >
> > +   if (ctx->ComputeProgram._Current) {
> > +  const struct gl_program_parameter_list *params =
> > + ctx->ComputeProgram._Current->Base.Parameters;
> > +  /*FIXME: StateFlags is always 0 because we have unnamed constant
> > +   *   not state changes */
> > +  if (params /*&& params->StateFlags & ctx->NewState*/) {
> > + new_state |= _NEW_PROGRAM_CONSTANTS;
> > +  }
> > +   }
> > +
> > return new_state;
> >  }
> >
> > --
> > 2.1.4
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/46] mesa: require VS if TCS or TES is present in pipeline

2015-06-16 Thread Jordan Justen
On 2015-06-16 17:02:57, Marek Olšák wrote:
> On Wed, Jun 17, 2015 at 1:46 AM, Jordan Justen
>  wrote:
> > This patch was in a range of 11 patches that I added my r-b for:
> > http://lists.freedesktop.org/archives/mesa-dev/2014-September/068742.html
> >
> > It seems Chris didn't get a chance to add my r-b to these patches,
> > including the recently already committed 358b6bb.
> 
> Sorry for not including your Rb. I did read all the comments on this
> mailing list and also github (I think it was tess-i965-4) and
> incorporated them into the series, but when you have a ton of comments
> and possibly incomplete 130 patches in your branch, the last thing you
> think of is adding Rb's.

It is understandable to miss feedback from last September. Especially
if it was directed to someone else...

Anyway, can I get my r-b on the remaining 10 patches, assuming they
are still alive?

Thanks,

-Jordan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 16/82] mesa: Implement _mesa_DeleteBuffers for target GL_SHADER_STORAGE_BUFFER

2015-06-17 Thread Jordan Justen
On 2015-06-03 00:01:06, Iago Toral Quiroga wrote:
> ---
>  src/mesa/main/bufferobj.c | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
> index a528787..0e762df 100644
> --- a/src/mesa/main/bufferobj.c
> +++ b/src/mesa/main/bufferobj.c
> @@ -1264,6 +1264,17 @@ _mesa_DeleteBuffers(GLsizei n, const GLuint *ids)
>  _mesa_BindBuffer( GL_UNIFORM_BUFFER, 0 );
>   }
>  
> + /* unbind SSBO binding points */
> + for (j = 0; j < ctx->Const.MaxShaderStorageBufferBindings; j++) {
> +if (ctx->ShaderStorageBufferBindings[j].BufferObject == bufObj) {
> +   _mesa_BindBufferBase( GL_SHADER_STORAGE_BUFFER, j, 0 );
> +}
> + }
> +
> + if (ctx->ShaderStorageBuffer == bufObj) {
> +_mesa_BindBuffer( GL_SHADER_STORAGE_BUFFER, 0 );
> +     }

Can you remove the extra spaces after '(' and before ')'?

Reviewed-by: Jordan Justen 

> +
>   /* unbind Atomci Buffer binding points */
>   for (j = 0; j < ctx->Const.MaxAtomicBufferBindings; j++) {
>  if (ctx->AtomicBufferBindings[j].BufferObject == bufObj) {
> -- 
> 1.9.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 17/82] mesa: Implement _mesa_BindBuffersBase for target GL_SHADER_STORAGE_BUFFER

2015-06-17 Thread Jordan Justen
On 2015-06-03 00:01:07, Iago Toral Quiroga wrote:
> ---
>  src/mesa/main/bufferobj.c | 142 
> ++
>  src/mesa/main/mtypes.h|   7 +++
>  2 files changed, 149 insertions(+)
> 
> diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
> index 0e762df..c8b29a7 100644
> --- a/src/mesa/main/bufferobj.c
> +++ b/src/mesa/main/bufferobj.c
> @@ -3034,6 +3034,33 @@ set_ubo_binding(struct gl_context *ctx,
>  }
>  
>  /**
> + * Binds a buffer object to a shader storage buffer binding point.
> + *
> + * The caller is responsible for flushing vertices and updating
> + * NewDriverState.
> + */
> +static void
> +set_ssbo_binding(struct gl_context *ctx,
> + struct gl_shader_storage_buffer_binding *binding,
> + struct gl_buffer_object *bufObj,
> + GLintptr offset,
> + GLsizeiptr size,
> + GLboolean autoSize)
> +{
> +   _mesa_reference_buffer_object(ctx, &binding->BufferObject, bufObj);
> +
> +   binding->Offset = offset;
> +   binding->Size = size;
> +   binding->AutomaticSize = autoSize;
> +
> +   /* If this is a real buffer object, mark it has having been used
> +* at some point as a SSBO.
> +*/
> +   if (size >= 0)
> +  bufObj->UsageHistory |= USAGE_SHADER_STORAGE_BUFFER;
> +}
> +
> +/**
>   * Binds a buffer object to a uniform buffer binding point.
>   *
>   * Unlike set_ubo_binding(), this function also flushes vertices
> @@ -3254,6 +3281,35 @@ error_check_bind_uniform_buffers(struct gl_context 
> *ctx,
> return true;
>  }
>  
> +static bool
> +error_check_bind_shader_storage_buffers(struct gl_context *ctx,
> +GLuint first, GLsizei count,
> +const char *caller)
> +{
> +   if (!ctx->Extensions.ARB_shader_storage_buffer_object) {
> +  _mesa_error(ctx, GL_INVALID_ENUM,
> +  "%s(target=GL_SHADER_STORAGE_BUFFER)", caller);
> +  return false;
> +   }
> +
> +   /* The ARB_multi_bind_spec says:
> +*
> +* "An INVALID_OPERATION error is generated if  +  is
> +*  greater than the number of target-specific indexed binding points,
> +*  as described in section 6.7.1."
> +*/
> +   if (first + count > ctx->Const.MaxShaderStorageBufferBindings) {
> +  _mesa_error(ctx, GL_INVALID_OPERATION,
> +  "%s(first=%u + count=%d > the value of "
> +  "GL_MAX_SHADER_STORAGE_BUFFER_BINDINGS=%u)",
> +  caller, first, count,
> +  ctx->Const.MaxShaderStorageBufferBindings);
> +  return false;
> +   }
> +
> +   return true;
> +}
> +
>  /**
>   * Unbind all uniform buffers in the range
>   *  through +-1
> @@ -3269,6 +3325,22 @@ unbind_uniform_buffers(struct gl_context *ctx, GLuint 
> first, GLsizei count)
>bufObj, -1, -1, GL_TRUE);
>  }
>  
> +/**
> + * Unbind all shader storage buffers in the range
> + *  through +-1
> + */
> +static void
> +unbind_shader_storage_buffers(struct gl_context *ctx, GLuint first,
> +  GLsizei count)
> +{
> +   struct gl_buffer_object *bufObj = ctx->Shared->NullBufferObj;
> +   GLint i;
> +
> +   for (i = 0; i < count; i++)
> +  set_ssbo_binding(ctx, &ctx->ShaderStorageBufferBindings[first + i],
> +   bufObj, -1, -1, GL_TRUE);
> +}
> +
>  static void
>  bind_uniform_buffers_base(struct gl_context *ctx, GLuint first, GLsizei 
> count,
>const GLuint *buffers)
> @@ -3336,6 +3408,73 @@ bind_uniform_buffers_base(struct gl_context *ctx, 
> GLuint first, GLsizei count,
>  }
>  
>  static void
> +bind_shader_storage_buffers_base(struct gl_context *ctx, GLuint first,
> + GLsizei count,const GLuint *buffers)

space before const

Reviewed-by: Jordan Justen 

> +{
> +   GLint i;
> +
> +   if (!error_check_bind_shader_storage_buffers(ctx, first, count,
> +"glBindBuffersBase"))
> +  return;
> +
> +   /* Assume that at least one binding will be changed */
> +   FLUSH_VERTICES(ctx, 0);
> +   ctx->NewDriverState |= ctx->DriverFlags.NewShaderStorageBuffer;
> +
> +   if (!buffers) {
> +  /* The ARB_multi_bind spec says:
> +   *
> +   *   "If  is NULL, all bindings from  through
> +   *+-1 are reset to their unbound (zero) state."
> +   */
> +  unbind_shader_storage_buffer

Re: [Mesa-dev] [PATCH v2 18/82] mesa: Implement _mesa_BindBuffersRange for target GL_SHADER_STORAGE_BUFFER

2015-06-17 Thread Jordan Justen
On 2015-06-03 00:01:08, Iago Toral Quiroga wrote:
> ---
>  src/mesa/main/bufferobj.c | 110 
> ++
>  1 file changed, 110 insertions(+)
> 
> diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
> index c8b29a7..70ac638 100644
> --- a/src/mesa/main/bufferobj.c
> +++ b/src/mesa/main/bufferobj.c
> @@ -3579,6 +3579,112 @@ bind_uniform_buffers_range(struct gl_context *ctx, 
> GLuint first, GLsizei count,
> _mesa_end_bufferobj_lookups(ctx);
>  }
>  
> +static void
> +bind_shader_storage_buffers_range(struct gl_context *ctx, GLuint first,
> +  GLsizei count, const GLuint *buffers,
> +  const GLintptr *offsets,
> +  const GLsizeiptr *sizes)
> +{
> +   GLint i;
> +
> +   if (!error_check_bind_shader_storage_buffers(ctx, first, count,
> +"glBindBuffersRange"))
> +  return;
> +
> +   /* Assume that at least one binding will be changed */
> +   FLUSH_VERTICES(ctx, 0);
> +   ctx->NewDriverState |= ctx->DriverFlags.NewShaderStorageBuffer;
> +
> +   if (!buffers) {
> +  /* The ARB_multi_bind spec says:
> +   *
> +   *"If  is NULL, all bindings from  through
> +   * +-1 are reset to their unbound (zero) state.
> +   * In this case, the offsets and sizes associated with the
> +   * binding points are set to default values, ignoring
> +   *  and ."
> +   */
> +  unbind_shader_storage_buffers(ctx, first, count);
> +  return;
> +   }
> +
> +   /* Note that the error semantics for multi-bind commands differ from
> +* those of other GL commands.
> +*
> +* The Issues section in the ARB_multi_bind spec says:
> +*
> +*"(11) Typically, OpenGL specifies that if an error is generated by a
> +*  command, that command has no effect.  This is somewhat
> +*  unfortunate for multi-bind commands, because it would require 
> a
> +*  first pass to scan the entire list of bound objects for errors
> +*  and then a second pass to actually perform the bindings.
> +*  Should we have different error semantics?
> +*
> +*   RESOLVED:  Yes.  In this specification, when the parameters for
> +*   one of the  binding points are invalid, that binding point
> +*   is not updated and an error will be generated.  However, other
> +*   binding points in the same command will be updated if their
> +*   parameters are valid and no other error occurs."
> +*/
> +
> +   _mesa_begin_bufferobj_lookups(ctx);
> +
> +   for (i = 0; i < count; i++) {
> +  struct gl_shader_storage_buffer_binding *binding =
> + &ctx->ShaderStorageBufferBindings[first + i];
> +  struct gl_buffer_object *bufObj;
> +
> +  if (!bind_buffers_check_offset_and_size(ctx, i, offsets, sizes))
> + continue;
> +
> +  /* The ARB_multi_bind spec says:
> +   *
> +   * "An INVALID_VALUE error is generated by BindBuffersRange if any
> +   *  pair of values in  and  does not respectively
> +   *  satisfy the constraints described for those parameters for the
> +   *  specified target, as described in section 6.7.1 (per binding)."
> +   *
> +   * Section 6.7.1 refers to table 6.5, which says:
> +   *
> +   * 
> "┌───┐
> +   *  │ Shader storage buffer array bindings (see sec. 7.8)  
>  │
> +   *  
> ├─┬─┤
> +   *  │  ...│  ...   
>  │
> +   *  │  offset restriction │  multiple of value of SHADER_STORAGE_- 
>  │
> +   *  │ │  BUFFER_OFFSET_ALIGNMENT   
>  │
> +   *  │  ...│  ...   
>  │
> +   *  │  size restriction   │  none  
>  │
> +   *  
> └─┴─┘"
> +   */
> +  if (offsets[i] & (ctx->Const.ShaderStorageBufferOffsetAlignment - 1)) {
> + _mesa_error(ctx, GL_INVALID_VALUE,
> + "glBindBuffersRange(offsets[%u]=%" PRId64
> + " is misaligned; it must be a multiple of the value of "
> + "GL_SHADER_STORAGE_BUFFER_OFFSET_ALIGNMENT=%u whe

Re: [Mesa-dev] [PATCH v2 20/82] mesa: Implement _mesa_BindBufferRange for target GL_SHADER_STORAGE_BUFFER

2015-06-17 Thread Jordan Justen
19-20 Reviewed-by: Jordan Justen 

On 2015-06-03 00:01:10, Iago Toral Quiroga wrote:
> ---
>  src/mesa/main/bufferobj.c | 37 +
>  1 file changed, 37 insertions(+)
> 
> diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
> index fb5331e..4277880 100644
> --- a/src/mesa/main/bufferobj.c
> +++ b/src/mesa/main/bufferobj.c
> @@ -3157,6 +3157,40 @@ bind_buffer_range_uniform_buffer(struct gl_context 
> *ctx,
> bind_uniform_buffer(ctx, index, bufObj, offset, size, GL_FALSE);
>  }
>  
> +/**
> + * Bind a region of a buffer object to a shader storage block binding point.
> + * \param index  the shader storage buffer binding point index
> + * \param bufObj  the buffer object
> + * \param offset  offset to the start of buffer object region
> + * \param size  size of the buffer object region
> + */
> +static void
> +bind_buffer_range_shader_storage_buffer(struct gl_context *ctx,
> +GLuint index,
> +struct gl_buffer_object *bufObj,
> +GLintptr offset,
> +GLsizeiptr size)
> +{
> +   if (index >= ctx->Const.MaxShaderStorageBufferBindings) {
> +  _mesa_error(ctx, GL_INVALID_VALUE, "glBindBufferRange(index=%d)", 
> index);
> +  return;
> +   }
> +
> +   if (offset & (ctx->Const.ShaderStorageBufferOffsetAlignment - 1)) {
> +  _mesa_error(ctx, GL_INVALID_VALUE,
> +  "glBindBufferRange(offset misaligned %d/%d)", (int) offset,
> +  ctx->Const.ShaderStorageBufferOffsetAlignment);
> +  return;
> +   }
> +
> +   if (bufObj == ctx->Shared->NullBufferObj) {
> +  offset = -1;
> +  size = -1;
> +   }
> +
> +   _mesa_reference_buffer_object(ctx, &ctx->ShaderStorageBuffer, bufObj);
> +   bind_shader_storage_buffer(ctx, index, bufObj, offset, size, GL_FALSE);
> +}
>  
>  /**
>   * Bind a buffer object to a uniform block binding point.
> @@ -4227,6 +4261,9 @@ _mesa_BindBufferRange(GLenum target, GLuint index,
> case GL_UNIFORM_BUFFER:
>bind_buffer_range_uniform_buffer(ctx, index, bufObj, offset, size);
>return;
> +   case GL_SHADER_STORAGE_BUFFER:
> +  bind_buffer_range_shader_storage_buffer(ctx, index, bufObj, offset, 
> size);
> +  return;
> case GL_ATOMIC_COUNTER_BUFFER:
>bind_atomic_buffer(ctx, index, bufObj, offset, size,
>   "glBindBufferRange");
> -- 
> 1.9.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 21/82] glsl: Don't do tree grafting on buffer variables

2015-06-17 Thread Jordan Justen
On 2015-06-03 00:01:11, Iago Toral Quiroga wrote:
> Otherwise we can lose writes into the buffers backing the variables.
> ---
>  src/glsl/opt_tree_grafting.cpp | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/src/glsl/opt_tree_grafting.cpp b/src/glsl/opt_tree_grafting.cpp
> index d47613c..7f2ee6c 100644
> --- a/src/glsl/opt_tree_grafting.cpp
> +++ b/src/glsl/opt_tree_grafting.cpp
> @@ -359,10 +359,11 @@ tree_grafting_basic_block(ir_instruction *bb_first,
>if (!lhs_var)
>  continue;
>  
> -  if (lhs_var->data.mode == ir_var_function_out ||
> - lhs_var->data.mode == ir_var_function_inout ||
> -  lhs_var->data.mode == ir_var_shader_out)
> -continue;
> +   if (lhs_var->data.mode == ir_var_function_out ||
> +   lhs_var->data.mode == ir_var_function_inout ||
> +   lhs_var->data.mode == ir_var_shader_out ||
> +   lhs_var->data.mode == ir_var_shader_storage)
> +  continue;

This indentation looks wrong. If fixed,
Reviewed-by: Jordan Justen 

>  
>ir_variable_refcount_entry *entry = 
> info->refs->get_variable_entry(lhs_var);
>  
> -- 
> 1.9.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 22/82] glsl: Do not kill dead assignments to buffer variables or SSBO declarations.

2015-06-17 Thread Jordan Justen
On 2015-06-03 00:01:12, Iago Toral Quiroga wrote:
> If we kill dead assignments we lose the buffer writes.
> 
> Also, we never kill UBO declarations even if they are never referenced
> by the shader, they are always considered active. Although the spec
> does not seem say this specifically for SSBOs, it is probably implied
> since SSBOs are pretty much the same as UBOs, only that you can write
> to them.
> ---
>  src/glsl/opt_dead_code.cpp | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/src/glsl/opt_dead_code.cpp b/src/glsl/opt_dead_code.cpp
> index f45bf5d..1bb5f32 100644
> --- a/src/glsl/opt_dead_code.cpp
> +++ b/src/glsl/opt_dead_code.cpp
> @@ -77,11 +77,13 @@ do_dead_code(exec_list *instructions, bool 
> uniform_locations_assigned)
>  
>if (entry->assign) {
>  /* Remove a single dead assignment to the variable we found.
> - * Don't do so if it's a shader or function output, though.
> + * Don't do so if it's a shader or function output or a buffer
> + * variable though.

buffer => shader storage

Reviewed-by: Jordan Justen 

>   */
>  if (entry->var->data.mode != ir_var_function_out &&
>  entry->var->data.mode != ir_var_function_inout &&
> - entry->var->data.mode != ir_var_shader_out) {
> + entry->var->data.mode != ir_var_shader_out &&
> + entry->var->data.mode != ir_var_shader_storage) {
> entry->assign->remove();
> progress = true;
>  
> @@ -99,7 +101,8 @@ do_dead_code(exec_list *instructions, bool 
> uniform_locations_assigned)
>   * stage.  Also, once uniform locations have been assigned, the
>   * declaration cannot be deleted.
>   */
> - if (entry->var->data.mode == ir_var_uniform) {
> + if (entry->var->data.mode == ir_var_uniform ||
> + entry->var->data.mode == ir_var_shader_storage) {
>  if (uniform_locations_assigned || entry->var->constant_value)
> continue;
>  
> -- 
> 1.9.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 23/82] glsl: Do not do CSE for expressions involving SSBO loads

2015-06-17 Thread Jordan Justen
I wanted to question whether this was required, based on this text
from the extension spec:

"The ability to write to buffer objects creates the potential for
 multiple independent shader invocations to read and write the same
 underlying memory. The same issue exists with the
 ARB_shader_image_load_store extension provided in OpenGL 4.2, which
 can write to texture objects and buffers. In both cases, the
 specification makes few guarantees related to the relative order of
 memory reads and writes performed by the shader invocations."

But I'm not sure if we can reconcile CSE with 'memoryBarrier' and
'barrier'. curro, any thoughts from image load/store?

-Jordan

On 2015-06-03 00:01:13, Iago Toral Quiroga wrote:
> SSBOs are read/write and this CSE pass only handles read-only variables.
> ---
>  src/glsl/opt_cse.cpp | 33 -
>  1 file changed, 32 insertions(+), 1 deletion(-)
> 
> diff --git a/src/glsl/opt_cse.cpp b/src/glsl/opt_cse.cpp
> index 4b8e9a0..a05ab46 100644
> --- a/src/glsl/opt_cse.cpp
> +++ b/src/glsl/opt_cse.cpp
> @@ -245,6 +245,28 @@ contains_rvalue(ir_rvalue *haystack, ir_rvalue *needle)
>  }
>  
>  static bool
> +expression_contains_ssbo_load(ir_expression *expr)
> +{
> +   if (expr->operation == ir_binop_ssbo_load)
> +  return true;
> +
> +   for (unsigned i = 0; i < expr->get_num_operands(); i++) {
> +  ir_rvalue *op = expr->operands[i];
> +  if (op->ir_type == ir_type_expression &&
> +  expression_contains_ssbo_load(op->as_expression())) {
> + return true;
> +  } else if (op->ir_type == ir_type_swizzle) {
> + ir_swizzle *swizzle = op->as_swizzle();
> + ir_expression *val = swizzle->val->as_expression();
> + if (val && expression_contains_ssbo_load(val))
> +return true;
> +  }
> +   }
> +
> +   return false;
> +}
> +
> +static bool
>  is_cse_candidate(ir_rvalue *ir)
>  {
> /* Our temporary variable assignment generation isn't ready to handle
> @@ -260,7 +282,16 @@ is_cse_candidate(ir_rvalue *ir)
>  * to variable-index array dereferences at some point.
>  */
> switch (ir->ir_type) {
> -   case ir_type_expression:
> +   case ir_type_expression: {
> + /* Skip expressions involving SSBO loads, since these operate on
> +  * read-write variables, meaning that the same ssbo_load expression
> +  * may return a different value if the underlying buffer storage
> +  * is written in between.
> +  */
> + if (expression_contains_ssbo_load(ir->as_expression()))
> +return false;
> +  }
> +  break;
> case ir_type_texture:
>break;
> default:
> -- 
> 1.9.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 23/82] glsl: Do not do CSE for expressions involving SSBO loads

2015-06-17 Thread Jordan Justen
On 2015-06-17 19:33:23, Matt Turner wrote:
> On Wed, Jun 17, 2015 at 5:20 PM, Jordan Justen
>  wrote:
> > I wanted to question whether this was required, based on this text
> > from the extension spec:
> >
> > "The ability to write to buffer objects creates the potential for
> >  multiple independent shader invocations to read and write the same
> >  underlying memory. The same issue exists with the
> >  ARB_shader_image_load_store extension provided in OpenGL 4.2, which
> >  can write to texture objects and buffers. In both cases, the
> >  specification makes few guarantees related to the relative order of
> >  memory reads and writes performed by the shader invocations."
> >
> > But I'm not sure if we can reconcile CSE with 'memoryBarrier' and
> > 'barrier'. curro, any thoughts from image load/store?
> 
> I think the point is that if you have two reads of the same location,
> you can't combine them into a single load since in the intervening
> time another shader invocation might have written to that location...?

I don't think the spec requires you to have gotten the newly written
value in that case. So, it might be okay to just read it once.

"In both cases, the specification makes few guarantees related to the
 relative order of memory reads and writes performed by the shader
 invocations."

> The question about the semantics of memory-barriers is a good one though.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 23/82] glsl: Do not do CSE for expressions involving SSBO loads

2015-06-18 Thread Jordan Justen
On 2015-06-17 22:57:27, Iago Toral wrote:
> On Wed, 2015-06-17 at 17:20 -0700, Jordan Justen wrote:
> > I wanted to question whether this was required, based on this text
> > from the extension spec:
> > 
> > "The ability to write to buffer objects creates the potential for
> >  multiple independent shader invocations to read and write the same
> >  underlying memory. The same issue exists with the
> >  ARB_shader_image_load_store extension provided in OpenGL 4.2, which
> >  can write to texture objects and buffers. In both cases, the
> >  specification makes few guarantees related to the relative order of
> >  memory reads and writes performed by the shader invocations."
> > 
> > But I'm not sure if we can reconcile CSE with 'memoryBarrier' and
> > 'barrier'. curro, any thoughts from image load/store?
> 
> I think the problem is within the same thread, that text above talks
> about multiple invocations reading from and writing to the same
> location, but within the same invocation, the order of reads and writes
> must be preserved:
> 
> "Buffer variable memory reads and writes within a single shader
> invocation are processed in order.  However, the order of reads and
> writes performed in one invocation relative to those performed by
> another invocation is largely undefined."
> 
> For example, if X is a shader storage buffer variable and we have code
> like this with just one invocation:
> 
> ssbo_store(X, 1);
> a = ssbo_load(X) + 1  // a = 2
> ssbo_store(X, 2);
> b = ssbo_load(X) + 1; // b = 3
> 
> CSE could mess it up like this:
> 
> ssbo_store(X, 1);
> tmp = ssbo_load(X) + 1  // tmp = 2
> a = tmp;
> ssbo_store(X, 2);
> b = tmp;
> 
> which would be incorrect. I think I wrote this patch after seeing
> something like this happening. The CSE pass clearly states that it does
> not support write variables after all.
> 
> Also, notice the same would apply if there are multiple invocations but
> the shader code used something like gl_VertexID or gl_FragCoord to make
> each invocation read from/write to a different address within the SSBO
> buffer (I imagine this is the usual way to operate with SSBOs). In these
> cases, even if we have multiple invocations, keeping the relative order
> of reads and writes within each one is necessary.

Ok. Reviewed-by: Jordan Justen 

> > 
> > On 2015-06-03 00:01:13, Iago Toral Quiroga wrote:
> > > SSBOs are read/write and this CSE pass only handles read-only variables.
> > > ---
> > >  src/glsl/opt_cse.cpp | 33 -
> > >  1 file changed, 32 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/src/glsl/opt_cse.cpp b/src/glsl/opt_cse.cpp
> > > index 4b8e9a0..a05ab46 100644
> > > --- a/src/glsl/opt_cse.cpp
> > > +++ b/src/glsl/opt_cse.cpp
> > > @@ -245,6 +245,28 @@ contains_rvalue(ir_rvalue *haystack, ir_rvalue 
> > > *needle)
> > >  }
> > >  
> > >  static bool
> > > +expression_contains_ssbo_load(ir_expression *expr)
> > > +{
> > > +   if (expr->operation == ir_binop_ssbo_load)
> > > +  return true;
> > > +
> > > +   for (unsigned i = 0; i < expr->get_num_operands(); i++) {
> > > +  ir_rvalue *op = expr->operands[i];
> > > +  if (op->ir_type == ir_type_expression &&
> > > +  expression_contains_ssbo_load(op->as_expression())) {
> > > + return true;
> > > +  } else if (op->ir_type == ir_type_swizzle) {
> > > + ir_swizzle *swizzle = op->as_swizzle();
> > > + ir_expression *val = swizzle->val->as_expression();
> > > + if (val && expression_contains_ssbo_load(val))
> > > +return true;
> > > +  }
> > > +   }
> > > +
> > > +   return false;
> > > +}
> > > +
> > > +static bool
> > >  is_cse_candidate(ir_rvalue *ir)
> > >  {
> > > /* Our temporary variable assignment generation isn't ready to handle
> > > @@ -260,7 +282,16 @@ is_cse_candidate(ir_rvalue *ir)
> > >  * to variable-index array dereferences at some point.
> > >  */
> > > switch (ir->ir_type) {
> > > -   case ir_type_expression:
> > > +   case ir_type_expression: {
> > > + /* Skip expressions involving SSBO loads, since these operate on
> > > +  * read-write variables, meaning that the same ssbo_load 
> > > expression
> > > +  * may return a different value if the underlying buffer storage
> > > +  * is written in between.
> > > +  */
> > > + if (expression_contains_ssbo_load(ir->as_expression()))
> > > +return false;
> > > +  }
> > > +  break;
> > > case ir_type_texture:
> > >break;
> > > default:
> > > -- 
> > > 1.9.1
> > > 
> > > ___
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > 
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 26/82] glsl: Don't do copy propagation on buffer variables

2015-06-22 Thread Jordan Justen
24-26 once again makes me wonder if these optimization *can* be used
with SSBOs based on the same ext spec wording I referenced before:

"The ability to write to buffer objects creates the potential for
 multiple independent shader invocations to read and write the same
 underlying memory. The same issue exists with the
 ARB_shader_image_load_store extension provided in OpenGL 4.2, which
 can write to texture objects and buffers. In both cases, the
 specification makes few guarantees related to the relative order of
 memory reads and writes performed by the shader invocations."

In these patches "other threads" were specifically mentioned.

Did these patches also prevent bad things from happening in generated
code? (Like mentioned for patch 23.)

-Jordan

On 2015-06-03 00:01:16, Iago Toral Quiroga wrote:
> Since the backing storage for these is shared we cannot ensure that the
> value won't change by writes from other threads.
> ---
>  src/glsl/opt_copy_propagation.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/glsl/opt_copy_propagation.cpp 
> b/src/glsl/opt_copy_propagation.cpp
> index 806027b..f206995 100644
> --- a/src/glsl/opt_copy_propagation.cpp
> +++ b/src/glsl/opt_copy_propagation.cpp
> @@ -330,7 +330,7 @@ ir_copy_propagation_visitor::add_copy(ir_assignment *ir)
>   */
>  ir->condition = new(ralloc_parent(ir)) ir_constant(false);
>  this->progress = true;
> -  } else {
> +  } else if (lhs_var->data.mode != ir_var_shader_storage) {
>  entry = new(this->acp) acp_entry(lhs_var, rhs_var);
>  this->acp->push_tail(entry);
>}
> -- 
> 1.9.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 27/82] mesa: Add new IR node ir_ssbo_store

2015-06-22 Thread Jordan Justen
Reviewed-by: Jordan Justen 

On 2015-06-03 00:01:17, Iago Toral Quiroga wrote:
> Shader storage buffer objects (SSBO) require special handling: when we
> detect writes to any channel of a shader buffer variable we need to
> emit the corresponding write to memory. We will later add a lowering pass
> that detects these writes and injects ir_ssbo_store nodes in the IR so
> drivers can  generate code for the memory writes.
> ---
>  src/glsl/ir.h  | 38 
> ++
>  src/glsl/ir_hierarchical_visitor.cpp   | 18 
>  src/glsl/ir_hierarchical_visitor.h |  2 ++
>  src/glsl/ir_hv_accept.cpp  | 23 
>  src/glsl/ir_print_visitor.cpp  | 12 
>  src/glsl/ir_print_visitor.h|  1 +
>  src/glsl/ir_rvalue_visitor.cpp | 21 ++
>  src/glsl/ir_rvalue_visitor.h   |  3 ++
>  src/glsl/ir_visitor.h  |  2 ++
>  src/glsl/nir/glsl_to_nir.cpp   |  7 +
>  src/mesa/drivers/dri/i965/brw_vec4.h   |  1 +
>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp |  6 
>  src/mesa/program/ir_to_mesa.cpp|  7 +
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  7 +
>  14 files changed, 148 insertions(+)
> 
> diff --git a/src/glsl/ir.h b/src/glsl/ir.h
> index 1118732..2a0b28c 100644
> --- a/src/glsl/ir.h
> +++ b/src/glsl/ir.h
> @@ -78,6 +78,7 @@ enum ir_node_type {
> ir_type_discard,
> ir_type_emit_vertex,
> ir_type_end_primitive,
> +   ir_type_ssbo_store,
> ir_type_max, /**< maximum ir_type enum number, for validation */
> ir_type_unset = ir_type_max
>  };
> @@ -2407,6 +2408,43 @@ public:
> ir_rvalue *stream;
>  };
>  
> +/**
> + * IR instruction to write to a shader storage buffer object (SSBO)
> + */
> +class ir_ssbo_store : public ir_instruction {
> +public:
> +   ir_ssbo_store(ir_rvalue *block, ir_rvalue *offset, ir_rvalue *val,
> + unsigned write_mask)
> +  : ir_instruction(ir_type_ssbo_store),
> +block(block), offset(offset), val(val), write_mask(write_mask)
> +   {
> +  assert(block);
> +  assert(offset);
> +  assert(val);
> +  assert(write_mask != 0);
> +   }
> +
> +   virtual void accept(ir_visitor *v)
> +   {
> +  v->visit(this);
> +   }
> +
> +   virtual ir_ssbo_store *clone(void *mem_ctx, struct hash_table *ht) const
> +   {
> +  return new(mem_ctx) ir_ssbo_store(this->block->clone(mem_ctx, ht),
> +this->offset->clone(mem_ctx, ht),
> +this->val->clone(mem_ctx, ht),
> +this->write_mask);
> +   }
> +
> +   virtual ir_visitor_status accept(ir_hierarchical_visitor *);
> +
> +   ir_rvalue *block;
> +   ir_rvalue *offset;
> +   ir_rvalue *val;
> +   unsigned write_mask;
> +};
> +
>  /*@}*/
>  
>  /**
> diff --git a/src/glsl/ir_hierarchical_visitor.cpp 
> b/src/glsl/ir_hierarchical_visitor.cpp
> index adb6294..1aa5cc0 100644
> --- a/src/glsl/ir_hierarchical_visitor.cpp
> +++ b/src/glsl/ir_hierarchical_visitor.cpp
> @@ -349,6 +349,24 @@ ir_hierarchical_visitor::visit_leave(ir_end_primitive 
> *ir)
> return visit_continue;
>  }
>  
> +ir_visitor_status
> +ir_hierarchical_visitor::visit_enter(ir_ssbo_store *ir)
> +{
> +   if (this->callback_enter != NULL)
> +  this->callback_enter(ir, this->data_enter);
> +
> +   return visit_continue;
> +}
> +
> +ir_visitor_status
> +ir_hierarchical_visitor::visit_leave(ir_ssbo_store *ir)
> +{
> +   if (this->callback_leave != NULL)
> +  this->callback_leave(ir, this->data_leave);
> +
> +   return visit_continue;
> +}
> +
>  void
>  ir_hierarchical_visitor::run(exec_list *instructions)
>  {
> diff --git a/src/glsl/ir_hierarchical_visitor.h 
> b/src/glsl/ir_hierarchical_visitor.h
> index faa52fd..49dc37e 100644
> --- a/src/glsl/ir_hierarchical_visitor.h
> +++ b/src/glsl/ir_hierarchical_visitor.h
> @@ -139,6 +139,8 @@ public:
> virtual ir_visitor_status visit_leave(class ir_emit_vertex *);
> virtual ir_visitor_status visit_enter(class ir_end_primitive *);
> virtual ir_visitor_status visit_leave(class ir_end_primitive *);
> +   virtual ir_visitor_status visit_enter(class ir_ssbo_store *);
> +   virtual ir_visitor_status visit_leave(class ir_ssbo_store *);
> /*@}*/
>  
>  
> diff --git a/src/glsl/ir_hv_accept.cpp b/src/glsl/ir_hv_accept.cpp
> index be5b3ea..500ce4b 100644
> --- a/src/glsl/ir_hv_a

Re: [Mesa-dev] [PATCH] mesa: Delete unused ICEIL().

2015-06-22 Thread Jordan Justen
Reviewed-by: Jordan Justen 

On 2015-06-22 14:58:27, Matt Turner wrote:
> Can't find any uses of it in git history.
> ---
> Strangely, when it was moved to its current location in commit 27558a1,
> it was moved from mmath.h... which seems to have been lost from git's
> history. Searching further git log --grep mmath.h shows that various
> commit messages mention modifying mmath.h and none of the commits
> actually do.
> 
>  src/mesa/main/imports.h | 32 
>  1 file changed, 32 deletions(-)
> 
> diff --git a/src/mesa/main/imports.h b/src/mesa/main/imports.h
> index c4d917e..9ffe3de 100644
> --- a/src/mesa/main/imports.h
> +++ b/src/mesa/main/imports.h
> @@ -230,38 +230,6 @@ static inline int IFLOOR(float f)
>  }
>  
>  
> -/** Return (as an integer) ceiling of float */
> -static inline int ICEIL(float f)
> -{
> -#if defined(USE_X86_ASM) && defined(__GNUC__) && defined(__i386__)
> -   /*
> -* IEEE ceil for computers that round to nearest or even.
> -* 'f' must be between -4194304 and 4194303.
> -* This ceil operation is done by "(iround(f + .5) + iround(f - .5) + 1) 
> >> 1",
> -* but uses some IEEE specific tricks for better speed.
> -* Contributed by Josh Vanderhoof
> -*/
> -   int ai, bi;
> -   double af, bf;
> -   af = (3 << 22) + 0.5 + (double)f;
> -   bf = (3 << 22) + 0.5 - (double)f;
> -   /* GCC generates an extra fstp/fld without this. */
> -   __asm__ ("fstps %0" : "=m" (ai) : "t" (af) : "st");
> -   __asm__ ("fstps %0" : "=m" (bi) : "t" (bf) : "st");
> -   return (ai - bi + 1) >> 1;
> -#else
> -   int ai, bi;
> -   double af, bf;
> -   fi_type u;
> -   af = (3 << 22) + 0.5 + (double)f;
> -   bf = (3 << 22) + 0.5 - (double)f;
> -   u.f = (float) af; ai = u.i;
> -   u.f = (float) bf; bi = u.i;
> -   return (ai - bi + 1) >> 1;
> -#endif
> -}
> -
> -
>  /**
>   * Is x a power of two?
>   */
> -- 
> 2.3.6
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] i965/cfg: Assert that cur_do/while/if pointers are non-NULL.

2015-06-22 Thread Jordan Justen
Series Reviewed-by: Jordan Justen 

On 2015-06-22 14:56:06, Matt Turner wrote:
> Coverity sees that the functions immediately below the new assertions
> dereference these pointers, but is unaware that an ENDIF always follows
> an IF, etc.
> ---
>  src/mesa/drivers/dri/i965/brw_cfg.cpp | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_cfg.cpp 
> b/src/mesa/drivers/dri/i965/brw_cfg.cpp
> index 39c419b..f1f230e 100644
> --- a/src/mesa/drivers/dri/i965/brw_cfg.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_cfg.cpp
> @@ -231,6 +231,7 @@ cfg_t::cfg_t(exec_list *instructions)
>   if (cur_else) {
>  cur_else->add_successor(mem_ctx, cur_endif);
>   } else {
> +assert(cur_if != NULL);
>  cur_if->add_successor(mem_ctx, cur_endif);
>   }
>  
> @@ -299,6 +300,7 @@ cfg_t::cfg_t(exec_list *instructions)
>   inst->exec_node::remove();
>   cur->instructions.push_tail(inst);
>  
> + assert(cur_do != NULL && cur_while != NULL);
>  cur->add_successor(mem_ctx, cur_do);
>  set_next_block(&cur, cur_while, ip);
>  
> -- 
> 2.3.6
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 26/82] glsl: Don't do copy propagation on buffer variables

2015-06-23 Thread Jordan Justen
On 2015-06-22 23:38:14, Iago Toral wrote:
> On Mon, 2015-06-22 at 14:28 -0700, Jordan Justen wrote:
> > 24-26 once again makes me wonder if these optimization *can* be used
> > with SSBOs based on the same ext spec wording I referenced before:
> > 
> > "The ability to write to buffer objects creates the potential for
> >  multiple independent shader invocations to read and write the same
> >  underlying memory. The same issue exists with the
> >  ARB_shader_image_load_store extension provided in OpenGL 4.2, which
> >  can write to texture objects and buffers. In both cases, the
> >  specification makes few guarantees related to the relative order of
> >  memory reads and writes performed by the shader invocations."
> > 
> > In these patches "other threads" were specifically mentioned.
> > 
> > Did these patches also prevent bad things from happening in generated
> > code? (Like mentioned for patch 23.)
> 
> I think the problem here is the possibility for shaders to use
> memoryBarrier():
> 
> "SHADER_STORAGE_BARRIER_BIT:  Memory accesses using shader buffer
> variables issued after the barrier will reflect data written by
> shaders prior to the barrier.  Additionally, assignments to and atomic
> operations performed on shader buffer variables after the barrier will
> not execute until all memory accesses (e.g., loads, stores, texture
> fetches, vertex fetches) initiated prior to the barrier complete."
> 
> I think in these cases we can't allow these optimizations to kick in. 
> 
> That said, maybe we can check if we are using any memorybarriers in the
> shader code and decide if we want to enable these optimizations for
> ssbos based on that. I think we can try to do that in a later patch.

Ok. What do you think about updating the commit messages on these
three patches?

For example, currently you have:

"Since the backing storage for these is shared we cannot ensure that
 the value won't change by writes from other threads."

How does something like this sound?

"Since the backing storage for these is shared we cannot ensure that
 the value won't change by writes from other threads. Normally SSBO
 accesses are not guaranteed to be syncronized with other threads,
 except when memoryBarrier is used. So, we might be able to optimize
 some SSBO accesses, but for now we always take the safe path and emit
 the SSBO access."

With a change like that to these commit messages, you can add
Reviewed-by: Jordan Justen 
to all 3 patches.

> > On 2015-06-03 00:01:16, Iago Toral Quiroga wrote:
> > > Since the backing storage for these is shared we cannot ensure that the
> > > value won't change by writes from other threads.
> > > ---
> > >  src/glsl/opt_copy_propagation.cpp | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/src/glsl/opt_copy_propagation.cpp 
> > > b/src/glsl/opt_copy_propagation.cpp
> > > index 806027b..f206995 100644
> > > --- a/src/glsl/opt_copy_propagation.cpp
> > > +++ b/src/glsl/opt_copy_propagation.cpp
> > > @@ -330,7 +330,7 @@ ir_copy_propagation_visitor::add_copy(ir_assignment 
> > > *ir)
> > >   */
> > >  ir->condition = new(ralloc_parent(ir)) ir_constant(false);
> > >  this->progress = true;
> > > -  } else {
> > > +  } else if (lhs_var->data.mode != ir_var_shader_storage) {
> > >  entry = new(this->acp) acp_entry(lhs_var, rhs_var);
> > >  this->acp->push_tail(entry);
> > >}
> > > -- 
> > > 1.9.1
> > > 
> > > ___
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > 
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Get rid of an unused variable in emit_barrier()

2015-06-23 Thread Jordan Justen
Reviewed-by: Jordan Justen 

On 2015-06-23 15:40:00, Jason Ekstrand wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index ea29341..9a4bad6 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> @@ -1963,11 +1963,11 @@ fs_visitor::emit_barrier()
> fs_reg payload = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD);
>  
> /* Clear the message payload */
> -   fs_inst *inst = bld.exec_all().MOV(payload, fs_reg(0u));
> +   bld.exec_all().MOV(payload, fs_reg(0u));
>  
> /* Copy bits 27:24 of r0.2 (barrier id) to the message payload reg.2 */
> fs_reg r0_2 = fs_reg(retype(brw_vec1_grf(0, 2), BRW_REGISTER_TYPE_UD));
> -   inst = bld.exec_all().AND(component(payload, 2), r0_2, 
> fs_reg(0x0f00u));
> +   bld.exec_all().AND(component(payload, 2), r0_2, fs_reg(0x0f00u));
>  
> /* Emit a gateway "barrier" message using the payload we set up, followed
>  * by a wait instruction.
> -- 
> 2.4.3
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/skl: Use more compact hiz dimensions

2015-06-25 Thread Jordan Justen
Reviewed-by: Jordan Justen 

On 2015-06-24 20:07:54, Ben Widawsky wrote:
> gen8 had some special restrictions which don't seem to carry over to gen9.
> Quoting the spec for SKL:
> "The Z_Height and Z_Width values must equal those present in
> 3DSTATE_DEPTH_BUFFER incremented by one."
> 
> This fixes nothing in piglit (and regresses nothing).
> 
> Cc: Jordan Justen 
> Signed-off-by: Ben Widawsky 
> ---
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 32 
> ++-
>  1 file changed, 17 insertions(+), 15 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
> b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> index 6aa969a..432a47c 100644
> --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
> @@ -1550,21 +1550,23 @@ intel_gen8_hiz_buf_create(struct brw_context *brw,
> /* Gen7 PRM Volume 2, Part 1, 11.5.3 "Hierarchical Depth Buffer" documents
>  * adjustments required for Z_Height and Z_Width based on multisampling.
>  */
> -   switch (mt->num_samples) {
> -   case 0:
> -   case 1:
> -  break;
> -   case 2:
> -   case 4:
> -  z_width *= 2;
> -  z_height *= 2;
> -  break;
> -   case 8:
> -  z_width *= 4;
> -  z_height *= 2;
> -  break;
> -   default:
> -  unreachable("unsupported sample count");
> +   if (brw->gen < 9) {
> +  switch (mt->num_samples) {
> +  case 0:
> +  case 1:
> + break;
> +  case 2:
> +  case 4:
> + z_width *= 2;
> + z_height *= 2;
> + break;
> +  case 8:
> + z_width *= 4;
> + z_height *= 2;
> + break;
> +  default:
> + unreachable("unsupported sample count");
> +  }
> }
>  
> const unsigned vertical_align = 8; /* 'j' in the docs */
> -- 
> 2.4.4
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Fix ir_txs in emit_texture_gen4_simd16().

2015-06-26 Thread Jordan Justen
Reviewed-by: Jordan Justen 

On 2015-06-22 11:18:55, Kenneth Graunke wrote:
> We were not emitting the LOD, which led to message lengths of 1 instead
> of 3.  Setting has_lod makes us emit the LOD, but I had to make changes
> to avoid emitting the non-existent coordinate as well.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91022
> Cc: mesa-sta...@lists.freedesktop.org
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index 4770838..12253e4 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> @@ -247,7 +247,7 @@ fs_visitor::emit_texture_gen4_simd16(ir_texture_opcode 
> op, fs_reg dst,
>   uint32_t sampler)
>  {
> fs_reg message(MRF, 2, BRW_REGISTER_TYPE_F, dispatch_width);
> -   bool has_lod = op == ir_txl || op == ir_txb || op == ir_txf;
> +   bool has_lod = op == ir_txl || op == ir_txb || op == ir_txf || op == 
> ir_txs;
>  
> if (has_lod && shadow_c.file != BAD_FILE)
>no16("TXB and TXL with shadow comparison unsupported in SIMD16.");
> @@ -264,14 +264,15 @@ fs_visitor::emit_texture_gen4_simd16(ir_texture_opcode 
> op, fs_reg dst,
> fs_reg msg_end = offset(message, vector_elements);
>  
> /* Messages other than sample and ld require all three components */
> -   if (has_lod || shadow_c.file != BAD_FILE) {
> +   if (vector_elements > 0 && (has_lod || shadow_c.file != BAD_FILE)) {
>for (int i = vector_elements; i < 3; i++) {
>   bld.MOV(offset(message, i), fs_reg(0.0f));
>}
> +  msg_end = offset(message, 3);
> }
>  
> if (has_lod) {
> -  fs_reg msg_lod = retype(offset(message, 3), op == ir_txf ?
> +  fs_reg msg_lod = retype(msg_end, op == ir_txf ?
>BRW_REGISTER_TYPE_UD : BRW_REGISTER_TYPE_F);
>bld.MOV(msg_lod, lod);
>msg_end = offset(msg_lod, 1);
> -- 
> 1.7.10.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Write at least some data in SIMD8 URB write messages.

2015-06-26 Thread Jordan Justen
On 2015-06-26 15:18:52, Kenneth Graunke wrote:
> According to the "URB SIMD8 Write > Write Data Payload" documentation,
> "The write data payload can be between 1 and 8 message phases long."

Would a more precise PRM ref location be possible?

Reviewed-by: Jordan Justen 

> Apparently, the simulator considers it an error if you issue an URB
> SIMD8 message with only a header and no actual data to write.
> 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp |5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index 9a4bad6..7074b5c 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> @@ -1800,14 +1800,13 @@ fs_visitor::emit_urb_writes(gl_clip_plane 
> *clip_planes)
> /* If we don't have any valid slots to write, just do a minimal urb write
>  * send to terminate the shader. */
> if (vue_map->slots_valid == 0) {
> -
> -  fs_reg payload = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD);
> +  fs_reg payload = fs_reg(GRF, alloc.allocate(2), BRW_REGISTER_TYPE_UD);
>bld.exec_all().MOV(payload, fs_reg(retype(brw_vec8_grf(1, 0),
>  BRW_REGISTER_TYPE_UD)));
>  
>fs_inst *inst = bld.emit(SHADER_OPCODE_URB_WRITE_SIMD8, reg_undef, 
> payload);
>inst->eot = true;
> -  inst->mlen = 1;
> +  inst->mlen = 2;
>inst->offset = 1;
>return;
> }
> -- 
> 1.7.10.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] i965: Run vector splitting for CS in brw_link_shader

2015-06-26 Thread Jordan Justen
Signed-off-by: Jordan Justen 
---
 src/mesa/drivers/dri/i965/brw_shader.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index 5653d6b..309f495 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -185,6 +185,7 @@ is_scalar_shader_stage(struct brw_context *brw, int stage)
 {
switch (stage) {
case MESA_SHADER_FRAGMENT:
+   case MESA_SHADER_COMPUTE:
   return true;
case MESA_SHADER_VERTEX:
   return brw->intelScreen->compiler->scalar_vs;
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/5] i965/fs: Set first_non_payload_grf in assign_curb_setup

2015-06-26 Thread Jordan Justen
first_non_payload_grf may be updated in assign_urb_setup for FS or
assign_vs_urb_setup for VS.

Signed-off-by: Jordan Justen 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 4292aa6..39ed89a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1324,6 +1324,9 @@ fs_visitor::assign_curb_setup()
 }
   }
}
+
+   /* This may be updated in assign_urb_setup or assign_vs_urb_setup. */
+   this->first_non_payload_grf = payload.num_regs + 
prog_data->curb_read_length;
 }
 
 void
@@ -1438,8 +1441,7 @@ fs_visitor::assign_urb_setup()
}
 
/* Each attribute is 4 setup channels, each of which is half a reg. */
-   this->first_non_payload_grf =
-  urb_start + prog_data->num_varying_inputs * 2;
+   this->first_non_payload_grf += prog_data->num_varying_inputs * 2;
 }
 
 void
@@ -1454,8 +1456,7 @@ fs_visitor::assign_vs_urb_setup()
   count++;
 
/* Each attribute is 4 regs. */
-   this->first_non_payload_grf =
-  payload.num_regs + prog_data->curb_read_length + count * 4;
+   this->first_non_payload_grf += count * 4;
 
unsigned vue_entries =
   MAX2(count, vs_prog_data->base.vue_map.num_slots);
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] i965/cs: Emit texture surfaces to enable CS sampling

2015-06-26 Thread Jordan Justen
Signed-off-by: Jordan Justen 
---
 src/mesa/drivers/dri/i965/brw_context.h  | 2 +-
 src/mesa/drivers/dri/i965/brw_state_upload.c | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 51a1447..67ace37 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1449,7 +1449,7 @@ struct brw_context
 
int num_atoms[BRW_NUM_PIPELINES];
const struct brw_tracked_state render_atoms[57];
-   const struct brw_tracked_state compute_atoms[4];
+   const struct brw_tracked_state compute_atoms[5];
 
/* If (INTEL_DEBUG & DEBUG_BATCH) */
struct {
diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
b/src/mesa/drivers/dri/i965/brw_state_upload.c
index c2d0ba1..f677826 100644
--- a/src/mesa/drivers/dri/i965/brw_state_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
@@ -253,6 +253,7 @@ static const struct brw_tracked_state *gen7_compute_atoms[] 
=
&brw_state_base_address,
&gen7_cs_push_constants,
&brw_cs_abo_surfaces,
+   &brw_texture_surfaces,
&brw_cs_state,
 };
 
@@ -337,6 +338,7 @@ static const struct brw_tracked_state *gen8_compute_atoms[] 
=
&gen8_state_base_address,
&gen7_cs_push_constants,
&brw_cs_abo_surfaces,
+   &brw_texture_surfaces,
&brw_cs_state,
 };
 
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] i965/cs: Support texture sampling for CS

2015-06-26 Thread Jordan Justen
Signed-off-by: Jordan Justen 
---
 src/mesa/drivers/dri/i965/brw_cs.cpp | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_cs.cpp 
b/src/mesa/drivers/dri/i965/brw_cs.cpp
index 63e3b8d..bf43def 100644
--- a/src/mesa/drivers/dri/i965/brw_cs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_cs.cpp
@@ -221,12 +221,18 @@ brw_codegen_cs_prog(struct brw_context *brw,
 static void
 brw_cs_populate_key(struct brw_context *brw, struct brw_cs_prog_key *key)
 {
+   struct gl_context *ctx = &brw->ctx;
/* BRW_NEW_COMPUTE_PROGRAM */
const struct brw_compute_program *cp =
   (struct brw_compute_program *) brw->compute_program;
+   const struct gl_program *prog = (struct gl_program *) cp;
 
memset(key, 0, sizeof(*key));
 
+   /* _NEW_TEXTURE */
+   brw_populate_sampler_prog_key_data(ctx, prog, brw->cs.base.sampler_count,
+  &key->tex);
+
/* The unique compute program ID */
key->program_string_id = cp->id;
 }
@@ -244,9 +250,12 @@ brw_upload_cs_prog(struct brw_context *brw)
if (!cp)
   return;
 
-   if (!brw_state_dirty(brw, 0, BRW_NEW_COMPUTE_PROGRAM))
+   if (!brw_state_dirty(brw, _NEW_TEXTURE, BRW_NEW_COMPUTE_PROGRAM))
   return;
 
+   brw->cs.base.sampler_count =
+  _mesa_fls(ctx->ComputeProgram._Current->Base.SamplersUsed);
+
brw_cs_populate_key(brw, &key);
 
if (!brw_search_cache(&brw->cache, BRW_CACHE_CS_PROG,
@@ -391,7 +400,8 @@ brw_upload_cs_state(struct brw_context *brw)
if (brw->gen >= 8)
   desc[dw++] = 0; /* Kernel Start Pointer High */
desc[dw++] = 0;
-   desc[dw++] = 0;
+   desc[dw++] = stage_state->sampler_offset |
+  ((stage_state->sampler_count + 3) / 4);
desc[dw++] = stage_state->bind_bo_offset;
desc[dw++] = SET_FIELD((reg_aligned_constant_size / 32) + 0,
   MEDIA_CURBE_READ_LENGTH);
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/5] i965/cs: Texture sampling support

2015-06-26 Thread Jordan Justen
With these environment overrides:
 MESA_GL_VERSION_OVERRIDE=4.3
 MESA_GLSL_VERSION_OVERRIDE=430
 MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader

This series allows this piglit test to pass:
tests/spec/arb_compute_shader/execution/basic-texelFetch.shader_test

Note, these patches were applied on top of this unreviewed patch:
http://patchwork.freedesktop.org/patch/52010/

Jordan Justen (5):
  i965: Support CS in update_stage_texture_surfaces
  i965: Run vector splitting for CS in brw_link_shader
  i965/fs: Set first_non_payload_grf in assign_curb_setup
  i965/cs: Support texture sampling for CS
  i965/cs: Emit texture surfaces to enable CS sampling

 src/mesa/drivers/dri/i965/brw_context.h  |  2 +-
 src/mesa/drivers/dri/i965/brw_cs.cpp | 14 --
 src/mesa/drivers/dri/i965/brw_fs.cpp |  9 +
 src/mesa/drivers/dri/i965/brw_shader.cpp |  1 +
 src/mesa/drivers/dri/i965/brw_state_upload.c |  2 ++
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c |  9 -
 6 files changed, 29 insertions(+), 8 deletions(-)

-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] i965: Support CS in update_stage_texture_surfaces

2015-06-26 Thread Jordan Justen
Signed-off-by: Jordan Justen 
---
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
index 72aad96..0eb5361 100644
--- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
+++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
@@ -804,7 +804,7 @@ update_stage_texture_surfaces(struct brw_context *brw,
   struct brw_stage_state *stage_state,
   bool for_gather)
 {
-   if (!prog)
+   if (!prog || !stage_state->prog_data)
   return;
 
struct gl_context *ctx = &brw->ctx;
@@ -848,10 +848,14 @@ brw_update_texture_surfaces(struct brw_context *brw)
/* BRW_NEW_FRAGMENT_PROGRAM */
struct gl_program *fs = (struct gl_program *) brw->fragment_program;
 
+   /* BRW_NEW_COMPUTE_PROGRAM */
+   struct gl_program *cs = (struct gl_program *) brw->compute_program;
+
/* _NEW_TEXTURE */
update_stage_texture_surfaces(brw, vs, &brw->vs.base, false);
update_stage_texture_surfaces(brw, gs, &brw->gs.base, false);
update_stage_texture_surfaces(brw, fs, &brw->wm.base, false);
+   update_stage_texture_surfaces(brw, cs, &brw->cs.base, false);
 
/* emit alternate set of surface state for gather. this
 * allows the surface format to be overriden for only the
@@ -863,6 +867,8 @@ brw_update_texture_surfaces(struct brw_context *brw)
  update_stage_texture_surfaces(brw, gs, &brw->gs.base, true);
   if (fs && fs->UsesGather)
  update_stage_texture_surfaces(brw, fs, &brw->wm.base, true);
+  if (cs && cs->UsesGather)
+ update_stage_texture_surfaces(brw, cs, &brw->cs.base, true);
}
 
brw->ctx.NewDriverState |= BRW_NEW_SURFACES;
@@ -872,6 +878,7 @@ const struct brw_tracked_state brw_texture_surfaces = {
.dirty = {
   .mesa = _NEW_TEXTURE,
   .brw = BRW_NEW_BATCH |
+ BRW_NEW_COMPUTE_PROGRAM |
  BRW_NEW_FRAGMENT_PROGRAM |
  BRW_NEW_FS_PROG_DATA |
  BRW_NEW_GEOMETRY_PROGRAM |
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Switch on shader stage in nir_setup_outputs().

2015-06-27 Thread Jordan Justen
On 2015-06-26 16:03:21, Kenneth Graunke wrote:
> Adding new shader stages to a switch statement is less confusing than an
> if-else-if ladder where all but the first case are fragment shader
> specific (but don't claim to be).
> 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp |   59 
> +-
>  1 file changed, 33 insertions(+), 26 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 59081ea..8bcd5e2 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -133,38 +133,45 @@ fs_visitor::nir_setup_outputs(nir_shader *shader)
>   var->type->is_array() ? var->type->fields.array->vector_elements
> : var->type->vector_elements;
>  
> -  if (stage == MESA_SHADER_VERTEX) {
> +  switch (stage) {
> +  case MESA_SHADER_VERTEX:
>   for (int i = 0; i < ALIGN(type_size(var->type), 4) / 4; i++) {
>  int output = var->data.location + i;
>  this->outputs[output] = offset(reg, 4 * i);
>  this->output_components[output] = vector_elements;
>   }
> -  } else if (var->data.index > 0) {
> - assert(var->data.location == FRAG_RESULT_DATA0);
> - assert(var->data.index == 1);
> - this->dual_src_output = reg;
> - this->do_dual_src = true;
> -  } else if (var->data.location == FRAG_RESULT_COLOR) {
> - /* Writing gl_FragColor outputs to all color regions. */
> - for (unsigned int i = 0; i < MAX2(key->nr_color_regions, 1); i++) {
> -this->outputs[i] = reg;
> -this->output_components[i] = 4;
> - }
> -  } else if (var->data.location == FRAG_RESULT_DEPTH) {
> - this->frag_depth = reg;
> -  } else if (var->data.location == FRAG_RESULT_SAMPLE_MASK) {
> - this->sample_mask = reg;
> -  } else {
> - /* gl_FragData or a user-defined FS output */
> - assert(var->data.location >= FRAG_RESULT_DATA0 &&
> -var->data.location < FRAG_RESULT_DATA0 + 
> BRW_MAX_DRAW_BUFFERS);
> -
> - /* General color output. */
> - for (unsigned int i = 0; i < MAX2(1, var->type->length); i++) {
> -int output = var->data.location - FRAG_RESULT_DATA0 + i;
> -this->outputs[output] = offset(reg, vector_elements * i);
> -this->output_components[output] = vector_elements;
> + break;
> +  case MESA_SHADER_FRAGMENT:
> + if (var->data.index > 0) {
> +assert(var->data.location == FRAG_RESULT_DATA0);
> +assert(var->data.index == 1);
> +this->dual_src_output = reg;
> +this->do_dual_src = true;
> + } else if (var->data.location == FRAG_RESULT_COLOR) {
> +/* Writing gl_FragColor outputs to all color regions. */
> +for (unsigned int i = 0; i < MAX2(key->nr_color_regions, 1); 
> i++) {
> +   this->outputs[i] = reg;
> +   this->output_components[i] = 4;
> +}
> + } else if (var->data.location == FRAG_RESULT_DEPTH) {
> +this->frag_depth = reg;
> + } else if (var->data.location == FRAG_RESULT_SAMPLE_MASK) {
> +this->sample_mask = reg;
> + } else {
> +/* gl_FragData or a user-defined FS output */
> +assert(var->data.location >= FRAG_RESULT_DATA0 &&
> +   var->data.location < 
> FRAG_RESULT_DATA0+BRW_MAX_DRAW_BUFFERS);
> +
> +/* General color output. */
> +for (unsigned int i = 0; i < MAX2(1, var->type->length); i++) {
> +   int output = var->data.location - FRAG_RESULT_DATA0 + i;
> +   this->outputs[output] = offset(reg, vector_elements * i);
> +   this->output_components[output] = vector_elements;
> +}

I noticed that it looks like MESA_SHADER_FRAGMENT case could use a
switch as well on var->data.location. Does it help, or is it better to
keep the if-ladder there?

switch (var->data.location) {
   case FRAG_RESULT_COLOR;
  /* do stuff */
  break;
   case FRAG_RESULT_DEPTH:
  /* do stuff */
  break;
   case FRAG_RESULT_SAMPLE_MASK:
  /* do stuff */
  break;
   case FRAG_RESULT_DATA0:
  if (var->data.index > 0) {
 /* do stuff */
 break;
  }
  /* fall-through */
   default:
  assert(var->data.index == 0);
  /* do stuff */
  break;
}

-Jordan

>   }
> + break;
> +  default:
> + unreachable("unhandled shader stage");
>}
> }
>  }
> -- 
> 1.7.10.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists

Re: [Mesa-dev] [PATCH] i965: Switch on shader stage in nir_setup_outputs().

2015-06-28 Thread Jordan Justen
Reviewed-by: Jordan Justen 

On 2015-06-26 16:03:21, Kenneth Graunke wrote:
> Adding new shader stages to a switch statement is less confusing than an
> if-else-if ladder where all but the first case are fragment shader
> specific (but don't claim to be).
> 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp |   59 
> +-
>  1 file changed, 33 insertions(+), 26 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 59081ea..8bcd5e2 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -133,38 +133,45 @@ fs_visitor::nir_setup_outputs(nir_shader *shader)
>   var->type->is_array() ? var->type->fields.array->vector_elements
> : var->type->vector_elements;
>  
> -  if (stage == MESA_SHADER_VERTEX) {
> +  switch (stage) {
> +  case MESA_SHADER_VERTEX:
>   for (int i = 0; i < ALIGN(type_size(var->type), 4) / 4; i++) {
>  int output = var->data.location + i;
>  this->outputs[output] = offset(reg, 4 * i);
>  this->output_components[output] = vector_elements;
>   }
> -  } else if (var->data.index > 0) {
> - assert(var->data.location == FRAG_RESULT_DATA0);
> - assert(var->data.index == 1);
> - this->dual_src_output = reg;
> - this->do_dual_src = true;
> -  } else if (var->data.location == FRAG_RESULT_COLOR) {
> - /* Writing gl_FragColor outputs to all color regions. */
> - for (unsigned int i = 0; i < MAX2(key->nr_color_regions, 1); i++) {
> -this->outputs[i] = reg;
> -this->output_components[i] = 4;
> - }
> -  } else if (var->data.location == FRAG_RESULT_DEPTH) {
> - this->frag_depth = reg;
> -  } else if (var->data.location == FRAG_RESULT_SAMPLE_MASK) {
> - this->sample_mask = reg;
> -  } else {
> - /* gl_FragData or a user-defined FS output */
> - assert(var->data.location >= FRAG_RESULT_DATA0 &&
> -var->data.location < FRAG_RESULT_DATA0 + 
> BRW_MAX_DRAW_BUFFERS);
> -
> - /* General color output. */
> - for (unsigned int i = 0; i < MAX2(1, var->type->length); i++) {
> -int output = var->data.location - FRAG_RESULT_DATA0 + i;
> -this->outputs[output] = offset(reg, vector_elements * i);
> -this->output_components[output] = vector_elements;
> + break;
> +  case MESA_SHADER_FRAGMENT:
> + if (var->data.index > 0) {
> +assert(var->data.location == FRAG_RESULT_DATA0);
> +assert(var->data.index == 1);
> +this->dual_src_output = reg;
> +this->do_dual_src = true;
> + } else if (var->data.location == FRAG_RESULT_COLOR) {
> +/* Writing gl_FragColor outputs to all color regions. */
> +for (unsigned int i = 0; i < MAX2(key->nr_color_regions, 1); 
> i++) {
> +   this->outputs[i] = reg;
> +   this->output_components[i] = 4;
> +}
> + } else if (var->data.location == FRAG_RESULT_DEPTH) {
> +this->frag_depth = reg;
> + } else if (var->data.location == FRAG_RESULT_SAMPLE_MASK) {
> +this->sample_mask = reg;
> + } else {
> +/* gl_FragData or a user-defined FS output */
> +assert(var->data.location >= FRAG_RESULT_DATA0 &&
> +   var->data.location < 
> FRAG_RESULT_DATA0+BRW_MAX_DRAW_BUFFERS);
> +
> +/* General color output. */
> +for (unsigned int i = 0; i < MAX2(1, var->type->length); i++) {
> +   int output = var->data.location - FRAG_RESULT_DATA0 + i;
> +   this->outputs[output] = offset(reg, vector_elements * i);
> +   this->output_components[output] = vector_elements;
> +}
>   }
> + break;
> +  default:
> + unreachable("unhandled shader stage");
>}
> }
>  }
> -- 
> 1.7.10.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 23/82] glsl: Do not do CSE for expressions involving SSBO loads

2015-06-29 Thread Jordan Justen
On 2015-06-24 07:36:24, Iago Toral wrote:
> On Wed, 2015-06-24 at 15:43 +0300, Francisco Jerez wrote:
> > AFAICT the reason why this (and many of the other changes in GLSL
> > optimization passes) is needed is because SSBO loads have been
> > implemented as ir_expression nodes instead of being lowered into
> > intrinsics (as other side-effectful operations do like
> > ARB_shader_image_load_store and ARB_shader_atomic_counters).  This
> > surely broke the assumption of a number of optimization passes that
> > ir_expression nodes behave as pure functions.  I guess the reason why
> > you've done it this way is because UBO loads were already being
> > represented as expressions, so I see why you may have wanted to use the
> > same approach for SSBOs even though there is a fundamental difference
> > between the two: UBO loads have no side effects and are constant for a
> > given set of arguments and a given shader execution, SSBO loads and
> > stores are not.  SSBO stores couldn't be accommodated into the same
> > framework so easily, and you decided to create a separate ir node for
> > them, what seems inconsistent with loads.  Intrinsics would probably
> > have been a good fit for both loads and stores, and would have made all
> > these optimization changes unnecessary...
> > 
> > P.S.: Sorry for the late reply, I was on vacation when I was CC'ed.
> 
> Right, your assessment about the reasons behind the current
> implementation is correct. I did not realize of these issues when I
> decided to go with the current implementation, now it does look like
> going with GLSL intrinsics would have made things a bit easier. I
> suppose it would make sense to revisit the implementation in the near
> future taking your work on arb_shader_image_load_store as a reference.

While we're waiting for curro's work to land, I was hoping to review
and let you guys land the first ~30 front end patches. These patches
would allow some compiler tests to pass if the extension is
overridden. (Plus, it would take a big chunk out of this large
series.)

Unfortunately, I think you should rework the load/store ops as
intrinsics as recommended by curro.

Once you have the extension working again with intrinsics, could you
re-post the early patches before the 'i965' patches start?

Does this seem like a reasonable plan?

Thanks,

-Jordan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/5] i965/fs: Mark last used ip for all regs read in the payload

2015-07-01 Thread Jordan Justen
On 2015-07-01 11:51:37, Connor Abbott wrote:
> From: Jordan Justen 
> 
> If a source register in the push constant registers uses more than on

Doh. I have to point out my own typo. :)

on=>one

> register, then we wouldn't update payload_last_use_ip for subsequent
> registers.
> 
> Unlike most uniform data pushed into registers, the CS gl_LocalInvocationID
> data varies per execution channel. Therefore for SIMD16 mode, we have vec16
> data in the payload. In this case we then need to mark 2 registers in
> payload_last_use_ip as last used by the instruction. There's a similar
> situation for the z and w coordinates of gl_FragCoord for fragment shaders,
> where it had only happened to work before because of some bogus interferences
> which the next commit removes.
> 
> (Connor: added bit about gl_FragCoord to commit message)
> 
> Reviewed-by: Connor Abbott 
> Signed-off-by: Jordan Justen 
> ---
>  src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
> index 8e5621d..620fc23 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
> @@ -380,7 +380,10 @@ fs_visitor::setup_payload_interference(struct ra_graph 
> *g,
>  if (node_nr >= payload_node_count)
> continue;
>  
> -payload_last_use_ip[node_nr] = use_ip;
> +for (int j = 0; j < inst->regs_read(i); j++) {
> +   payload_last_use_ip[node_nr + j] = use_ip;
> +   assert(node_nr + j < payload_node_count);
> +}
>   }
>}
>  
> -- 
> 2.4.3
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] i965/fs: fix regs_read() for LINTERP

2015-07-01 Thread Jordan Justen
Reviewed-by: Jordan Justen 

On 2015-07-01 11:51:35, Connor Abbott wrote:
> The second source always stays within the same SIMD8 register.
> 
> Signed-off-by: Connor Abbott 
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index dd95519..38b9095 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -676,7 +676,8 @@ fs_inst::regs_read(int arg) const
> case FS_OPCODE_LINTERP:
>if (arg == 0)
>   return exec_size / 4;
> -  break;
> +  else
> + return 1;
>  
> case FS_OPCODE_PIXEL_X:
> case FS_OPCODE_PIXEL_Y:
> -- 
> 2.4.3
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] i965/fs: remove special case in setup_payload_interference()

2015-07-01 Thread Jordan Justen
On 2015-07-01 11:51:38, Connor Abbott wrote:
> From: Connor Abbott 
> 
> regs_read() will handle LINTERP for us since the previous commit. In
> addition, we were being too conservative, since it will only read 2
> registers on SIMD8.
> 
> instructions in affected programs: 9061 -> 8893 (-1.85%)
> helped:10
> HURT:  0
> GAINED:0
> LOST:      0

Nice. :)

Reviewed-by: Jordan Justen 

> All of the changes were due to spills being eliminated, mostly in KSP
> shaders.
> 
> Signed-off-by: Connor Abbott 
> ---
>  src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 20 
>  1 file changed, 20 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
> index 620fc23..1ee19e4 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
> @@ -389,26 +389,6 @@ fs_visitor::setup_payload_interference(struct ra_graph 
> *g,
>  
>/* Special case instructions which have extra implied registers used. 
> */
>switch (inst->opcode) {
> -  case FS_OPCODE_LINTERP:
> - /* On gen6+ in SIMD16, there are 4 adjacent registers used by
> -  * PLN's sourcing of the deltas, while we list only the first one
> -  * in the arguments.  Pre-gen6, the deltas are computed in normal
> -  * VGRFs.
> -  */
> - if (devinfo->gen >= 6) {
> -int delta_x_arg = 0;
> -if (inst->src[delta_x_arg].file == HW_REG &&
> -inst->src[delta_x_arg].fixed_hw_reg.file ==
> -BRW_GENERAL_REGISTER_FILE) {
> -   for (int i = 1; i < 4; ++i) {
> -  int node = inst->src[delta_x_arg].fixed_hw_reg.nr + i;
> -  assert(node < payload_node_count);
> -  payload_last_use_ip[node] = use_ip;
> -   }
> -}
> - }
> - break;
> -
>case CS_OPCODE_CS_TERMINATE:
>   payload_last_use_ip[0] = use_ip;
>   break;
> -- 
> 2.4.3
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/5] i965/fs: don't make unused payload registers interfere

2015-07-01 Thread Jordan Justen
Reviewed-by: Jordan Justen 

On 2015-07-01 11:51:39, Connor Abbott wrote:
> From: Connor Abbott 
> 
> Before, we were setting payload_last_use_ip for unused payload
> registers to 0, which made them interfere with whatever the first
> instruction wrote to due to the workaround for SIMD16 uniform arguments.
> Just use -1 to mean "unused" instead, and then skip setting any
> interferences for unused payload registers.
> 
> instructions in affected programs: 0 -> 0
> helped:0
> HURT:  0
> GAINED:1
> LOST:  0
> 
> Signed-off-by: Connor Abbott 
> ---
>  src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
> index 1ee19e4..f25f2ec 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
> @@ -341,7 +341,9 @@ fs_visitor::setup_payload_interference(struct ra_graph *g,
> int loop_end_ip = 0;
>  
> int payload_last_use_ip[payload_node_count];
> -   memset(payload_last_use_ip, 0, sizeof(payload_last_use_ip));
> +   for (int i = 0; i < payload_node_count; i++)
> +  payload_last_use_ip[i] = -1;
> +
> int ip = 0;
> foreach_block_and_inst(block, fs_inst, inst, cfg) {
>switch (inst->opcode) {
> @@ -411,6 +413,9 @@ fs_visitor::setup_payload_interference(struct ra_graph *g,
> }
>  
> for (int i = 0; i < payload_node_count; i++) {
> +  if (payload_last_use_ip[i] == -1)
> + continue;
> +
>/* Mark the payload node as interfering with any virtual grf that is
> * live between the start of the program and our last use of the 
> payload
> * node.
> -- 
> 2.4.3
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/cs: Initialize GPGPU Thread Count

2015-07-02 Thread Jordan Justen
On 2015-06-25 11:34:59, Ben Widawsky wrote:
> On Thu, Jun 11, 2015 at 09:04:45PM -0700, Jordan Justen wrote:
> > +   desc[dw++] = 0;
> > +   const uint32_t media_threads =
> > +  brw->gen >= 8 ?
> > +  SET_FIELD(threads, GEN8_MEDIA_GPGPU_THREAD_COUNT) :
> > +  SET_FIELD(threads, MEDIA_GPGPU_THREAD_COUNT);
> > +   desc[dw++] = media_threads;
> 
> What's the deal with, "The maximum value for global barriers is limited by the
> number of threads in the system, or by 511," Can we add an assert?

I guess we are using a local barrier, so "the maximum value is the
number of threads in a subslice for local barriers".

How about I add assert(threads <= brw->max_cs_threads)? Although, in
brw_compute:brw_emit_gpgpu_walker we have a similar assertion.

-Jordan

> >  
> > BEGIN_BATCH(4);
> > OUT_BATCH(MEDIA_INTERFACE_DESCRIPTOR_LOAD << 16 | (4 - 2));
> > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
> > b/src/mesa/drivers/dri/i965/brw_defines.h
> > index f6da305..2a8f500 100644
> > --- a/src/mesa/drivers/dri/i965/brw_defines.h
> > +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> > @@ -2495,6 +2495,11 @@ enum brw_wm_barycentric_interp_mode {
> >  # define MEDIA_VFE_STATE_CURBE_ALLOC_MASK   INTEL_MASK(15, 0)
> >  
> >  #define MEDIA_INTERFACE_DESCRIPTOR_LOAD 0x7002
> > +/* GEN7 DW5, GEN8+ DW6 */
> > +# define MEDIA_GPGPU_THREAD_COUNT_SHIFT 0
> > +# define MEDIA_GPGPU_THREAD_COUNT_MASK  INTEL_MASK(7, 0)
> > +# define GEN8_MEDIA_GPGPU_THREAD_COUNT_SHIFT0
> > +# define GEN8_MEDIA_GPGPU_THREAD_COUNT_MASK INTEL_MASK(9, 0)
> >  #define MEDIA_STATE_FLUSH   0x7004
> >  #define GPGPU_WALKER0x7105
> >  /* GEN8+ DW2 */
> > -- 
> > 2.1.4
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 30/82] i965: Use 16-byte offset alignment for shader storage buffers

2015-07-09 Thread Jordan Justen
30-32 Reviewed-by: Jordan Justen 

On 2015-06-03 00:01:20, Iago Toral Quiroga wrote:
> This is the same we do for other things like uniforms because it ensures
> optimal performance.
> ---
>  src/mesa/drivers/dri/i965/brw_context.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
> b/src/mesa/drivers/dri/i965/brw_context.c
> index 274a237..5d21747 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.c
> +++ b/src/mesa/drivers/dri/i965/brw_context.c
> @@ -544,6 +544,7 @@ brw_initialize_context_constants(struct brw_context *brw)
>  * However, unaligned accesses are slower, so enforce buffer alignment.
>  */
> ctx->Const.UniformBufferOffsetAlignment = 16;
> +   ctx->Const.ShaderStorageBufferOffsetAlignment = 16;
> ctx->Const.TextureBufferOffsetAlignment = 16;
>  
> if (brw->gen >= 6) {
> -- 
> 1.9.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 (part1) 25/26] glsl: Lower shader storage buffer object writes to GLSL IR instrinsics

2015-07-10 Thread Jordan Justen
isitor::ubo_load(const glsl_type 
> *type,
>  
>  }
>  
> +static bool
> +shader_storage_buffer_object(const _mesa_glsl_parse_state *state)
> +{
> +   return state->ARB_shader_storage_buffer_object_enable;
> +}
> +
> +ir_call *
> +lower_ubo_reference_visitor::ssbo_write(ir_rvalue *deref,
> +ir_rvalue *offset,
> +unsigned write_mask)
> +{
> +   exec_list sig_params;
> +
> +   ir_variable *block_ref = new(mem_ctx)
> +  ir_variable(glsl_type::uint_type, "block_ref" , ir_var_function_in);
> +   sig_params.push_tail(block_ref);
> +
> +   ir_variable *offset_ref = new(mem_ctx)
> +  ir_variable(glsl_type::uint_type, "offset" , ir_var_function_in);
> +   sig_params.push_tail(offset_ref);
> +
> +   ir_variable *val_ref = new(mem_ctx)
> +  ir_variable(deref->type, "value" , ir_var_function_in);
> +   sig_params.push_tail(val_ref);
> +
> +   ir_variable *writemask_ref = new(mem_ctx)
> +  ir_variable(glsl_type::uint_type, "write_mask" , ir_var_function_in);
> +   sig_params.push_tail(writemask_ref);
> +
> +   ir_function_signature *sig = new(mem_ctx)
> +  ir_function_signature(glsl_type::void_type, 
> shader_storage_buffer_object);
> +   assert(sig);
> +   sig->replace_parameters(&sig_params);
> +   sig->is_intrinsic = true;
> +
> +   ir_function *f = new(mem_ctx) ir_function("__intrinsic_store_ssbo");
> +   f->add_signature(sig);
> +
> +   exec_list call_params;
> +   call_params.push_tail(this->uniform_block->clone(mem_ctx, NULL));
> +   call_params.push_tail(offset->clone(mem_ctx, NULL));
> +   call_params.push_tail(deref->clone(mem_ctx, NULL));
> +   call_params.push_tail(new(mem_ctx) ir_constant(write_mask));
> +   return new(mem_ctx) ir_call(sig, NULL, &call_params);
> +}
> +
> +static inline int
> +writemask_for_size(unsigned n)
> +{
> +   return ((1 << n) - 1);
> +}
> +
>  /**
> - * Takes LHS and emits a series of assignments into its components
> - * from the UBO variable at variable_offset + deref_offset.
> - *
> - * Recursively calls itself to break the deref down to the point that
> - * the ir_binop_ubo_load expressions generated are contiguous scalars
> - * or vectors.
> + * Takes a deref and recursively calls itself to break the deref down to the
> + * point that the reads or writes generated are contiguous scalars or 
> vectors.
>   */
>  void
> -lower_ubo_reference_visitor::emit_ubo_loads(ir_dereference *deref,
> -   ir_variable *base_offset,
> -unsigned int deref_offset,
> -bool row_major,
> -int matrix_columns)
> +lower_ubo_reference_visitor::emit_reads_or_writes(bool is_write,
> +  ir_dereference *deref,
> +  ir_variable *base_offset,
> +  unsigned int deref_offset,
> +      bool row_major,
> +  int matrix_columns,
> +  unsigned write_mask)
>  {
> if (deref->type->is_record()) {
>unsigned int field_offset = 0;
>  
>for (unsigned i = 0; i < deref->type->length; i++) {
> -const struct glsl_struct_field *field =
> -   &deref->type->fields.structure[i];
> -ir_dereference *field_deref =
> -   new(mem_ctx) ir_dereference_record(deref->clone(mem_ctx, NULL),
> -  field->name);
> -
> -field_offset =
> -   glsl_align(field_offset,
> + const struct glsl_struct_field *field =
> +&deref->type->fields.structure[i];
> + ir_dereference *field_deref =
> +new(mem_ctx) ir_dereference_record(deref->clone(mem_ctx, NULL),
> +   field->name);
> +
> + field_offset =
> +glsl_align(field_offset,
> field->type->std140_base_alignment(row_major));
>  
> -emit_ubo_loads(field_deref, base_offset, deref_offset + field_offset,
> -row_major, 1);
> + emit_reads_or_writes(is_write, field_deref, base_offset,
> +  deref_offset + field_offset,
> +  row_major, 1,
> +  
> w

Re: [Mesa-dev] [PATCH v3 (part1) 26/26] glsl: Lower shader storage buffer object loads to GLSL IR instrinsics

2015-07-10 Thread Jordan Justen
Reviewed-by: Jordan Justen 

On 2015-07-10 03:13:45, Iago Toral Quiroga wrote:
> From: Samuel Iglesias Gonsalvez 
> 
> Extend the existing lower_ubo_reference pass to also detect SSBO loads
> and lower them to __intrinsic_load_ssbo intrinsics.
> 
> Signed-off-by: Samuel Iglesias Gonsalvez 
> ---
>  src/glsl/lower_ubo_reference.cpp | 73 
> +++-
>  1 file changed, 65 insertions(+), 8 deletions(-)
> 
> diff --git a/src/glsl/lower_ubo_reference.cpp 
> b/src/glsl/lower_ubo_reference.cpp
> index 460b490..822b723 100644
> --- a/src/glsl/lower_ubo_reference.cpp
> +++ b/src/glsl/lower_ubo_reference.cpp
> @@ -150,7 +150,8 @@ public:
>  int *matrix_columns);
> ir_expression *ubo_load(const struct glsl_type *type,
>ir_rvalue *offset);
> -
> +   ir_call *ssbo_load(const struct glsl_type *type,
> +  ir_rvalue *offset);
>  
> void check_for_ssbo_write(ir_assignment *ir);
> void write_to_memory(ir_dereference *deref,
> @@ -170,6 +171,7 @@ public:
> struct gl_uniform_buffer_variable *ubo_var;
> ir_rvalue *uniform_block;
> bool progress;
> +   bool is_shader_storage;
>  };
>  
>  /**
> @@ -266,6 +268,8 @@ 
> lower_ubo_reference_visitor::setup_for_load_or_write(ir_variable *var,
>  this->uniform_block = index;
>   }
>  
> + this->is_shader_storage = shader->UniformBlocks[i].IsShaderStorage;
> +
>   struct gl_uniform_block *block = &shader->UniformBlocks[i];
>  
>   this->ubo_var = var->is_interface_instance()
> @@ -415,7 +419,7 @@ lower_ubo_reference_visitor::handle_rvalue(ir_rvalue 
> **rvalue)
> if (!var || !var->is_in_buffer_block())
>return;
>  
> -   mem_ctx = ralloc_parent(*rvalue);
> +   mem_ctx = ralloc_parent(shader->ir);
>  
> ir_rvalue *offset = NULL;
> unsigned const_offset;
> @@ -512,6 +516,42 @@ lower_ubo_reference_visitor::ssbo_write(ir_rvalue *deref,
> return new(mem_ctx) ir_call(sig, NULL, &call_params);
>  }
>  
> +ir_call *
> +lower_ubo_reference_visitor::ssbo_load(const struct glsl_type *type,
> +   ir_rvalue *offset)
> +{
> +   exec_list sig_params;
> +
> +   ir_variable *block_ref = new(mem_ctx)
> +  ir_variable(glsl_type::uint_type, "block_ref" , ir_var_function_in);
> +   sig_params.push_tail(block_ref);
> +
> +   ir_variable *offset_ref = new(mem_ctx)
> +  ir_variable(glsl_type::uint_type, "offset_ref" , ir_var_function_in);
> +   sig_params.push_tail(offset_ref);
> +
> +   ir_function_signature *sig =
> +  new(mem_ctx) ir_function_signature(type, shader_storage_buffer_object);
> +   assert(sig);
> +   sig->replace_parameters(&sig_params);
> +   sig->is_intrinsic = true;
> +
> +   ir_function *f = new(mem_ctx) ir_function("__intrinsic_load_ssbo");
> +   f->add_signature(sig);
> +
> +   ir_variable *result = new(mem_ctx)
> +  ir_variable(type, "ssbo_load_result", ir_var_temporary);
> +   base_ir->insert_before(result);
> +   ir_dereference_variable *deref_result = new(mem_ctx)
> +  ir_dereference_variable(result);
> +
> +   exec_list call_params;
> +   call_params.push_tail(this->uniform_block->clone(mem_ctx, NULL));
> +   call_params.push_tail(offset->clone(mem_ctx, NULL));
> +
> +   return new(mem_ctx) ir_call(sig, deref_result, &call_params);
> +}
> +
>  static inline int
>  writemask_for_size(unsigned n)
>  {
> @@ -610,9 +650,17 @@ lower_ubo_reference_visitor::emit_reads_or_writes(bool 
> is_write,
>   add(base_offset, new(mem_ctx) ir_constant(deref_offset));
>if (is_write)
>   base_ir->insert_after(ssbo_write(deref, offset, write_mask));
> -  else
> - base_ir->insert_before(assign(deref->clone(mem_ctx, NULL),
> -   ubo_load(deref->type, offset)));
> +  else {
> + if (!this->is_shader_storage) {
> + base_ir->insert_before(assign(deref->clone(mem_ctx, NULL),
> +   ubo_load(deref->type, offset)));
> + } else {
> +ir_call *load_ssbo = ssbo_load(deref->type, offset);
> +base_ir->insert_before(load_ssbo);
> +ir_rvalue *value = 
> load_ssbo->return_deref->as_rvalue()->clone(mem_ctx, NULL);
> +base_ir->insert_before(assign(deref->clone(mem_ctx, NULL), 
> value));
> + }
> +  }
> } else {
>unsigned N = deref->type->

Re: [Mesa-dev] [PATCH v2 33/82] i965: Upload Shader Storage Buffer Object surfaces

2015-07-10 Thread Jordan Justen
On 2015-06-03 00:01:23, Iago Toral Quiroga wrote:
> Since these are a special kind of UBOs we emit them together reusing the
> same infrastructure, however, we use a RAW surface so we can reuse
> existing untyped read/write/atomic messages which include a pixel mask
> header that we need to set to obtain correct behavior with helper
> invocations of the fragment shader.
> ---
>  src/mesa/drivers/dri/i965/brw_context.h  |  6 +++
>  src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 62 
> +++-
>  2 files changed, 56 insertions(+), 12 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
> b/src/mesa/drivers/dri/i965/brw_context.h
> index 01c4283..154d7ae 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -1718,6 +1718,12 @@ void brw_create_constant_surface(struct brw_context 
> *brw,
>   uint32_t size,
>   uint32_t *out_offset,
>   bool dword_pitch);
> +void brw_create_buffer_surface(struct brw_context *brw,
> +   drm_intel_bo *bo,
> +   uint32_t offset,
> +   uint32_t size,
> +   uint32_t *out_offset,
> +   bool dword_pitch);
>  void brw_update_buffer_texture_surface(struct gl_context *ctx,
> unsigned unit,
> uint32_t *surf_offset);
> diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c 
> b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> index 160dd2f..e9ccdd6 100644
> --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> @@ -410,6 +410,29 @@ brw_create_constant_surface(struct brw_context *brw,
>  }
>  
>  /**
> + * Create the buffer surface.  Vertex/fragment shader buffer variables will 
> be

Only vertex/fragment? How about 'Shader buffer variables' ...?

Reviewed-by: Jordan Justen 

> + * read from / write to this buffer with Data Port Read/Write
> + * instructions/messages.
> + */
> +void
> +brw_create_buffer_surface(struct brw_context *brw,
> +  drm_intel_bo *bo,
> +  uint32_t offset,
> +  uint32_t size,
> +  uint32_t *out_offset,
> +  bool dword_pitch)
> +{
> +   /* Use a raw surface so we can reuse existing untyped read/write/atomic
> +* messages. We need these specifically for the fragment shader since they
> +* include a pixel mask header that we need to ensure correct behavior
> +* with helper invocations, which cannot write to the buffer.
> +*/
> +   brw->vtbl.emit_buffer_surface_state(brw, out_offset, bo, offset,
> +   BRW_SURFACEFORMAT_RAW,
> +   size, 1, true);
> +}
> +
> +/**
>   * Set up a binding table entry for use by stream output logic (transform
>   * feedback).
>   *
> @@ -897,24 +920,39 @@ brw_upload_ubo_surfaces(struct brw_context *brw,
>&stage_state->surf_offset[prog_data->binding_table.ubo_start];
>  
> for (int i = 0; i < shader->NumUniformBlocks; i++) {
> -  struct gl_uniform_buffer_binding *binding;
>struct intel_buffer_object *intel_bo;
>  
> -  binding = 
> &ctx->UniformBufferBindings[shader->UniformBlocks[i].Binding];
> -  intel_bo = intel_buffer_object(binding->BufferObject);
> -  drm_intel_bo *bo =
> - intel_bufferobj_buffer(brw, intel_bo,
> -binding->Offset,
> -binding->BufferObject->Size - 
> binding->Offset);
> -
>/* Because behavior for referencing outside of the binding's size in 
> the
> * glBindBufferRange case is undefined, we can just bind the whole 
> buffer
> * glBindBufferBase wants and be a correct implementation.
> */
> -  brw_create_constant_surface(brw, bo, binding->Offset,
> -  bo->size - binding->Offset,
> -  &surf_offsets[i],
> -  dword_pitch);
> +  if (!shader->UniformBlocks[i].IsShaderStorage) {
> + struct gl_uniform_buffer_binding *binding;
> + binding =
> +&ctx->UniformBufferBindings[shader->UniformBlocks[i].Binding];
> + intel_bo = intel_buffer_object(binding->BufferObject);
> + drm_intel_bo *bo =
>

Re: [Mesa-dev] [PATCH v2 34/82] i965: handle visiting of ir_var_shader_storage variables

2015-07-10 Thread Jordan Justen
Reviewed-by: Jordan Justen 

On 2015-06-03 00:01:24, Iago Toral Quiroga wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> index a2a75a4..13496a3 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> @@ -1053,11 +1053,12 @@ vec4_visitor::visit(ir_variable *ir)
>break;
>  
> case ir_var_uniform:
> +   case ir_var_shader_storage:
>reg = new(this->mem_ctx) dst_reg(UNIFORM, this->uniforms);
>  
>/* Thanks to the lower_ubo_reference pass, we will see only
> -   * ir_binop_ubo_load expressions and not ir_dereference_variable for 
> UBO
> -   * variables, so no need for them to be in variable_ht.
> +   * ir_binop_{ubo,ssbo}_load expressions and not ir_dereference_variable
> +   * for UBO/SSBO variables, so no need for them to be in variable_ht.
> *
> * Some uniforms, such as samplers and atomic counters, have no actual
> * storage, so we should ignore them.
> -- 
> 1.9.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 35/82] i965/fs: Do not split buffer variables

2015-07-10 Thread Jordan Justen
Reviewed-by: Jordan Justen 

On 2015-06-03 00:01:25, Iago Toral Quiroga wrote:
> Buffer variables are the same as uniforms, only that read/write, so we want
> the same treatment.
> ---
>  src/mesa/drivers/dri/i965/brw_fs_vector_splitting.cpp | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_vector_splitting.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_vector_splitting.cpp
> index 01d3a56..61b6ebc 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_vector_splitting.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_vector_splitting.cpp
> @@ -104,6 +104,7 @@ 
> ir_vector_reference_visitor::get_variable_entry(ir_variable *var)
>  
> switch (var->data.mode) {
> case ir_var_uniform:
> +   case ir_var_shader_storage:
> case ir_var_shader_in:
> case ir_var_shader_out:
> case ir_var_system_value:
> -- 
> 1.9.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] i965: Push miptree tiling request into flags

2015-07-14 Thread Jordan Justen
Reviewed-by: Jordan Justen 

On 2015-07-14 09:56:09, Ben Widawsky wrote:
> With the last few patches a way was provided to influence lower layer miptree
> layout and allocation decisions via flags (replacing bools). For simplicity, I
> chose not to touch the tiling requests because the change was slightly less
> mechanical than replacing the bools.
> 
> The goal is to organize the code so we can continue to add new parameters and
> tiling types while minimizing risk to the existing code, and not having to
> constantly add new function parameters.
> 
> v2: Rebased on Anuj's recent Yf/Ys changes
> Fix non-msrt MCS allocation (was only happening in gen8 case before)
> 
> Cc: Anuj Phogat 
> Cc: Chad Versace 
> Signed-off-by: Ben Widawsky 
> ---
>  src/mesa/drivers/dri/i965/brw_tex_layout.c | 21 ++--
>  src/mesa/drivers/dri/i965/intel_fbo.c  |  6 ++--
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.c  | 45 
> +-
>  src/mesa/drivers/dri/i965/intel_mipmap_tree.h  | 15 -
>  src/mesa/drivers/dri/i965/intel_tex.c  |  2 +-
>  src/mesa/drivers/dri/i965/intel_tex_image.c|  3 +-
>  src/mesa/drivers/dri/i965/intel_tex_validate.c |  5 +--
>  7 files changed, 50 insertions(+), 47 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c 
> b/src/mesa/drivers/dri/i965/brw_tex_layout.c
> index 389834f..a12b4af 100644
> --- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
> +++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
> @@ -614,8 +614,8 @@ brw_miptree_layout_texture_3d(struct brw_context *brw,
>   */
>  static uint32_t
>  brw_miptree_choose_tiling(struct brw_context *brw,
> -  enum intel_miptree_tiling_mode requested,
> -  const struct intel_mipmap_tree *mt)
> +  const struct intel_mipmap_tree *mt,
> +  uint32_t layout_flags)
>  {
> if (mt->format == MESA_FORMAT_S_UINT8) {
>/* The stencil buffer is W tiled. However, we request from the kernel a
> @@ -624,15 +624,18 @@ brw_miptree_choose_tiling(struct brw_context *brw,
>return I915_TILING_NONE;
> }
>  
> +   /* Do not support changing the tiling for miptrees with pre-allocated 
> BOs. */
> +   assert((layout_flags & MIPTREE_LAYOUT_FOR_BO) == 0);
> +
> /* Some usages may want only one type of tiling, like depth miptrees (Y
>  * tiled), or temporary BOs for uploading data once (linear).
>  */
> -   switch (requested) {
> -   case INTEL_MIPTREE_TILING_ANY:
> +   switch (layout_flags & MIPTREE_LAYOUT_ALLOC_ANY_TILED) {
> +   case MIPTREE_LAYOUT_ALLOC_ANY_TILED:
>break;
> -   case INTEL_MIPTREE_TILING_Y:
> +   case MIPTREE_LAYOUT_ALLOC_YTILED:
>return I915_TILING_Y;
> -   case INTEL_MIPTREE_TILING_NONE:
> +   case MIPTREE_LAYOUT_ALLOC_LINEAR:
>return I915_TILING_NONE;
> }
>  
> @@ -835,7 +838,6 @@ intel_miptree_can_use_tr_mode(const struct 
> intel_mipmap_tree *mt)
>  void
>  brw_miptree_layout(struct brw_context *brw,
> struct intel_mipmap_tree *mt,
> -   enum intel_miptree_tiling_mode requested,
> uint32_t layout_flags)
>  {
> const unsigned bpp = mt->cpp * 8;
> @@ -852,8 +854,7 @@ brw_miptree_layout(struct brw_context *brw,
>!(layout_flags & MIPTREE_LAYOUT_FOR_BO) &&
>!mt->compressed &&
>_mesa_is_format_color_format(mt->format) &&
> -  (requested == INTEL_MIPTREE_TILING_Y ||
> -   requested == INTEL_MIPTREE_TILING_ANY) &&
> +  (layout_flags & MIPTREE_LAYOUT_ALLOC_YTILED) &&
>(bpp && is_power_of_two(bpp)) &&
>/* FIXME: To avoid piglit regressions keep the Yf/Ys tiling
> * disabled at the moment.
> @@ -897,7 +898,7 @@ brw_miptree_layout(struct brw_context *brw,
>if (layout_flags & MIPTREE_LAYOUT_FOR_BO)
>   break;
>  
> -  mt->tiling = brw_miptree_choose_tiling(brw, requested, mt);
> +  mt->tiling = brw_miptree_choose_tiling(brw, mt, layout_flags);
>if (is_tr_mode_yf_ys_allowed) {
>   if (intel_miptree_can_use_tr_mode(mt))
>  break;
> diff --git a/src/mesa/drivers/dri/i965/intel_fbo.c 
> b/src/mesa/drivers/dri/i965/intel_fbo.c
> index 05e3f8b..26f895b 100644
> --- a/src/mesa/drivers/dri/i965/intel_fbo.c
> +++ b/src/mesa/drivers/dri/i965/intel_fbo.c
> @@ -1022,6 +1022,9 @@ intel_renderbuffer_move_to_temp(struct brw_context *brw,
> struct intel_mipmap_tree *new_mt;
> int width, height, depth;
>  
> +   uint32_t layout_flags = MI

[Mesa-dev] [PATCH] i965/cs: Use dispatch width of 8 for cs terminate

2015-07-16 Thread Jordan Justen
This prevents an assertion failure in brw_fs_live_variables.cpp,
fs_live_variables::setup_one_write: Assertion `var < num_vars' failed.

Signed-off-by: Jordan Justen 
Cc: Jason Ekstrand 
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 94d6a58..62dfb9a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1960,11 +1960,12 @@ fs_visitor::emit_cs_terminate()
 */
struct brw_reg g0 = retype(brw_vec8_grf(0, 0), BRW_REGISTER_TYPE_UD);
fs_reg payload = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD);
-   bld.exec_all().MOV(payload, g0);
+   const fs_builder bld8 = bld.group(8, 0);
+   bld8.exec_all().MOV(payload, g0);
 
/* Send a message to the thread spawner to terminate the thread. */
-   fs_inst *inst = bld.exec_all()
-  .emit(CS_OPCODE_CS_TERMINATE, reg_undef, payload);
+   fs_inst *inst = bld8.exec_all()
+   .emit(CS_OPCODE_CS_TERMINATE, reg_undef, payload);
inst->eot = true;
 }
 
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/cs: Use dispatch width of 8 for cs terminate

2015-07-16 Thread Jordan Justen
On 2015-07-16 13:59:45, Jason Ekstrand wrote:
>On Jul 16, 2015 2:00 PM, "Jordan Justen" 
>wrote:
>>
>> This prevents an assertion failure in brw_fs_live_variables.cpp,
>> fs_live_variables::setup_one_write: Assertion `var < num_vars' failed.
> 
>Best guess is that you should just fix regs_read to return the right value
>(1 in this case).  Most other send instructions use mlen but that may not
>be needed tour CS_TERMINATE.

I think regs_read will fix a similar assert in setup_one_read. The MOV
will still hit the setup_one_write assert. Should I be generating the
MOV differently?

For example

   bld.group(8, 0).exec_all().MOV(payload, g0);

works in combo with an update to regs_read.

-Jordan

>> Signed-off-by: Jordan Justen 
>> Cc: Jason Ekstrand 
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 7 ---
>>  1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> index 94d6a58..62dfb9a 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> @@ -1960,11 +1960,12 @@ fs_visitor::emit_cs_terminate()
>>  */
>> struct brw_reg g0 = retype(brw_vec8_grf(0, 0),
>BRW_REGISTER_TYPE_UD);
>> fs_reg payload = fs_reg(GRF, alloc.allocate(1),
>BRW_REGISTER_TYPE_UD);
>> -   bld.exec_all().MOV(payload, g0);
>> +   const fs_builder bld8 = bld.group(8, 0);
>> +   bld8.exec_all().MOV(payload, g0);
>>
>> /* Send a message to the thread spawner to terminate the thread. */
>> -   fs_inst *inst = bld.exec_all()
>> -  .emit(CS_OPCODE_CS_TERMINATE, reg_undef,
>payload);
>> +   fs_inst *inst = bld8.exec_all()
>> +   .emit(CS_OPCODE_CS_TERMINATE, reg_undef,
>payload);
>> inst->eot = true;
>>  }
>>
>> --
>> 2.1.4
>>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/cs: Use dispatch width of 8 for cs terminate

2015-07-16 Thread Jordan Justen
On 2015-07-16 14:37:59, Jason Ekstrand wrote:
>On Jul 16, 2015 4:35 PM, "Jordan Justen" 
>wrote:
>>
>> On 2015-07-16 13:59:45, Jason Ekstrand wrote:
>> >On Jul 16, 2015 2:00 PM, "Jordan Justen"
>
>> >wrote:
>> >>
>> >> This prevents an assertion failure in brw_fs_live_variables.cpp,
>> >> fs_live_variables::setup_one_write: Assertion `var < num_vars'
>failed.
>> >
>> >Best guess is that you should just fix regs_read to return the
>right value
>> >(1 in this case).  Most other send instructions use mlen but that
>may not
>> >be needed tour CS_TERMINATE.
>>
>> I think regs_read will fix a similar assert in setup_one_read. The MOV
>> will still hit the setup_one_write assert. Should I be generating the
>> MOV differently?
>>
>> For example
>>
>>bld.group(8, 0).exec_all().MOV(payload, g0);
> 
>Yes, that's how we need to do the MOV.
> 
>Out of curiosity, why are you emitting a MOV at all and not just calling
>CS_TERMINATE with g0 directly?

While sending from g0 appears to work, apparently we are supposed to
do the final send from a high register. Ken found some wording that
appeared to indicate that we needed to do this even on compute:

http://article.gmane.org/gmane.comp.video.mesa3d.devel/98098

-Jordan

>> works in combo with an update to regs_read.
>>
>> -Jordan
>>
>> >> Signed-off-by: Jordan Justen 
>> >> Cc: Jason Ekstrand 
>> >> ---
>> >>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 7 ---
>> >>  1 file changed, 4 insertions(+), 3 deletions(-)
>> >>
>> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> >b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> >> index 94d6a58..62dfb9a 100644
>> >> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> >> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> >> @@ -1960,11 +1960,12 @@ fs_visitor::emit_cs_terminate()
>> >>  */
>> >> struct brw_reg g0 = retype(brw_vec8_grf(0, 0),
>> >BRW_REGISTER_TYPE_UD);
>> >> fs_reg payload = fs_reg(GRF, alloc.allocate(1),
>> >BRW_REGISTER_TYPE_UD);
>> >> -   bld.exec_all().MOV(payload, g0);
>> >> +   const fs_builder bld8 = bld.group(8, 0);
>> >> +   bld8.exec_all().MOV(payload, g0);
>> >>
>> >> /* Send a message to the thread spawner to terminate the
>thread. */
>> >> -   fs_inst *inst = bld.exec_all()
>> >> -  .emit(CS_OPCODE_CS_TERMINATE, reg_undef,
>> >payload);
>> >> +   fs_inst *inst = bld8.exec_all()
>> >> +   .emit(CS_OPCODE_CS_TERMINATE, reg_undef,
>> >payload);
>> >> inst->eot = true;
>> >>  }
>> >>
>> >> --
>> >> 2.1.4
>> >>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965/cs: Use dispatch width of 8 for cs terminate payload setup

2015-07-16 Thread Jordan Justen
This prevents an assertion failure in brw_fs_live_variables.cpp,
fs_live_variables::setup_one_write: Assertion `var < num_vars' failed.

Signed-off-by: Jordan Justen 
Cc: Jason Ekstrand 
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 94d6a58..d6a60a7 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1960,7 +1960,7 @@ fs_visitor::emit_cs_terminate()
 */
struct brw_reg g0 = retype(brw_vec8_grf(0, 0), BRW_REGISTER_TYPE_UD);
fs_reg payload = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD);
-   bld.exec_all().MOV(payload, g0);
+   bld.group(8, 0).exec_all().MOV(payload, g0);
 
/* Send a message to the thread spawner to terminate the thread. */
fs_inst *inst = bld.exec_all()
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] i965/cs: Return 1 for regs_read on CS_OPCODE_CS_TERMINATE

2015-07-16 Thread Jordan Justen
This prevents an assertion failure in brw_fs_live_variables.cpp,
fs_live_variables::setup_one_read: Assertion `var < num_vars' failed.

Signed-off-by: Jordan Justen 
Cc: Jason Ekstrand 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index ff0675d..2e3eb05 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -698,6 +698,9 @@ fs_inst::regs_read(int arg) const
  return 1;
   break;
 
+   case CS_OPCODE_CS_TERMINATE:
+  return 1;
+
default:
   if (is_tex() && arg == 0 && src[0].file == GRF)
  return mlen;
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965/cs: Setup push constant data for uniforms

2015-07-17 Thread Jordan Justen
On 2015-07-09 17:15:14, Ben Widawsky wrote:
> On Tue, Jun 16, 2015 at 02:21:39PM -0700, Jordan Justen wrote:
> > +static void
> > +gen7_upload_cs_push_constants(struct brw_context *brw)
> > +{
> > +   struct brw_stage_state *stage_state = &brw->cs.base;
> > +
> > +   /* BRW_NEW_COMPUTE_PROGRAM */
> > +   const struct brw_compute_program *cp =
> > +  (struct brw_compute_program *) brw->compute_program;
> > +
> > +   if (cp) {
> > +  /* CACHE_NEW_CS_PROG */
> > +  struct brw_cs_prog_data *cs_prog_data = brw->cs.prog_data;
> > +
> > +  brw_upload_cs_push_constants(brw, &cp->program.Base, cs_prog_data,
> > +   stage_state, AUB_TRACE_WM_CONSTANTS);
> 
> I'm uncertain, but I wonder if we need to flag that you've destroyed the URB
> state like it's done in gen7_push_constant_space state atom.

I tried looking for some documentation that indicated if compute's
usage of the URB would invalidate the render URB allocations/state,
but I didn't see anything...

-Jordan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/4] i965: skip control-flow aware liveness analysis if we only have 1 block

2015-10-13 Thread Jordan Justen
On 2015-10-13 05:17:37, Francisco Jerez wrote:
> Iago Toral Quiroga  writes:
> 
> > This fixes the following test:
> >
> > [require]
> > GL >= 3.3
> > GLSL >= 3.30
> > GL_ARB_shader_storage_buffer_object
> >
> > [fragment shader]
> > #version 330
> > #extension GL_ARB_shader_storage_buffer_object: require
> >
> > buffer SSBO {
> > mat4 sm4;
> > };
> >
> >
> > mat4 um4;
> >
> > void main() {
> > sm4 *= um4;
> 
> This is using the value of "um4", which is never assigned, so liveness
> analysis will correctly extend its live interval up to the start of the
> block.

This test was derived by simplifying a CTS test case.

Anyway, I'm not sure what happened on the way to the commit message,
but um4 should be a uniform.

http://sprunge.us/cEUe

-Jordan

> The other problem here seems to be that the liveness analysis pass
> doesn't consider scalar writes (like the ones emitted by the
> combine_constants optimization pass and by emit_uniformize()) to fully
> define the contents of a register, so it will incorrectly extend up the
> live interval of registers defined using scalar writes even if all
> components ever used in the shader have been defined individually.
> Incidentally the use-def-chains-based implementation of liveness
> analysis I'm working on will fix this issue easily.
> 
> > sm4[0][0] = 0.0;
> > }
> >
> > [test]
> > link success
> >
> > where our liveness analysis works really bad because the control-flow part
> > will expand register liveness to the end of only block we have (so every
> > register will be marked alive until the end of the program). This 
> > artificially
> > increases register pressure to a point in which we run out of registers.
> > Of course, this is only a simple optimization for a trivial case, the
> > underlying problem still exists and would manifest when we have more than
> > one block, but it helps simple shaders such as the one in the example above
> > without any effort. I guess the real fix would involve re-thinking parts of 
> > the
> > liveness analysis algorithm...
> >
> > Jordan, there is another thing that we can improve for this shader that is
> > specific to SSBOs: we emit the same ssbo load multiple times because we are
> > playing it safe just in case there are writes in between. I think we can 
> > try to
> > do better and not re-issue the same load if we don't have ssbo stores to
> > the same address in between. I'll try to come up with a patch for this
> > (hopefully we can do this inside lower_ubo_reference).
> >
> > The actual fix goes into patches 2 (FS) and 4 (VS). Patches 1,3 are just
> > indentation fixes in the code around these.
> >
> > Iago Toral Quiroga (4):
> >   i965/fs: Fix indentation in fs_live_variables::compute_start_end
> >   i965/fs: skip control-flow aware liveness analysis if we only have 1
> > block
> >   i965/vec4: fix indentation in vec4_visitor::calculate_live_intervals
> >   i965/vec4: skip control-flow aware liveness analysis if we only have 1
> > block
> >
> >  .../drivers/dri/i965/brw_fs_live_variables.cpp | 24 
> > ++
> >  .../drivers/dri/i965/brw_vec4_live_variables.cpp   | 23 
> > +
> >  2 files changed, 30 insertions(+), 17 deletions(-)
> >
> > -- 
> > 1.9.1
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/fs: Restore compute shader support in brw_nir_lower_inputs

2015-10-13 Thread Jordan Justen
The commit shown below caused compute shaders to hit the unreachable
in the default of the switch block. Restore compute shaders to use the
fragment shader path.

Also, simplify the fragment/compute path to only support scalar mode.

commit 2953c3d76178d7589947e6ea1dbd902b7b02b3d4
Author: Kenneth Graunke 
Date:   Fri Aug 14 15:15:11 2015 -0700

i965/vs: Map scalar VS input locations properly; avoid tons of MOVs.

Signed-off-by: Jordan Justen 
Cc: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_nir.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
b/src/mesa/drivers/dri/i965/brw_nir.c
index 4f35d81..357ee4f 100644
--- a/src/mesa/drivers/dri/i965/brw_nir.c
+++ b/src/mesa/drivers/dri/i965/brw_nir.c
@@ -96,8 +96,10 @@ brw_nir_lower_inputs(nir_shader *nir, bool is_scalar)
   }
   break;
case MESA_SHADER_FRAGMENT:
+   case MESA_SHADER_COMPUTE:
+  assert(is_scalar);
   nir_assign_var_locations(&nir->inputs, &nir->num_inputs,
-   is_scalar ? type_size_scalar : type_size_vec4);
+   type_size_scalar);
   break;
default:
   unreachable("unsupported shader stage");
-- 
2.5.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] glsl: Support uint index in do_vec_index_to_cond_assign

2015-10-13 Thread Jordan Justen
The ES31-CTS.compute_shader.pipeline-compute-chain test case was
generating an unsigned index by using gl_LocalInvocationID.x and
gl_LocalInvocationID.y as array indices.

Signed-off-by: Jordan Justen 
---
 src/glsl/lower_vec_index_to_cond_assign.cpp | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/glsl/lower_vec_index_to_cond_assign.cpp 
b/src/glsl/lower_vec_index_to_cond_assign.cpp
index 0c3394a..b623882 100644
--- a/src/glsl/lower_vec_index_to_cond_assign.cpp
+++ b/src/glsl/lower_vec_index_to_cond_assign.cpp
@@ -88,7 +88,9 @@ 
ir_vec_index_to_cond_assign_visitor::convert_vec_index_to_cond_assign(void *mem_
exec_list list;
 
/* Store the index to a temporary to avoid reusing its tree. */
-   index = new(base_ir) ir_variable(glsl_type::int_type,
+   assert(orig_index->type == glsl_type::int_type ||
+  orig_index->type == glsl_type::uint_type);
+   index = new(base_ir) ir_variable(orig_index->type,
"vec_index_tmp_i",
ir_var_temporary);
list.push_tail(index);
-- 
2.5.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] glsl: Support uint index in lower_vector_insert

2015-10-13 Thread Jordan Justen
The ES31-CTS.compute_shader.pipeline-compute-chain test case was
generating an unsigned index by using gl_LocalInvocationID.x and
gl_LocalInvocationID.y as array indices.

Signed-off-by: Jordan Justen 
---
 src/glsl/lower_vector_insert.cpp | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/glsl/lower_vector_insert.cpp b/src/glsl/lower_vector_insert.cpp
index 6d7cfa9..26d31b0 100644
--- a/src/glsl/lower_vector_insert.cpp
+++ b/src/glsl/lower_vector_insert.cpp
@@ -108,9 +108,13 @@ vector_insert_visitor::handle_rvalue(ir_rvalue **rv)
   factory.emit(assign(temp, expr->operands[0]));
   factory.emit(assign(src_temp, expr->operands[1]));
 
+  assert(expr->operands[2]->type == glsl_type::int_type ||
+ expr->operands[2]->type == glsl_type::uint_type);
+
   for (unsigned i = 0; i < expr->type->vector_elements; i++) {
  ir_constant *const cmp_index =
-new(factory.mem_ctx) ir_constant(int(i));
+ir_constant::zero(factory.mem_ctx, expr->operands[2]->type);
+ cmp_index->value.u[0] = i;
 
  ir_variable *const cmp_result =
 factory.make_temp(glsl_type::bool_type, "index_condition");
-- 
2.5.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Restore compute shader support in brw_nir_lower_inputs

2015-10-13 Thread Jordan Justen
On 2015-10-13 20:04:36, Kenneth Graunke wrote:
> On Tuesday, October 13, 2015 01:44:55 PM Jordan Justen wrote:
> > The commit shown below caused compute shaders to hit the unreachable
> > in the default of the switch block. Restore compute shaders to use the
> > fragment shader path.
> > 
> > Also, simplify the fragment/compute path to only support scalar mode.
> > 
> > commit 2953c3d76178d7589947e6ea1dbd902b7b02b3d4
> > Author: Kenneth Graunke 
> > Date:   Fri Aug 14 15:15:11 2015 -0700
> > 
> > i965/vs: Map scalar VS input locations properly; avoid tons of MOVs.
> > 
> > Signed-off-by: Jordan Justen 
> > Cc: Kenneth Graunke 
> > ---
> >  src/mesa/drivers/dri/i965/brw_nir.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
> > b/src/mesa/drivers/dri/i965/brw_nir.c
> > index 4f35d81..357ee4f 100644
> > --- a/src/mesa/drivers/dri/i965/brw_nir.c
> > +++ b/src/mesa/drivers/dri/i965/brw_nir.c
> > @@ -96,8 +96,10 @@ brw_nir_lower_inputs(nir_shader *nir, bool is_scalar)
> >}
> >break;
> > case MESA_SHADER_FRAGMENT:
> > +   case MESA_SHADER_COMPUTE:
> > +  assert(is_scalar);
> >nir_assign_var_locations(&nir->inputs, &nir->num_inputs,
> > -   is_scalar ? type_size_scalar : 
> > type_size_vec4);
> > +   type_size_scalar);
> >break;
> > default:
> >unreachable("unsupported shader stage");
> > 
> 
> I didn't think compute shaders had inputs...so it might make more sense
> just to do:
> 
>case MESA_SHADER_COMPUTE:
>   /* Compute shaders have no inputs. */
>   break;

Huh. I looked at nir_assign_var_locations and it seemed to ignore
UBO's and SSBO's, but not normal uniforms. So, I thought it might be
needed for uniforms.

But, I didn't see a regression after skipping calling it. Can you
confirm it is not needed for uniforms?

Strange, it looks like fs_visitor::nir_setup_uniforms will use
var->data.driver_location which nir_assign_var_locations sets. So, I'm
wondering why not calling it is not causing trouble.

> Either way, sorry for breaking this, and...
> Reviewed-by: Kenneth Graunke 

No problem. Hitting unreachable is easy to debug. :)

-Jordan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/4] i965: skip control-flow aware liveness analysis if we only have 1 block

2015-10-14 Thread Jordan Justen
On 2015-10-13 22:49:08, Iago Toral wrote:
> On Tue, 2015-10-13 at 09:44 -0700, Jordan Justen wrote:
> > On 2015-10-13 05:17:37, Francisco Jerez wrote:
> > > Iago Toral Quiroga  writes:
> > > 
> > > > This fixes the following test:
> > > >
> > > > [require]
> > > > GL >= 3.3
> > > > GLSL >= 3.30
> > > > GL_ARB_shader_storage_buffer_object
> > > >
> > > > [fragment shader]
> > > > #version 330
> > > > #extension GL_ARB_shader_storage_buffer_object: require
> > > >
> > > > buffer SSBO {
> > > > mat4 sm4;
> > > > };
> > > >
> > > >
> > > > mat4 um4;
> > > >
> > > > void main() {
> > > > sm4 *= um4;
> > > 
> > > This is using the value of "um4", which is never assigned, so liveness
> > > analysis will correctly extend its live interval up to the start of the
> > > block.
> > 
> > This test was derived by simplifying a CTS test case.
> > 
> > Anyway, I'm not sure what happened on the way to the commit message,
> > but um4 should be a uniform.
> > 
> > http://sprunge.us/cEUe
> 
> Oh yes, that was me playing around with the example. The patches also
> fix the uniform version. Jordan, can you verify if this fixes the CTS
> test case?

Unfortunately, no. The CTS case has some control flow. I had removed
it to minimize the test case.

Here is a small shader_test that has control flow and still fails to
compile with your patches:

http://sprunge.us/LIjA

> In any case, since Curro is working on a more general fix for this
> stuff, I guess you'd rather wait for his patches...

It depends how long we'd have to wait. :) Anyway, since we don't have
a short-term fix anyhow, let's wait to see what curro has to say.

-Jordan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] main: Match DispatchCompute* API validation from main specification

2015-10-14 Thread Jordan Justen
There is a discrepancy between the ARB_compute_shader specification,
and the OpenGL 4.3 and OpenGLES 3.1 specifications. With regards to
the indirect dispatch parameter, unsupported value errors should
return INVALID_VALUE according to the main specifications, whereas the
extension specification indicated INVALID_OPERATION should be
returned.

Here we update the code to match the main specifications, and update
the citations use the main specification rather than the extension
specification.

Signed-off-by: Jordan Justen 
---
 Fixes ES31-CTS.compute_shader.api-indirect

 src/mesa/main/api_validate.c | 37 ++---
 1 file changed, 26 insertions(+), 11 deletions(-)

diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
index a46c194..c286945 100644
--- a/src/mesa/main/api_validate.c
+++ b/src/mesa/main/api_validate.c
@@ -895,6 +895,11 @@ check_valid_to_compute(struct gl_context *ctx, const char 
*function)
   return false;
}
 
+   /* From the OpenGL 4.3 Core Specification, Chapter 19, Compute Shaders:
+*
+* "An INVALID_OPERATION error is generated if there is no active program
+*  for the compute shader stage."
+*/
prog = ctx->Shader.CurrentProgram[MESA_SHADER_COMPUTE];
if (prog == NULL || prog->_LinkedShaders[MESA_SHADER_COMPUTE] == NULL) {
   _mesa_error(ctx, GL_INVALID_OPERATION,
@@ -917,6 +922,12 @@ _mesa_validate_DispatchCompute(struct gl_context *ctx,
   return GL_FALSE;
 
for (i = 0; i < 3; i++) {
+  /* From the OpenGL 4.3 Core Specification, Chapter 19, Compute Shaders:
+   *
+   * "An INVALID_VALUE error is generated if any of num_groups_x,
+   *  num_groups_y and num_groups_z are greater than or equal to the
+   *  maximum work group count for the corresponding dimension."
+   */
   if (num_groups[i] > ctx->Const.MaxComputeWorkGroupCount[i]) {
  _mesa_error(ctx, GL_INVALID_VALUE,
  "glDispatchCompute(num_groups_%c)", 'x' + i);
@@ -937,24 +948,33 @@ valid_dispatch_indirect(struct gl_context *ctx,
if (!check_valid_to_compute(ctx, name))
   return GL_FALSE;
 
-   /* From the ARB_compute_shader specification:
+   /* From the OpenGL 4.3 Core Specification, Chapter 19, Compute Shaders:
 *
-* "An INVALID_OPERATION error is generated [...] if  is less
-*  than zero or not a multiple of the size, in basic machine units, of
-*  uint."
+* "An INVALID_VALUE error is generated if indirect is negative or is not a
+*  multiple of four."
+*
+* Note that the OpenGLES 3.1 specification matches this, but this is
+* different than the extension specification which has a return of
+* INVALID_OPERATION instead.
 */
if ((GLintptr)indirect & (sizeof(GLuint) - 1)) {
-  _mesa_error(ctx, GL_INVALID_OPERATION,
+  _mesa_error(ctx, GL_INVALID_VALUE,
   "%s(indirect is not aligned)", name);
   return GL_FALSE;
}
 
if ((GLintptr)indirect < 0) {
-  _mesa_error(ctx, GL_INVALID_OPERATION,
+  _mesa_error(ctx, GL_INVALID_VALUE,
   "%s(indirect is less than zero)", name);
   return GL_FALSE;
}
 
+   /* From the OpenGL 4.3 Core Specification, Chapter 19, Compute Shaders:
+*
+* "An INVALID_OPERATION error is generated if no buffer is bound to the
+*  DRAW_INDIRECT_BUFFER binding, or if the command would source data
+*  beyond the end of the buffer object."
+*/
if (!_mesa_is_bufferobj(ctx->DispatchIndirectBuffer)) {
   _mesa_error(ctx, GL_INVALID_OPERATION,
   "%s: no buffer bound to DISPATCH_INDIRECT_BUFFER", name);
@@ -967,11 +987,6 @@ valid_dispatch_indirect(struct gl_context *ctx,
   return GL_FALSE;
}
 
-   /* From the ARB_compute_shader specification:
-*
-* "An INVALID_OPERATION error is generated if this command sources data
-*  beyond the end of the buffer object [...]"
-*/
if (ctx->DispatchIndirectBuffer->Size < end) {
   _mesa_error(ctx, GL_INVALID_OPERATION,
   "%s(DISPATCH_INDIRECT_BUFFER too small)", name);
-- 
2.5.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/fs: Disable CSE optimization for untyped & typed surface reads

2015-10-20 Thread Jordan Justen
An untyped surface read is volatile because it might be affected by a
write.

In the ES31-CTS.compute_shader.resources-max test, two back to back
read/modify/writes of an SSBO variable looked something like this:

  r1 = untyped_surface_read(ssbo_float)
  r2 = r1 + 1
  untyped_surface_write(ssbo_float, r2)
  r3 = untyped_surface_read(ssbo_float)
  r4 = r3 + 1
  untyped_surface_write(ssbo_float, r4)

And after CSE, we had:

  r1 = untyped_surface_read(ssbo_float)
  r2 = r1 + 1
  untyped_surface_write(ssbo_float, r2)
  r4 = r1 + 1
  untyped_surface_write(ssbo_float, r4)

Signed-off-by: Jordan Justen 
---
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp |  3 ++-
 src/mesa/drivers/dri/i965/brw_shader.cpp | 14 ++
 src/mesa/drivers/dri/i965/brw_shader.h   |  6 ++
 3 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
index c7628dc..3a28c8d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
@@ -93,7 +93,8 @@ is_expression(const fs_visitor *v, const fs_inst *const inst)
case SHADER_OPCODE_LOAD_PAYLOAD:
   return !inst->is_copy_payload(v->alloc);
default:
-  return inst->is_send_from_grf() && !inst->has_side_effects();
+  return inst->is_send_from_grf() && !inst->has_side_effects() &&
+ !inst->is_volatile();
}
 }
 
diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index 2324b56..be911ed 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -969,6 +969,20 @@ backend_instruction::has_side_effects() const
}
 }
 
+bool
+backend_instruction::is_volatile() const
+{
+   switch (opcode) {
+   case SHADER_OPCODE_UNTYPED_SURFACE_READ:
+   case SHADER_OPCODE_UNTYPED_SURFACE_READ_LOGICAL:
+   case SHADER_OPCODE_TYPED_SURFACE_READ:
+   case SHADER_OPCODE_TYPED_SURFACE_READ_LOGICAL:
+  return true;
+   default:
+  return false;
+   }
+}
+
 #ifndef NDEBUG
 static bool
 inst_is_in_block(const bblock_t *block, const backend_instruction *inst)
diff --git a/src/mesa/drivers/dri/i965/brw_shader.h 
b/src/mesa/drivers/dri/i965/brw_shader.h
index b33b08f..35ee210 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.h
+++ b/src/mesa/drivers/dri/i965/brw_shader.h
@@ -115,6 +115,12 @@ struct backend_instruction : public exec_node {
 * optimize these out unless you know what you are doing.
 */
bool has_side_effects() const;
+
+   /**
+* True if the instruction might be affected by side effects of other
+* instructions.
+*/
+   bool is_volatile() const;
 #else
 struct backend_instruction {
struct exec_node link;
-- 
2.5.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] i965: Don't use message headers for typed/untyped reads

2015-10-20 Thread Jordan Justen
On 2015-10-19 12:23:27, Kristian Høgsberg wrote:
> On Mon, Oct 19, 2015 at 3:19 AM, Francisco Jerez  
> wrote:
> > Hey Kristian,
> >
> > Kristian Høgsberg Kristensen  writes:
> >
> >> We always set the mask to 0x, which is what it defaults to when no
> >> header is present. Let's drop the header instead.
> >>
> >> Signed-off-by: Kristian Høgsberg Kristensen 
> >> ---
> >>  src/mesa/drivers/dri/i965/brw_eu_emit.c | 3 +--
> >>  src/mesa/drivers/dri/i965/brw_fs.cpp| 4 ++--
> >>  2 files changed, 3 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
> >> b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> >> index bf2fee9..ebd811f 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> >> +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> >> @@ -2906,11 +2906,10 @@ brw_untyped_surface_read(struct brw_codegen *p,
> >> const unsigned sfid = (devinfo->gen >= 8 || devinfo->is_haswell ?
> >>HSW_SFID_DATAPORT_DATA_CACHE_1 :
> >>GEN7_SFID_DATAPORT_DATA_CACHE);
> >> -   const bool align1 = (brw_inst_access_mode(devinfo, p->current) == 
> >> BRW_ALIGN_1);
> >> struct brw_inst *insn = brw_send_indirect_surface_message(
> >>p, sfid, dst, payload, surface, msg_length,
> >>brw_surface_payload_size(p, num_channels, true, true),
> >> -  align1);
> >> +  false);
> >>
> >> brw_set_dp_untyped_surface_read_message(
> >>p, insn, num_channels);
> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> >> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> >> index a2fd441..457bf59 100644
> >> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> >> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> >> @@ -4032,7 +4032,7 @@ fs_visitor::lower_logical_sends()
> >>case SHADER_OPCODE_UNTYPED_SURFACE_READ_LOGICAL:
> >>   lower_surface_logical_send(ibld, inst,
> >>  SHADER_OPCODE_UNTYPED_SURFACE_READ,
> >> -fs_reg(0x));
> >> +fs_reg());
> >>   break;
> >>
> >>case SHADER_OPCODE_UNTYPED_SURFACE_WRITE_LOGICAL:
> >> @@ -4050,7 +4050,7 @@ fs_visitor::lower_logical_sends()
> >>case SHADER_OPCODE_TYPED_SURFACE_READ_LOGICAL:
> >>   lower_surface_logical_send(ibld, inst,
> >>  SHADER_OPCODE_TYPED_SURFACE_READ,
> >> -fs_reg(0x));
> >> +fs_reg());
> >>   break;
> >>
> >
> > Whatever you do here must be in agreement with the generator and whether
> > it sets the header-present bit in the message descriptor or not,
> > otherwise you're likely to get a hang or misrendering (I guess Jenkins
> > would've caught this).
> >
> > However, according to my hardware docs "Typed messages (which go to the
> > render cache data port) must include the header.".  There's no mention
> > of which generation that restriction applies to, but I believe it
> > applies to IVB/VLV *only*, which is the only generation in which typed
> > surface messages are handled by the render cache instead of by the data
> > cache -- The parenthesized comment doesn't quite make sense on newer
> > gens.  A quick test shows no piglit regressions on HSW after removing
> > the header from typed surface read messages.
> >
> > I guess there are two things you could do, I'm okay with either:
> >
> >  - Just drop this hunk in order to stick to the letter of the BSpec and
> >always pass a header together with typed surface read messages.  I
> >have the strong suspicion though that the docs are just being
> >inaccurate and the header is in fact unnecessary on HSW+.  No need to
> >resend if you decide to go down this path, if you just drop the
> >change for typed reads feel free to include my:
> >
> >Reviewed-by: Francisco Jerez 
> 
> You're right, that should not have been there. I did get some image
> load/store regressions from this (on BDW as well) that I only noticed
> this morning. Backing out this change fixes it, thanks.

With this change:

Reviewed-by: Jordan Justen 

and, at least with CS,

Tested-by: Jordan Justen 

> >  - Pass 'gen == 7 ? fs_reg(0x) : fs_reg()' as sample_mask argument
> >to lower_surface_logical_send() in the TYPED_SURFACE_READ case.  For
> >this to work you'll also have to change the generator code to pass
> >the fs_inst::header_size field down to brw_typed_surface_read() so it
> >knows whether the header is present or not.
> >
> >>case SHADER_OPCODE_TYPED_SURFACE_WRITE_LOGICAL:
> >> --
> >> 2.6.2
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965/fs: Make emit_uniformize a no-op for immediates

2015-10-20 Thread Jordan Justen
I wrote a similar patch a while back because I was annoyed by how this
was causing the send disassembly to not be as nice. :) I dropped it
from my branch at some point, but I still think it is a good idea.

Reviewed-by: Jordan Justen 

On 2015-10-18 21:31:43, Kristian Høgsberg Kristensen wrote:
> An immdiate is already uniform so just return it up front. Without this,
> brw_fs_surface_builder ends up passing immediate surface indices through
> SHADER_OPCODE_BROADCAST. This writes to a stride 0 dst, which we can't
> constant propagate out of, and further, we don't constant propagate into
> the typed/untype read/write opcodes at all.  The end result is that all
> typed/untyped read/write/atomics end up as indirect sends.
> 
> Code generation should always produce either an immediate or an actual
> indirect surface index, so we can fix this by just special casing
> immediates in emit_uniformize.
> 
> Signed-off-by: Kristian Høgsberg Kristensen 
> ---
>  src/mesa/drivers/dri/i965/brw_fs_builder.h | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_builder.h 
> b/src/mesa/drivers/dri/i965/brw_fs_builder.h
> index df10a9d..98ce71e 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_builder.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs_builder.h
> @@ -394,6 +394,9 @@ namespace brw {
>   const dst_reg chan_index = component(vgrf(BRW_REGISTER_TYPE_UD), 0);
>   const dst_reg dst = component(vgrf(src.type), 0);
>  
> + if (src.file == IMM)
> +return src;
> +
>   ubld.emit(SHADER_OPCODE_FIND_LIVE_CHANNEL, chan_index);
>   ubld.emit(SHADER_OPCODE_BROADCAST, dst, src, chan_index);
>  
> -- 
> 2.6.2
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965/fs: Read all components of a SSBO field with one send

2015-10-20 Thread Jordan Justen
Reviewed-by: Jordan Justen 

On 2015-10-18 21:31:44, Kristian Høgsberg Kristensen wrote:
> Instead of looping through single-component reads, read all components
> in one go.
> 
> Signed-off-by: Kristian Høgsberg Kristensen 
> ---
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 25 +++--
>  1 file changed, 7 insertions(+), 18 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 792663f..f09578c 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -1509,24 +1509,13 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, 
> nir_intrinsic_instr *instr
>}
>  
>/* Read the vector */
> -  for (int i = 0; i < instr->num_components; i++) {
> - fs_reg read_result = emit_untyped_read(bld, surf_index, offset_reg,
> -1 /* dims */, 1 /* size */,
> -BRW_PREDICATE_NONE);
> - read_result.type = dest.type;
> - bld.MOV(dest, read_result);
> - dest = offset(dest, bld, 1);
> -
> - /* Vector components are stored contiguous in memory */
> - if (i < instr->num_components) {
> -if (!has_indirect) {
> -   const_offset_bytes += 4;
> -   bld.MOV(offset_reg, fs_reg(const_offset_bytes));
> -} else {
> -   bld.ADD(offset_reg, offset_reg, brw_imm_ud(4));
> -}
> - }
> -  }
> +  fs_reg read_result = emit_untyped_read(bld, surf_index, offset_reg,
> + 1 /* dims */,
> + instr->num_components,
> + BRW_PREDICATE_NONE);
> +  read_result.type = dest.type;
> +  for (int i = 0; i < instr->num_components; i++)
> + bld.MOV(offset(dest, bld, i), offset(read_result, bld, i));
>  
>break;
> }
> -- 
> 2.6.2
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Disable CSE optimization for untyped & typed surface reads

2015-10-21 Thread Jordan Justen
On 2015-10-20 00:43:13, Iago Toral wrote:
> On Tue, 2015-10-20 at 00:12 -0700, Jordan Justen wrote:
> > An untyped surface read is volatile because it might be affected by a
> > write.
> > 
> > In the ES31-CTS.compute_shader.resources-max test, two back to back
> > read/modify/writes of an SSBO variable looked something like this:
> > 
> >   r1 = untyped_surface_read(ssbo_float)
> >   r2 = r1 + 1
> >   untyped_surface_write(ssbo_float, r2)
> >   r3 = untyped_surface_read(ssbo_float)
> >   r4 = r3 + 1
> >   untyped_surface_write(ssbo_float, r4)
> > 
> > And after CSE, we had:
> > 
> >   r1 = untyped_surface_read(ssbo_float)
> >   r2 = r1 + 1
> >   untyped_surface_write(ssbo_float, r2)
> >   r4 = r1 + 1
> >   untyped_surface_write(ssbo_float, r4)
> 
> Yeah, we cannot do CSE with SSBO loads. Patch looks good to me, but we
> should do the same in the vec4 CSE pass.

Yeah, I checked vec4 CSE. It looks like is_expression will
unconditionally return false for those opcodes.

r-b?

-Jordan

> 
> > Signed-off-by: Jordan Justen 
> > ---
> >  src/mesa/drivers/dri/i965/brw_fs_cse.cpp |  3 ++-
> >  src/mesa/drivers/dri/i965/brw_shader.cpp | 14 ++
> >  src/mesa/drivers/dri/i965/brw_shader.h   |  6 ++
> >  3 files changed, 22 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
> > b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> > index c7628dc..3a28c8d 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> > @@ -93,7 +93,8 @@ is_expression(const fs_visitor *v, const fs_inst *const 
> > inst)
> > case SHADER_OPCODE_LOAD_PAYLOAD:
> >return !inst->is_copy_payload(v->alloc);
> > default:
> > -  return inst->is_send_from_grf() && !inst->has_side_effects();
> > +  return inst->is_send_from_grf() && !inst->has_side_effects() &&
> > + !inst->is_volatile();
> > }
> >  }
> >  
> > diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
> > b/src/mesa/drivers/dri/i965/brw_shader.cpp
> > index 2324b56..be911ed 100644
> > --- a/src/mesa/drivers/dri/i965/brw_shader.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
> > @@ -969,6 +969,20 @@ backend_instruction::has_side_effects() const
> > }
> >  }
> >  
> > +bool
> > +backend_instruction::is_volatile() const
> > +{
> > +   switch (opcode) {
> > +   case SHADER_OPCODE_UNTYPED_SURFACE_READ:
> > +   case SHADER_OPCODE_UNTYPED_SURFACE_READ_LOGICAL:
> > +   case SHADER_OPCODE_TYPED_SURFACE_READ:
> > +   case SHADER_OPCODE_TYPED_SURFACE_READ_LOGICAL:
> > +  return true;
> > +   default:
> > +  return false;
> > +   }
> > +}
> > +
> >  #ifndef NDEBUG
> >  static bool
> >  inst_is_in_block(const bblock_t *block, const backend_instruction *inst)
> > diff --git a/src/mesa/drivers/dri/i965/brw_shader.h 
> > b/src/mesa/drivers/dri/i965/brw_shader.h
> > index b33b08f..35ee210 100644
> > --- a/src/mesa/drivers/dri/i965/brw_shader.h
> > +++ b/src/mesa/drivers/dri/i965/brw_shader.h
> > @@ -115,6 +115,12 @@ struct backend_instruction : public exec_node {
> >  * optimize these out unless you know what you are doing.
> >  */
> > bool has_side_effects() const;
> > +
> > +   /**
> > +* True if the instruction might be affected by side effects of other
> > +* instructions.
> > +*/
> > +   bool is_volatile() const;
> >  #else
> >  struct backend_instruction {
> > struct exec_node link;
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Disable CSE optimization for untyped & typed surface reads

2015-10-22 Thread Jordan Justen
On 2015-10-22 00:06:37, Iago Toral wrote:
> On Wed, 2015-10-21 at 23:24 -0700, Jordan Justen wrote:
> > On 2015-10-20 00:43:13, Iago Toral wrote:
> > > On Tue, 2015-10-20 at 00:12 -0700, Jordan Justen wrote:
> > > > An untyped surface read is volatile because it might be affected by a
> > > > write.
> > > > 
> > > > In the ES31-CTS.compute_shader.resources-max test, two back to back
> > > > read/modify/writes of an SSBO variable looked something like this:
> > > > 
> > > >   r1 = untyped_surface_read(ssbo_float)
> > > >   r2 = r1 + 1
> > > >   untyped_surface_write(ssbo_float, r2)
> > > >   r3 = untyped_surface_read(ssbo_float)
> > > >   r4 = r3 + 1
> > > >   untyped_surface_write(ssbo_float, r4)
> > > > 
> > > > And after CSE, we had:
> > > > 
> > > >   r1 = untyped_surface_read(ssbo_float)
> > > >   r2 = r1 + 1
> > > >   untyped_surface_write(ssbo_float, r2)
> > > >   r4 = r1 + 1
> > > >   untyped_surface_write(ssbo_float, r4)
> > > 
> > > Yeah, we cannot do CSE with SSBO loads. Patch looks good to me, but we
> > > should do the same in the vec4 CSE pass.
> > 
> > Yeah, I checked vec4 CSE. It looks like is_expression will
> > unconditionally return false for those opcodes.
> 
> Oh right.
> 
> > r-b?
> 
> Reviewed-by: Iago Toral Quiroga 
> 
> FWIW, my ssbo load optimization pass is trying to "undo" this since it
> is all about doing CSE for ssbo loads that are safe to CSE, that is,
> when we know that we don't have stores/atomics that write to the same
> offset or memory barriers in between. I am trying to implement that in
> NIR though, so we still need this, to prevent i965 from trying to CSE
> the remaining loads it sees, since thise would not be safe to CSE.
> 
> Also, as I mentioned in another e-mail, we did not notice this issue
> earlier was because there are a couple of problems in i965 that make it
> quite difficult that the CSE pass identifies identical SSBO loads at the
> moment, but that is bound to change as soon as those things get
> eventually fixed.

Yeah. I only saw this after applying some optimization patches from krh.

-Jordan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC 18/21] mesa/extensions: Remove extra memsets on gl_extensions

2015-10-23 Thread Jordan Justen
On 2015-10-22 03:32:58, Emil Velikov wrote:
> On 19 October 2015 at 23:44, Nanley Chery  wrote:
> > From: Nanley Chery 
> >
> > Aside from those modified in this commit, all gl_extensions structs are
> > zero-initialized by default. There is therefore no need to memset the
> > structs to 0. Also, remove the open-coded memset in
> > _mesa_init_extensions().
> >
> > Signed-off-by: Nanley Chery 
> > ---
> >  src/mesa/main/extensions.c | 18 --
> >  1 file changed, 4 insertions(+), 14 deletions(-)
> >
> > diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
> > index 365e7ed..6cd2b27 100644
> > --- a/src/mesa/main/extensions.c
> > +++ b/src/mesa/main/extensions.c
> > @@ -37,8 +37,8 @@
> >  #include "macros.h"
> >  #include "mtypes.h"
> >
> > -struct gl_extensions _mesa_extension_override_enables;
> > -struct gl_extensions _mesa_extension_override_disables;
> > +struct gl_extensions _mesa_extension_override_enables = {0};
> > +struct gl_extensions _mesa_extension_override_disables = {0};
> Side note: once ARB_compute_shader lands in for i965 we can even make
> these local/static.

I added these variables specifically so a driver could check to see
what extensions were overridden early in context creation. Making them
static would undo that feature...

The alternative before was to add a custom environment variable. (See
10e03b44)

-Jordan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] mesa: Map program UBOs and SSBOs to Interface Blocks

2015-10-27 Thread Jordan Justen
Signed-off-by: Jordan Justen 
Cc: Samuel Iglesias Gonsálvez 
Cc: Iago Toral Quiroga 
---
 src/glsl/linker.cpp | 14 ++
 src/glsl/standalone_scaffolding.cpp |  5 +
 src/mesa/main/mtypes.h  |  7 +++
 3 files changed, 26 insertions(+)

diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index 3db2fd3..d925393 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -4448,6 +4448,20 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
 &prog->ShaderStorageBlocks,
 &prog->NumShaderStorageBlocks);
 
+   prog->UboInterfaceBlockIndex =
+  ralloc_array(prog, int, prog->NumUniformBlocks);
+   prog->SsboInterfaceBlockIndex =
+  ralloc_array(prog, int, prog->NumShaderStorageBlocks);
+   for (unsigned i = 0, u = 0, s = 0;
+i < prog->NumBufferInterfaceBlocks;
+i++) {
+  if (prog->BufferInterfaceBlocks[i].IsShaderStorage) {
+ prog->SsboInterfaceBlockIndex[s++] = i;
+  } else {
+ prog->UboInterfaceBlockIndex[u++] = i;
+  }
+   }
+
/* FINISHME: Assign fragment shader output locations. */
 
 done:
diff --git a/src/glsl/standalone_scaffolding.cpp 
b/src/glsl/standalone_scaffolding.cpp
index fe1d820..5c76295 100644
--- a/src/glsl/standalone_scaffolding.cpp
+++ b/src/glsl/standalone_scaffolding.cpp
@@ -124,6 +124,11 @@ _mesa_clear_shader_program_data(struct gl_shader_program 
*shProg)
   shProg->InterfaceBlockStageIndex[i] = NULL;
}
 
+   ralloc_free(shProg->UboInterfaceBlockIndex);
+   shProg->UboInterfaceBlockIndex = NULL;
+   ralloc_free(shProg->SsboInterfaceBlockIndex);
+   shProg->SsboInterfaceBlockIndex = NULL;
+
ralloc_free(shProg->AtomicBuffers);
shProg->AtomicBuffers = NULL;
shProg->NumAtomicBuffers = 0;
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index cce959e..c1cbe96 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2725,6 +2725,13 @@ struct gl_shader_program
int *InterfaceBlockStageIndex[MESA_SHADER_STAGES];
 
/**
+* Indices into the BufferInterfaceBlocks[] array for Uniform Buffer
+* Objects and Shader Storage Buffer Objects.
+*/
+   int *UboInterfaceBlockIndex;
+   int *SsboInterfaceBlockIndex;
+
+   /**
 * Map of active uniform names to locations
 *
 * Maps any active uniform that is not an array element to a location.
-- 
2.5.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] mesa: rename UniformBlockStageIndex to InterfaceBlockStageIndex

2015-10-27 Thread Jordan Justen
Signed-off-by: Jordan Justen 
Cc: Samuel Iglesias Gonsálvez 
Cc: Iago Toral Quiroga 
---
 src/glsl/link_uniform_initializers.cpp |  2 +-
 src/glsl/linker.cpp| 16 
 src/glsl/standalone_scaffolding.cpp|  4 ++--
 src/mesa/main/mtypes.h | 11 ++-
 src/mesa/main/shader_query.cpp |  2 +-
 src/mesa/main/shaderobj.c  |  4 ++--
 src/mesa/main/uniforms.c   |  4 ++--
 7 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/src/glsl/link_uniform_initializers.cpp 
b/src/glsl/link_uniform_initializers.cpp
index 682a4ee..58d21e5 100644
--- a/src/glsl/link_uniform_initializers.cpp
+++ b/src/glsl/link_uniform_initializers.cpp
@@ -178,7 +178,7 @@ set_block_binding(gl_shader_program *prog, const char 
*block_name, int binding)
 
   /* This is a field of a UBO.  val is the binding index. */
   for (int i = 0; i < MESA_SHADER_STAGES; i++) {
- int stage_index = prog->UniformBlockStageIndex[i][block_index];
+ int stage_index = prog->InterfaceBlockStageIndex[i][block_index];
 
  if (stage_index != -1) {
 struct gl_shader *sh = prog->_LinkedShaders[i];
diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index cfd8f81..3db2fd3 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -1174,10 +1174,10 @@ interstage_cross_validate_uniform_blocks(struct 
gl_shader_program *prog)
for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
   struct gl_shader *sh = prog->_LinkedShaders[i];
 
-  prog->UniformBlockStageIndex[i] = ralloc_array(prog, int,
-max_num_uniform_blocks);
+  prog->InterfaceBlockStageIndex[i] = ralloc_array(prog, int,
+   max_num_uniform_blocks);
   for (unsigned int j = 0; j < max_num_uniform_blocks; j++)
-prog->UniformBlockStageIndex[i][j] = -1;
+prog->InterfaceBlockStageIndex[i][j] = -1;
 
   if (sh == NULL)
 continue;
@@ -1194,7 +1194,7 @@ interstage_cross_validate_uniform_blocks(struct 
gl_shader_program *prog)
return false;
 }
 
-prog->UniformBlockStageIndex[i][index] = j;
+prog->InterfaceBlockStageIndex[i][index] = j;
   }
}
 
@@ -2836,9 +2836,9 @@ check_resources(struct gl_context *ctx, struct 
gl_shader_program *prog)
   }
 
   for (unsigned j = 0; j < MESA_SHADER_STAGES; j++) {
-if (prog->UniformBlockStageIndex[j][i] != -1) {
+if (prog->InterfaceBlockStageIndex[j][i] != -1) {
 struct gl_shader *sh = prog->_LinkedShaders[j];
-int stage_index = prog->UniformBlockStageIndex[j][i];
+int stage_index = prog->InterfaceBlockStageIndex[j][i];
 if (sh && sh->BufferInterfaceBlocks[stage_index].IsShaderStorage) {
shader_blocks[j]++;
total_shader_storage_blocks++;
@@ -2955,7 +2955,7 @@ check_image_resources(struct gl_context *ctx, struct 
gl_shader_program *prog)
  total_image_units += sh->NumImages;
 
  for (unsigned j = 0; j < prog->NumBufferInterfaceBlocks; j++) {
-int stage_index = prog->UniformBlockStageIndex[i][j];
+int stage_index = prog->InterfaceBlockStageIndex[i][j];
 if (stage_index != -1 && 
sh->BufferInterfaceBlocks[stage_index].IsShaderStorage)
total_shader_storage_blocks++;
  }
@@ -3734,7 +3734,7 @@ build_program_resource_list(struct gl_shader_program 
*shProg)
   int block_index = shProg->UniformStorage[i].block_index;
   if (block_index != -1) {
  for (unsigned j = 0; j < MESA_SHADER_STAGES; j++) {
- if (shProg->UniformBlockStageIndex[j][block_index] != -1)
+ if (shProg->InterfaceBlockStageIndex[j][block_index] != -1)
 stageref |= (1 << j);
  }
   }
diff --git a/src/glsl/standalone_scaffolding.cpp 
b/src/glsl/standalone_scaffolding.cpp
index eccf094..fe1d820 100644
--- a/src/glsl/standalone_scaffolding.cpp
+++ b/src/glsl/standalone_scaffolding.cpp
@@ -120,8 +120,8 @@ _mesa_clear_shader_program_data(struct gl_shader_program 
*shProg)
shProg->NumShaderStorageBlocks = 0;
 
for (i = 0; i < MESA_SHADER_STAGES; i++) {
-  ralloc_free(shProg->UniformBlockStageIndex[i]);
-  shProg->UniformBlockStageIndex[i] = NULL;
+  ralloc_free(shProg->InterfaceBlockStageIndex[i]);
+  shProg->InterfaceBlockStageIndex[i] = NULL;
}
 
ralloc_free(shProg->AtomicBuffers);
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 34120cf..cce959e 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -2715,13 +2715,14 @@ struct gl_shader_program
struct gl_uniform_block **ShaderStorageBlocks;
 
/**
-* Indices into the _LinkedShaders's Unifor

[Mesa-dev] [PATCH 3/3] mesa: Use UBO/SSBO indices during binding

2015-10-27 Thread Jordan Justen
Previously we were treating the binding index for Uniform Buffer
Objects and Shader Storage Buffer Objects as being part of the
combined BufferInterfaceBlocks array.

Fixes ES31-CTS.compute_shader.resource-ubo on i965.

Signed-off-by: Jordan Justen 
Cc: Samuel Iglesias Gonsálvez 
Cc: Iago Toral Quiroga 
---
 src/mesa/main/uniforms.c | 30 --
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/src/mesa/main/uniforms.c b/src/mesa/main/uniforms.c
index 758ca24..47f80ce 100644
--- a/src/mesa/main/uniforms.c
+++ b/src/mesa/main/uniforms.c
@@ -1002,10 +1002,10 @@ _mesa_UniformBlockBinding(GLuint program,
if (!shProg)
   return;
 
-   if (uniformBlockIndex >= shProg->NumBufferInterfaceBlocks) {
+   if (uniformBlockIndex >= shProg->NumUniformBlocks) {
   _mesa_error(ctx, GL_INVALID_VALUE,
  "glUniformBlockBinding(block index %u >= %u)",
- uniformBlockIndex, shProg->NumBufferInterfaceBlocks);
+ uniformBlockIndex, shProg->NumUniformBlocks);
   return;
}
 
@@ -1016,17 +1016,22 @@ _mesa_UniformBlockBinding(GLuint program,
   return;
}
 
-   if (shProg->BufferInterfaceBlocks[uniformBlockIndex].Binding !=
+   if (shProg->UniformBlocks[uniformBlockIndex]->Binding !=
uniformBlockBinding) {
   int i;
 
   FLUSH_VERTICES(ctx, 0);
   ctx->NewDriverState |= ctx->DriverFlags.NewUniformBuffer;
 
-  shProg->BufferInterfaceBlocks[uniformBlockIndex].Binding = 
uniformBlockBinding;
+  const int interface_block_index =
+ shProg->UboInterfaceBlockIndex[uniformBlockIndex];
+
+  shProg->BufferInterfaceBlocks[interface_block_index].Binding =
+ uniformBlockBinding;
 
   for (i = 0; i < MESA_SHADER_STAGES; i++) {
-int stage_index = 
shProg->InterfaceBlockStageIndex[i][uniformBlockIndex];
+int stage_index =
+shProg->InterfaceBlockStageIndex[i][interface_block_index];
 
 if (stage_index != -1) {
struct gl_shader *sh = shProg->_LinkedShaders[i];
@@ -1054,10 +1059,10 @@ _mesa_ShaderStorageBlockBinding(GLuint program,
if (!shProg)
   return;
 
-   if (shaderStorageBlockIndex >= shProg->NumBufferInterfaceBlocks) {
+   if (shaderStorageBlockIndex >= shProg->NumShaderStorageBlocks) {
   _mesa_error(ctx, GL_INVALID_VALUE,
  "glShaderStorageBlockBinding(block index %u >= %u)",
- shaderStorageBlockIndex, shProg->NumBufferInterfaceBlocks);
+ shaderStorageBlockIndex, shProg->NumShaderStorageBlocks);
   return;
}
 
@@ -1069,17 +1074,22 @@ _mesa_ShaderStorageBlockBinding(GLuint program,
   return;
}
 
-   if (shProg->BufferInterfaceBlocks[shaderStorageBlockIndex].Binding !=
+   if (shProg->ShaderStorageBlocks[shaderStorageBlockIndex]->Binding !=
shaderStorageBlockBinding) {
   int i;
 
   FLUSH_VERTICES(ctx, 0);
   ctx->NewDriverState |= ctx->DriverFlags.NewShaderStorageBuffer;
 
-  shProg->BufferInterfaceBlocks[shaderStorageBlockIndex].Binding = 
shaderStorageBlockBinding;
+  const int interface_block_index =
+ shProg->SsboInterfaceBlockIndex[shaderStorageBlockIndex];
+
+  shProg->BufferInterfaceBlocks[interface_block_index].Binding =
+ shaderStorageBlockBinding;
 
   for (i = 0; i < MESA_SHADER_STAGES; i++) {
-int stage_index = 
shProg->InterfaceBlockStageIndex[i][shaderStorageBlockIndex];
+int stage_index =
+shProg->InterfaceBlockStageIndex[i][interface_block_index];
 
 if (stage_index != -1) {
struct gl_shader *sh = shProg->_LinkedShaders[i];
-- 
2.5.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: Add compute shader builtin variables for OpenGLES 3.1

2015-10-29 Thread Jordan Justen
Signed-off-by: Jordan Justen 
---
 src/glsl/builtin_variables.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
index a6ad105..00113d5 100644
--- a/src/glsl/builtin_variables.cpp
+++ b/src/glsl/builtin_variables.cpp
@@ -710,7 +710,7 @@ builtin_variable_generator::generate_constants()
   }
}
 
-   if (state->is_version(430, 0) || state->ARB_compute_shader_enable) {
+   if (state->is_version(430, 310) || state->ARB_compute_shader_enable) {
   add_const("gl_MaxComputeAtomicCounterBuffers", 
MAX_COMPUTE_ATOMIC_COUNTER_BUFFERS);
   add_const("gl_MaxComputeAtomicCounters", MAX_COMPUTE_ATOMIC_COUNTERS);
   add_const("gl_MaxComputeImageUniforms", MAX_COMPUTE_IMAGE_UNIFORMS);
-- 
2.5.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: OpenGLES GLSL 3.1 precision qualifiers ordering rules

2015-10-29 Thread Jordan Justen
The OpenGLES GLSL 3.1 specification uses the precision qualifier
ordering rules from ARB_shading_language_420pack.

Signed-off-by: Jordan Justen 
---
 src/glsl/glsl_parser.yy | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
index 2f2e10d..4636435 100644
--- a/src/glsl/glsl_parser.yy
+++ b/src/glsl/glsl_parser.yy
@@ -948,7 +948,8 @@ parameter_qualifier:
   if ($2.precision != ast_precision_none)
  _mesa_glsl_error(&@1, state, "duplicate precision qualifier");
 
-  if (!state->has_420pack() && $2.flags.i != 0)
+  if (!(state->has_420pack() || state->is_version(420, 310)) &&
+  $2.flags.i != 0)
  _mesa_glsl_error(&@1, state, "precision qualifiers must come last");
 
   $$ = $2;
@@ -1847,7 +1848,8 @@ type_qualifier:
   if ($2.precision != ast_precision_none)
  _mesa_glsl_error(&@1, state, "duplicate precision qualifier");
 
-  if (!state->has_420pack() && $2.flags.i != 0)
+  if (!(state->has_420pack() || state->is_version(420, 310)) &&
+  $2.flags.i != 0)
  _mesa_glsl_error(&@1, state, "precision qualifiers must come last");
 
   $$ = $2;
-- 
2.5.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] main/get: Add MAX_COMBINED_COMPUTE_UNIFORM_COMPONENTS

2015-10-29 Thread Jordan Justen
Signed-off-by: Jordan Justen 
---
 src/mesa/main/get_hash_params.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index c295615..fbc7b8f 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -460,6 +460,7 @@ descriptor=[
   [ "MAX_COMPUTE_UNIFORM_COMPONENTS", "CONST(MAX_COMPUTE_UNIFORM_COMPONENTS), 
extra_ARB_compute_shader_es31" ],
   [ "MAX_COMPUTE_IMAGE_UNIFORMS", "CONST(MAX_COMPUTE_IMAGE_UNIFORMS), 
extra_ARB_compute_shader_es31" ],
   [ "DISPATCH_INDIRECT_BUFFER_BINDING", "LOC_CUSTOM, TYPE_INT, 0, 
extra_ARB_compute_shader_es31" ],
+  [ "MAX_COMBINED_COMPUTE_UNIFORM_COMPONENTS", 
"CONTEXT_INT(Const.Program[MESA_SHADER_COMPUTE].MaxCombinedUniformComponents), 
extra_ARB_compute_shader_es31" ],
 
 # GL_ARB_framebuffer_no_attachments / GLES 3.1
   ["MAX_FRAMEBUFFER_WIDTH", "CONTEXT_INT(Const.MaxFramebufferWidth), 
extra_ARB_framebuffer_no_attachments"],
-- 
2.5.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Setup pull constant state for compute programs

2015-10-29 Thread Jordan Justen
Signed-off-by: Jordan Justen 
---
 src/mesa/drivers/dri/i965/brw_context.h  |  2 +-
 src/mesa/drivers/dri/i965/brw_state.h|  1 +
 src/mesa/drivers/dri/i965/brw_state_upload.c |  2 ++
 src/mesa/drivers/dri/i965/gen7_cs_state.c| 32 
 4 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 18c361e..887b57b 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1177,7 +1177,7 @@ struct brw_context
 
int num_atoms[BRW_NUM_PIPELINES];
const struct brw_tracked_state render_atoms[60];
-   const struct brw_tracked_state compute_atoms[8];
+   const struct brw_tracked_state compute_atoms[9];
 
/* If (INTEL_DEBUG & DEBUG_BATCH) */
struct {
diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
b/src/mesa/drivers/dri/i965/brw_state.h
index dc2b941..2c7c2f3 100644
--- a/src/mesa/drivers/dri/i965/brw_state.h
+++ b/src/mesa/drivers/dri/i965/brw_state.h
@@ -49,6 +49,7 @@ extern const struct brw_tracked_state brw_clip_unit;
 extern const struct brw_tracked_state brw_vs_pull_constants;
 extern const struct brw_tracked_state brw_gs_pull_constants;
 extern const struct brw_tracked_state brw_wm_pull_constants;
+extern const struct brw_tracked_state brw_cs_pull_constants;
 extern const struct brw_tracked_state brw_constant_buffer;
 extern const struct brw_tracked_state brw_curbe_offsets;
 extern const struct brw_tracked_state brw_invariant_state;
diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
b/src/mesa/drivers/dri/i965/brw_state_upload.c
index 79b8301..0344b8a 100644
--- a/src/mesa/drivers/dri/i965/brw_state_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
@@ -259,6 +259,7 @@ static const struct brw_tracked_state *gen7_compute_atoms[] 
=
&brw_state_base_address,
&brw_cs_image_surfaces,
&gen7_cs_push_constants,
+   &brw_cs_pull_constants,
&brw_cs_ubo_surfaces,
&brw_cs_abo_surfaces,
&brw_texture_surfaces,
@@ -353,6 +354,7 @@ static const struct brw_tracked_state *gen8_compute_atoms[] 
=
&gen8_state_base_address,
&brw_cs_image_surfaces,
&gen7_cs_push_constants,
+   &brw_cs_pull_constants,
&brw_cs_ubo_surfaces,
&brw_cs_abo_surfaces,
&brw_texture_surfaces,
diff --git a/src/mesa/drivers/dri/i965/gen7_cs_state.c 
b/src/mesa/drivers/dri/i965/gen7_cs_state.c
index 6aeb0cb..da1d05f 100644
--- a/src/mesa/drivers/dri/i965/gen7_cs_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_cs_state.c
@@ -29,6 +29,7 @@
 #include "brw_shader.h"
 #include "intel_mipmap_tree.h"
 #include "intel_batchbuffer.h"
+#include "intel_buffer_objects.h"
 #include "brw_state.h"
 
 static unsigned
@@ -285,3 +286,34 @@ const struct brw_tracked_state gen7_cs_push_constants = {
},
.emit = gen7_upload_cs_push_constants,
 };
+
+/**
+ * Creates a new CS constant buffer reflecting the current CS program's
+ * constants, if needed by the CS program.
+ */
+static void
+brw_upload_cs_pull_constants(struct brw_context *brw)
+{
+   struct brw_stage_state *stage_state = &brw->cs.base;
+
+   /* BRW_NEW_COMPUTE_PROGRAM */
+   struct brw_compute_program *cp =
+  (struct brw_compute_program *) brw->compute_program;
+
+   /* BRW_NEW_CS_PROG_DATA */
+   const struct brw_stage_prog_data *prog_data = &brw->cs.prog_data->base;
+
+   /* _NEW_PROGRAM_CONSTANTS */
+   brw_upload_pull_constants(brw, BRW_NEW_SURFACES, &cp->program.Base,
+ stage_state, prog_data, true);
+}
+
+const struct brw_tracked_state brw_cs_pull_constants = {
+   .dirty = {
+  .mesa = _NEW_PROGRAM_CONSTANTS,
+  .brw = BRW_NEW_BATCH |
+ BRW_NEW_COMPUTE_PROGRAM |
+ BRW_NEW_CS_PROG_DATA,
+   },
+   .emit = brw_upload_cs_pull_constants,
+};
-- 
2.5.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/nir: Mark const index UBO surfaces as used

2015-10-29 Thread Jordan Justen
Signed-off-by: Jordan Justen 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 6 --
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 6 --
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 4950ba4..6d69e96 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -1438,8 +1438,10 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, 
nir_intrinsic_instr *instr
   fs_reg surf_index;
 
   if (const_index) {
- surf_index = fs_reg(stage_prog_data->binding_table.ubo_start +
- const_index->u[0]);
+ unsigned index = stage_prog_data->binding_table.ubo_start +
+  const_index->u[0];
+ surf_index = fs_reg(index);
+ brw_mark_surface_used(prog_data, index);
   } else {
  /* The block index is not a constant. Evaluate the index expression
   * per-channel and add the base UBO index; we have to select a value
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index 0f04f65..efbdaa9 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
@@ -749,8 +749,10 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
*instr)
  /* The block index is a constant, so just emit the binding table entry
   * as an immediate.
   */
- surf_index = src_reg(prog_data->base.binding_table.ubo_start +
-  const_block_index->u[0]);
+ unsigned index = prog_data->base.binding_table.ubo_start +
+  const_block_index->u[0];
+ surf_index = src_reg(index);
+ brw_mark_surface_used(&prog_data->base, index);
   } else {
  /* The block index is not a constant. Evaluate the index expression
   * per-channel and add the base UBO index; we have to select a value
-- 
2.5.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] mesa/sso: Add MESA_VERBOSE=api trace support

2015-10-29 Thread Jordan Justen
Signed-off-by: Jordan Justen 
---
 src/mesa/main/pipelineobj.c | 33 +
 1 file changed, 33 insertions(+)

diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c
index 51ee10f..c8c50fa 100644
--- a/src/mesa/main/pipelineobj.c
+++ b/src/mesa/main/pipelineobj.c
@@ -230,6 +230,10 @@ _mesa_UseProgramStages(GLuint pipeline, GLbitfield stages, 
GLuint program)
struct gl_shader_program *shProg = NULL;
GLbitfield any_valid_stages;
 
+   if (MESA_VERBOSE & VERBOSE_API)
+  _mesa_debug(ctx, "glUseProgramStages(%d, 0x%x, %d)\n",
+  pipeline, stages, program);
+
if (!pipe) {
   _mesa_error(ctx, GL_INVALID_OPERATION, "glUseProgramStages(pipeline)");
   return;
@@ -345,6 +349,9 @@ _mesa_ActiveShaderProgram(GLuint pipeline, GLuint program)
struct gl_shader_program *shProg = NULL;
struct gl_pipeline_object *pipe = _mesa_lookup_pipeline_object(ctx, 
pipeline);
 
+   if (MESA_VERBOSE & VERBOSE_API)
+  _mesa_debug(ctx, "glActiveShaderProgram(%d, %d)\n", pipeline, program);
+
if (program != 0) {
   shProg = _mesa_lookup_shader_program_err(ctx, program,

"glActiveShaderProgram(program)");
@@ -380,6 +387,9 @@ _mesa_BindProgramPipeline(GLuint pipeline)
GET_CURRENT_CONTEXT(ctx);
struct gl_pipeline_object *newObj = NULL;
 
+   if (MESA_VERBOSE & VERBOSE_API)
+  _mesa_debug(ctx, "glBindProgramPipeline(%d)\n", pipeline);
+
/* Rebinding the same pipeline object: no change.
 */
if (ctx->_Shader->Name == pipeline)
@@ -467,6 +477,9 @@ _mesa_DeleteProgramPipelines(GLsizei n, const GLuint 
*pipelines)
GET_CURRENT_CONTEXT(ctx);
GLsizei i;
 
+   if (MESA_VERBOSE & VERBOSE_API)
+  _mesa_debug(ctx, "glDeleteProgramPipelines(%d, %p)\n", n, pipelines);
+
if (n < 0) {
   _mesa_error(ctx, GL_INVALID_VALUE, "glDeleteProgramPipelines(n<0)");
   return;
@@ -551,6 +564,9 @@ _mesa_GenProgramPipelines(GLsizei n, GLuint *pipelines)
 {
GET_CURRENT_CONTEXT(ctx);
 
+   if (MESA_VERBOSE & VERBOSE_API)
+  _mesa_debug(ctx, "glGenProgramPipelines(%d, %p)\n", n, pipelines);
+
create_program_pipelines(ctx, n, pipelines, false);
 }
 
@@ -559,6 +575,9 @@ _mesa_CreateProgramPipelines(GLsizei n, GLuint *pipelines)
 {
GET_CURRENT_CONTEXT(ctx);
 
+   if (MESA_VERBOSE & VERBOSE_API)
+  _mesa_debug(ctx, "glCreateProgramPipelines(%d, %p)\n", n, pipelines);
+
create_program_pipelines(ctx, n, pipelines, true);
 }
 
@@ -574,6 +593,9 @@ _mesa_IsProgramPipeline(GLuint pipeline)
 {
GET_CURRENT_CONTEXT(ctx);
 
+   if (MESA_VERBOSE & VERBOSE_API)
+  _mesa_debug(ctx, "glIsProgramPipeline(%d)\n", pipeline);
+
struct gl_pipeline_object *obj = _mesa_lookup_pipeline_object(ctx, 
pipeline);
if (obj == NULL)
   return GL_FALSE;
@@ -590,6 +612,10 @@ _mesa_GetProgramPipelineiv(GLuint pipeline, GLenum pname, 
GLint *params)
GET_CURRENT_CONTEXT(ctx);
struct gl_pipeline_object *pipe = _mesa_lookup_pipeline_object(ctx, 
pipeline);
 
+   if (MESA_VERBOSE & VERBOSE_API)
+  _mesa_debug(ctx, "glGetProgramPipelineiv(%d, %d, %p)\n",
+  pipeline, pname, params);
+
/* Are geometry shaders available in this context?
 */
const bool has_gs = _mesa_has_geometry_shaders(ctx);
@@ -857,6 +883,9 @@ _mesa_ValidateProgramPipeline(GLuint pipeline)
 {
GET_CURRENT_CONTEXT(ctx);
 
+   if (MESA_VERBOSE & VERBOSE_API)
+  _mesa_debug(ctx, "glValidateProgramPipeline(%d)\n", pipeline);
+
struct gl_pipeline_object *pipe = _mesa_lookup_pipeline_object(ctx, 
pipeline);
 
if (!pipe) {
@@ -875,6 +904,10 @@ _mesa_GetProgramPipelineInfoLog(GLuint pipeline, GLsizei 
bufSize,
 {
GET_CURRENT_CONTEXT(ctx);
 
+   if (MESA_VERBOSE & VERBOSE_API)
+  _mesa_debug(ctx, "glGetProgramPipelineInfoLog(%d, %d, %p, %p)\n",
+  pipeline, bufSize, length, infoLog);
+
struct gl_pipeline_object *pipe = _mesa_lookup_pipeline_object(ctx, 
pipeline);
 
if (!pipe) {
-- 
2.5.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] mesa/sso: Add compute shader support

2015-10-29 Thread Jordan Justen
Signed-off-by: Jordan Justen 
---
 src/mesa/main/api_validate.c |  2 +-
 src/mesa/main/pipelineobj.c  | 11 +++
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
index c59b6f3..46f39e7 100644
--- a/src/mesa/main/api_validate.c
+++ b/src/mesa/main/api_validate.c
@@ -923,7 +923,7 @@ check_valid_to_compute(struct gl_context *ctx, const char 
*function)
 * "An INVALID_OPERATION error is generated if there is no active program
 *  for the compute shader stage."
 */
-   prog = ctx->Shader.CurrentProgram[MESA_SHADER_COMPUTE];
+   prog = ctx->_Shader->CurrentProgram[MESA_SHADER_COMPUTE];
if (prog == NULL || prog->_LinkedShaders[MESA_SHADER_COMPUTE] == NULL) {
   _mesa_error(ctx, GL_INVALID_OPERATION,
   "%s(no active compute shader)",
diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c
index c8c50fa..58730f4 100644
--- a/src/mesa/main/pipelineobj.c
+++ b/src/mesa/main/pipelineobj.c
@@ -255,6 +255,8 @@ _mesa_UseProgramStages(GLuint pipeline, GLbitfield stages, 
GLuint program)
if (_mesa_has_tessellation(ctx))
   any_valid_stages |= GL_TESS_CONTROL_SHADER_BIT |
   GL_TESS_EVALUATION_SHADER_BIT;
+   if (_mesa_has_compute_shaders(ctx))
+  any_valid_stages |= GL_COMPUTE_SHADER_BIT;
 
if (stages != GL_ALL_SHADER_BITS && (stages & ~any_valid_stages) != 0) {
   _mesa_error(ctx, GL_INVALID_VALUE, "glUseProgramStages(Stages)");
@@ -336,6 +338,9 @@ _mesa_UseProgramStages(GLuint pipeline, GLbitfield stages, 
GLuint program)
 
if ((stages & GL_TESS_EVALUATION_SHADER_BIT) != 0)
   _mesa_use_shader_program(ctx, GL_TESS_EVALUATION_SHADER, shProg, pipe);
+
+   if ((stages & GL_COMPUTE_SHADER_BIT) != 0)
+  _mesa_use_shader_program(ctx, GL_COMPUTE_SHADER, shProg, pipe);
 }
 
 /**
@@ -669,6 +674,12 @@ _mesa_GetProgramPipelineiv(GLuint pipeline, GLenum pname, 
GLint *params)
   *params = pipe->CurrentProgram[MESA_SHADER_FRAGMENT]
  ? pipe->CurrentProgram[MESA_SHADER_FRAGMENT]->Name : 0;
   return;
+   case GL_COMPUTE_SHADER:
+  if (!_mesa_has_compute_shaders(ctx))
+ break;
+  *params = pipe->CurrentProgram[MESA_SHADER_COMPUTE]
+ ? pipe->CurrentProgram[MESA_SHADER_COMPUTE]->Name : 0;
+  return;
default:
   break;
}
-- 
2.5.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: OpenGLES GLSL 3.1 precision qualifiers ordering rules

2015-10-29 Thread Jordan Justen
On 2015-10-29 00:53:01, Ilia Mirkin wrote:
> Would it make sense to just modify the has_420pack function? Or do you
> not want all of it?

Yeah, that was my first thought.

Looking at the OpenGLES 3.1 spec, it seems to not have picked up very
much of 420pack. This was the only part I've noticed so far, but I did
see several parts that did not adopt 420pack changes.

-Jordan

> On Thu, Oct 29, 2015 at 3:47 AM, Jordan Justen
>  wrote:
> > The OpenGLES GLSL 3.1 specification uses the precision qualifier
> > ordering rules from ARB_shading_language_420pack.
> >
> > Signed-off-by: Jordan Justen 
> > ---
> >  src/glsl/glsl_parser.yy | 6 --
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
> > index 2f2e10d..4636435 100644
> > --- a/src/glsl/glsl_parser.yy
> > +++ b/src/glsl/glsl_parser.yy
> > @@ -948,7 +948,8 @@ parameter_qualifier:
> >if ($2.precision != ast_precision_none)
> >   _mesa_glsl_error(&@1, state, "duplicate precision qualifier");
> >
> > -  if (!state->has_420pack() && $2.flags.i != 0)
> > +  if (!(state->has_420pack() || state->is_version(420, 310)) &&
> > +  $2.flags.i != 0)
> >   _mesa_glsl_error(&@1, state, "precision qualifiers must come 
> > last");
> >
> >$$ = $2;
> > @@ -1847,7 +1848,8 @@ type_qualifier:
> >if ($2.precision != ast_precision_none)
> >   _mesa_glsl_error(&@1, state, "duplicate precision qualifier");
> >
> > -  if (!state->has_420pack() && $2.flags.i != 0)
> > +  if (!(state->has_420pack() || state->is_version(420, 310)) &&
> > +  $2.flags.i != 0)
> >   _mesa_glsl_error(&@1, state, "precision qualifiers must come 
> > last");
> >
> >$$ = $2;
> > --
> > 2.5.1
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] mesa/sso: Add compute shader support

2015-10-29 Thread Jordan Justen
On 2015-10-29 03:04:38, Iago Toral wrote:
> On Thu, 2015-10-29 at 00:52 -0700, Jordan Justen wrote:
> > Signed-off-by: Jordan Justen 
> > ---
> >  src/mesa/main/api_validate.c |  2 +-
> >  src/mesa/main/pipelineobj.c  | 11 +++
> >  2 files changed, 12 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
> > index c59b6f3..46f39e7 100644
> > --- a/src/mesa/main/api_validate.c
> > +++ b/src/mesa/main/api_validate.c
> > @@ -923,7 +923,7 @@ check_valid_to_compute(struct gl_context *ctx, const 
> > char *function)
> >  * "An INVALID_OPERATION error is generated if there is no active 
> > program
> >  *  for the compute shader stage."
> >  */
> > -   prog = ctx->Shader.CurrentProgram[MESA_SHADER_COMPUTE];
> > +   prog = ctx->_Shader->CurrentProgram[MESA_SHADER_COMPUTE];
> > if (prog == NULL || prog->_LinkedShaders[MESA_SHADER_COMPUTE] == NULL) {
> >_mesa_error(ctx, GL_INVALID_OPERATION,
> >"%s(no active compute shader)",
> 
> This hunk won't apply on current master, there is no such comment before
> this line. Maybe this is part of another series of patches?

I guess this depends on some nearby changed lines in my unreviewed
patch:

  "main: Match DispatchCompute* API validation from main specification"

http://patchwork.freedesktop.org/patch/61881/

-Jordan

> For the rest of the patch:
> Reviewed-by: Iago Toral Quiroga 
> 
> > diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c
> > index c8c50fa..58730f4 100644
> > --- a/src/mesa/main/pipelineobj.c
> > +++ b/src/mesa/main/pipelineobj.c
> > @@ -255,6 +255,8 @@ _mesa_UseProgramStages(GLuint pipeline, GLbitfield 
> > stages, GLuint program)
> > if (_mesa_has_tessellation(ctx))
> >any_valid_stages |= GL_TESS_CONTROL_SHADER_BIT |
> >GL_TESS_EVALUATION_SHADER_BIT;
> > +   if (_mesa_has_compute_shaders(ctx))
> > +  any_valid_stages |= GL_COMPUTE_SHADER_BIT;
> >  
> > if (stages != GL_ALL_SHADER_BITS && (stages & ~any_valid_stages) != 0) {
> >_mesa_error(ctx, GL_INVALID_VALUE, "glUseProgramStages(Stages)");
> > @@ -336,6 +338,9 @@ _mesa_UseProgramStages(GLuint pipeline, GLbitfield 
> > stages, GLuint program)
> >  
> > if ((stages & GL_TESS_EVALUATION_SHADER_BIT) != 0)
> >_mesa_use_shader_program(ctx, GL_TESS_EVALUATION_SHADER, shProg, 
> > pipe);
> > +
> > +   if ((stages & GL_COMPUTE_SHADER_BIT) != 0)
> > +  _mesa_use_shader_program(ctx, GL_COMPUTE_SHADER, shProg, pipe);
> >  }
> >  
> >  /**
> > @@ -669,6 +674,12 @@ _mesa_GetProgramPipelineiv(GLuint pipeline, GLenum 
> > pname, GLint *params)
> >*params = pipe->CurrentProgram[MESA_SHADER_FRAGMENT]
> >   ? pipe->CurrentProgram[MESA_SHADER_FRAGMENT]->Name : 0;
> >return;
> > +   case GL_COMPUTE_SHADER:
> > +  if (!_mesa_has_compute_shaders(ctx))
> > + break;
> > +  *params = pipe->CurrentProgram[MESA_SHADER_COMPUTE]
> > + ? pipe->CurrentProgram[MESA_SHADER_COMPUTE]->Name : 0;
> > +  return;
> > default:
> >break;
> > }
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/nir: Mark const index UBO surfaces as used

2015-10-29 Thread Jordan Justen
On 2015-10-29 02:17:20, Iago Toral wrote:
> On Thu, 2015-10-29 at 00:50 -0700, Jordan Justen wrote:
> > Signed-off-by: Jordan Justen 
> > ---
> >  src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 6 --
> >  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 6 --
> >  2 files changed, 8 insertions(+), 4 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > index 4950ba4..6d69e96 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> > @@ -1438,8 +1438,10 @@ fs_visitor::nir_emit_intrinsic(const fs_builder 
> > &bld, nir_intrinsic_instr *instr
> >fs_reg surf_index;
> >  
> >if (const_index) {
> > - surf_index = fs_reg(stage_prog_data->binding_table.ubo_start +
> > - const_index->u[0]);
> > + unsigned index = stage_prog_data->binding_table.ubo_start +
> > +  const_index->u[0];
> > + surf_index = fs_reg(index);
> > + brw_mark_surface_used(prog_data, index);
> >} else {
> >   /* The block index is not a constant. Evaluate the index 
> > expression
> >* per-channel and add the base UBO index; we have to select a 
> > value
> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > index 0f04f65..efbdaa9 100644
> > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > @@ -749,8 +749,10 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
> > *instr)
> >   /* The block index is a constant, so just emit the binding table 
> > entry
> >* as an immediate.
> >*/
> > - surf_index = src_reg(prog_data->base.binding_table.ubo_start +
> > -  const_block_index->u[0]);
> > + unsigned index = prog_data->base.binding_table.ubo_start +
> > +  const_block_index->u[0];
> > + surf_index = src_reg(index);
> > + brw_mark_surface_used(&prog_data->base, index);
> >} else {
> >   /* The block index is not a constant. Evaluate the index 
> > expression
> >* per-channel and add the base UBO index; we have to select a 
> > value
> 
> Is this necessary? As far as I can see, the generator opcodes will mark
> surfaces for which we have a constant index as used, so this should not
> be necessary.

Ah. You are correct.

I was debugging a UBO+SSBO test, and I noticed the discrepancy between
SSBO and UBO, but it appears SSBO has to mark it in brw_*_nir.cpp,
whereas the generator code can also mark it for the constant case for
UBOs.

Maybe we should just handle this in brw_*_nir.cpp for consistency?

-Jordan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: DispatchComputeIndirect should return INVALID_VALUE on unaligned

2015-10-30 Thread Jordan Justen
I sent a patch on Oct 14 for this:

  "main: Match DispatchCompute* API validation from main specification"

http://patchwork.freedesktop.org/patch/61881/

You might want monitor my cs git branch for related patches to help
with the OpenGLES 3.1 CTS compute shader tests.

-Jordan

On 2015-10-29 07:23:33, Marta Lofstedt wrote:
> From: Marta Lofstedt 
> 
> From the ARB_compute_shader specification:
> 
> "An INVALID_OPERATION error is generated [...] if  is
> less than zero or not a multiple of the size, in basic machine
> units, of uint."
> 
> However, OpenGL ES 3.1 specification, section 17 and OpenGL 4.5
> specification, section 19, has the updated definition:
> 
> "An INVALID_VALUE error is generated if indirect is negative or
> is not a multiple of the size, in basic machine units, of uint."
> 
> Mesa should use the updated version.
> ---
>  src/mesa/main/api_validate.c | 11 ++-
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
> index 06efe02..9ee8252 100644
> --- a/src/mesa/main/api_validate.c
> +++ b/src/mesa/main/api_validate.c
> @@ -960,14 +960,15 @@ valid_dispatch_indirect(struct gl_context *ctx,
> if (!check_valid_to_compute(ctx, name))
>return GL_FALSE;
>  
> -   /* From the ARB_compute_shader specification:
> +   /* From the OpenGL ES 3.1 specification, section 17 and the
> +* OpenGL 4.5 specification, section 19:
>  *
> -* "An INVALID_OPERATION error is generated [...] if  is less
> -*  than zero or not a multiple of the size, in basic machine units, of
> -*  uint."
> +*   "An INVALID_VALUE error is generated if indirect is negative
> +*or is not a multiple of the size, in basic machine units,
> +*of uint."
>  */
> if ((GLintptr)indirect & (sizeof(GLuint) - 1)) {
> -  _mesa_error(ctx, GL_INVALID_OPERATION,
> +  _mesa_error(ctx, GL_INVALID_VALUE,
>"%s(indirect is not aligned)", name);
>return GL_FALSE;
> }
> -- 
> 2.1.4
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/10] i965: always mark used surfaces in the visitors

2015-10-30 Thread Jordan Justen
On 2015-10-30 09:28:10, Matt Turner wrote:
> On Fri, Oct 30, 2015 at 4:11 AM, Iago Toral Quiroga  wrote:
> > Right now some opcodes that only use constant surface indexing mark them as
> > used in the generator while others do it in the visitor. When the opcode can
> > handle both direct and indirect surface indexing then some opcodes handle
> > only the constant part in the generator and leave the indirect case to the
> > caller. It is all very inconsistent and leads to confusion, since one has to
> > go and look into the generator code in each case to check if it marks 
> > surfaces
> > as used or not, and in which cases.
> >
> > when I was working on SSBOs I was tempted to try and fix this but then I
> > forgot. Jordan bumped into this recently too when comparing visitor
> > code paths for similar opcodes (ubos and ssbos) that need to handle this
> > differently because they use different generator opcodes.
> >
> > Since the generator opcodes never handle marking of indirect surfaces, just
> > leave surface marking to the caller completely, since callers always have
> > all the information needed for this. It also makes things more consistent
> > and clear for everyone: marking surfaces as used is always on the side
> > of the visitor, never the generator.
> 
> What happens if we dead code eliminate the last texture() on a given
> surface? Before in the constant indexed case, we wouldn't mark the
> surface as used, but since this series moves that to the NIR ->
> backend IR translation, presumably it'll still be marked used?
> 
> Is that important, and if so is there anything we can do to "unmark"
> the unused surface?

Seems like a good point, but I can't comment on how important that is.

This could apply to indirect accesses as well, right? But, I guess we
can't deal with that being optimized out today.

What if we had a range of possible surfaces referenced in the
backend_instruction struct? Could we then just look over them once
after optimizing to find out the max surface used?

-Jordan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] main: Match DispatchCompute* API validation from main specification

2015-10-30 Thread Jordan Justen
On 2015-10-30 00:16:43, Iago Toral wrote:
> On Wed, 2015-10-14 at 13:46 -0700, Jordan Justen wrote:
> > There is a discrepancy between the ARB_compute_shader specification,
> > and the OpenGL 4.3 and OpenGLES 3.1 specifications. With regards to
> > the indirect dispatch parameter, unsupported value errors should
> > return INVALID_VALUE according to the main specifications, whereas the
> > extension specification indicated INVALID_OPERATION should be
> > returned.
> > 
> > Here we update the code to match the main specifications, and update
> > the citations use the main specification rather than the extension
> > specification.
> > 
> > Signed-off-by: Jordan Justen 
> > ---
> >  Fixes ES31-CTS.compute_shader.api-indirect
> > 
> >  src/mesa/main/api_validate.c | 37 ++---
> >  1 file changed, 26 insertions(+), 11 deletions(-)
> > 
> > diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
> > index a46c194..c286945 100644
> > --- a/src/mesa/main/api_validate.c
> > +++ b/src/mesa/main/api_validate.c
> > @@ -895,6 +895,11 @@ check_valid_to_compute(struct gl_context *ctx, const 
> > char *function)
> >return false;
> > }
> >  
> > +   /* From the OpenGL 4.3 Core Specification, Chapter 19, Compute Shaders:
> > +*
> > +* "An INVALID_OPERATION error is generated if there is no active 
> > program
> > +*  for the compute shader stage."
> > +*/
> > prog = ctx->Shader.CurrentProgram[MESA_SHADER_COMPUTE];
> > if (prog == NULL || prog->_LinkedShaders[MESA_SHADER_COMPUTE] == NULL) {
> >_mesa_error(ctx, GL_INVALID_OPERATION,
> > @@ -917,6 +922,12 @@ _mesa_validate_DispatchCompute(struct gl_context *ctx,
> >return GL_FALSE;
> >  
> > for (i = 0; i < 3; i++) {
> > +  /* From the OpenGL 4.3 Core Specification, Chapter 19, Compute 
> > Shaders:
> > +   *
> > +   * "An INVALID_VALUE error is generated if any of num_groups_x,
> > +   *  num_groups_y and num_groups_z are greater than or equal to the
> > +   *  maximum work group count for the corresponding dimension."
> > +   */
> >if (num_groups[i] > ctx->Const.MaxComputeWorkGroupCount[i]) {
> 
> This should be >= according to that spec quotation.

Wow. Good catch, except, I am fairly certain that this must be a spec
error. 'greater than' instead of 'greater than or equal' is more
consistent with the usage and other parts of the spec.

Unfortunately, this wording exists in the OpenGL 4.3 - 4.5 specs. What
a mess. :)

Here is some contradicting text from the DispatchCompute section of
4.3 spec:

"num groups x, num groups y and num groups z specify the number of
 local work groups that will be dispatched in the X, Y and Z
 dimensions, respectively."

"The maximum number of work groups that may be dispatched at one time
 may be determined by calling GetIntegeri v with target set to
 MAX_COMPUTE_WORK_GROUP_COUNT and index set to zero, one, or two,
 representing the X, Y, and Z dimensions respectively."

So, it is saying that MAX_COMPUTE_WORK_GROUP_COUNT is the maximum
values. Thus being 'equal' to the max should be valid.

And, then in the DispatchComputeIndirect section of the 4.3 spec:

"If any of num groups x, num groups y or num groups z is greater than
 the value of MAX_COMPUTE_WORK_GROUP_COUNT for the corresponding
 dimension then the results are undefined."

Note 'greater than' rather than 'greater than or equal'.

I'll ask the spec author about it, but in the meantime, I think I
should just skip adding this spec citation, or instead use the one
from above that starts with "The maximum number of " ...

What do you think?

> 
> >   _mesa_error(ctx, GL_INVALID_VALUE,
> >   "glDispatchCompute(num_groups_%c)", 'x' + i);
> > @@ -937,24 +948,33 @@ valid_dispatch_indirect(struct gl_context *ctx,
> > if (!check_valid_to_compute(ctx, name))
> >return GL_FALSE;
> >  
> > -   /* From the ARB_compute_shader specification:
> > +   /* From the OpenGL 4.3 Core Specification, Chapter 19, Compute Shaders:
> >  *
> > -* "An INVALID_OPERATION error is generated [...] if  is less
> > -*  than zero or not a multiple of the size, in basic machine units, of
> > -*  uint."
> > +* "An INVALID_VALUE error is generated if indirect is negative or is 
> > not a
> > +*  multiple of four."
> > +* Note that the OpenGLES 3.1 specification match

Re: [Mesa-dev] [PATCH] glsl: OpenGLES GLSL 3.1 precision qualifiers ordering rules

2015-10-31 Thread Jordan Justen
On 2015-10-29 01:22:37, Iago Toral wrote:
> On Thu, 2015-10-29 at 00:47 -0700, Jordan Justen wrote:
> > The OpenGLES GLSL 3.1 specification uses the precision qualifier
> > ordering rules from ARB_shading_language_420pack.
> 
> Maybe expand the commit log to make explicit that this is for GLES 3.1
> and desktop GL since 4.2

has_420pack already checks for OpenGL 4.2, so for desktop GL, this
patch doesn't change anything.

I don't think we want to add OpenGLES 3.1 to has_420pack because the
3.1 spec doesn't appear to adopt all of 420pack.

-Jordan

> Reviewed-by: Iago Toral Quiroga 
> 
> > Signed-off-by: Jordan Justen 
> > ---
> >  src/glsl/glsl_parser.yy | 6 --
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
> > index 2f2e10d..4636435 100644
> > --- a/src/glsl/glsl_parser.yy
> > +++ b/src/glsl/glsl_parser.yy
> > @@ -948,7 +948,8 @@ parameter_qualifier:
> >if ($2.precision != ast_precision_none)
> >   _mesa_glsl_error(&@1, state, "duplicate precision qualifier");
> >  
> > -  if (!state->has_420pack() && $2.flags.i != 0)
> > +  if (!(state->has_420pack() || state->is_version(420, 310)) &&
> > +  $2.flags.i != 0)
> >   _mesa_glsl_error(&@1, state, "precision qualifiers must come 
> > last");
> >  
> >$$ = $2;
> > @@ -1847,7 +1848,8 @@ type_qualifier:
> >if ($2.precision != ast_precision_none)
> >   _mesa_glsl_error(&@1, state, "duplicate precision qualifier");
> >  
> > -  if (!state->has_420pack() && $2.flags.i != 0)
> > +  if (!(state->has_420pack() || state->is_version(420, 310)) &&
> > +  $2.flags.i != 0)
> >   _mesa_glsl_error(&@1, state, "precision qualifiers must come 
> > last");
> >  
> >$$ = $2;
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 1/2] mesa: Update DispatchComputeIndirect errors for indirect parameter

2015-11-02 Thread Jordan Justen
There is some discrepancy between the return values for some error
cases for the DispatchComputeIndirect call in the ARB_compute_shader
specification. Regarding the indirect parameter, in one place the
extension spec lists that the error returned for invalid values should
be INVALID_OPERATION, while later it specifies INVALID_VALUE.

The OpenGL 4.3 and OpenGLES 3.1 specifications appear to be consistent
in requiring the INVALID_VALUE error return in this case.

Here we update the code to match the main specifications, and update
the citations use the main specification rather than the extension
specification.

v2:
 * Updates based on review from Iago

Signed-off-by: Jordan Justen 
Cc: Iago Toral Quiroga 
Cc: Marta Lofstedt 
---
 src/mesa/main/api_validate.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
index 1f729e8..a3ee8c0 100644
--- a/src/mesa/main/api_validate.c
+++ b/src/mesa/main/api_validate.c
@@ -960,20 +960,19 @@ valid_dispatch_indirect(struct gl_context *ctx,
if (!check_valid_to_compute(ctx, name))
   return GL_FALSE;
 
-   /* From the ARB_compute_shader specification:
+   /* From the OpenGL 4.3 Core Specification, Chapter 19, Compute Shaders:
 *
-* "An INVALID_OPERATION error is generated [...] if  is less
-*  than zero or not a multiple of the size, in basic machine units, of
-*  uint."
+* "An INVALID_VALUE error is generated if indirect is negative or is not a
+*  multiple of four."
 */
if ((GLintptr)indirect & (sizeof(GLuint) - 1)) {
-  _mesa_error(ctx, GL_INVALID_OPERATION,
+  _mesa_error(ctx, GL_INVALID_VALUE,
   "%s(indirect is not aligned)", name);
   return GL_FALSE;
}
 
if ((GLintptr)indirect < 0) {
-  _mesa_error(ctx, GL_INVALID_OPERATION,
+  _mesa_error(ctx, GL_INVALID_VALUE,
   "%s(indirect is less than zero)", name);
   return GL_FALSE;
}
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 2/2] mesa: Add spec citations for DispatchCompute*

2015-11-02 Thread Jordan Justen
Note: The OpenGL 4.3 - 4.5 specification language for DispatchCompute
appears to have an error regarding the max allowed values. When adding
the specification citation, we note why the code does not match the
specification language.

v2:
 * Updates based on review from Iago

Signed-off-by: Jordan Justen 
Cc: Iago Toral Quiroga 
Cc: Marta Lofstedt 
---
 src/mesa/main/api_validate.c | 34 +-
 1 file changed, 29 insertions(+), 5 deletions(-)

diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
index a3ee8c0..a490189 100644
--- a/src/mesa/main/api_validate.c
+++ b/src/mesa/main/api_validate.c
@@ -918,6 +918,11 @@ check_valid_to_compute(struct gl_context *ctx, const char 
*function)
   return false;
}
 
+   /* From the OpenGL 4.3 Core Specification, Chapter 19, Compute Shaders:
+*
+* "An INVALID_OPERATION error is generated if there is no active program
+*  for the compute shader stage."
+*/
prog = ctx->_Shader->CurrentProgram[MESA_SHADER_COMPUTE];
if (prog == NULL || prog->_LinkedShaders[MESA_SHADER_COMPUTE] == NULL) {
   _mesa_error(ctx, GL_INVALID_OPERATION,
@@ -940,6 +945,24 @@ _mesa_validate_DispatchCompute(struct gl_context *ctx,
   return GL_FALSE;
 
for (i = 0; i < 3; i++) {
+  /* From the OpenGL 4.3 Core Specification, Chapter 19, Compute Shaders:
+   *
+   * "An INVALID_VALUE error is generated if any of num_groups_x,
+   *  num_groups_y and num_groups_z are greater than or equal to the
+   *  maximum work group count for the corresponding dimension."
+   *
+   * However, the "or equal to" portions appears to be a specification
+   * bug. In all other areas, the specification appears to indicate that
+   * the number of workgroups can match the MAX_COMPUTE_WORK_GROUP_COUNT
+   * value. For example, under DispatchComputeIndirect:
+   *
+   * "If any of num_groups_x, num_groups_y or num_groups_z is greater than
+   *  the value of MAX_COMPUTE_WORK_GROUP_COUNT for the corresponding
+   *  dimension then the results are undefined."
+   *
+   * Additionally, the OpenGLES 3.1 specification does not contain "or
+   * equal to" as an error condition.
+   */
   if (num_groups[i] > ctx->Const.MaxComputeWorkGroupCount[i]) {
  _mesa_error(ctx, GL_INVALID_VALUE,
  "glDispatchCompute(num_groups_%c)", 'x' + i);
@@ -977,6 +1000,12 @@ valid_dispatch_indirect(struct gl_context *ctx,
   return GL_FALSE;
}
 
+   /* From the OpenGL 4.3 Core Specification, Chapter 19, Compute Shaders:
+*
+* "An INVALID_OPERATION error is generated if no buffer is bound to the
+*  DRAW_INDIRECT_BUFFER binding, or if the command would source data
+*  beyond the end of the buffer object."
+*/
if (!_mesa_is_bufferobj(ctx->DispatchIndirectBuffer)) {
   _mesa_error(ctx, GL_INVALID_OPERATION,
   "%s: no buffer bound to DISPATCH_INDIRECT_BUFFER", name);
@@ -989,11 +1018,6 @@ valid_dispatch_indirect(struct gl_context *ctx,
   return GL_FALSE;
}
 
-   /* From the ARB_compute_shader specification:
-*
-* "An INVALID_OPERATION error is generated if this command sources data
-*  beyond the end of the buffer object [...]"
-*/
if (ctx->DispatchIndirectBuffer->Size < end) {
   _mesa_error(ctx, GL_INVALID_OPERATION,
   "%s(DISPATCH_INDIRECT_BUFFER too small)", name);
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/2] mesa: Add spec citations for DispatchCompute*

2015-11-02 Thread Jordan Justen
On 2015-11-02 00:35:36, Jordan Justen wrote:
> Note: The OpenGL 4.3 - 4.5 specification language for DispatchCompute
> appears to have an error regarding the max allowed values. When adding
> the specification citation, we note why the code does not match the
> specification language.
> 
> v2:
>  * Updates based on review from Iago
> 
> Signed-off-by: Jordan Justen 
> Cc: Iago Toral Quiroga 
> Cc: Marta Lofstedt 
> ---
>  src/mesa/main/api_validate.c | 34 +-
>  1 file changed, 29 insertions(+), 5 deletions(-)
> 
> diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
> index a3ee8c0..a490189 100644
> --- a/src/mesa/main/api_validate.c
> +++ b/src/mesa/main/api_validate.c
> @@ -918,6 +918,11 @@ check_valid_to_compute(struct gl_context *ctx, const 
> char *function)
>return false;
> }
>  
> +   /* From the OpenGL 4.3 Core Specification, Chapter 19, Compute Shaders:
> +*
> +* "An INVALID_OPERATION error is generated if there is no active program
> +*  for the compute shader stage."
> +*/
> prog = ctx->_Shader->CurrentProgram[MESA_SHADER_COMPUTE];
> if (prog == NULL || prog->_LinkedShaders[MESA_SHADER_COMPUTE] == NULL) {
>_mesa_error(ctx, GL_INVALID_OPERATION,
> @@ -940,6 +945,24 @@ _mesa_validate_DispatchCompute(struct gl_context *ctx,
>return GL_FALSE;
>  
> for (i = 0; i < 3; i++) {
> +  /* From the OpenGL 4.3 Core Specification, Chapter 19, Compute Shaders:
> +   *
> +   * "An INVALID_VALUE error is generated if any of num_groups_x,
> +   *  num_groups_y and num_groups_z are greater than or equal to the
> +   *  maximum work group count for the corresponding dimension."
> +   *
> +   * However, the "or equal to" portions appears to be a specification
> +   * bug. In all other areas, the specification appears to indicate that
> +   * the number of workgroups can match the MAX_COMPUTE_WORK_GROUP_COUNT
> +   * value. For example, under DispatchComputeIndirect:
> +   *
> +   * "If any of num_groups_x, num_groups_y or num_groups_z is greater 
> than
> +   *  the value of MAX_COMPUTE_WORK_GROUP_COUNT for the corresponding
> +   *  dimension then the results are undefined."
> +   *
> +   * Additionally, the OpenGLES 3.1 specification does not contain "or
> +   * equal to" as an error condition.

I did get confirmation from the extension spec author that this
appears to be a bug in the OpenGL 4.3 - 4.5 specs, and I filed an
OpenGL spec bug.

-Jordan

> +   */
>if (num_groups[i] > ctx->Const.MaxComputeWorkGroupCount[i]) {
>   _mesa_error(ctx, GL_INVALID_VALUE,
>   "glDispatchCompute(num_groups_%c)", 'x' + i);
> @@ -977,6 +1000,12 @@ valid_dispatch_indirect(struct gl_context *ctx,
>return GL_FALSE;
> }
>  
> +   /* From the OpenGL 4.3 Core Specification, Chapter 19, Compute Shaders:
> +*
> +* "An INVALID_OPERATION error is generated if no buffer is bound to the
> +*  DRAW_INDIRECT_BUFFER binding, or if the command would source data
> +*  beyond the end of the buffer object."
> +*/
> if (!_mesa_is_bufferobj(ctx->DispatchIndirectBuffer)) {
>_mesa_error(ctx, GL_INVALID_OPERATION,
>"%s: no buffer bound to DISPATCH_INDIRECT_BUFFER", name);
> @@ -989,11 +1018,6 @@ valid_dispatch_indirect(struct gl_context *ctx,
>return GL_FALSE;
> }
>  
> -   /* From the ARB_compute_shader specification:
> -*
> -* "An INVALID_OPERATION error is generated if this command sources data
> -*  beyond the end of the buffer object [...]"
> -*/
> if (ctx->DispatchIndirectBuffer->Size < end) {
>_mesa_error(ctx, GL_INVALID_OPERATION,
>"%s(DISPATCH_INDIRECT_BUFFER too small)", name);
> -- 
> 2.6.2
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] intel: Add SKL GT4 PCI IDs

2015-11-03 Thread Jordan Justen
Series Reviewed-by: Jordan Justen 

For the 2de4e8fdbae1e1909ce35f8ba15608a124686fb0 version on your drm
gt4 branch.

On 2015-10-23 10:56:33, Ben Widawsky wrote:
> Cc: Kristian Høgsberg 
> Cc: Damien Lespiau 
> Signed-off-by: Ben Widawsky 
> ---
>  intel/intel_chipset.h | 12 +++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/intel/intel_chipset.h b/intel/intel_chipset.h
> index 253ea71..6c8dc73 100644
> --- a/intel/intel_chipset.h
> +++ b/intel/intel_chipset.h
> @@ -180,6 +180,10 @@
>  #define PCI_CHIP_SKYLAKE_SRV_GT3   0x192A
>  #define PCI_CHIP_SKYLAKE_SRV_GT1   0x190A
>  #define PCI_CHIP_SKYLAKE_WKS_GT2   0x191D
> +#define PCI_CHIP_SKYLAKE_DT_GT40x1932
> +#define PCI_CHIP_SKYLAKE_SRV_GT4   0x193A
> +#define PCI_CHIP_SKYLAKE_H_GT4 0x193B
> +#define PCI_CHIP_SKYLAKE_WKS_GT4   0x193D
>  
>  #define PCI_CHIP_BROXTON_0 0x0A84
>  #define PCI_CHIP_BROXTON_1 0x1A84
> @@ -362,9 +366,15 @@
>  (devid) == PCI_CHIP_SKYLAKE_HALO_GT3   || \
>  (devid) == PCI_CHIP_SKYLAKE_SRV_GT3)
>  
> +#define IS_SKL_GT4(devid)  ((devid) == PCI_CHIP_SKYLAKE_DT_GT4 || \
> +(devid) == PCI_CHIP_SKYLAKE_SRV_GT4|| \
> +(devid) == PCI_CHIP_SKYLAKE_H_GT4  || \
> +(devid) == PCI_CHIP_SKYLAKE_WKS_GT4)
> +
>  #define IS_SKYLAKE(devid)  (IS_SKL_GT1(devid) || \
>  IS_SKL_GT2(devid) || \
> -IS_SKL_GT3(devid))
> +IS_SKL_GT3(devid) || \
> +IS_SKL_GT4(devid))
>  
>  #define IS_BROXTON(devid)  ((devid) == PCI_CHIP_BROXTON_0  || \
>  (devid) == PCI_CHIP_BROXTON_1  || \
> -- 
> 2.6.1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] [v3] i965/skl: Add GT4 PCI IDs

2015-11-03 Thread Jordan Justen
Seried Reviewed-by: Jordan Justen 

For the dde33fc23c4ef8b8e02fb5768161fdaa078847d5 version on your mesa
gt4 branch.

On 2015-10-29 17:30:35, Ben Widawsky wrote:
> Like other gen8+ hardware, the hardware automatically scales up thread counts
> and URB sizes, so there is no need to do anything but add the PCI IDs.
> 
> FINISHME: This patch still needs testing before merge.
> 
> v2: Remove the PCI ID removal. That should be done as part of the next patch.
> 
> v3: Update the wm thread count to support GT4.
> 
> Cc: mesa-sta...@lists.freedesktop.org
> Signed-off-by: Ben Widawsky 
> ---
>  include/pci_ids/i965_pci_ids.h  | 4 
>  src/mesa/drivers/dri/i965/brw_device_info.c | 6 +-
>  2 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/include/pci_ids/i965_pci_ids.h b/include/pci_ids/i965_pci_ids.h
> index 8a42599..626064a 100644
> --- a/include/pci_ids/i965_pci_ids.h
> +++ b/include/pci_ids/i965_pci_ids.h
> @@ -124,6 +124,10 @@ CHIPSET(0x1921, skl_gt2, "Intel(R) Skylake ULT GT2F")
>  CHIPSET(0x1926, skl_gt3, "Intel(R) Skylake ULT GT3")
>  CHIPSET(0x192A, skl_gt3, "Intel(R) Skylake SRV GT3")
>  CHIPSET(0x192B, skl_gt3, "Intel(R) Skylake Halo GT3")
> +CHIPSET(0x1932, skl_gt4, "Intel(R) Skylake GT4")
> +CHIPSET(0x193A, skl_gt4, "Intel(R) Skylake GT4")
> +CHIPSET(0x193B, skl_gt4, "Intel(R) Skylake GT4")
> +CHIPSET(0x193D, skl_gt4, "Intel(R) Skylake GT4")
>  CHIPSET(0x22B0, chv, "Intel(R) HD Graphics (Cherryview)")
>  CHIPSET(0x22B1, chv, "Intel(R) HD Graphics (Cherryview)")
>  CHIPSET(0x22B2, chv, "Intel(R) HD Graphics (Cherryview)")
> diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c 
> b/src/mesa/drivers/dri/i965/brw_device_info.c
> index e86b530..2ebc084 100644
> --- a/src/mesa/drivers/dri/i965/brw_device_info.c
> +++ b/src/mesa/drivers/dri/i965/brw_device_info.c
> @@ -311,7 +311,7 @@ static const struct brw_device_info brw_device_info_chv = 
> {
> .max_gs_threads = 336,   \
> .max_hs_threads = 336,   \
> .max_ds_threads = 336,   \
> -   .max_wm_threads = 64 * 6,\
> +   .max_wm_threads = 64 * 9,\
> .max_cs_threads = 56,\
> .urb = { \
>.size = 384,  \
> @@ -335,6 +335,10 @@ static const struct brw_device_info 
> brw_device_info_skl_gt3 = {
> GEN9_FEATURES, .gt = 3,
>  };
>  
> +static const struct brw_device_info brw_device_info_skl_gt4 = {
> +   GEN9_FEATURES, .gt = 4,
> +};
> +
>  static const struct brw_device_info brw_device_info_bxt = {
> GEN9_FEATURES,
> .is_broxton = 1,
> -- 
> 2.6.2
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] glsl: Add new barrier functions for compute shaders

2015-11-03 Thread Jordan Justen
When these functions are called in GLSL code, we create an intrinsic
function call:

 * groupMemoryBarrier => __intrinsic_group_memory_barrier
 * memoryBarrierAtomicCounter => __intrinsic_memory_barrier_atomic_counter
 * memoryBarrierBuffer => __intrinsic_memory_barrier_buffer
 * memoryBarrierImage => __intrinsic_memory_barrier_image
 * memoryBarrierShared => __intrinsic_memory_barrier_shared

Signed-off-by: Jordan Justen 
---

Notes:
git://people.freedesktop.org/~jljusten/mesa cs-shader-barrier-funcs-v1
http://patchwork.freedesktop.org/bundle/jljusten/cs-shader-barrier-funcs-v1

Tested with the OpenGLES 3.1 CTS test suite on my cs branch. With
these patches reverted, these tests fail:

 * ES31-CTS.compute_shader.work-group-size
 * ES31-CTS.compute_shader.shared-simple
 * ES31-CTS.compute_shader.shared-struct
 * ES31-CTS.compute_shader.atomic-case1
 * ES31-CTS.compute_shader.shared-indexing

 src/glsl/builtin_functions.cpp | 134 -
 1 file changed, 133 insertions(+), 1 deletion(-)

diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp
index 509a57b..2bb7bcb 100644
--- a/src/glsl/builtin_functions.cpp
+++ b/src/glsl/builtin_functions.cpp
@@ -459,9 +459,15 @@ fp64(const _mesa_glsl_parse_state *state)
 }
 
 static bool
+compute_shader(const _mesa_glsl_parse_state *state)
+{
+   return state->stage == MESA_SHADER_COMPUTE;
+}
+
+static bool
 barrier_supported(const _mesa_glsl_parse_state *state)
 {
-   return state->stage == MESA_SHADER_COMPUTE ||
+   return compute_shader(state) ||
   state->stage == MESA_SHADER_TESS_CTRL;
 }
 
@@ -715,6 +721,11 @@ private:
ir_function_signature *_EndStreamPrimitive(builtin_available_predicate 
avail,
   const glsl_type *stream_type);
B0(barrier)
+   B0(groupMemoryBarrier)
+   B0(memoryBarrierAtomicCounter)
+   B0(memoryBarrierBuffer)
+   B0(memoryBarrierImage)
+   B0(memoryBarrierShared)
 
BA2(textureQueryLod);
B1(textureQueryLevels);
@@ -787,6 +798,26 @@ private:
   builtin_available_predicate avail);
ir_function_signature *_memory_barrier(
   builtin_available_predicate avail);
+   ir_function_signature *_group_memory_barrier_intrinsic(
+  builtin_available_predicate avail);
+   ir_function_signature *_group_memory_barrier(
+  builtin_available_predicate avail);
+   ir_function_signature *_memory_barrier_atomic_counter_intrinsic(
+  builtin_available_predicate avail);
+   ir_function_signature *_memory_barrier_atomic_counter(
+  builtin_available_predicate avail);
+   ir_function_signature *_memory_barrier_buffer_intrinsic(
+  builtin_available_predicate avail);
+   ir_function_signature *_memory_barrier_buffer(
+  builtin_available_predicate avail);
+   ir_function_signature *_memory_barrier_image_intrinsic(
+  builtin_available_predicate avail);
+   ir_function_signature *_memory_barrier_image(
+  builtin_available_predicate avail);
+   ir_function_signature *_memory_barrier_shared_intrinsic(
+  builtin_available_predicate avail);
+   ir_function_signature *_memory_barrier_shared(
+  builtin_available_predicate avail);
 
ir_function_signature *_shader_clock_intrinsic(builtin_available_predicate 
avail,
   const glsl_type *type);
@@ -968,6 +999,22 @@ builtin_builder::create_intrinsics()
 _shader_clock_intrinsic(shader_clock,
 glsl_type::uvec2_type),
 NULL);
+
+   add_function("__intrinsic_group_memory_barrier",
+_group_memory_barrier_intrinsic(compute_shader),
+NULL);
+   add_function("__intrinsic_memory_barrier_atomic_counter",
+_memory_barrier_atomic_counter_intrinsic(compute_shader),
+NULL);
+   add_function("__intrinsic_memory_barrier_buffer",
+_memory_barrier_buffer_intrinsic(compute_shader),
+NULL);
+   add_function("__intrinsic_memory_barrier_image",
+_memory_barrier_image_intrinsic(compute_shader),
+NULL);
+   add_function("__intrinsic_memory_barrier_shared",
+_memory_barrier_shared_intrinsic(compute_shader),
+NULL);
 }
 
 /**
@@ -2081,6 +2128,11 @@ builtin_builder::create_builtins()
 _EndStreamPrimitive(gs_streams, glsl_type::int_type),
 NULL);
add_function("barrier", _barrier(), NULL);
+   add_function("groupMemoryBarrier", _group_memory_barrier(compute_shader), 
NULL);
+   add_function("memoryBarrierAtomicCounter", 
_memory_barrier_atomic_counter(compute_shader), NULL);
+   add_function("memoryBarrierBuffer", _memory_barrier_buffer(compute_shader), 
NULL);
+   add_function("memoryBarrierImage", _memory_

[Mesa-dev] [PATCH 2/3] nir: Add new barrier functions for compute shaders

2015-11-03 Thread Jordan Justen
When these functions are called in glsl-ir, we create a corresponding
nir intrinsic function call.

Signed-off-by: Jordan Justen 
---
 src/glsl/nir/glsl_to_nir.cpp  | 15 +++
 src/glsl/nir/nir_intrinsics.h | 11 +++
 2 files changed, 26 insertions(+)

diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
index 57aba5b..facb9fa 100644
--- a/src/glsl/nir/glsl_to_nir.cpp
+++ b/src/glsl/nir/glsl_to_nir.cpp
@@ -719,6 +719,16 @@ nir_visitor::visit(ir_call *ir)
  op = nir_intrinsic_ssbo_atomic_comp_swap;
   } else if (strcmp(ir->callee_name(), "__intrinsic_shader_clock") == 0) {
  op = nir_intrinsic_shader_clock;
+  } else if (strcmp(ir->callee_name(), "__intrinsic_group_memory_barrier") 
== 0) {
+ op = nir_intrinsic_group_memory_barrier;
+  } else if (strcmp(ir->callee_name(), 
"__intrinsic_memory_barrier_atomic_counter") == 0) {
+ op = nir_intrinsic_memory_barrier_atomic_counter;
+  } else if (strcmp(ir->callee_name(), 
"__intrinsic_memory_barrier_buffer") == 0) {
+ op = nir_intrinsic_memory_barrier_buffer;
+  } else if (strcmp(ir->callee_name(), "__intrinsic_memory_barrier_image") 
== 0) {
+ op = nir_intrinsic_memory_barrier_image;
+  } else if (strcmp(ir->callee_name(), 
"__intrinsic_memory_barrier_shared") == 0) {
+ op = nir_intrinsic_memory_barrier_shared;
   } else {
  unreachable("not reached");
   }
@@ -821,6 +831,11 @@ nir_visitor::visit(ir_call *ir)
  break;
   }
   case nir_intrinsic_memory_barrier:
+  case nir_intrinsic_group_memory_barrier:
+  case nir_intrinsic_memory_barrier_atomic_counter:
+  case nir_intrinsic_memory_barrier_buffer:
+  case nir_intrinsic_memory_barrier_image:
+  case nir_intrinsic_memory_barrier_shared:
  nir_instr_insert_after_cf_list(this->cf_node_list, &instr->instr);
  break;
   case nir_intrinsic_shader_clock:
diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h
index c2b6fe7..36fb286 100644
--- a/src/glsl/nir/nir_intrinsics.h
+++ b/src/glsl/nir/nir_intrinsics.h
@@ -91,6 +91,17 @@ BARRIER(memory_barrier)
  */
 INTRINSIC(shader_clock, 0, ARR(), true, 1, 0, 0, NIR_INTRINSIC_CAN_ELIMINATE)
 
+/*
+ * Memory barrier with semantics analogous to the compute shader
+ * groupMemoryBarrier(), memoryBarrierAtomicCounter(), memoryBarrierBuffer(),
+ * memoryBarrierImage() and memoryBarrierShared() GLSL intrinsics.
+ */
+BARRIER(group_memory_barrier)
+BARRIER(memory_barrier_atomic_counter)
+BARRIER(memory_barrier_buffer)
+BARRIER(memory_barrier_image)
+BARRIER(memory_barrier_shared)
+
 /** A conditional discard, with a single boolean source. */
 INTRINSIC(discard_if, 1, ARR(1), false, 0, 0, 0, 0)
 
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] i965/nir/fs: Implement new barrier functions for compute shaders

2015-11-03 Thread Jordan Justen
For these nir intrinsics, we emit the same code as
nir_intrinsic_memory_barrier:

 * nir_intrinsic_memory_barrier_atomic_counter
 * nir_intrinsic_memory_barrier_buffer
 * nir_intrinsic_memory_barrier_image

We treat these nir intrinsics as no-ops:

 * nir_intrinsic_group_memory_barrier
 * nir_intrinsic_memory_barrier_shared

Signed-off-by: Jordan Justen 
Cc: Francisco Jerez 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 7eeff93..044b83e 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -1324,6 +1324,9 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, 
nir_intrinsic_instr *instr
   break;
}
 
+   case nir_intrinsic_memory_barrier_atomic_counter:
+   case nir_intrinsic_memory_barrier_buffer:
+   case nir_intrinsic_memory_barrier_image:
case nir_intrinsic_memory_barrier: {
   const fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_UD, 16 / dispatch_width);
   bld.emit(SHADER_OPCODE_MEMORY_FENCE, tmp)
@@ -1331,6 +1334,10 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, 
nir_intrinsic_instr *instr
   break;
}
 
+   case nir_intrinsic_group_memory_barrier:
+   case nir_intrinsic_memory_barrier_shared:
+  break;
+
case nir_intrinsic_shader_clock: {
   /* We cannot do anything if there is an event, so ignore it for now */
   fs_reg shader_clock = get_timestamp(bld);
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 1/3] glsl: Add new barrier functions for compute shaders

2015-11-04 Thread Jordan Justen
When these functions are called in GLSL code, we create an intrinsic
function call:

 * groupMemoryBarrier => __intrinsic_group_memory_barrier
 * memoryBarrierAtomicCounter => __intrinsic_memory_barrier_atomic_counter
 * memoryBarrierBuffer => __intrinsic_memory_barrier_buffer
 * memoryBarrierImage => __intrinsic_memory_barrier_image
 * memoryBarrierShared => __intrinsic_memory_barrier_shared

v2:
 * Consolidate with memoryBarrier function/intrinsic creation (curro)

Signed-off-by: Jordan Justen 
---
 src/glsl/builtin_functions.cpp | 58 +++---
 1 file changed, 49 insertions(+), 9 deletions(-)

diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp
index 509a57b..21ae5ce 100644
--- a/src/glsl/builtin_functions.cpp
+++ b/src/glsl/builtin_functions.cpp
@@ -459,9 +459,15 @@ fp64(const _mesa_glsl_parse_state *state)
 }
 
 static bool
+compute_shader(const _mesa_glsl_parse_state *state)
+{
+   return state->stage == MESA_SHADER_COMPUTE;
+}
+
+static bool
 barrier_supported(const _mesa_glsl_parse_state *state)
 {
-   return state->stage == MESA_SHADER_COMPUTE ||
+   return compute_shader(state) ||
   state->stage == MESA_SHADER_TESS_CTRL;
 }
 
@@ -785,7 +791,9 @@ private:
 
ir_function_signature *_memory_barrier_intrinsic(
   builtin_available_predicate avail);
-   ir_function_signature *_memory_barrier(
+   void add_memory_barrier_function(
+  const char *function_name,
+  const char *intrinsic_name,
   builtin_available_predicate avail);
 
ir_function_signature *_shader_clock_intrinsic(builtin_available_predicate 
avail,
@@ -963,6 +971,21 @@ builtin_builder::create_intrinsics()
add_function("__intrinsic_memory_barrier",
 _memory_barrier_intrinsic(shader_image_load_store),
 NULL);
+   add_function("__intrinsic_group_memory_barrier",
+_memory_barrier_intrinsic(compute_shader),
+NULL);
+   add_function("__intrinsic_memory_barrier_atomic_counter",
+_memory_barrier_intrinsic(compute_shader),
+NULL);
+   add_function("__intrinsic_memory_barrier_buffer",
+_memory_barrier_intrinsic(compute_shader),
+NULL);
+   add_function("__intrinsic_memory_barrier_image",
+_memory_barrier_intrinsic(compute_shader),
+NULL);
+   add_function("__intrinsic_memory_barrier_shared",
+_memory_barrier_intrinsic(compute_shader),
+NULL);
 
add_function("__intrinsic_shader_clock",
 _shader_clock_intrinsic(shader_clock,
@@ -2753,9 +2776,24 @@ builtin_builder::create_builtins()
 
add_image_functions(true);
 
-   add_function("memoryBarrier",
-_memory_barrier(shader_image_load_store),
-NULL);
+   add_memory_barrier_function("memoryBarrier",
+   "__intrinsic_memory_barrier",
+   shader_image_load_store);
+   add_memory_barrier_function("groupMemoryBarrier",
+   "__intrinsic_group_memory_barrier",
+   compute_shader);
+   add_memory_barrier_function("memoryBarrierAtomicCounter",
+   "__intrinsic_memory_barrier_atomic_counter",
+   compute_shader);
+   add_memory_barrier_function("memoryBarrierBuffer",
+   "__intrinsic_memory_barrier_buffer",
+   compute_shader);
+   add_memory_barrier_function("memoryBarrierImage",
+   "__intrinsic_memory_barrier_image",
+   compute_shader);
+   add_memory_barrier_function("memoryBarrierShared",
+   "__intrinsic_memory_barrier_shared",
+   compute_shader);
 
add_function("clock2x32ARB",
 _shader_clock(shader_clock,
@@ -5263,13 +5301,15 @@ 
builtin_builder::_memory_barrier_intrinsic(builtin_available_predicate avail)
return sig;
 }
 
-ir_function_signature *
-builtin_builder::_memory_barrier(builtin_available_predicate avail)
+void
+builtin_builder::add_memory_barrier_function(const char *function_name,
+ const char *intrinsic_name,
+ builtin_available_predicate avail)
 {
MAKE_SIG(glsl_type::void_type, avail, 0);
-   body.emit(call(shader->symbols->get_function("__intrinsic_memory_barrier"),
+   body.emit(call(shader->symbols->get_function(intrinsic_name),
   NULL, sig->parameters));
-   return sig;
+   add_function(function_name, sig, NULL);
 }
 
 ir_function_signature *
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/3] glsl: Add new barrier functions for compute shaders

2015-11-05 Thread Jordan Justen
On 2015-11-05 06:07:02, Francisco Jerez wrote:
> Jordan Justen  writes:
> 
> > When these functions are called in GLSL code, we create an intrinsic
> > function call:
> >
> >  * groupMemoryBarrier => __intrinsic_group_memory_barrier
> >  * memoryBarrierAtomicCounter => __intrinsic_memory_barrier_atomic_counter
> >  * memoryBarrierBuffer => __intrinsic_memory_barrier_buffer
> >  * memoryBarrierImage => __intrinsic_memory_barrier_image
> >  * memoryBarrierShared => __intrinsic_memory_barrier_shared
> >
> > v2:
> >  * Consolidate with memoryBarrier function/intrinsic creation (curro)
> >
> > Signed-off-by: Jordan Justen 
> > ---
> >  src/glsl/builtin_functions.cpp | 58 
> > +++---
> >  1 file changed, 49 insertions(+), 9 deletions(-)
> >
> > diff --git a/src/glsl/builtin_functions.cpp b/src/glsl/builtin_functions.cpp
> > index 509a57b..21ae5ce 100644
> > --- a/src/glsl/builtin_functions.cpp
> > +++ b/src/glsl/builtin_functions.cpp
> > @@ -459,9 +459,15 @@ fp64(const _mesa_glsl_parse_state *state)
> >  }
> >  
> >  static bool
> > +compute_shader(const _mesa_glsl_parse_state *state)
> > +{
> > +   return state->stage == MESA_SHADER_COMPUTE;
> > +}
> > +
> > +static bool
> >  barrier_supported(const _mesa_glsl_parse_state *state)
> >  {
> > -   return state->stage == MESA_SHADER_COMPUTE ||
> > +   return compute_shader(state) ||
> >state->stage == MESA_SHADER_TESS_CTRL;
> >  }
> >  
> > @@ -785,7 +791,9 @@ private:
> >  
> > ir_function_signature *_memory_barrier_intrinsic(
> >builtin_available_predicate avail);
> > -   ir_function_signature *_memory_barrier(
> > +   void add_memory_barrier_function(
> > +  const char *function_name,
> > +  const char *intrinsic_name,
> >builtin_available_predicate avail);
> >  
> Do we need a separate add_*_function() method for these?  The most
> consistent with the other builtin builder code would be to have a
> _memory_barrier() method to construct the ir_function_signature given
> the name of the intrinsic to call and an availability predicate, and
> then pass it to the usual add_function() method, kind of like you did in
> your v1.  Admittedly image load/store/atomic built-ins use a separate
> add_image_function() for this because they need such a ridiculous amount
> of overloads -- But that's the exception rather than the rule.

So, add the intrinsic name to lookup to _memory_barrier?

Like this?

   add_function("memoryBarrierAtomicCounter",
_memory_barrier("__intrinsic_memory_barrier_atomic_counter",
compute_shader),
NULL);

I thought add_memory_barrier_function looked a little better, but
either is fine with me.

-Jordan

> 
> > ir_function_signature 
> > *_shader_clock_intrinsic(builtin_available_predicate avail,
> > @@ -963,6 +971,21 @@ builtin_builder::create_intrinsics()
> > add_function("__intrinsic_memory_barrier",
> >  _memory_barrier_intrinsic(shader_image_load_store),
> >  NULL);
> > +   add_function("__intrinsic_group_memory_barrier",
> > +_memory_barrier_intrinsic(compute_shader),
> > +NULL);
> > +   add_function("__intrinsic_memory_barrier_atomic_counter",
> > +_memory_barrier_intrinsic(compute_shader),
> > +NULL);
> > +   add_function("__intrinsic_memory_barrier_buffer",
> > +_memory_barrier_intrinsic(compute_shader),
> > +NULL);
> > +   add_function("__intrinsic_memory_barrier_image",
> > +_memory_barrier_intrinsic(compute_shader),
> > +NULL);
> > +   add_function("__intrinsic_memory_barrier_shared",
> > +_memory_barrier_intrinsic(compute_shader),
> > +NULL);
> >  
> > add_function("__intrinsic_shader_clock",
> >  _shader_clock_intrinsic(shader_clock,
> > @@ -2753,9 +2776,24 @@ builtin_builder::create_builtins()
> >  
> > add_image_functions(true);
> >  
> > -   add_function("memoryBarrier",
> > -_memory_barrier(shader_image_load_store),
> > -NULL);
> > +   add_memory_barrier_function("memoryBarrier",
> > +   "__intrinsic_memory_barrier",
> > +

[Mesa-dev] [PATCH v3 2/3] nir: Add new barrier functions for compute shaders

2015-11-05 Thread Jordan Justen
When these functions are called in glsl-ir, we create a corresponding
nir intrinsic function call.

Signed-off-by: Jordan Justen 
Reviewed-by: Francisco Jerez 
---
 src/glsl/nir/glsl_to_nir.cpp  | 15 +++
 src/glsl/nir/nir_intrinsics.h | 11 +++
 2 files changed, 26 insertions(+)

diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
index 57aba5b..facb9fa 100644
--- a/src/glsl/nir/glsl_to_nir.cpp
+++ b/src/glsl/nir/glsl_to_nir.cpp
@@ -719,6 +719,16 @@ nir_visitor::visit(ir_call *ir)
  op = nir_intrinsic_ssbo_atomic_comp_swap;
   } else if (strcmp(ir->callee_name(), "__intrinsic_shader_clock") == 0) {
  op = nir_intrinsic_shader_clock;
+  } else if (strcmp(ir->callee_name(), "__intrinsic_group_memory_barrier") 
== 0) {
+ op = nir_intrinsic_group_memory_barrier;
+  } else if (strcmp(ir->callee_name(), 
"__intrinsic_memory_barrier_atomic_counter") == 0) {
+ op = nir_intrinsic_memory_barrier_atomic_counter;
+  } else if (strcmp(ir->callee_name(), 
"__intrinsic_memory_barrier_buffer") == 0) {
+ op = nir_intrinsic_memory_barrier_buffer;
+  } else if (strcmp(ir->callee_name(), "__intrinsic_memory_barrier_image") 
== 0) {
+ op = nir_intrinsic_memory_barrier_image;
+  } else if (strcmp(ir->callee_name(), 
"__intrinsic_memory_barrier_shared") == 0) {
+ op = nir_intrinsic_memory_barrier_shared;
   } else {
  unreachable("not reached");
   }
@@ -821,6 +831,11 @@ nir_visitor::visit(ir_call *ir)
  break;
   }
   case nir_intrinsic_memory_barrier:
+  case nir_intrinsic_group_memory_barrier:
+  case nir_intrinsic_memory_barrier_atomic_counter:
+  case nir_intrinsic_memory_barrier_buffer:
+  case nir_intrinsic_memory_barrier_image:
+  case nir_intrinsic_memory_barrier_shared:
  nir_instr_insert_after_cf_list(this->cf_node_list, &instr->instr);
  break;
   case nir_intrinsic_shader_clock:
diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h
index c2b6fe7..36fb286 100644
--- a/src/glsl/nir/nir_intrinsics.h
+++ b/src/glsl/nir/nir_intrinsics.h
@@ -91,6 +91,17 @@ BARRIER(memory_barrier)
  */
 INTRINSIC(shader_clock, 0, ARR(), true, 1, 0, 0, NIR_INTRINSIC_CAN_ELIMINATE)
 
+/*
+ * Memory barrier with semantics analogous to the compute shader
+ * groupMemoryBarrier(), memoryBarrierAtomicCounter(), memoryBarrierBuffer(),
+ * memoryBarrierImage() and memoryBarrierShared() GLSL intrinsics.
+ */
+BARRIER(group_memory_barrier)
+BARRIER(memory_barrier_atomic_counter)
+BARRIER(memory_barrier_buffer)
+BARRIER(memory_barrier_image)
+BARRIER(memory_barrier_shared)
+
 /** A conditional discard, with a single boolean source. */
 INTRINSIC(discard_if, 1, ARR(1), false, 0, 0, 0, 0)
 
-- 
2.6.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   3   4   5   6   7   8   9   10   >