[Mesa-dev] [PATCH 1/2] st/xlib: Fix XImage bytes-per-pixel calculation

2013-06-14 Thread Richard Sandiford
Fixes a crash seen while running gnome on a 16-bit screen over vnc.

Signed-off-by: Richard Sandiford 
---
 src/gallium/state_trackers/glx/xlib/xm_api.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/src/gallium/state_trackers/glx/xlib/xm_api.c 
b/src/gallium/state_trackers/glx/xlib/xm_api.c
index b758c8e..36ebb46 100644
--- a/src/gallium/state_trackers/glx/xlib/xm_api.c
+++ b/src/gallium/state_trackers/glx/xlib/xm_api.c
@@ -1407,9 +1407,8 @@ XMesaBindTexImage(Display *dpy, XMesaBuffer drawable, int 
buffer,
  return;
   }
 
-  /* The pipe transfer has a pitch rounded up to the nearest 64 pixels.
- We assume 32 bit pixels. */
-  ximage_stride = w * 4;
+  /* The pipe transfer has a pitch rounded up to the nearest 64 pixels.  */
+  ximage_stride = w * ((img->bits_per_pixel + 7) / 8);
 
   for (line = 0; line < h; line++)
  memcpy(&map[line * tex_xfer->stride],
-- 
1.7.11.7

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] st/xlib: Fix XImage stride calculation

2013-06-14 Thread Richard Sandiford
Fixes window skew seen while running gnome on a 16-bit screen over vnc.

Signed-off-by: Richard Sandiford 
---
 src/gallium/state_trackers/glx/xlib/xm_api.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/gallium/state_trackers/glx/xlib/xm_api.c 
b/src/gallium/state_trackers/glx/xlib/xm_api.c
index 36ebb46..ca717bd 100644
--- a/src/gallium/state_trackers/glx/xlib/xm_api.c
+++ b/src/gallium/state_trackers/glx/xlib/xm_api.c
@@ -1381,7 +1381,7 @@ XMesaBindTexImage(Display *dpy, XMesaBuffer drawable, int 
buffer,
   enum pipe_format internal_format = res->format;
   struct pipe_transfer *tex_xfer;
   char *map;
-  int line, ximage_stride;
+  int line, byte_width;
   XImage *img;
 
   internal_format = choose_pixel_format(drawable->xm_visual);
@@ -1408,12 +1408,12 @@ XMesaBindTexImage(Display *dpy, XMesaBuffer drawable, 
int buffer,
   }
 
   /* The pipe transfer has a pitch rounded up to the nearest 64 pixels.  */
-  ximage_stride = w * ((img->bits_per_pixel + 7) / 8);
+  byte_width = w * ((img->bits_per_pixel + 7) / 8);
 
   for (line = 0; line < h; line++)
  memcpy(&map[line * tex_xfer->stride],
-&img->data[line * ximage_stride],
-ximage_stride);
+&img->data[line * img->bytes_per_line],
+byte_width);
 
   pipe_transfer_unmap(pipe, tex_xfer);
 
-- 
1.7.11.7

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/8] util: Expand the comment above the channel[] array

2013-06-14 Thread Richard Sandiford
Will Schmidt  writes:
> On Thu, 2013-06-13 at 14:50 +0100, Richard Sandiford wrote:
>>
>
> The entirety of the comment looks pretty good to me.  :-) One
> question, and this is mostly curiosity on my part, I'm not specifically
> asking for another revision. 
>
>> * (This is the same as C bitfield layout on most ABIs.)
>
> Do we have a handle on what 'most ABIs' are?   I.e. would this include
> X86* and PPC* ABIs as we know them today, or do we already clearly
> understand which ones would not match?

I think it includes all ABIs supported by GCC, including the various x86
and ppc ones like you say.  I should probably have dropped the bitfield
thing altogether though.

Thanks,
Richard

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] glsl/linker: Use correct array length when linking inter-stage uniforms and varyings.

2013-06-14 Thread Fabian Bieler
Signed-off-by: Fabian Bieler 
---
 src/glsl/linker.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index cd8d680..e3a8ccd 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -1147,7 +1147,7 @@ update_array_sizes(struct gl_shader_program *prog)
}
 }
 
-if (size + 1 != var->type->fields.array->length) {
+if (size + 1 != var->type->length) {
/* If this is a built-in uniform (i.e., it's backed by some
 * fixed-function state), adjust the number of state slots to
 * match the new array size.  The number of slots per array entry
-- 
1.8.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] glsl: Only call mark_whole_array_access for arrays.

2013-06-14 Thread Fabian Bieler
Otherwise the max_array_access field of scalar variables is set to 0x.
This doesn't lead to any errors since that field isn't used for scalar
variables but leaving it at zero is probably better.

Signed-off-by: Fabian Bieler 
---
 src/glsl/ast_to_hir.cpp | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index e918ade..f14a5b1 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -656,6 +656,8 @@ validate_assignment(struct _mesa_glsl_parse_state *state,
 static void
 mark_whole_array_access(ir_rvalue *access)
 {
+   assert(access->type->is_array());
+
ir_dereference_variable *deref = access->as_dereference_variable();
 
if (deref && deref->var) {
@@ -763,8 +765,10 @@ do_assignment(exec_list *instructions, struct 
_mesa_glsl_parse_state *state,
   rhs->type->array_size());
 d->type = var->type;
   }
-  mark_whole_array_access(rhs);
-  mark_whole_array_access(lhs);
+  if (rhs->type->is_array()) {
+mark_whole_array_access(rhs);
+mark_whole_array_access(lhs);
+  }
}
 
/* Most callers of do_assignment (assign, add_assign, pre_inc/dec,
-- 
1.8.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] mesa/main: Check for 0 size draws after validation.

2013-06-14 Thread Fabian Bieler
When validating draw parameters move check for 0 draw count last
(drawing with count 0 is not an error), so that other parameters (e.g.: the
primitive type) are validated and the correct errors (if applicable) are
generated.

>From the OpenGL 3.3 spec page 33 (page 48 of the PDF):
"[Regarding DrawArraysOneInstance, in terms of which other draw operations
are defined:]
If count is negative, an INVALID_VALUE error is generated."

This patch also changes the bahavior of MultiDrawElements to perform the draw
operation if some primitive's index counts are zero.

Signed-off-by: Fabian Bieler 
---
 src/mesa/main/api_validate.c  | 51 +--
 src/mesa/vbo/vbo_exec_array.c | 12 ++
 2 files changed, 42 insertions(+), 21 deletions(-)

diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
index 7ab8e30..0770f94 100644
--- a/src/mesa/main/api_validate.c
+++ b/src/mesa/main/api_validate.c
@@ -334,9 +334,8 @@ _mesa_validate_DrawElements(struct gl_context *ctx,
   return GL_FALSE;
}
 
-   if (count <= 0) {
-  if (count < 0)
-_mesa_error(ctx, GL_INVALID_VALUE, "glDrawElements(count)" );
+   if (count < 0) {
+  _mesa_error(ctx, GL_INVALID_VALUE, "glDrawElements(count)" );
   return GL_FALSE;
}
 
@@ -368,6 +367,9 @@ _mesa_validate_DrawElements(struct gl_context *ctx,
if (!check_index_bounds(ctx, count, type, indices, basevertex))
   return GL_FALSE;
 
+   if (count == 0)
+  return GL_FALSE;
+
return GL_TRUE;
 }
 
@@ -388,10 +390,9 @@ _mesa_validate_MultiDrawElements(struct gl_context *ctx,
FLUSH_CURRENT(ctx, 0);
 
for (i = 0; i < primcount; i++) {
-  if (count[i] <= 0) {
- if (count[i] < 0)
-_mesa_error(ctx, GL_INVALID_VALUE,
-"glMultiDrawElements(count)" );
+  if (count[i] < 0) {
+ _mesa_error(ctx, GL_INVALID_VALUE,
+ "glMultiDrawElements(count)" );
  return GL_FALSE;
   }
}
@@ -463,9 +464,8 @@ _mesa_validate_DrawRangeElements(struct gl_context *ctx, 
GLenum mode,
   return GL_FALSE;
}
 
-   if (count <= 0) {
-  if (count < 0)
-_mesa_error(ctx, GL_INVALID_VALUE, "glDrawRangeElements(count)" );
+   if (count < 0) {
+  _mesa_error(ctx, GL_INVALID_VALUE, "glDrawRangeElements(count)" );
   return GL_FALSE;
}
 
@@ -502,6 +502,9 @@ _mesa_validate_DrawRangeElements(struct gl_context *ctx, 
GLenum mode,
if (!check_index_bounds(ctx, count, type, indices, basevertex))
   return GL_FALSE;
 
+   if (count == 0)
+  return GL_FALSE;
+
return GL_TRUE;
 }
 
@@ -519,9 +522,8 @@ _mesa_validate_DrawArrays(struct gl_context *ctx,
   = ctx->TransformFeedback.CurrentObject;
FLUSH_CURRENT(ctx, 0);
 
-   if (count <= 0) {
-  if (count < 0)
- _mesa_error(ctx, GL_INVALID_VALUE, "glDrawArrays(count)" );
+   if (count < 0) {
+  _mesa_error(ctx, GL_INVALID_VALUE, "glDrawArrays(count)" );
   return GL_FALSE;
}
 
@@ -560,6 +562,9 @@ _mesa_validate_DrawArrays(struct gl_context *ctx,
   xfb_obj->GlesRemainingPrims -= prim_count;
}
 
+   if (count == 0)
+  return GL_FALSE;
+
return GL_TRUE;
 }
 
@@ -572,10 +577,9 @@ _mesa_validate_DrawArraysInstanced(struct gl_context *ctx, 
GLenum mode, GLint fi
   = ctx->TransformFeedback.CurrentObject;
FLUSH_CURRENT(ctx, 0);
 
-   if (count <= 0) {
-  if (count < 0)
- _mesa_error(ctx, GL_INVALID_VALUE,
- "glDrawArraysInstanced(count=%d)", count);
+   if (count < 0) {
+  _mesa_error(ctx, GL_INVALID_VALUE,
+  "glDrawArraysInstanced(count=%d)", count);
   return GL_FALSE;
}
 
@@ -628,6 +632,9 @@ _mesa_validate_DrawArraysInstanced(struct gl_context *ctx, 
GLenum mode, GLint fi
   xfb_obj->GlesRemainingPrims -= prim_count;
}
 
+   if (count == 0)
+  return GL_FALSE;
+
return GL_TRUE;
 }
 
@@ -653,10 +660,9 @@ _mesa_validate_DrawElementsInstanced(struct gl_context 
*ctx,
   return GL_FALSE;
}
 
-   if (count <= 0) {
-  if (count < 0)
-_mesa_error(ctx, GL_INVALID_VALUE,
- "glDrawElementsInstanced(count=%d)", count);
+   if (count < 0) {
+  _mesa_error(ctx, GL_INVALID_VALUE,
+  "glDrawElementsInstanced(count=%d)", count);
   return GL_FALSE;
}
 
@@ -693,6 +699,9 @@ _mesa_validate_DrawElementsInstanced(struct gl_context *ctx,
  return GL_FALSE;
}
 
+   if (count == 0)
+  return GL_FALSE;
+
if (!check_index_bounds(ctx, count, type, indices, basevertex))
   return GL_FALSE;
 
diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c
index 9dadd04..c08 100644
--- a/src/mesa/vbo/vbo_exec_array.c
+++ b/src/mesa/vbo/vbo_exec_array.c
@@ -1298,6 +1298,16 @@ vbo_validated_multidrawelements(struct gl_context *ctx, 
GLenum mode,
   }
}
 
+   /* Draw primitives individually if one count is zero, so we can easily skip
+* that primit

Re: [Mesa-dev] Pull request for 1.50 GS layout qualifiers

2013-06-14 Thread Fabian Bieler
Hello!

I gave your series a try and found two small nitpicks:

In "glsl: Parse the GLSL 1.50 GS layout qualifiers.":
There are two debug printfs left in "ast_type_qualifier::merge_qualifier"
max_vertices or VerticesOut is not checked against MaxGeometryOutputVertices.

Fabian

On 2013-06-14 03:15, Eric Anholt wrote:
> Hey Paul!  I got the layout qualifiers working.  It's unblocked things
> so I can finish off a bunch of testcases I've been working on, so I'd
> like to get it in your gs branch so we can all enjoy testcases together.
> 
> There's one not-for-upstream commit in here, and do note the TODO in the
> last commit (and we need tests for this feature still).  Oh, and there's
> one little prep commit for UBOs, too.
> 
> I'm planning on sending out these commits:
>   glsl: Make _mesa_print_ir() available from anything including ir.h.
>   glsl: Remove ir_print_visitor.h includes and usage
>   mesa: Use shared code for converting shader targets to short strings.
>   mesa: Move the common _mesa_glsl_compile_shader() code to glsl/.
> 
> for review, plus a port of your "Make files buildable from C" to the
> list, since they seem like a good cleanup, together.
> 
> The following changes since commit 4e6d6dbfab79d9e7aff5d26c585d6e77b36db0f2:
> 
>   !UPSTREAM: Handle GS_OPCODE_THREAD_END in implied_mrf_writes() (2013-06-12 
> 11:09:01 -0700)
> 
> are available in the git repository at:
> 
>   git://people.freedesktop.org/~anholt/mesa gs-qualifiers
> 
> for you to fetch changes up to dbe3e86de06813ea0619dd9035f328372c9caab2:
> 
>   glsl: Cross-validate GS layout qualifiers while intrastage linking. 
> (2013-06-13 18:04:29 -0700)
> 
> 
> Eric Anholt (11):
>   mesa: Expose uniform buffers in geometry shaders.
>   glsl: Make _mesa_print_ir() available from anything including ir.h.
>   glsl: Remove ir_print_visitor.h includes and usage
>   mesa: Use shared code for converting shader targets to short strings.
>   mesa: Move the common _mesa_glsl_compile_shader() code to glsl/.
>   glsl: Include EmitVertex() and EndPrimitive() prototypes for GLSL 1.50 
> GS.
>   glsl: !UPSTREAM: Spam in builtin 1.30 variables for 1.50 GSes.
>   glsl: Make sure that we don't put too many bitfields in 
> ast_type_qualifier.
>   glsl: Parse the GLSL 1.50 GS layout qualifiers.
>   glsl: Export the compiler's GS layout qualifiers to the gl_shader.
>   glsl: Cross-validate GS layout qualifiers while intrastage linking.
> 
>  src/glsl/ast.h |  12 ++
>  src/glsl/ast_to_hir.cpp|   2 +
>  src/glsl/ast_type.cpp  |  23 
>  src/glsl/builtin_variables.cpp |   6 +
>  src/glsl/builtins/profiles/150.geom|   3 +
>  src/glsl/glsl_parser.yy|  69 +-
>  src/glsl/glsl_parser_extras.cpp| 153 
> -
>  src/glsl/glsl_parser_extras.h  |  11 ++
>  src/glsl/ir.h  |   8 ++
>  src/glsl/ir_print_visitor.cpp  |   3 +
>  src/glsl/ir_print_visitor.h|   3 -
>  src/glsl/ir_rvalue_visitor.cpp |   1 -
>  src/glsl/link_varyings.cpp |  12 +-
>  src/glsl/linker.cpp| 115 +---
>  src/glsl/linker.h  |   3 -
>  src/glsl/main.cpp  |  60 +---
>  src/glsl/opt_array_splitting.cpp   |   1 -
>  src/glsl/opt_noop_swizzle.cpp  |   1 -
>  src/glsl/opt_structure_splitting.cpp   |   1 -
>  src/glsl/program.h |  16 ++-
>  src/glsl/test_optpass.cpp  |   1 -
>  src/mesa/drivers/dri/i965/brw_fs.cpp   |   1 -
>  src/mesa/drivers/dri/i965/brw_fs_emit.cpp  |   1 -
>  src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp  |   1 -
>  .../drivers/dri/i965/brw_fs_vector_splitting.cpp   |   1 -
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   |   5 +-
>  .../drivers/dri/i965/brw_schedule_instructions.cpp |   1 -
>  src/mesa/drivers/dri/i965/brw_shader.cpp   |  10 +-
>  src/mesa/drivers/dri/i965/brw_vec4.cpp |   1 -
>  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp  |   1 -
>  .../drivers/dri/i965/brw_vec4_reg_allocate.cpp |   1 -
>  src/mesa/main/ff_fragment_shader.cpp   |   1 -
>  src/mesa/main/mtypes.h |  18 +++
>  src/mesa/main/shaderapi.c  |  74 ++
>  src/mesa/main/uniform_query.cpp|   9 +-
>  src/mesa/program/ir_to_mesa.cpp| 102 +-
>  src/mesa/program/ir_to_mesa.h  |   1 -
>  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  

Re: [Mesa-dev] [PATCH] draw: don't clear the so targets until we stream out

2013-06-14 Thread Brian Paul

On 06/13/2013 07:01 PM, Zack Rusin wrote:

Though I find stream output very confusing...


I agree. I was digging a bit more and I think I was correct the first time.
The D3D spec is very clear that "a buffer cannot be bound as both an input
and an output at the same time", so I think the current behavior is correct,
or at least one of the correct options given that the behavior is simply
undefined. So I think I'm going to skip this patch, especially that is is
subtly wrong (because it will clear so target buffers on each invocation of
the stream output stage which isn't correct behavior since the buffers
should only be cleared when new so targets are set).


Actually I'd just like to commit the attached patch. All it does is move
the clearing of the so targets from the drivers to the draw module. It fixes
a bug in softpipe, because softpipe would never clear the buffers and would
always append.




diff --git a/src/gallium/auxiliary/draw/draw_context.c 
b/src/gallium/auxiliary/draw/draw_context.c
index 22c0e9b..53f515e 100644
--- a/src/gallium/auxiliary/draw/draw_context.c
+++ b/src/gallium/auxiliary/draw/draw_context.c
@@ -809,12 +809,20 @@ draw_get_rasterizer_no_cull( struct draw_context *draw,
 void
 draw_set_mapped_so_targets(struct draw_context *draw,
int num_targets,
-   struct draw_so_target *targets[PIPE_MAX_SO_BUFFERS])
+   struct draw_so_target *targets[PIPE_MAX_SO_BUFFERS],
+   unsigned append_bitmask)
 {


Can you document how the append_bitmask works?

Otherwise, LGTM.

Reviewed-by: Brian Paul 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] st/xlib: Fix XImage stride calculation

2013-06-14 Thread Brian Paul

On 06/14/2013 04:11 AM, Richard Sandiford wrote:

Fixes window skew seen while running gnome on a 16-bit screen over vnc.

Signed-off-by: Richard Sandiford 
---
  src/gallium/state_trackers/glx/xlib/xm_api.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/gallium/state_trackers/glx/xlib/xm_api.c 
b/src/gallium/state_trackers/glx/xlib/xm_api.c
index 36ebb46..ca717bd 100644
--- a/src/gallium/state_trackers/glx/xlib/xm_api.c
+++ b/src/gallium/state_trackers/glx/xlib/xm_api.c
@@ -1381,7 +1381,7 @@ XMesaBindTexImage(Display *dpy, XMesaBuffer drawable, int 
buffer,
enum pipe_format internal_format = res->format;
struct pipe_transfer *tex_xfer;
char *map;
-  int line, ximage_stride;
+  int line, byte_width;
XImage *img;

internal_format = choose_pixel_format(drawable->xm_visual);
@@ -1408,12 +1408,12 @@ XMesaBindTexImage(Display *dpy, XMesaBuffer drawable, 
int buffer,
}

/* The pipe transfer has a pitch rounded up to the nearest 64 pixels.  
*/
-  ximage_stride = w * ((img->bits_per_pixel + 7) / 8);
+  byte_width = w * ((img->bits_per_pixel + 7) / 8);

for (line = 0; line < h; line++)
   memcpy(&map[line * tex_xfer->stride],
-&img->data[line * ximage_stride],
-ximage_stride);
+&img->data[line * img->bytes_per_line],
+byte_width);

pipe_transfer_unmap(pipe, tex_xfer);




For both, Reviewed-by: Brian Paul 

Should probably be tagged as candidates for the stable branch.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] R600: Add SI load support for v[24]i32 and store for v2i32

2013-06-14 Thread Aaron Watry
Also add a seperate vector lit test file, since r600 doesn't seem to handle
v2i32 load/store yet, but we can test both for SI.

Signed-off-by: Aaron Watry 
---
 lib/Target/R600/SIInstructions.td |  5 +
 test/CodeGen/R600/load.vec.ll | 19 +++
 2 files changed, 24 insertions(+)
 create mode 100644 test/CodeGen/R600/load.vec.ll

diff --git a/lib/Target/R600/SIInstructions.td 
b/lib/Target/R600/SIInstructions.td
index e8ed2dd..9c96c08 100644
--- a/lib/Target/R600/SIInstructions.td
+++ b/lib/Target/R600/SIInstructions.td
@@ -1638,6 +1638,10 @@ defm : MUBUFLoad_Pattern ;
 defm : MUBUFLoad_Pattern ;
+defm : MUBUFLoad_Pattern ;
+defm : MUBUFLoad_Pattern ;
 
 multiclass MUBUFStore_Pattern  {
 
@@ -1654,6 +1658,7 @@ multiclass MUBUFStore_Pattern  
{
 
 defm : MUBUFStore_Pattern ;
 defm : MUBUFStore_Pattern ;
+defm : MUBUFStore_Pattern ;
 defm : MUBUFStore_Pattern ;
 
 /** == **/
diff --git a/test/CodeGen/R600/load.vec.ll b/test/CodeGen/R600/load.vec.ll
new file mode 100644
index 000..08e034e
--- /dev/null
+++ b/test/CodeGen/R600/load.vec.ll
@@ -0,0 +1,19 @@
+; RUN: llc < %s -march=r600 -mcpu=SI | FileCheck --check-prefix=SI-CHECK  %s
+
+; load a v2i32 value from the global address space.
+; SI-CHECK: @load_v2i32
+; SI-CHECK: BUFFER_LOAD_DWORDX2 VGPR{{[0-9]+}}
+define void @load_v2i32(<2 x i32> addrspace(1)* %out, <2 x i32> addrspace(1)* 
%in) {
+  %a = load <2 x i32> addrspace(1) * %in
+  store <2 x i32> %a, <2 x i32> addrspace(1)* %out
+  ret void
+}
+
+; load a v4i32 value from the global address space.
+; SI-CHECK: @load_v4i32
+; SI-CHECK: BUFFER_LOAD_DWORDX4 VGPR{{[0-9]+}}
+define void @load_v4i32(<4 x i32> addrspace(1)* %out, <4 x i32> addrspace(1)* 
%in) {
+  %a = load <4 x i32> addrspace(1) * %in
+  store <4 x i32> %a, <4 x i32> addrspace(1)* %out
+  ret void
+}
-- 
1.8.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] draw: don't clear the so targets until we stream out

2013-06-14 Thread Jose Fonseca
Sounds good to me.

Jose

- Original Message -
> Since draw auto fetches the count from the buffers, we can't
> just clear them on bind, we need to wait until the actual
> stream out is performed. Otherwise the count for draw auto
> will be zero. Plus is cleaner to have draw do it rather
> than drivers having to mess with draw's internals.
> 
> Signed-off-by: Zack Rusin 
> ---
>  src/gallium/auxiliary/draw/draw_context.c |4 +++-
>  src/gallium/auxiliary/draw/draw_context.h |3 ++-
>  src/gallium/auxiliary/draw/draw_private.h |1 +
>  src/gallium/auxiliary/draw/draw_pt_so_emit.c  |   20 
>  src/gallium/drivers/llvmpipe/lp_context.h |1 +
>  src/gallium/drivers/llvmpipe/lp_draw_arrays.c |4 ++--
>  src/gallium/drivers/llvmpipe/lp_state_so.c|8 ++--
>  src/gallium/drivers/softpipe/sp_context.h |1 +
>  src/gallium/drivers/softpipe/sp_draw_arrays.c |4 ++--
>  src/gallium/drivers/softpipe/sp_state_so.c|1 +
>  10 files changed, 35 insertions(+), 12 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/draw/draw_context.c
> b/src/gallium/auxiliary/draw/draw_context.c
> index 4a08765..f463739 100644
> --- a/src/gallium/auxiliary/draw/draw_context.c
> +++ b/src/gallium/auxiliary/draw/draw_context.c
> @@ -810,7 +810,8 @@ draw_get_rasterizer_no_cull( struct draw_context *draw,
>  void
>  draw_set_mapped_so_targets(struct draw_context *draw,
> int num_targets,
> -   struct draw_so_target
> *targets[PIPE_MAX_SO_BUFFERS])
> +   struct draw_so_target
> *targets[PIPE_MAX_SO_BUFFERS],
> +   unsigned append_bitmask)
>  {
> int i;
>  
> @@ -820,6 +821,7 @@ draw_set_mapped_so_targets(struct draw_context *draw,
>draw->so.targets[i] = NULL;
>  
> draw->so.num_targets = num_targets;
> +   draw->so.append_bitmask = append_bitmask;
>  }
>  
>  void
> diff --git a/src/gallium/auxiliary/draw/draw_context.h
> b/src/gallium/auxiliary/draw/draw_context.h
> index 4a1b27e..ae63068 100644
> --- a/src/gallium/auxiliary/draw/draw_context.h
> +++ b/src/gallium/auxiliary/draw/draw_context.h
> @@ -231,7 +231,8 @@ draw_set_mapped_constant_buffer(struct draw_context
> *draw,
>  void
>  draw_set_mapped_so_targets(struct draw_context *draw,
> int num_targets,
> -   struct draw_so_target
> *targets[PIPE_MAX_SO_BUFFERS]);
> +   struct draw_so_target
> *targets[PIPE_MAX_SO_BUFFERS],
> +   unsigned append_bitmask);
>  
>  
>  /***
> diff --git a/src/gallium/auxiliary/draw/draw_private.h
> b/src/gallium/auxiliary/draw/draw_private.h
> index fd52c2d..4dda90e 100644
> --- a/src/gallium/auxiliary/draw/draw_private.h
> +++ b/src/gallium/auxiliary/draw/draw_private.h
> @@ -290,6 +290,7 @@ struct draw_context
> struct {
>struct draw_so_target *targets[PIPE_MAX_SO_BUFFERS];
>uint num_targets;
> +  uint append_bitmask;
> } so;
>  
> /* Clip derived state:
> diff --git a/src/gallium/auxiliary/draw/draw_pt_so_emit.c
> b/src/gallium/auxiliary/draw/draw_pt_so_emit.c
> index d624a99..785aa34 100644
> --- a/src/gallium/auxiliary/draw/draw_pt_so_emit.c
> +++ b/src/gallium/auxiliary/draw/draw_pt_so_emit.c
> @@ -77,6 +77,24 @@ draw_has_so(const struct draw_context *draw)
> return FALSE;
>  }
>  
> +static void
> +clean_so_buffers(struct pt_so_emit *emit)
> +{
> +   struct draw_context *draw = emit->draw;
> +   unsigned i;
> +
> +   debug_assert(emit->has_so);
> +
> +   for (i = 0; i < draw->so.num_targets; i++) {
> +  /* if we're not appending then lets reset the internal
> + data of our so target */
> +  if (!(draw->so.append_bitmask & (1 << i)) && draw->so.targets[i]) {
> + draw->so.targets[i]->internal_offset = 0;
> + draw->so.targets[i]->emitted_vertices = 0;
> +  }
> +   }
> +}
> +
>  void draw_pt_so_emit_prepare(struct pt_so_emit *emit, boolean
>  use_pre_clip_pos)
>  {
> struct draw_context *draw = emit->draw;
> @@ -257,6 +275,8 @@ void draw_pt_so_emit( struct pt_so_emit *emit,
> if (!draw->so.num_targets)
>return;
>  
> +   clean_so_buffers(emit);
> +
> emit->emitted_vertices = 0;
> emit->emitted_primitives = 0;
> emit->generated_primitives = 0;
> diff --git a/src/gallium/drivers/llvmpipe/lp_context.h
> b/src/gallium/drivers/llvmpipe/lp_context.h
> index abfe852..0515968 100644
> --- a/src/gallium/drivers/llvmpipe/lp_context.h
> +++ b/src/gallium/drivers/llvmpipe/lp_context.h
> @@ -91,6 +91,7 @@ struct llvmpipe_context {
>  
> struct draw_so_target *so_targets[PIPE_MAX_SO_BUFFERS];
> int num_so_targets;
> +   unsigned so_append_bitmask;
> struct pipe_query_data_so_statistics so_stats;
> unsigned num_primitives_generated;
>  
> diff --git a/src/gallium/drivers/llvmpipe/lp_draw_array

Re: [Mesa-dev] R600 Patches: Add support for the local address space

2013-06-14 Thread Vincent Lejeune
Hi,

Thank for your work on this !
Patch 2, 4 and 5 have my rb.


>diff --git a/lib/Target/R600/R600InstrInfo.cpp 
>b/lib/Target/R600/R600InstrInfo.cpp
>index b9da74c..6de47f7 100644
>--- a/lib/Target/R600/R600InstrInfo.cpp
>+++ b/lib/Target/R600/R600InstrInfo.cpp
>@@ -133,6 +133,12 @@ bool R600InstrInfo::isCubeOp(unsigned Opcode) const {
> bool R600InstrInfo::isALUInstr(unsigned Opcode) const {
>   unsigned TargetFlags = get(Opcode).TSFlags; >+  return (TargetFlags & 
> R600_InstFlag::ALU_INST);
>+}
>+
>+bool R600InstrInfo::hasInstrModifiers(unsigned Opcode) const {
>+  unsigned TargetFlags = get(Opcode).TSFlags;
>+
>   return ((TargetFlags & R600_InstFlag::OP1) |
>   (TargetFlags & R600_InstFlag::OP2) |
>   (TargetFlags & R600_InstFlag::OP3));
Function prototype is not defined here (it is defined in patch 5).



>diff --git a/lib/Target/R600/R600MachineScheduler.cpp 
>b/lib/Target/R600/R600MachineScheduler.cpp
>index a330d88..acc1b4d 100644
>--- a/lib/Target/R600/R600MachineScheduler.cpp
>+++ b/lib/Target/R600/R600MachineScheduler.cpp
>@@ -269,10 +269,14 @@ R600SchedStrategy::AluKind 
>R600SchedStrategy::getAluKind(SUnit *SU) const {
> }
> 
> // Does the instruction take a whole IG ?
>+// XXX: Is it possible to add a helper function in R600InstrInfo that can
>+// be used here and in R600PacketizerList::isSoloInstruction() ?
> if(TII->isVector(*MI) ||
> TII->isCubeOp(MI->getOpcode()) ||
>-TII->isReductionOp(MI->getOpcode()))
>+TII->isReductionOp(MI->getOpcode()) ||
>+MI->getOpcode() == AMDGPU::GROUP_BARRIER) {
>   return AluT_XYZW;
>+}

I'm not sure it'll factorize that much code ; R600Packetizer is called after 
cube/reduction op are lowered
by R600Expand pass and thus the isVector/ReductionOp check is useless. I may 
have left some debug code in
isSoloInstruction code though.



- Mail original -
> De : Tom Stellard 
> À : llvm-comm...@cs.uiuc.edu
> Cc : mesa-dev@lists.freedesktop.org
> Envoyé le : Jeudi 13 juin 2013 2h42
> Objet : [Mesa-dev] R600 Patches: Add support for the local address space
> 
> Hi,
> 
> The attached patches add support for local address space on
> Evergreen / Northern Islands GPUs.
> 
> Please Review.
> 
> -Tom
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: per-texture locking

2013-06-14 Thread Eric Anholt
Frank Henigman  writes:

> On Wed, Jun 12, 2013 at 1:33 PM, Eric Anholt  wrote:
>> glCopyTexSubImage was just an example of "reads from texture and writes
>> to texture, thanks to FBOs" -- you've also got the normal drawing path,
>> glCopyPixels, glDrawPixels, glBitmap, glBlitFramebuffer.  If zero-copy
>> PBOs are reintroduced, then glReadPixels() and glGetTexImage() are
>> concerns.
>>
>
> There seem to be two categories here: operations on one texture and
> and operations on multiple textures.  For the former, I don't think my patch
> does any harm.  There may be places where locking is lacking but those
> have been there all along and should be fixable by adding locking.
> In the second category there's work to do for glCopyTexSubImage() and the
> other functions you mention, but they seem fixable, even for insane usage.
> As for "the normal drawing path" I don't think I've harmed that either.
> It doesn't look like any driver holds the lock during rendering (except the
> intel 915 driver does "lock all textures" in intelRunPipeline() for reasons
> I don't understand) therefore they don't become less safe in the way
> glCopyTexSubImage() did.

Just because drivers aren't doing locking doesn't mean they don't need
it.  For example, right now TexImage is done by freeing the old buffer,
then some time later making a new buffer.  When you're drawing from that
texture in another thread, you should receieve either old or new
(depending on when you bound, or maybe new even when you bound with
old), but instead if you draw at the wrong time you just segfault
because the image storage was gone when it absolutely shouldn't have
been.

(And just dropping the FreeTextureImageBuffer isn't enough -- if you
draw with old with new's size or the other way around, you may scribble
over other GPU memory)

I don't want to see locking pushdown when we know our current global
locking is thoroughly broken, and the locking pushdown would make fixing
things more difficult (ABBA problems).


pgpvZS0k4_U57.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium: add condition parameter to render_condition

2013-06-14 Thread sroland
From: Roland Scheidegger 

For conditional rendering this makes it possible to skip rendering
if either the predicate is true or false, as supported by d3d10
(in fact previously it was sort of implied skip rendering if predicate
is false for occlusion predicate, and true for so_overflow predicate).
There's no cap bit for this as presumably all drivers could do it trivially
(but this patch does not implement it for the drivers using true
hw predicates, nvxx, r600, radeonsi, no change is expected for OpenGL
functionality).
---
 src/gallium/auxiliary/cso_cache/cso_context.c |   13 ++---
 src/gallium/auxiliary/cso_cache/cso_context.h |3 ++-
 src/gallium/auxiliary/hud/hud_context.c   |2 +-
 src/gallium/auxiliary/postprocess/pp_run.c|2 +-
 src/gallium/auxiliary/util/u_blit.c   |2 +-
 src/gallium/auxiliary/util/u_blitter.c|3 ++-
 src/gallium/auxiliary/util/u_blitter.h|3 +++
 src/gallium/auxiliary/util/u_gen_mipmap.c |2 +-
 src/gallium/docs/source/context.rst   |   14 +-
 src/gallium/drivers/galahad/glhd_context.c|3 ++-
 src/gallium/drivers/ilo/ilo_3d.c  |4 +++-
 src/gallium/drivers/ilo/ilo_3d.h  |1 +
 src/gallium/drivers/llvmpipe/lp_context.c |2 ++
 src/gallium/drivers/llvmpipe/lp_context.h |1 +
 src/gallium/drivers/llvmpipe/lp_query.c   |3 ++-
 src/gallium/drivers/llvmpipe/lp_surface.c |2 +-
 src/gallium/drivers/nv30/nv30_context.h   |1 +
 src/gallium/drivers/nv30/nv30_miptree.c   |2 +-
 src/gallium/drivers/nv30/nv30_query.c |4 +++-
 src/gallium/drivers/nv50/nv50_context.h   |1 +
 src/gallium/drivers/nv50/nv50_query.c |4 +++-
 src/gallium/drivers/nv50/nv50_surface.c   |2 +-
 src/gallium/drivers/nvc0/nvc0_context.h   |1 +
 src/gallium/drivers/nvc0/nvc0_query.c |4 +++-
 src/gallium/drivers/nvc0/nvc0_surface.c   |2 +-
 src/gallium/drivers/r300/r300_query.c |7 ---
 src/gallium/drivers/r600/r600_blit.c  |1 +
 src/gallium/drivers/r600/r600_pipe.c  |6 --
 src/gallium/drivers/r600/r600_pipe.h  |1 +
 src/gallium/drivers/r600/r600_query.c |2 ++
 src/gallium/drivers/radeonsi/r600_blit.c  |4 +++-
 src/gallium/drivers/radeonsi/r600_query.c |4 +++-
 src/gallium/drivers/radeonsi/radeonsi_pipe.c  |6 --
 src/gallium/drivers/radeonsi/radeonsi_pipe.h  |2 ++
 src/gallium/drivers/softpipe/sp_context.c |2 ++
 src/gallium/drivers/softpipe/sp_context.h |1 +
 src/gallium/drivers/softpipe/sp_query.c   |2 +-
 src/gallium/drivers/softpipe/sp_surface.c |2 +-
 src/gallium/drivers/svga/svga_pipe_blit.c |2 +-
 src/gallium/drivers/trace/tr_context.c|4 +++-
 src/gallium/include/pipe/p_context.h  |2 ++
 src/mesa/state_tracker/st_cb_condrender.c |6 +++---
 42 files changed, 95 insertions(+), 40 deletions(-)

diff --git a/src/gallium/auxiliary/cso_cache/cso_context.c 
b/src/gallium/auxiliary/cso_cache/cso_context.c
index b06a070..6805427 100644
--- a/src/gallium/auxiliary/cso_cache/cso_context.c
+++ b/src/gallium/auxiliary/cso_cache/cso_context.c
@@ -111,6 +111,7 @@ struct cso_context {
void *velements, *velements_saved;
struct pipe_query *render_condition, *render_condition_saved;
uint render_condition_mode, render_condition_mode_saved;
+   boolean render_condition_cond, render_condition_cond_saved;
 
struct pipe_clip_state clip;
struct pipe_clip_state clip_saved;
@@ -723,13 +724,17 @@ void cso_restore_stencil_ref(struct cso_context *ctx)
 }
 
 void cso_set_render_condition(struct cso_context *ctx,
-  struct pipe_query *query, uint mode)
+  struct pipe_query *query,
+  boolean condition, uint mode)
 {
struct pipe_context *pipe = ctx->pipe;
 
-   if (ctx->render_condition != query || ctx->render_condition_mode != mode) {
-  pipe->render_condition(pipe, query, mode);
+   if (ctx->render_condition != query ||
+   ctx->render_condition_mode != mode ||
+   ctx->render_condition_cond != condition) {
+  pipe->render_condition(pipe, query, condition, mode);
   ctx->render_condition = query;
+  ctx->render_condition_cond = condition;
   ctx->render_condition_mode = mode;
}
 }
@@ -737,12 +742,14 @@ void cso_set_render_condition(struct cso_context *ctx,
 void cso_save_render_condition(struct cso_context *ctx)
 {
ctx->render_condition_saved = ctx->render_condition;
+   ctx->render_condition_cond_saved = ctx->render_condition_cond;
ctx->render_condition_mode_saved = ctx->render_condition_mode;
 }
 
 void cso_restore_render_condition(struct cso_context *ctx)
 {
cso_set_render_condition(ctx, ctx->render_condition_saved,
+ctx->render_condition_cond_saved,
   

[Mesa-dev] [PATCH] mesa: Fix ieee fp on Alpha

2013-06-14 Thread Sven Joachim
Commit 1f82bf12ed inadvertently broke it, checking for __IEEE_FLOAT on all
Alpha machines instead of only on VMS as before.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Sven Joachim 
---
 src/mesa/main/compiler.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/compiler.h b/src/mesa/main/compiler.h
index ad719d4..fb7baf8 100644
--- a/src/mesa/main/compiler.h
+++ b/src/mesa/main/compiler.h
@@ -316,7 +316,7 @@ static INLINE GLuint CPU_TO_LE32(GLuint x)
 defined(__arm__) || \
 defined(__sh__) || defined(__m32r__) || \
 (defined(__sun) && defined(_IEEE_754)) || \
-(defined(__alpha__) && defined(__IEEE_FLOAT))
+defined(__alpha__)
 #define USE_IEEE
 #define IEEE_ONE 0x3f80
 #endif
-- 
1.8.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Fix ieee fp on Alpha

2013-06-14 Thread Matt Turner
On Fri, Jun 14, 2013 at 1:10 PM, Sven Joachim  wrote:
> Commit 1f82bf12ed inadvertently broke it, checking for __IEEE_FLOAT on all
> Alpha machines instead of only on VMS as before.
>
> NOTE: This is a candidate for the 9.1 branch.
>
> Signed-off-by: Sven Joachim 
> ---
>  src/mesa/main/compiler.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/mesa/main/compiler.h b/src/mesa/main/compiler.h
> index ad719d4..fb7baf8 100644
> --- a/src/mesa/main/compiler.h
> +++ b/src/mesa/main/compiler.h
> @@ -316,7 +316,7 @@ static INLINE GLuint CPU_TO_LE32(GLuint x)
>  defined(__arm__) || \
>  defined(__sh__) || defined(__m32r__) || \
>  (defined(__sun) && defined(_IEEE_754)) || \
> -(defined(__alpha__) && defined(__IEEE_FLOAT))
> +defined(__alpha__)
>  #define USE_IEEE
>  #define IEEE_ONE 0x3f80
>  #endif
> --
> 1.8.3.1

Reviewed-by: Matt Turner 

Will commit.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] RFC: more changes to render_condition

2013-06-14 Thread Roland Scheidegger
Am 14.06.2013 19:49, schrieb srol...@vmware.com:
> From: Roland Scheidegger 
> 
> For conditional rendering this makes it possible to skip rendering
> if either the predicate is true or false, as supported by d3d10
> (in fact previously it was sort of implied skip rendering if predicate
> is false for occlusion predicate, and true for so_overflow predicate).
> There's no cap bit for this as presumably all drivers could do it trivially
> (but this patch does not implement it for the drivers using true
> hw predicates, nvxx, r600, radeonsi, no change is expected for OpenGL
> functionality).
> ---


FWIW there's some more changes which would be useful but they are probably
more controversial and may require some more thought so here it goes:


diff --git a/src/gallium/docs/source/context.rst 
b/src/gallium/docs/source/context.rst
index ede89be..59403de 100644
--- a/src/gallium/docs/source/context.rst
+++ b/src/gallium/docs/source/context.rst
@@ -385,7 +385,8 @@ A drawing command can be skipped depending on the outcome 
of a query
 (typically an occlusion query, or streamout overflow predicate).
 The ``render_condition`` function specifies the query which should be checked
 prior to rendering anything. Functions honoring render_condition include
-(and are limited to) draw_vbo, clear, clear_render_target, clear_depth_stencil.
+(and are limited to) draw_vbo, clear, clear_render_target, clear_depth_stencil,
+resource_copy_region. Transfers may also be affected.

 If ``render_condition`` is called with ``query`` = NULL, conditional
 rendering is disabled and drawing takes place normally.
@@ -545,6 +546,13 @@ These flags control the behavior of a transfer object.
   Written ranges will be notified later with :ref:`transfer_flush_region`.
   Cannot be used with ``PIPE_TRANSFER_READ``.

+``PIPE_TRANSFER_HONOR_RENDER_CONDITION``
+  The transfer will honor the current render condition. This is only valid
+  essentially for ``transfer_inline_write`` (but since everyone implements
+  this with a fallback to ordinary transfer_map/transfer_unmap it is valid
+  for transfer_map too, however the same restriction apply, the transfer
+  must be write-only with either DISCARD_RANGE or DISCARD_WHOLE_RESOURCE set).
+

The reasoning for this is that d3d10 has CopyResource/CopySubResource
and UpdateSubResource predicated.
For resource_copy_region if it always honors render_condition,
then state trackers not wanting this can simply disable predication
when they call it. But the opposite is not possible, if it never
honors predication, then a state tracker needing predication will
need to wait on the predicate, hence requiring a cpu/gpu sync (if
the result isn't available yet).
For transfers this is a bit weird I admit it essentially implies
a predicated gpu blit from a staging texture (if you implement this
fully on hardware). If that's too awkward though this one could be
emulated in the state tracker easily enough, if resource_copy_region
honors predication (by just creating a temporary texture and doing
a predicated resource_copy_region), which is probably cleaner from
a API perspective.

Roland
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] prog_parameter.c ASAN Patch

2013-06-14 Thread Myles C. Maxfield
Sorry for the triple post; I received a bounce email the first time
and got sent to the spam folder the second time, so I'm trying a third
time.

Hello, all. I was running Mesa with Address Sanitizer [1] turned on,
and found one place where ASAN pointed out a read-before-initialized
problem. In particular, in _mesa_add_parameter, in prog_parameter.c,
|values| represents an array holding a variable number of values.
These values get copied out of the array 4 at a time with the COPY_4V
macro, however, the array might only contain a single element. In this
case, ASAN reports a read-before-initialize because the last 3 of the
4 elements haven't been written to yet. I was hoping to contribute a
patch that will silence this problem that ASAN reports. I'm happy to
incorporate any feedback anyone has into this patch.

Thanks,
Myles C. Maxfield

[1] https://code.google.com/p/address-sanitizer/

diff --git a/src/mesa/program/prog_parameter.c
b/src/mesa/program/prog_parameter.c
index 2018fa5..63915fb 100644
--- a/src/mesa/program/prog_parameter.c
+++ b/src/mesa/program/prog_parameter.c
@@ -158,7 +158,17 @@ _mesa_add_parameter(struct
gl_program_parameter_list *paramList,
  p->DataType = datatype;
  p->Flags = flags;
  if (values) {
-COPY_4V(paramList->ParameterValues[oldNum + i], values);
+if (size & 3) {
+  for (j = 0; j < size; j++) {
+paramList->ParameterValues[oldNum + i][j] = values[j];
+  }
+  /* silence asan */
+  for (j = size; j < 4; j++) {
+paramList->ParameterValues[oldNum + i][j].f = 0;
+  }
+} else {
+  COPY_4V(paramList->ParameterValues[oldNum + i], values);
+}
 values += 4;
 p->Initialized = GL_TRUE;
  }
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] clover: Don't segfault when compiling a program with no kernel

2013-06-14 Thread Tom Stellard
On Thu, Jun 06, 2013 at 10:29:21AM -0500, Aaron Watry wrote:
> Looks good to me.  Is there a piglit test for this?

I just sent a test for this to the list.

-Tom

> 
> --Aaron
> 
> On Wed, Jun 5, 2013 at 7:12 PM, Tom Stellard  wrote:
> > From: Tom Stellard 
> >
> > ---
> >  src/gallium/state_trackers/clover/llvm/invocation.cpp | 7 +++
> >  1 file changed, 7 insertions(+)
> >
> > diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
> > b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> > index 2d115ed..8ec089d 100644
> > --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> > +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> > @@ -209,6 +209,13 @@ namespace {
> > find_kernels(llvm::Module *mod, std::vector &kernels) 
> > {
> >const llvm::NamedMDNode *kernel_node =
> >   mod->getNamedMetadata("opencl.kernels");
> > +  // This means there are no kernels in the program.  The spec does not
> > +  // require that we return an error here, but there will be an error 
> > if
> > +  // the user tries to pass this program to a clCreateKernel() call.
> > +  if (!kernel_node) {
> > + return;
> > +  }
> > +
> >for (unsigned i = 0; i < kernel_node->getNumOperands(); ++i) {
> >   kernels.push_back(llvm::dyn_cast(
> >  
> > kernel_node->getOperand(i)->getOperand(0)));
> > --
> > 1.7.11.4
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] R600: Add SI load support for v[24]i32 and store for v2i32

2013-06-14 Thread Tom Stellard
On Fri, Jun 14, 2013 at 08:40:38AM -0500, Aaron Watry wrote:
> Also add a seperate vector lit test file, since r600 doesn't seem to handle
> v2i32 load/store yet, but we can test both for SI.
>

Pushed, thanks!

-Tom
> Signed-off-by: Aaron Watry 
> ---
>  lib/Target/R600/SIInstructions.td |  5 +
>  test/CodeGen/R600/load.vec.ll | 19 +++
>  2 files changed, 24 insertions(+)
>  create mode 100644 test/CodeGen/R600/load.vec.ll
> 
> diff --git a/lib/Target/R600/SIInstructions.td 
> b/lib/Target/R600/SIInstructions.td
> index e8ed2dd..9c96c08 100644
> --- a/lib/Target/R600/SIInstructions.td
> +++ b/lib/Target/R600/SIInstructions.td
> @@ -1638,6 +1638,10 @@ defm : MUBUFLoad_Pattern  i32,
>global_load, constant_load>;
>  defm : MUBUFLoad_Pattern zextloadi8_global, zextloadi8_constant>;
> +defm : MUBUFLoad_Pattern  +  global_load, constant_load>;
> +defm : MUBUFLoad_Pattern  +  global_load, constant_load>;
>  
>  multiclass MUBUFStore_Pattern  {
>  
> @@ -1654,6 +1658,7 @@ multiclass MUBUFStore_Pattern  vt> {
>  
>  defm : MUBUFStore_Pattern ;
>  defm : MUBUFStore_Pattern ;
> +defm : MUBUFStore_Pattern ;
>  defm : MUBUFStore_Pattern ;
>  
>  /** == **/
> diff --git a/test/CodeGen/R600/load.vec.ll b/test/CodeGen/R600/load.vec.ll
> new file mode 100644
> index 000..08e034e
> --- /dev/null
> +++ b/test/CodeGen/R600/load.vec.ll
> @@ -0,0 +1,19 @@
> +; RUN: llc < %s -march=r600 -mcpu=SI | FileCheck --check-prefix=SI-CHECK  %s
> +
> +; load a v2i32 value from the global address space.
> +; SI-CHECK: @load_v2i32
> +; SI-CHECK: BUFFER_LOAD_DWORDX2 VGPR{{[0-9]+}}
> +define void @load_v2i32(<2 x i32> addrspace(1)* %out, <2 x i32> 
> addrspace(1)* %in) {
> +  %a = load <2 x i32> addrspace(1) * %in
> +  store <2 x i32> %a, <2 x i32> addrspace(1)* %out
> +  ret void
> +}
> +
> +; load a v4i32 value from the global address space.
> +; SI-CHECK: @load_v4i32
> +; SI-CHECK: BUFFER_LOAD_DWORDX4 VGPR{{[0-9]+}}
> +define void @load_v4i32(<4 x i32> addrspace(1)* %out, <4 x i32> 
> addrspace(1)* %in) {
> +  %a = load <4 x i32> addrspace(1) * %in
> +  store <4 x i32> %a, <4 x i32> addrspace(1)* %out
> +  ret void
> +}
> -- 
> 1.8.1.2
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] llvmpipe: fixes for conditional rendering

2013-06-14 Thread sroland
From: Roland Scheidegger 

honor render_condition for clear_render_target and clear_depth_stencil.
Also add minimal support for occlusion predicate, though it can't be active
at the same time as an occlusion query yet.
While here also switchify some large if-else (actually just mutually
exclusive if-if-if...) constructs.
---
 src/gallium/drivers/llvmpipe/lp_bld_depth.c |4 ++
 src/gallium/drivers/llvmpipe/lp_query.c |   78 +++
 src/gallium/drivers/llvmpipe/lp_rast.c  |2 +
 src/gallium/drivers/llvmpipe/lp_surface.c   |   42 ++-
 4 files changed, 90 insertions(+), 36 deletions(-)

diff --git a/src/gallium/drivers/llvmpipe/lp_bld_depth.c 
b/src/gallium/drivers/llvmpipe/lp_bld_depth.c
index a8bd15f..40ab7be 100644
--- a/src/gallium/drivers/llvmpipe/lp_bld_depth.c
+++ b/src/gallium/drivers/llvmpipe/lp_bld_depth.c
@@ -429,6 +429,10 @@ get_s_shift_and_mask(const struct util_format_description 
*format_desc,
  * Test the depth mask. Add the number of channel which has none zero mask
  * into the occlusion counter. e.g. maskvalue is {-1, -1, -1, -1}.
  * The counter will add 4.
+ * TODO: would be much easier if we'd just have a nx32bit counter
+ * and simply sub the masks here. Then add the individual values
+ * at query end. This however wouldn't work with 64bit counter values
+ * which we should also do.
  *
  * \param type holds element type of the mask vector.
  * \param maskvalue is the depth test mask.
diff --git a/src/gallium/drivers/llvmpipe/lp_query.c 
b/src/gallium/drivers/llvmpipe/lp_query.c
index 973c689..84910b8 100644
--- a/src/gallium/drivers/llvmpipe/lp_query.c
+++ b/src/gallium/drivers/llvmpipe/lp_query.c
@@ -125,6 +125,12 @@ llvmpipe_get_query_result(struct pipe_context *pipe,
  *result += pq->count[i];
   }
   break;
+   case PIPE_QUERY_OCCLUSION_PREDICATE:
+  for (i = 0; i < num_threads; i++) {
+ /* safer (still not guaranteed) when there's an overflow */
+ *result = *result || pq->count[i];
+  }
+  break;
case PIPE_QUERY_TIMESTAMP:
   for (i = 0; i < num_threads; i++) {
  if (pq->count[i] > *result) {
@@ -181,30 +187,28 @@ llvmpipe_begin_query(struct pipe_context *pipe, struct 
pipe_query *q)
 
 
memset(pq->count, 0, sizeof(pq->count));
+   /* XXX do we really need to bin all queries */
lp_setup_begin_query(llvmpipe->setup, pq);
 
-   if (pq->type == PIPE_QUERY_PRIMITIVES_EMITTED) {
+   switch (pq->type) {
+   case PIPE_QUERY_PRIMITIVES_EMITTED:
   pq->num_primitives_written = 0;
   llvmpipe->so_stats.num_primitives_written = 0;
-   }
-
-   if (pq->type == PIPE_QUERY_PRIMITIVES_GENERATED) {
+  break;
+   case PIPE_QUERY_PRIMITIVES_GENERATED:
   pq->num_primitives_generated = 0;
   llvmpipe->num_primitives_generated = 0;
-   }
-
-   if (pq->type == PIPE_QUERY_SO_STATISTICS) {
+  break;
+   case PIPE_QUERY_SO_STATISTICS:
   pq->num_primitives_written = 0;
   llvmpipe->so_stats.num_primitives_written = 0;
   pq->num_primitives_generated = 0;
   llvmpipe->num_primitives_generated = 0;
-   }
-
-   if (pq->type == PIPE_QUERY_SO_OVERFLOW_PREDICATE) {
+  break;
+   case PIPE_QUERY_SO_OVERFLOW_PREDICATE:
   pq->so_has_overflown = FALSE;
-   }
-
-   if (pq->type == PIPE_QUERY_PIPELINE_STATISTICS) {
+  break;
+   case PIPE_QUERY_PIPELINE_STATISTICS:
   /* reset our cache */
   if (llvmpipe->active_statistics_queries == 0) {
  memset(&llvmpipe->pipeline_statistics, 0,
@@ -212,11 +216,16 @@ llvmpipe_begin_query(struct pipe_context *pipe, struct 
pipe_query *q)
   }
   memcpy(&pq->stats, &llvmpipe->pipeline_statistics, sizeof(pq->stats));
   llvmpipe->active_statistics_queries++;
-   }
-
-   if (pq->type == PIPE_QUERY_OCCLUSION_COUNTER) {
-  llvmpipe->active_occlusion_query = TRUE;
+  break;
+   case PIPE_QUERY_OCCLUSION_COUNTER:
+   case PIPE_QUERY_OCCLUSION_PREDICATE:
+  /* Both active at same time will still fail all over the place.
+   * Then again several of each type can be active too... */
+  llvmpipe->active_occlusion_query++;
   llvmpipe->dirty |= LP_NEW_OCCLUSION_QUERY;
+  break;
+   default:
+  break;
}
 }
 
@@ -229,25 +238,23 @@ llvmpipe_end_query(struct pipe_context *pipe, struct 
pipe_query *q)
 
lp_setup_end_query(llvmpipe->setup, pq);
 
-   if (pq->type == PIPE_QUERY_PRIMITIVES_EMITTED) {
-  pq->num_primitives_written = llvmpipe->so_stats.num_primitives_written;
-   }
+   switch (pq->type) {
 
-   if (pq->type == PIPE_QUERY_PRIMITIVES_GENERATED) {
+   case PIPE_QUERY_PRIMITIVES_EMITTED:
+  pq->num_primitives_written = llvmpipe->so_stats.num_primitives_written;
+  break;
+   case PIPE_QUERY_PRIMITIVES_GENERATED:
   pq->num_primitives_generated = llvmpipe->num_primitives_generated;
-   }
-
-   if (pq->type == PIPE_QUERY_SO_STATISTICS) {
+  break;
+   case PIPE_QUERY_SO_STATISTICS:
   pq->num_primitives_written = llvmpipe->so_stats.num_p