Re: [Mesa-dev] [PATCH] swrast: Add glBlitFramebuffer to commands affected by conditional rendering

2014-04-23 Thread Carl Worth
Anuj Phogat  writes:
> +   /* Page 679 of OpenGL 4.4 spec says:
> +*"Added BlitFramebuffer to commands affected by conditional 
> rendering in
> +* section 10.10 (Bug 9562)."
> +*/
> +   if (!_mesa_check_conditional_render(ctx))
> +  return; /* Do not blit */
> +
> if (!_mesa_clip_blit(ctx, &srcX0, &srcY0, &srcX1, &srcY1,
>  &dstX0, &dstY0, &dstX1, &dstY1)) {
>return;

After I picked this to the 10.1 branch and tested I found the following
piglit regression:

$ ~/src/piglit/bin/nv_conditional_render-blitframebuffer -auto 
Probe color at (0,16)
  Expected: 0.00 1.00 0.00 0.00
  Observed: 1.00 0.00 0.00 0.00
PIGLIT: {'result': 'fail' }

So I'm moving this patch off of the "applied" queue and onto a new
"rejected" queue as can be seen here:

http://cworth.org/~cworth/mesa-stable-queue/

Feel free to follow up with any additional information that would be
useful, (such as an updated patch or a change to the relevant piglit
test).

-Carl

-- 
carl.d.wo...@intel.com


pgpns8sazsbCO.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] swrast: Add glBlitFramebuffer to commands affected by conditional rendering

2014-04-23 Thread Carl Worth
Carl Worth  writes:
> After I picked this to the 10.1 branch and tested I found the following
> piglit regression:
...
> So I'm moving this patch off of the "applied" queue and onto a new
> "rejected" queue as can be seen here:

Actually, I replied to the wrong email here. The patch I bisected to was
the i965 patch with the same description.

But now I'm wondering if the swrast patch has the same problem (and
whether my testing is adequate...).

-Carl

-- 
carl.d.wo...@intel.com


pgp4kF1D6NVJr.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/5] util: Rework endian handling in python code

2014-04-23 Thread Richard Sandiford
Ping

Richard Sandiford  writes:
> Ping (with fixed subject)
>
> Richard Sandiford  writes:
>> This is a refresh of:
>>
>>http://lists.freedesktop.org/archives/mesa-dev/2013-June/040594.html
>>
>> At the moment the python code uses sys.byteorder to decide whether
>> u_format_table.c should be for big or little endian.  With this series
>> it instead generates both forms, using blocks like:
>>
>> #ifdef PIPE_ARCH_BIG_ENDIAN
>> ...
>> #else
>> ...
>> #endif
>>
>> in cases where endianness matters.
>>
>> Doing it this way is more cross-compiler-friendly.  It also means people
>> working on LE systems can see what the differences would be for BE.
>>
>> Tested on x86_64 and z.  I don't have commit access so please apply if OK.
>>
>> Thanks,
>> Richard
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/6] mesa: Add core support for the GL_INTEL_performance_query extension.

2014-04-23 Thread Petri Latvala
Like AMD_performance_monitor, this extension provides an interface for
applications (and OpenGL-based tools) to access GPU performance
counters. Since the exact performance counters available vary between
vendors and hardware generations, the extension provides an API the
application can use to get the names, types, and minimum/maximum
values of all available counters.

Applications create performance queries based on available query
types, and begin/end measurement collection. Multiple queries can be
measuring simultaneously.

v2: Whitespace changes
v3: src/mapi/glapi/gen/gl_API.xml: Also expose the functions to GLES2.
v4: Whitespace changes, static_dispatch="false" for all functions, fix
dispatch_sanity test for GLES2 functions

Signed-off-by: Petri Latvala 
Reviewed-by: Ian Romanick 
---
 src/mapi/glapi/gen/INTEL_performance_query.xml |  93 +++
 src/mapi/glapi/gen/Makefile.am |   1 +
 src/mapi/glapi/gen/gl_API.xml  |   2 +
 src/mesa/main/config.h |   8 +
 src/mesa/main/extensions.c |   1 +
 src/mesa/main/get.c|   1 +
 src/mesa/main/get_hash_params.py   |   6 +
 src/mesa/main/mtypes.h |   1 +
 src/mesa/main/performance_monitor.c| 219 +
 src/mesa/main/performance_monitor.h|  43 -
 src/mesa/main/tests/dispatch_sanity.cpp|  24 +++
 11 files changed, 398 insertions(+), 1 deletion(-)
 create mode 100644 src/mapi/glapi/gen/INTEL_performance_query.xml

diff --git a/src/mapi/glapi/gen/INTEL_performance_query.xml 
b/src/mapi/glapi/gen/INTEL_performance_query.xml
new file mode 100644
index 000..25cd181
--- /dev/null
+++ b/src/mapi/glapi/gen/INTEL_performance_query.xml
@@ -0,0 +1,93 @@
+
+
+
+
+
+
+
+  
+
+  
+
+  
+
+
+  
+
+  
+
+
+  
+
+  
+
+
+
+
+
+
+
+  
+
+  
+
+
+
+
+
+
+
+
+
+
+
+  
+
+  
+
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+  
+
+
+
+
+
+  
+
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+  
+
+
+
diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am
index 6b932e7..409d356 100644
--- a/src/mapi/glapi/gen/Makefile.am
+++ b/src/mapi/glapi/gen/Makefile.am
@@ -171,6 +171,7 @@ API_XML = \
EXT_texture_array.xml \
EXT_texture_integer.xml \
EXT_transform_feedback.xml \
+   INTEL_performance_query.xml \
KHR_debug.xml \
NV_conditional_render.xml \
NV_primitive_restart.xml \
diff --git a/src/mapi/glapi/gen/gl_API.xml b/src/mapi/glapi/gen/gl_API.xml
index 9200cd6..71b39a8 100644
--- a/src/mapi/glapi/gen/gl_API.xml
+++ b/src/mapi/glapi/gen/gl_API.xml
@@ -12808,6 +12808,8 @@
 
 
 
+http://www.w3.org/2001/XInclude"/>
+
 
 
 
diff --git a/src/mesa/main/config.h b/src/mesa/main/config.h
index 30da5d4..c96502a 100644
--- a/src/mesa/main/config.h
+++ b/src/mesa/main/config.h
@@ -281,6 +281,14 @@
 #define MAX_VERTEX_STREAMS  4
 /*@}*/
 
+/** For GL_INTEL_performance_query */
+/*@{*/
+#define MAX_PERFQUERY_QUERY_NAME_LENGTH 256
+#define MAX_PERFQUERY_COUNTER_NAME_LENGTH   256
+#define MAX_PERFQUERY_COUNTER_DESC_LENGTH   1024
+#define PERFQUERY_HAVE_GPA_EXTENDED_COUNTERS 0
+/*@}*/
+
 /*
  * Color channel component order
  * 
diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c
index a72284c..6c1c033 100644
--- a/src/mesa/main/extensions.c
+++ b/src/mesa/main/extensions.c
@@ -330,6 +330,7 @@ static const struct extension extension_table[] = {
{ "GL_IBM_rasterpos_clip",  o(dummy_true),  
GLL,1996 },
{ "GL_IBM_texture_mirrored_repeat", o(dummy_true),  
GLL,1998 },
{ "GL_INGR_blend_func_separate",o(EXT_blend_func_separate), 
GLL,1999 },
+   { "GL_INTEL_performance_query", o(INTEL_performance_query), 
  GL | ES2, 2013 },
{ "GL_MESA_pack_invert",o(MESA_pack_invert),
GL, 2002 },
{ "GL_MESA_texture_signed_rgba",o(EXT_texture_snorm),   
GL, 2009 },
{ "GL_MESA_window_pos", o(dummy_true),  
GLL,2000 },
diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c
index 6d95790..1897e8d 100644
--- a/src/mesa/main/get.c
+++ b/src/mesa/main/get.c
@@ -395,6 +395,7 @@ EXTRA_EXT(ARB_viewport_array);
 EXTRA_EXT(ARB_compute_shader);
 EXTRA_EXT(ARB_gpu_shader5);
 EXTRA_EXT2(ARB_transform_feedback3, ARB_gpu_shader5);
+EXTRA_EXT(INTEL_performance_query);
 
 static const int
 extra_ARB_color_buffer_float_or_glcore[] = {
diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/g

[Mesa-dev] [PATCH 5/6] i965: Enable INTEL_performance_query for Gen5+.

2014-04-23 Thread Petri Latvala
Signed-off-by: Petri Latvala 
---
 src/mesa/drivers/dri/i965/intel_extensions.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index 892a048..d6e1494 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -311,8 +311,10 @@ intelInitExtensions(struct gl_context *ctx)
   ctx->Extensions.ARB_stencil_texturing = true;
}
 
-   if (brw->gen == 5 || can_write_oacontrol(brw))
+   if (brw->gen == 5 || can_write_oacontrol(brw)) {
   ctx->Extensions.AMD_performance_monitor = true;
+  ctx->Extensions.INTEL_performance_query = true;
+   }
 
if (ctx->API == API_OPENGL_CORE)
   ctx->Extensions.ARB_base_instance = true;
-- 
1.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/6] mesa: Add INTEL_performance_query enums to tests/enum_strings.cpp

2014-04-23 Thread Petri Latvala
Signed-off-by: Petri Latvala 
Reviewed-by: Ian Romanick 
---
 src/mesa/main/tests/enum_strings.cpp | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/src/mesa/main/tests/enum_strings.cpp 
b/src/mesa/main/tests/enum_strings.cpp
index 3795700..d16eb36 100644
--- a/src/mesa/main/tests/enum_strings.cpp
+++ b/src/mesa/main/tests/enum_strings.cpp
@@ -807,6 +807,9 @@ const struct enum_info everything[] = {
{ 0x83F1, "GL_COMPRESSED_RGBA_S3TC_DXT1_EXT" },
{ 0x83F2, "GL_COMPRESSED_RGBA_S3TC_DXT3_ANGLE" },
{ 0x83F3, "GL_COMPRESSED_RGBA_S3TC_DXT5_ANGLE" },
+   { 0x83F9, "GL_PERFQUERY_DONOT_FLUSH_INTEL" },
+   { 0x83FA, "GL_PERFQUERY_FLUSH_INTEL" },
+   { 0x83FB, "GL_PERFQUERY_WAIT_INTEL" },
{ 0x844D, "GL_NEAREST_CLIPMAP_NEAREST_SGIX" },
{ 0x844E, "GL_NEAREST_CLIPMAP_LINEAR_SGIX" },
{ 0x844F, "GL_LINEAR_CLIPMAP_NEAREST_SGIX" },
@@ -1843,6 +1846,21 @@ const struct enum_info everything[] = {
{ 0x9271, "GL_COMPRESSED_SIGNED_R11_EAC" },
{ 0x9272, "GL_COMPRESSED_RG11_EAC" },
{ 0x9273, "GL_COMPRESSED_SIGNED_RG11_EAC" },
+   { 0x94F0, "GL_PERFQUERY_COUNTER_EVENT_INTEL" },
+   { 0x94F1, "GL_PERFQUERY_COUNTER_DURATION_NORM_INTEL" },
+   { 0x94F2, "GL_PERFQUERY_COUNTER_DURATION_RAW_INTEL" },
+   { 0x94F3, "GL_PERFQUERY_COUNTER_THROUGHPUT_INTEL" },
+   { 0x94F4, "GL_PERFQUERY_COUNTER_RAW_INTEL" },
+   { 0x94F5, "GL_PERFQUERY_COUNTER_TIMESTAMP_INTEL" },
+   { 0x94F8, "GL_PERFQUERY_COUNTER_DATA_UINT32_INTEL" },
+   { 0x94F9, "GL_PERFQUERY_COUNTER_DATA_UINT64_INTEL" },
+   { 0x94FA, "GL_PERFQUERY_COUNTER_DATA_FLOAT_INTEL" },
+   { 0x94FB, "GL_PERFQUERY_COUNTER_DATA_DOUBLE_INTEL" },
+   { 0x94FC, "GL_PERFQUERY_COUNTER_DATA_BOOL32_INTEL" },
+   { 0x94FD, "GL_PERFQUERY_QUERY_NAME_LENGTH_MAX_INTEL" },
+   { 0x94FE, "GL_PERFQUERY_COUNTER_NAME_LENGTH_MAX_INTEL" },
+   { 0x94FF, "GL_PERFQUERY_COUNTER_DESC_LENGTH_MAX_INTEL" },
+   { 0x9500, "GL_PERFQUERY_GPA_EXTENDED_COUNTERS_INTEL" },
{ 0x19262, "GL_RASTER_POSITION_UNCLIPPED_IBM" },
{ 0, NULL }
 };
-- 
1.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/6] docs: update 10.2 release notes

2014-04-23 Thread Petri Latvala
Signed-off-by: Petri Latvala 
---
 docs/relnotes/10.2.html | 1 +
 1 file changed, 1 insertion(+)

diff --git a/docs/relnotes/10.2.html b/docs/relnotes/10.2.html
index d7d557b..473739c 100644
--- a/docs/relnotes/10.2.html
+++ b/docs/relnotes/10.2.html
@@ -47,6 +47,7 @@ Note: some of the new features are only available with 
certain drivers.
 GL_ARB_buffer_storage on i965, r300, r600, and radeonsi
 GL_ARB_stencil_texturing on i965/gen8+
 GL_ARB_texture_view on i965/gen7
+GL_INTEL_performance_query on i965/gen5+
 
 
 
-- 
1.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/6] v3: Implement INTEL_performance_query

2014-04-23 Thread Petri Latvala
Third revision of the patch series. Changes:

- Rebased to current master
- Changes based on Ian's review
- Add the extension to 10.2 release notes

I didn't change patch 5/6 "Enable INTEL_performance_query for Gen5+"
along the review comments yet. It's true that currently drivers can
support both by just implementing the current driver hooks, but I'm
worried that the situation might change with the upcoming changes to
the driver functions to support semantic types and normalized
counters. I have no concrete examples that would make that happen, I
might just be too paranoid and wary.

Summarum, there's more changes for this extension incoming, and that
sort of cleanup can be part of it if so desired.

I will need someone to push this btw.

Petri Latvala (6):
  Regenerate gl_mangle.h.
  mesa: Add INTEL_performance_query enums to tests/enum_strings.cpp
  mesa: Add core support for the GL_INTEL_performance_query extension.
  mesa: Implement INTEL_performance_query.
  i965: Enable INTEL_performance_query for Gen5+.
  docs: update 10.2 release notes

 docs/relnotes/10.2.html|   1 +
 include/GL/gl_mangle.h | 371 ++-
 src/mapi/glapi/gen/INTEL_performance_query.xml |  93 
 src/mapi/glapi/gen/Makefile.am |   1 +
 src/mapi/glapi/gen/gl_API.xml  |   2 +
 src/mesa/drivers/dri/i965/intel_extensions.c   |   4 +-
 src/mesa/main/config.h |   8 +
 src/mesa/main/extensions.c |   1 +
 src/mesa/main/get.c|   1 +
 src/mesa/main/get_hash_params.py   |   6 +
 src/mesa/main/mtypes.h |   1 +
 src/mesa/main/performance_monitor.c| 614 +
 src/mesa/main/performance_monitor.h|  43 +-
 src/mesa/main/tests/dispatch_sanity.cpp|  24 +
 src/mesa/main/tests/enum_strings.cpp   |  18 +
 15 files changed, 1180 insertions(+), 8 deletions(-)
 create mode 100644 src/mapi/glapi/gen/INTEL_performance_query.xml

-- 
1.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/6] mesa: Implement INTEL_performance_query.

2014-04-23 Thread Petri Latvala
Using the existing driver hooks made for AMD_performance_monitor, implement
INTEL_performance_query functions.

v2: Whitespace changes.
v3: Whitespace changes, add a _mesa_warning()

Signed-off-by: Petri Latvala 
Reviewed-by: Ian Romanick 
---
 src/mesa/main/performance_monitor.c | 487 
 1 file changed, 441 insertions(+), 46 deletions(-)

diff --git a/src/mesa/main/performance_monitor.c 
b/src/mesa/main/performance_monitor.c
index 597f633..21b9423 100644
--- a/src/mesa/main/performance_monitor.c
+++ b/src/mesa/main/performance_monitor.c
@@ -137,6 +137,46 @@ get_counter(const struct gl_perf_monitor_group *group_obj, 
GLuint id)
return &group_obj->Counters[id];
 }
 
+/* For INTEL_performance_query, query id 0 is reserved to be invalid. We use
+ * index to Groups array + 1 as the query id. Same applies to counter id.
+ */
+static inline GLuint
+queryid_to_index(GLuint queryid)
+{
+   return queryid - 1;
+}
+
+static inline GLuint
+index_to_queryid(GLuint index)
+{
+   return index + 1;
+}
+
+static inline bool
+queryid_valid(const struct gl_context *ctx, GLuint queryid)
+{
+   return get_group(ctx, queryid_to_index(queryid)) != NULL;
+}
+
+static inline GLuint
+counterid_to_index(GLuint counterid)
+{
+   return counterid - 1;
+}
+
+static inline GLuint
+index_to_counterid(GLuint index)
+{
+   return index + 1;
+}
+
+static inline bool
+counterid_valid(const struct gl_perf_monitor_group *group_obj,
+GLuint counterid)
+{
+   return get_counter(group_obj, counterid_to_index(counterid)) != NULL;
+}
+
 /*/
 
 void GLAPIENTRY
@@ -644,6 +684,7 @@ extern void GLAPIENTRY
 _mesa_GetFirstPerfQueryIdINTEL(GLuint *queryId)
 {
GET_CURRENT_CONTEXT(ctx);
+   unsigned numGroups;
 
/* The GL_INTEL_performance_query spec says:
 *
@@ -655,16 +696,22 @@ _mesa_GetFirstPerfQueryIdINTEL(GLuint *queryId)
   return;
}
 
+   numGroups = ctx->PerfMonitor.NumGroups;
+
/* The GL_INTEL_performance_query spec says:
 *
 *"If the given hardware platform doesn't support any performance
 *queries, then the value of 0 is returned and INVALID_OPERATION error
 *is raised."
 */
+   if (numGroups == 0) {
+  *queryId = 0;
+  _mesa_error(ctx, GL_INVALID_OPERATION,
+  "glGetFirstPerfQueryIdINTEL(no queries supported)");
+  return;
+   }
 
-   *queryId = 0;
-   _mesa_error(ctx, GL_INVALID_OPERATION,
-   "glGetFirstPerfQueryIdINTEL(no queries supported)");
+   *queryId = index_to_queryid(0);
 }
 
 extern void GLAPIENTRY
@@ -674,40 +721,66 @@ _mesa_GetNextPerfQueryIdINTEL(GLuint queryId, GLuint 
*nextQueryId)
 
/* The GL_INTEL_performance_query spec says:
 *
-*"If nextQueryId pointer is equal to 0, an INVALID_VALUE error is
-*generated."
+*"The result is passed in location pointed by nextQueryId. If query
+*identified by queryId is the last query available the value of 0 is
+*returned. If the specified performance query identifier is invalid
+*then INVALID_VALUE error is generated. If nextQueryId pointer is
+*equal to 0, an INVALID_VALUE error is generated.  Whenever error is
+*generated, the value of 0 is returned."
 */
+
if (!nextQueryId) {
   _mesa_error(ctx, GL_INVALID_VALUE,
   "glGetNextPerfQueryIdINTEL(nextQueryId == NULL)");
   return;
}
 
-   /* The GL_INTEL_performance_query spec says:
-*
-*"If the specified performance query identifier is invalid then
-*INVALID_VALUE error is generated. Whenever error is generated, the
-*value of 0 is returned."
-*
-* No queries are supported, so all queries are invalid.
-*/
-   *nextQueryId = 0;
-   _mesa_error(ctx, GL_INVALID_VALUE,
-   "glGetNextPerfQueryIdINTEL(invalid query)");
+   if (!queryid_valid(ctx, queryId)) {
+  *nextQueryId = 0;
+  _mesa_error(ctx, GL_INVALID_VALUE,
+  "glGetNextPerfQueryIdINTEL(invalid query)");
+  return;
+   }
+
+   ++queryId;
+
+   if (!queryid_valid(ctx, queryId)) {
+  *nextQueryId = 0;
+   } else {
+  *nextQueryId = queryId;
+   }
 }
 
 extern void GLAPIENTRY
 _mesa_GetPerfQueryIdByNameINTEL(char *queryName, GLuint *queryId)
 {
GET_CURRENT_CONTEXT(ctx);
+   unsigned i;
 
/* The GL_INTEL_performance_query spec says:
 *
 *"If queryName does not reference a valid query name, an INVALID_VALUE
 *error is generated."
-*
-* No queries are supported, so all query names are invalid.
 */
+   if (!queryName) {
+  _mesa_error(ctx, GL_INVALID_VALUE,
+  "glGetPerfQueryIdByNameINTEL(queryName == NULL)");
+  return;
+   }
+
+   /* The specification does not state that this produces an error. */
+   if (!queryId) {
+  _mesa_warning(ctx, "glGetPerfQueryIdByNameINTEL(queryId == NULL)");
+  return;
+   }
+
+

[Mesa-dev] Mixing Pixel Shaders and Compute Shaders

2014-04-23 Thread Dorrington, Albert
In trying to implement Image support in Clover, I have discovered that the 
existing CL image related calls result in the generation of Pixel Shader 
sequences for copies of images to and from the GPU.

I initially thought that this would be fine, and was able to implement image 
read tests that use clEnqueueWriteImage() to get an image into a kernel.
The clEnqueueWriteImage(), through the routines in clover/api/transfer.cpp 
generates a Pixel shader which copies the image to the GPU.
The Compute Shader then picks the image up from where the Pixel Shader left it.

I had some issues initially with mixing the Pixel and Compute Shaders, until I 
added a PS_PARTIAL_FLUSH event along with the CS_PARTIAL_FLUSH event at the 
start of evergreen_init_atom_start_compute_cs(). I think this helped because it 
made the Pixel Shader Execute before the Compute Shader (not entirely sure?)

When I try to call clEnqueueReadImage(), after a clEnqueueNDRangeKernel(); the 
clover/aop/transfer.cpp again generates a Pixel Shader, which gets integrated 
into the command stream after the Compute Shader entries (so I send up with:

Command Sequence
1  [Initial Configuration]
2  [PixelShader]
3  [ComputeShader]
4  [PixelShader]
5  [Final Configuration/Cleanup/Wait]

The problem is, now I am encountering GPU Lockup CP Stalls at the end of 
'section 4' and the start of 'section 5'
I am not sure I entirely understand why this is happening, but I know it has to 
do with the fact that the Pixel Shader is in the command stream after the 
Compute Shader commands.
I'm assuming something in how the flushes are configured for the Pixel Shader 
are not waiting for the Compute Shader to complete before executing, but again, 
I'm not entirely sure.

I figure there are two possible approaches to resolving this:

1.   Figure out the right way to get the Compute Shader and Pixel Shader to 
interact properly

2.   Do away with the need for the Pixel Shader by doing the image transfer 
entirely within the Compute Shader context. (Probably a lot of driver code to 
replace the existing routines that use the vbo and blitter draw routines?)

>From reviewing the R600/Evergreen register documentation, I see that the 
>CB_COLOR#_INFO registers have a RAT bit (bit 26 in GPU registers 
>0x28c70-0x28ea4)
I also found that if this flag is set, that the surface is treated as a RAT and 
can only be manipulated by Compute Shader operations. (Which I suppose is the 
cause of the conflict between the Pixel Shader and Compute Shader trying to 
manipulate the same Color buffer/Texture.)

My biggest issue with this, is I have not really found any documentation that 
describes how you are supposed to transfer buffers/textures within a compute 
shader, so I feel like I am missing something that might be a very basic 
foundation for understanding these routines, which is resulting in my 
overcomplicating the concepts and confusing myself...

If anyone is familiar with this area and is willing to provide some more 
insight, I would greatly appreciate it.

While our team's goal is to implement OpenCL capability in an alternate 
operating system, my hope is that once I understand all of this and get it 
working in that environment, I will be able to contribute back Clover image 
support to the main Mesa baseline.

Thanks,
Al Dorrington
Software Engineer Sr
Lockheed Martin, Mission Systems and Training

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mixing Pixel Shaders and Compute Shaders

2014-04-23 Thread Tom Stellard
On Wed, Apr 23, 2014 at 01:27:11PM +, Dorrington, Albert wrote:
> In trying to implement Image support in Clover, I have discovered that the 
> existing CL image related calls result in the generation of Pixel Shader 
> sequences for copies of images to and from the GPU.
> 
> I initially thought that this would be fine, and was able to implement image 
> read tests that use clEnqueueWriteImage() to get an image into a kernel.
> The clEnqueueWriteImage(), through the routines in clover/api/transfer.cpp 
> generates a Pixel shader which copies the image to the GPU.
> The Compute Shader then picks the image up from where the Pixel Shader left 
> it.
> 
> I had some issues initially with mixing the Pixel and Compute Shaders, until 
> I added a PS_PARTIAL_FLUSH event along with the CS_PARTIAL_FLUSH event at the 
> start of evergreen_init_atom_start_compute_cs(). I think this helped because 
> it made the Pixel Shader Execute before the Compute Shader (not entirely 
> sure?)
>

PS_PARTIAL_FLUSH guarantees all pixel shader have completed before the
next packet is processed, so we should be using this any place where we
may need the result of a pixel shader.
 
> When I try to call clEnqueueReadImage(), after a clEnqueueNDRangeKernel(); 
> the clover/aop/transfer.cpp again generates a Pixel Shader, which gets 
> integrated into the command stream after the Compute Shader entries (so I 
> send up with:
> 
> Command Sequence
> 1  [Initial Configuration]
> 2  [PixelShader]
Need PS_PARTIAL_FLUSH here
> 3  [ComputeShader]
Need CS_PARTIAL_FLUSH here
> 4  [PixelShader]
Need PS_PARTIAL_FLUSH here
> 5  [Final Configuration/Cleanup/Wait]
> 

You also may need to flush the various caches after the pixel shader
and compute shaders have completed.  See r600_flush_emit in r600_hw_context.c

> The problem is, now I am encountering GPU Lockup CP Stalls at the end of 
> 'section 4' and the start of 'section 5'
> I am not sure I entirely understand why this is happening, but I know it has 
> to do with the fact that the Pixel Shader is in the command stream after the 
> Compute Shader commands.
> I'm assuming something in how the flushes are configured for the Pixel Shader 
> are not waiting for the Compute Shader to complete before executing, but 
> again, I'm not entirely sure.
> 
> I figure there are two possible approaches to resolving this:
> 
> 1.   Figure out the right way to get the Compute Shader and Pixel Shader 
> to interact properly
> 
> 2.   Do away with the need for the Pixel Shader by doing the image 
> transfer entirely within the Compute Shader context. (Probably a lot of 
> driver code to replace the existing routines that use the vbo and blitter 
> draw routines?)
> 
> From reviewing the R600/Evergreen register documentation, I see that the 
> CB_COLOR#_INFO registers have a RAT bit (bit 26 in GPU registers 
> 0x28c70-0x28ea4)
> I also found that if this flag is set, that the surface is treated as a RAT 
> and can only be manipulated by Compute Shader operations. (Which I suppose is 
> the cause of the conflict between the Pixel Shader and Compute Shader trying 
> to manipulate the same Color buffer/Texture.)
>

Does the documentation say that setting the RAT bit means it can only be
used by compute shaders, or have you discovered this from your testing?
If this is the case, you may have to use a different CB_COLOR# for the
image when coying it with a pixel shader.

 
> My biggest issue with this, is I have not really found any documentation that 
> describes how you are supposed to transfer buffers/textures within a compute 
> shader, so I feel like I am missing something that might be a very basic 
> foundation for understanding these routines, which is resulting in my 
> overcomplicating the concepts and confusing myself...
> 
> If anyone is familiar with this area and is willing to provide some more 
> insight, I would greatly appreciate it.
> 
> While our team's goal is to implement OpenCL capability in an alternate 
> operating system, my hope is that once I understand all of this and get it 
> working in that environment, I will be able to contribute back Clover image 
> support to the main Mesa baseline.
> 

Looking forward to your contribution.  As always the sooner you can post the 
code the better
as that will make it easier to review and may help uncover some of your issues.

-Tom

> Thanks,
> Al Dorrington
> Software Engineer Sr
> Lockheed Martin, Mission Systems and Training
> 

> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] util/u_debug: Pass correct size to strncat.

2014-04-23 Thread jfonseca
From: José Fonseca 

Courtesy of Clang static analyzer.
---
 src/gallium/auxiliary/util/u_debug.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_debug.c 
b/src/gallium/auxiliary/util/u_debug.c
index fe51717..dc840e8 100644
--- a/src/gallium/auxiliary/util/u_debug.c
+++ b/src/gallium/auxiliary/util/u_debug.c
@@ -334,10 +334,10 @@ debug_dump_flags(const struct debug_named_value *names,
while(names->name) {
   if((names->value & value) == names->value) {
 if (!first)
-   util_strncat(output, "|", sizeof(output));
+   util_strncat(output, "|", sizeof(output) - strlen(output) - 1);
 else
first = 0;
-util_strncat(output, names->name, sizeof(output) - 1);
+util_strncat(output, names->name, sizeof(output) - strlen(output) - 1);
 output[sizeof(output) - 1] = '\0';
 value &= ~names->value;
   }
@@ -346,12 +346,12 @@ debug_dump_flags(const struct debug_named_value *names,

if (value) {
   if (!first)
-util_strncat(output, "|", sizeof(output));
+util_strncat(output, "|", sizeof(output) - strlen(output) - 1);
   else
 first = 0;
   
   util_snprintf(rest, sizeof(rest), "0x%08lx", value);
-  util_strncat(output, rest, sizeof(output) - 1);
+  util_strncat(output, rest, sizeof(output) - strlen(output) - 1);
   output[sizeof(output) - 1] = '\0';
}

-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] mesa/st: Fix pipe_framebuffer_state::height for PIPE_TEXTURE_1D_ARRAY.

2014-04-23 Thread jfonseca
From: José Fonseca 

This prevents buffer overflow w/ llvmpipe when running piglit

  bin/gl-3.2-layered-rendering-clear-color-all-types 1d_array single_level -fbo 
-auto

v2: Compute the framebuffer size as the minimum size, as pointed out by
Brian;  compacted code;  ran piglit quick test list (with no
regressions.)
---
 src/mesa/state_tracker/st_atom_framebuffer.c | 33 ++--
 1 file changed, 31 insertions(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom_framebuffer.c 
b/src/mesa/state_tracker/st_atom_framebuffer.c
index 4c4f839..f395ec7 100644
--- a/src/mesa/state_tracker/st_atom_framebuffer.c
+++ b/src/mesa/state_tracker/st_atom_framebuffer.c
@@ -31,6 +31,8 @@
   *   Brian Paul
   */
  
+#include 
+
 #include "st_context.h"
 #include "st_atom.h"
 #include "st_cb_bitmap.h"
@@ -44,6 +46,24 @@
 
 
 /**
+ * Update framebuffer size.
+ *
+ * framebuffer->width should match fb->Weight, but for PIPE_TEXTURE_1D_ARRAY
+ * textures fb->Height has the number of layers, and not the surface height.
+ */
+static void
+update_framebuffer_size(struct pipe_framebuffer_state *framebuffer,
+struct pipe_surface *surface)
+{
+   assert(surface);
+   assert(surface->width  < UINT_MAX);
+   assert(surface->height < UINT_MAX);
+   framebuffer->width  = MIN2(framebuffer->width,  surface->width);
+   framebuffer->height = MIN2(framebuffer->height, surface->height);
+}
+
+
+/**
  * Update framebuffer state (color, depth, stencil, etc. buffers)
  */
 static void
@@ -57,11 +77,12 @@ update_framebuffer_state( struct st_context *st )
st_flush_bitmap_cache(st);
 
st->state.fb_orientation = st_fb_orientation(fb);
-   framebuffer->width = fb->Width;
-   framebuffer->height = fb->Height;
 
/*printf("-- fb size %d x %d\n", fb->Width, fb->Height);*/
 
+   framebuffer->width  = UINT_MAX;
+   framebuffer->height = UINT_MAX;
+
/* Examine Mesa's ctx->DrawBuffer->_ColorDrawBuffers state
 * to determine which surfaces to draw to
 */
@@ -81,6 +102,7 @@ update_framebuffer_state( struct st_context *st )
 
  if (strb->surface) {
 pipe_surface_reference(&framebuffer->cbufs[i], strb->surface);
+update_framebuffer_size(framebuffer, strb->surface);
  }
  strb->defined = GL_TRUE; /* we'll be drawing something */
   }
@@ -100,12 +122,14 @@ update_framebuffer_state( struct st_context *st )
  st_update_renderbuffer_surface(st, strb);
   }
   pipe_surface_reference(&framebuffer->zsbuf, strb->surface);
+  update_framebuffer_size(framebuffer, strb->surface);
}
else {
   strb = st_renderbuffer(fb->Attachment[BUFFER_STENCIL].Renderbuffer);
   if (strb) {
  assert(strb->surface);
  pipe_surface_reference(&framebuffer->zsbuf, strb->surface);
+ update_framebuffer_size(framebuffer, strb->surface);
   }
   else
  pipe_surface_reference(&framebuffer->zsbuf, NULL);
@@ -122,6 +146,11 @@ update_framebuffer_state( struct st_context *st )
}
 #endif
 
+   /* _mesa_test_framebuffer_completeness refuses framebuffers with no
+* attachments, so this should never happen. */
+   assert(framebuffer->width  != UINT_MAX);
+   assert(framebuffer->height != UINT_MAX);
+
cso_set_framebuffer(st->cso_context, framebuffer);
 }
 
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] swrast: Fix vertex color in _swsetup_Translate()

2014-04-23 Thread ville . syrjala
From: nick 

Straightforward fix to properly load dest->color with color data, as
opposed to position data as previously implemented.

[vsyrjala: I noticed the patch languishing in bugzilla.
 It looks correct to me so I refreshed it for master.

 Looks like it's a regression introduced in:
  commit a164d3aee063580503e5e9a77980059d52c486d7
  Author: Brian 
  Date:   Fri Jul 6 16:51:19 2007 -0600

In _swsetup_Translate(), update dest->attrib[FRAG_ATTRIB_COL0].]

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=27499
Cc: Brian Paul 
Signed-off-by: Ville Syrjälä 
---
 src/mesa/swrast_setup/ss_context.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/swrast_setup/ss_context.c 
b/src/mesa/swrast_setup/ss_context.c
index 12a4735..237f74c 100644
--- a/src/mesa/swrast_setup/ss_context.c
+++ b/src/mesa/swrast_setup/ss_context.c
@@ -287,7 +287,8 @@ _swsetup_Translate( struct gl_context *ctx, const void 
*vertex, SWvertex *dest )
 
_tnl_get_attr( ctx, vertex, _TNL_ATTRIB_COLOR0,
   dest->attrib[VARYING_SLOT_COL0] );
-   UNCLAMPED_FLOAT_TO_RGBA_CHAN( dest->color, tmp );
+   UNCLAMPED_FLOAT_TO_RGBA_CHAN( dest->color,
+ dest->attrib[VARYING_SLOT_COL0] );
 
_tnl_get_attr( ctx, vertex, _TNL_ATTRIB_COLOR1,
   dest->attrib[VARYING_SLOT_COL1]);
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] util/u_debug: Pass correct size to strncat.

2014-04-23 Thread Brian Paul

On 04/23/2014 07:55 AM, jfons...@vmware.com wrote:

From: José Fonseca 

Courtesy of Clang static analyzer.
---
  src/gallium/auxiliary/util/u_debug.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_debug.c 
b/src/gallium/auxiliary/util/u_debug.c
index fe51717..dc840e8 100644
--- a/src/gallium/auxiliary/util/u_debug.c
+++ b/src/gallium/auxiliary/util/u_debug.c
@@ -334,10 +334,10 @@ debug_dump_flags(const struct debug_named_value *names,
 while(names->name) {
if((names->value & value) == names->value) {
 if (!first)
-   util_strncat(output, "|", sizeof(output));
+   util_strncat(output, "|", sizeof(output) - strlen(output) - 1);
 else
first = 0;
-util_strncat(output, names->name, sizeof(output) - 1);
+util_strncat(output, names->name, sizeof(output) - strlen(output) - 1);
 output[sizeof(output) - 1] = '\0';
 value &= ~names->value;
}
@@ -346,12 +346,12 @@ debug_dump_flags(const struct debug_named_value *names,

 if (value) {
if (!first)
-util_strncat(output, "|", sizeof(output));
+util_strncat(output, "|", sizeof(output) - strlen(output) - 1);
else
 first = 0;

util_snprintf(rest, sizeof(rest), "0x%08lx", value);
-  util_strncat(output, rest, sizeof(output) - 1);
+  util_strncat(output, rest, sizeof(output) - strlen(output) - 1);
output[sizeof(output) - 1] = '\0';
 }




Reviewed-by: Brian Paul 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] mesa/st: Fix pipe_framebuffer_state::height for PIPE_TEXTURE_1D_ARRAY.

2014-04-23 Thread Brian Paul

On 04/23/2014 07:55 AM, jfons...@vmware.com wrote:

From: José Fonseca 

This prevents buffer overflow w/ llvmpipe when running piglit

   bin/gl-3.2-layered-rendering-clear-color-all-types 1d_array single_level 
-fbo -auto

v2: Compute the framebuffer size as the minimum size, as pointed out by
Brian;  compacted code;  ran piglit quick test list (with no
regressions.)
---
  src/mesa/state_tracker/st_atom_framebuffer.c | 33 ++--
  1 file changed, 31 insertions(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom_framebuffer.c 
b/src/mesa/state_tracker/st_atom_framebuffer.c
index 4c4f839..f395ec7 100644
--- a/src/mesa/state_tracker/st_atom_framebuffer.c
+++ b/src/mesa/state_tracker/st_atom_framebuffer.c
@@ -31,6 +31,8 @@
*   Brian Paul
*/

+#include 
+
  #include "st_context.h"
  #include "st_atom.h"
  #include "st_cb_bitmap.h"
@@ -44,6 +46,24 @@


  /**
+ * Update framebuffer size.
+ *
+ * framebuffer->width should match fb->Weight, but for PIPE_TEXTURE_1D_ARRAY


"fb->Width"



+ * textures fb->Height has the number of layers, and not the surface height.
+ */


The comment seems a bit disconnected from the code. 
update_framebuffer_size() is used to find the size which is the min of 
the attached surfaces.  The comment about 1D array textures doesn't seem 
to matter in the code.  That just seems a little confusing.




+static void
+update_framebuffer_size(struct pipe_framebuffer_state *framebuffer,
+struct pipe_surface *surface)
+{
+   assert(surface);
+   assert(surface->width  < UINT_MAX);
+   assert(surface->height < UINT_MAX);
+   framebuffer->width  = MIN2(framebuffer->width,  surface->width);
+   framebuffer->height = MIN2(framebuffer->height, surface->height);
+}
+
+
+/**
   * Update framebuffer state (color, depth, stencil, etc. buffers)
   */
  static void
@@ -57,11 +77,12 @@ update_framebuffer_state( struct st_context *st )
 st_flush_bitmap_cache(st);

 st->state.fb_orientation = st_fb_orientation(fb);
-   framebuffer->width = fb->Width;
-   framebuffer->height = fb->Height;

 /*printf("-- fb size %d x %d\n", fb->Width, fb->Height);*/

+   framebuffer->width  = UINT_MAX;
+   framebuffer->height = UINT_MAX;
+
 /* Examine Mesa's ctx->DrawBuffer->_ColorDrawBuffers state
  * to determine which surfaces to draw to
  */
@@ -81,6 +102,7 @@ update_framebuffer_state( struct st_context *st )

   if (strb->surface) {
  pipe_surface_reference(&framebuffer->cbufs[i], strb->surface);
+update_framebuffer_size(framebuffer, strb->surface);
   }
   strb->defined = GL_TRUE; /* we'll be drawing something */
}
@@ -100,12 +122,14 @@ update_framebuffer_state( struct st_context *st )
   st_update_renderbuffer_surface(st, strb);
}
pipe_surface_reference(&framebuffer->zsbuf, strb->surface);
+  update_framebuffer_size(framebuffer, strb->surface);
 }
 else {
strb = st_renderbuffer(fb->Attachment[BUFFER_STENCIL].Renderbuffer);
if (strb) {
   assert(strb->surface);
   pipe_surface_reference(&framebuffer->zsbuf, strb->surface);
+ update_framebuffer_size(framebuffer, strb->surface);
}
else
   pipe_surface_reference(&framebuffer->zsbuf, NULL);
@@ -122,6 +146,11 @@ update_framebuffer_state( struct st_context *st )
 }
  #endif

+   /* _mesa_test_framebuffer_completeness refuses framebuffers with no
+* attachments, so this should never happen. */


Close */ on next line.



+   assert(framebuffer->width  != UINT_MAX);
+   assert(framebuffer->height != UINT_MAX);
+
 cso_set_framebuffer(st->cso_context, framebuffer);
  }




Otherwise, Reviewed-by: Brian Paul 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] EXTERNAL: Re: Mixing Pixel Shaders and Compute Shaders

2014-04-23 Thread Dorrington, Albert


> -Original Message-
> From: Tom Stellard> 
> On Wed, Apr 23, 2014 at 01:27:11PM +, Dorrington, Albert wrote:

> > When I try to call clEnqueueReadImage(), after a
> clEnqueueNDRangeKernel(); the clover/aop/transfer.cpp again generates a
> Pixel Shader, which gets integrated into the command stream after the
> Compute Shader entries (so I send up with:

Command Sequence
1  [Initial Configuration]
2  [PixelShader]
Need PS_PARTIAL_FLUSH here <-- this one I think I have with what I added to 
evergren_init_atom_start_compute_cs()
3  [ComputeShader]
Need CS_PARTIAL_FLUSH here
4  [PixelShader]
Need PS_PARTIAL_FLUSH here
5  [Final Configuration/Cleanup/Wait]

The other two, I've been experimenting trying to add them in various places, 
but I haevn't seen a change in behavior yet (probably not putting them in the 
right place yet...)

> 
> You also may need to flush the various caches after the pixel shader and
> compute shaders have completed.  See r600_flush_emit in
> r600_hw_context.c

That could be why I'm not seeing any changes yet... I have been looking at 
r600_flush_emit() this morning and experimenting with its use.


> Does the documentation say that setting the RAT bit means it can only be
> used by compute shaders, or have you discovered this from your testing?
> If this is the case, you may have to use a different CB_COLOR# for the image
> when coying it with a pixel shader.

The documentation I'm referring to is "Radeon Evergreen/Northern Islands 
Acceleration" Rev 1.0 Dated May 24, 2011
Section 8 CB Programming
Section 8.6 Compute Shader

Compute shaders can perform atomic writes ("device reduction 
operations") to memory via the CB. The order of 
execution of the operations is not guaranteed, only that they are 
atomic. These writes can include simple operations
(min, max, add, and, or, exchange, compare-exchange) and can optionally 
return a value (pre-op) back to the shader.

The CF_export adds two new opcodes for RAT exports: EXPORT_RAT and 
EXPORT_RAT_CACHELESS.

If CB_COLOR_INFO.RAT is programmed, the surface is treated as a 
Random Access Target and can only be
drawn by Compute Shader operations. A set of MRTs can be configured for 
RATs and normal rendering. The only
stipulation is that all RAT MRTs must be assigned to higher number MRTs 
than normal rendering MRTs.

I take the statement in the 3rd paragraph to mean that if the RAT bit is set, 
that a CB setup within a Compute Shader
will not work in a Pixel Shader. However, looking at the command stream 
executed, the 'r600_draw_vbo()' function 
that gets called appears to reconfigure all of the Color buffers, so I don't 
think this is the issue causing the conflict.
I'm getting more convinced that the issue is that the Compute Shader needs to 
run completely, before the Pixel Shader runs.
(presumably the need for the CS_PARTIAL_FLUSH and PS_PARTIAL_FLUSH directives)

Frustrating thing is, sometimes DRM can recover from these stalls, other times 
my box locks up.

> >
> 
> Looking forward to your contribution.  As always the sooner you can post the
> code the better as that will make it easier to review and may help uncover
> some of your issues.
> 

I'd love to contribute some of what I've done back soon, unfortunately with our 
teams choice to use the AMD SDK OpenCL compiler rather than LLVM, I can't 
easily migrate my changes back to the Mesa baseline. We have quite a few 
work-arounds (hacks? haha) to setup the RATs and Vertex Buffers the way the AMD 
compiler needs them. Once the crazy hours at work slowdown, I'll be able to 
have some more free time at home where i can contribute. :)

After I reboot the hung box (yet again) I'll be taking another look to verify 
if the partial flushes are in the command stream or not.

Thanks!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] swrast: Fix vertex color in _swsetup_Translate()

2014-04-23 Thread Brian Paul

On 04/23/2014 08:18 AM, ville.syrj...@linux.intel.com wrote:

From: nick 

Straightforward fix to properly load dest->color with color data, as
opposed to position data as previously implemented.

[vsyrjala: I noticed the patch languishing in bugzilla.
  It looks correct to me so I refreshed it for master.

  Looks like it's a regression introduced in:
   commit a164d3aee063580503e5e9a77980059d52c486d7
   Author: Brian 
   Date:   Fri Jul 6 16:51:19 2007 -0600

 In _swsetup_Translate(), update dest->attrib[FRAG_ATTRIB_COL0].]

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=27499
Cc: Brian Paul 
Signed-off-by: Ville Syrjälä 
---
  src/mesa/swrast_setup/ss_context.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/swrast_setup/ss_context.c 
b/src/mesa/swrast_setup/ss_context.c
index 12a4735..237f74c 100644
--- a/src/mesa/swrast_setup/ss_context.c
+++ b/src/mesa/swrast_setup/ss_context.c
@@ -287,7 +287,8 @@ _swsetup_Translate( struct gl_context *ctx, const void 
*vertex, SWvertex *dest )

 _tnl_get_attr( ctx, vertex, _TNL_ATTRIB_COLOR0,
dest->attrib[VARYING_SLOT_COL0] );
-   UNCLAMPED_FLOAT_TO_RGBA_CHAN( dest->color, tmp );
+   UNCLAMPED_FLOAT_TO_RGBA_CHAN( dest->color,
+ dest->attrib[VARYING_SLOT_COL0] );

 _tnl_get_attr( ctx, vertex, _TNL_ATTRIB_COLOR1,
dest->attrib[VARYING_SLOT_COL1]);



Reviewed-by: Brian Paul 

Do you need someone to commit this for you?

Tag for stable branches?

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 77789] Account request for Andreas Hartmetz

2014-04-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=77789

Brian Paul  changed:

   What|Removed |Added

   Assignee|mesa-dev@lists.freedesktop. |sitewranglers@lists.freedes
   |org |ktop.org
Product|Mesa|freedesktop.org
  Component|Other   |New Accounts

--- Comment #2 from Brian Paul  ---
Reassigning to fd.o admins.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 77749] PRAGMA_EXPORT_SUPPORTED defined incorrectly on OS X

2014-04-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=77749

--- Comment #4 from Brian Paul  ---
The patch looks OK to me.  I'm always happy to get rid of old system-specific
crud.

But I can't apply the patch here to test it:

$ patch -p0 < ~/mesaport.diff 
patching file include/GL/gl.h
patch:  malformed patch at line 16: @@ -92,10 +87,6 @@

Do I need a special option for patch?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] swrast: Fix vertex color in _swsetup_Translate()

2014-04-23 Thread Ville Syrjälä
On Wed, Apr 23, 2014 at 08:40:22AM -0600, Brian Paul wrote:
> On 04/23/2014 08:18 AM, ville.syrj...@linux.intel.com wrote:
> > From: nick 
> >
> > Straightforward fix to properly load dest->color with color data, as
> > opposed to position data as previously implemented.
> >
> > [vsyrjala: I noticed the patch languishing in bugzilla.
> >   It looks correct to me so I refreshed it for master.
> >
> >   Looks like it's a regression introduced in:
> >commit a164d3aee063580503e5e9a77980059d52c486d7
> >Author: Brian 
> >Date:   Fri Jul 6 16:51:19 2007 -0600
> >
> >  In _swsetup_Translate(), update dest->attrib[FRAG_ATTRIB_COL0].]
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=27499
> > Cc: Brian Paul 
> > Signed-off-by: Ville Syrjälä 
> > ---
> >   src/mesa/swrast_setup/ss_context.c | 3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/mesa/swrast_setup/ss_context.c 
> > b/src/mesa/swrast_setup/ss_context.c
> > index 12a4735..237f74c 100644
> > --- a/src/mesa/swrast_setup/ss_context.c
> > +++ b/src/mesa/swrast_setup/ss_context.c
> > @@ -287,7 +287,8 @@ _swsetup_Translate( struct gl_context *ctx, const void 
> > *vertex, SWvertex *dest )
> >
> >  _tnl_get_attr( ctx, vertex, _TNL_ATTRIB_COLOR0,
> > dest->attrib[VARYING_SLOT_COL0] );
> > -   UNCLAMPED_FLOAT_TO_RGBA_CHAN( dest->color, tmp );
> > +   UNCLAMPED_FLOAT_TO_RGBA_CHAN( dest->color,
> > + dest->attrib[VARYING_SLOT_COL0] );
> >
> >  _tnl_get_attr( ctx, vertex, _TNL_ATTRIB_COLOR1,
> > dest->attrib[VARYING_SLOT_COL1]);
> >
> 
> Reviewed-by: Brian Paul 
> 
> Do you need someone to commit this for you?

Yes please.

> 
> Tag for stable branches?

Yeah that would seem appropriate. The offending commit has been there
since 7.2, but I guess 10.1 is the only still active stable branch?

-- 
Ville Syrjälä
Intel OTC
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] mesa/st: Fix pipe_framebuffer_state::height for PIPE_TEXTURE_1D_ARRAY.

2014-04-23 Thread Jose Fonseca
Thanks for the review.

- Original Message -
> On 04/23/2014 07:55 AM, jfons...@vmware.com wrote:
> > From: José Fonseca 
> >
> > This prevents buffer overflow w/ llvmpipe when running piglit
> >
> >bin/gl-3.2-layered-rendering-clear-color-all-types 1d_array single_level
> >-fbo -auto
> >
> > v2: Compute the framebuffer size as the minimum size, as pointed out by
> > Brian;  compacted code;  ran piglit quick test list (with no
> > regressions.)
> > ---
> >   src/mesa/state_tracker/st_atom_framebuffer.c | 33
> >   ++--
> >   1 file changed, 31 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/mesa/state_tracker/st_atom_framebuffer.c
> > b/src/mesa/state_tracker/st_atom_framebuffer.c
> > index 4c4f839..f395ec7 100644
> > --- a/src/mesa/state_tracker/st_atom_framebuffer.c
> > +++ b/src/mesa/state_tracker/st_atom_framebuffer.c
> > @@ -31,6 +31,8 @@
> > *   Brian Paul
> > */
> >
> > +#include 
> > +
> >   #include "st_context.h"
> >   #include "st_atom.h"
> >   #include "st_cb_bitmap.h"
> > @@ -44,6 +46,24 @@
> >
> >
> >   /**
> > + * Update framebuffer size.
> > + *
> > + * framebuffer->width should match fb->Weight, but for
> > PIPE_TEXTURE_1D_ARRAY
> 
> "fb->Width"
> 
> 
> > + * textures fb->Height has the number of layers, and not the surface
> > height.
> > + */
> 
> The comment seems a bit disconnected from the code.
> update_framebuffer_size() is used to find the size which is the min of
> the attached surfaces.  The comment about 1D array textures doesn't seem
> to matter in the code.  That just seems a little confusing.

Yes, the update_framebuffer_size finds the min size, which I think is obvious.  
This comment here was supposed to explain why we do it when gl_framebuffer has 
similar info, ie., the less obvious bit.

But I agree that the comment could be better phrased.  What about this?

   "We need to derive pipe_framebuffer size from the bound pipe_surfaces here 
instead of copying gl_framebuffer size because for certain target types (like 
PIPE_TEXTURE_1D_ARRAY) gl_framebuffer::Height has the number of layers instead 
of 1."

Jose



> > +static void
> > +update_framebuffer_size(struct pipe_framebuffer_state *framebuffer,
> > +struct pipe_surface *surface)
> > +{
> > +   assert(surface);
> > +   assert(surface->width  < UINT_MAX);
> > +   assert(surface->height < UINT_MAX);
> > +   framebuffer->width  = MIN2(framebuffer->width,  surface->width);
> > +   framebuffer->height = MIN2(framebuffer->height, surface->height);
> > +}
> > +
> > +
> > +/**
> >* Update framebuffer state (color, depth, stencil, etc. buffers)
> >*/
> >   static void
> > @@ -57,11 +77,12 @@ update_framebuffer_state( struct st_context *st )
> >  st_flush_bitmap_cache(st);
> >
> >  st->state.fb_orientation = st_fb_orientation(fb);
> > -   framebuffer->width = fb->Width;
> > -   framebuffer->height = fb->Height;
> >
> >  /*printf("-- fb size %d x %d\n", fb->Width, fb->Height);*/
> >
> > +   framebuffer->width  = UINT_MAX;
> > +   framebuffer->height = UINT_MAX;
> > +
> >  /* Examine Mesa's ctx->DrawBuffer->_ColorDrawBuffers state
> >   * to determine which surfaces to draw to
> >   */
> > @@ -81,6 +102,7 @@ update_framebuffer_state( struct st_context *st )
> >
> >if (strb->surface) {
> >   pipe_surface_reference(&framebuffer->cbufs[i],
> >   strb->surface);
> > +update_framebuffer_size(framebuffer, strb->surface);
> >}
> >strb->defined = GL_TRUE; /* we'll be drawing something */
> > }
> > @@ -100,12 +122,14 @@ update_framebuffer_state( struct st_context *st )
> >st_update_renderbuffer_surface(st, strb);
> > }
> > pipe_surface_reference(&framebuffer->zsbuf, strb->surface);
> > +  update_framebuffer_size(framebuffer, strb->surface);
> >  }
> >  else {
> > strb =
> > st_renderbuffer(fb->Attachment[BUFFER_STENCIL].Renderbuffer);
> > if (strb) {
> >assert(strb->surface);
> >pipe_surface_reference(&framebuffer->zsbuf, strb->surface);
> > + update_framebuffer_size(framebuffer, strb->surface);
> > }
> > else
> >pipe_surface_reference(&framebuffer->zsbuf, NULL);
> > @@ -122,6 +146,11 @@ update_framebuffer_state( struct st_context *st )
> >  }
> >   #endif
> >
> > +   /* _mesa_test_framebuffer_completeness refuses framebuffers with no
> > +* attachments, so this should never happen. */
> 
> Close */ on next line.
> 
> 
> > +   assert(framebuffer->width  != UINT_MAX);
> > +   assert(framebuffer->height != UINT_MAX);
> > +
> >  cso_set_framebuffer(st->cso_context, framebuffer);
> >   }
> >
> >
> 
> Otherwise, Reviewed-by: Brian Paul 
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] mesa/st: Fix pipe_framebuffer_state::height for PIPE_TEXTURE_1D_ARRAY.

2014-04-23 Thread Brian Paul

On 04/23/2014 09:17 AM, Jose Fonseca wrote:

Thanks for the review.

- Original Message -

On 04/23/2014 07:55 AM, jfons...@vmware.com wrote:

From: José Fonseca 

This prevents buffer overflow w/ llvmpipe when running piglit

bin/gl-3.2-layered-rendering-clear-color-all-types 1d_array single_level
-fbo -auto

v2: Compute the framebuffer size as the minimum size, as pointed out by
Brian;  compacted code;  ran piglit quick test list (with no
regressions.)
---
   src/mesa/state_tracker/st_atom_framebuffer.c | 33
   ++--
   1 file changed, 31 insertions(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom_framebuffer.c
b/src/mesa/state_tracker/st_atom_framebuffer.c
index 4c4f839..f395ec7 100644
--- a/src/mesa/state_tracker/st_atom_framebuffer.c
+++ b/src/mesa/state_tracker/st_atom_framebuffer.c
@@ -31,6 +31,8 @@
 *   Brian Paul
 */

+#include 
+
   #include "st_context.h"
   #include "st_atom.h"
   #include "st_cb_bitmap.h"
@@ -44,6 +46,24 @@


   /**
+ * Update framebuffer size.
+ *
+ * framebuffer->width should match fb->Weight, but for
PIPE_TEXTURE_1D_ARRAY


"fb->Width"



+ * textures fb->Height has the number of layers, and not the surface
height.
+ */


The comment seems a bit disconnected from the code.
update_framebuffer_size() is used to find the size which is the min of
the attached surfaces.  The comment about 1D array textures doesn't seem
to matter in the code.  That just seems a little confusing.


Yes, the update_framebuffer_size finds the min size, which I think is obvious.  
This comment here was supposed to explain why we do it when gl_framebuffer has 
similar info, ie., the less obvious bit.

But I agree that the comment could be better phrased.  What about this?

"We need to derive pipe_framebuffer size from the bound pipe_surfaces here 
instead of copying gl_framebuffer size because for certain target types (like 
PIPE_TEXTURE_1D_ARRAY) gl_framebuffer::Height has the number of layers instead of 1."


That sounds great.

-Brian



Jose




+static void
+update_framebuffer_size(struct pipe_framebuffer_state *framebuffer,
+struct pipe_surface *surface)
+{
+   assert(surface);
+   assert(surface->width  < UINT_MAX);
+   assert(surface->height < UINT_MAX);
+   framebuffer->width  = MIN2(framebuffer->width,  surface->width);
+   framebuffer->height = MIN2(framebuffer->height, surface->height);
+}
+
+
+/**
* Update framebuffer state (color, depth, stencil, etc. buffers)
*/
   static void
@@ -57,11 +77,12 @@ update_framebuffer_state( struct st_context *st )
  st_flush_bitmap_cache(st);

  st->state.fb_orientation = st_fb_orientation(fb);
-   framebuffer->width = fb->Width;
-   framebuffer->height = fb->Height;

  /*printf("-- fb size %d x %d\n", fb->Width, fb->Height);*/

+   framebuffer->width  = UINT_MAX;
+   framebuffer->height = UINT_MAX;
+
  /* Examine Mesa's ctx->DrawBuffer->_ColorDrawBuffers state
   * to determine which surfaces to draw to
   */
@@ -81,6 +102,7 @@ update_framebuffer_state( struct st_context *st )

if (strb->surface) {
   pipe_surface_reference(&framebuffer->cbufs[i],
   strb->surface);
+update_framebuffer_size(framebuffer, strb->surface);
}
strb->defined = GL_TRUE; /* we'll be drawing something */
 }
@@ -100,12 +122,14 @@ update_framebuffer_state( struct st_context *st )
st_update_renderbuffer_surface(st, strb);
 }
 pipe_surface_reference(&framebuffer->zsbuf, strb->surface);
+  update_framebuffer_size(framebuffer, strb->surface);
  }
  else {
 strb =
 st_renderbuffer(fb->Attachment[BUFFER_STENCIL].Renderbuffer);
 if (strb) {
assert(strb->surface);
pipe_surface_reference(&framebuffer->zsbuf, strb->surface);
+ update_framebuffer_size(framebuffer, strb->surface);
 }
 else
pipe_surface_reference(&framebuffer->zsbuf, NULL);
@@ -122,6 +146,11 @@ update_framebuffer_state( struct st_context *st )
  }
   #endif

+   /* _mesa_test_framebuffer_completeness refuses framebuffers with no
+* attachments, so this should never happen. */


Close */ on next line.



+   assert(framebuffer->width  != UINT_MAX);
+   assert(framebuffer->height != UINT_MAX);
+
  cso_set_framebuffer(st->cso_context, framebuffer);
   }




Otherwise, Reviewed-by: Brian Paul 



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [cfe-dev] 3 element vectors in opencl 1.1+

2014-04-23 Thread Jan Vesely
On Tue, 2014-04-22 at 17:50 -0700, Matt Arsenault wrote:


> >> I think this is what v96:128 is for
> > according to [0], it specifies only alignment, not size. I could not
> > find an __attribute__ that would change size either.
> >
> > It should be possible to have ADMGPUDataLayout: public DataLayout class
> > that would intercept the call and fix the reported value, but I think it
> > would only move the hack to different place.
> >
> > I have added pocl-devel list as suggested.
> >
> > regards,
> > Jan
> >
> > [0]http://llvm.org/docs/LangRef.html#data-layout
> >
> 
> Only the size in memory matters, which is what the required alignment 
> specifies. DataLayout::getTypeAllocSize accounts for the alignment, but 
> getTypeStoreSize does not. I actually thought this was half of what 
> getTypeStoreSize was for, but it turns out it isn't.

hm, I always thought that alignment only puts restrictions on starting
address and using padding was just a tool to do the job.

anyway, thanks for the hint, using getTypeAllocSize works nicely.
since we are allocating space in the argument vector I think
getAllocSize is the right function to use.

I'll post a patch.

regards,
Jan


-- 
Jan Vesely 


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/1] clover: use getTypeAllocSize() for kernel arguments

2014-04-23 Thread Jan Vesely
3 element vectors have the size of 4 element ones.
See Ch 6.1.5 of OCL 1.1+ specs

Thx Matt Arsenault for the hint

Signed-off-by: Jan Vesely 
---
 src/gallium/state_trackers/clover/llvm/invocation.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
b/src/gallium/state_trackers/clover/llvm/invocation.cpp
index a81bdf8..a2022b4 100644
--- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
+++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
@@ -308,7 +308,7 @@ namespace {
 #endif
 
 llvm::Type *arg_type = arg.getType();
-unsigned arg_size = TD.getTypeStoreSize(arg_type);
+unsigned arg_size = TD.getTypeAllocSize(arg_type);
 
 llvm::Type *target_type = arg_type->isIntegerTy() ?
TD.getSmallestLegalIntType(mod->getContext(), arg_size * 8) :
-- 
1.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] swrast: move texture_slices() calls out of loops

2014-04-23 Thread Brian Paul
---
 src/mesa/swrast/s_texture.c |   12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/mesa/swrast/s_texture.c b/src/mesa/swrast/s_texture.c
index 9273e94..5fd80ca 100644
--- a/src/mesa/swrast/s_texture.c
+++ b/src/mesa/swrast/s_texture.c
@@ -269,7 +269,7 @@ _swrast_map_texture(struct gl_context *ctx, struct 
gl_texture_object *texObj)
   for (level = texObj->BaseLevel; level < MAX_TEXTURE_LEVELS; level++) {
  struct gl_texture_image *texImage = texObj->Image[face][level];
  struct swrast_texture_image *swImage = swrast_texture_image(texImage);
- unsigned int i;
+ unsigned int i, slices;
 
  if (!texImage)
 continue;
@@ -289,7 +289,9 @@ _swrast_map_texture(struct gl_context *ctx, struct 
gl_texture_object *texObj)
continue;
  }
 
- for (i = 0; i < texture_slices(texImage); i++) {
+ slices = texture_slices(texImage);
+
+ for (i = 0; i < slices; i++) {
 GLubyte *map;
 GLint rowStride;
 
@@ -327,7 +329,7 @@ _swrast_unmap_texture(struct gl_context *ctx, struct 
gl_texture_object *texObj)
   for (level = texObj->BaseLevel; level < MAX_TEXTURE_LEVELS; level++) {
  struct gl_texture_image *texImage = texObj->Image[face][level];
  struct swrast_texture_image *swImage = swrast_texture_image(texImage);
- unsigned int i;
+ unsigned int i, slices;
 
  if (!texImage)
 continue;
@@ -338,7 +340,9 @@ _swrast_unmap_texture(struct gl_context *ctx, struct 
gl_texture_object *texObj)
  if (!swImage->ImageSlices)
 continue;
 
- for (i = 0; i < texture_slices(texImage); i++) {
+ slices = texture_slices(texImage);
+
+ for (i = 0; i < slices; i++) {
 if (swImage->ImageSlices[i]) {
ctx->Driver.UnmapTextureImage(ctx, texImage, i);
swImage->ImageSlices[i] = NULL;
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] swrast: move null pointer check earlier in _swrast_map_teximage()

2014-04-23 Thread Brian Paul
There's no reason to compute texel size, stride, etc. if there's no
image data to map.
---
 src/mesa/swrast/s_texture.c |   15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/src/mesa/swrast/s_texture.c b/src/mesa/swrast/s_texture.c
index cbfa26b..9273e94 100644
--- a/src/mesa/swrast/s_texture.c
+++ b/src/mesa/swrast/s_texture.c
@@ -218,6 +218,15 @@ _swrast_map_teximage(struct gl_context *ctx,
 
check_map_teximage(texImage, slice, x, y, w, h);
 
+   if (!swImage->Buffer) {
+  /* Either glTexImage was called with a NULL  argument or
+   * we ran out of memory when allocating texture memory,
+   */
+  *mapOut = NULL;
+  *rowStrideOut = 0;
+  return;
+   }
+
texelSize = _mesa_get_format_bytes(texImage->TexFormat);
stride = _mesa_format_row_stride(texImage->TexFormat, texImage->Width);
_mesa_get_format_block_size(texImage->TexFormat, &bw, &bh);
@@ -225,12 +234,6 @@ _swrast_map_teximage(struct gl_context *ctx,
assert(x % bw == 0);
assert(y % bh == 0);
 
-   if (!swImage->Buffer) {
-  /* probably ran out of memory when allocating tex mem */
-  *mapOut = NULL;
-  return;
-   }
-
/* This function can only be used with a swrast-allocated buffer, in which
 * case ImageSlices is populated with pointers into Buffer.
 */
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] swrast: allocate swrast_texture_image::ImageSlices array if needed

2014-04-23 Thread Brian Paul
Fixes a segmentation fault in conform divzero.c test.
This happens when glTexImage(level, width=0, height=0) is called.  We
don't allocate texture memory in that case so the ImageSlices array
was never allocated.

Cc: "10.1" 
---
 src/mesa/swrast/s_texture.c |   10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/mesa/swrast/s_texture.c b/src/mesa/swrast/s_texture.c
index c08a4e9..1d449a2 100644
--- a/src/mesa/swrast/s_texture.c
+++ b/src/mesa/swrast/s_texture.c
@@ -279,6 +279,13 @@ _swrast_map_texture(struct gl_context *ctx, struct 
gl_texture_object *texObj)
 continue;
  }
 
+ if (!swImage->ImageSlices) {
+swImage->ImageSlices =
+   calloc(texture_slices(texImage), sizeof(void *));
+if (!swImage->ImageSlices)
+   continue;
+ }
+
  for (i = 0; i < texture_slices(texImage); i++) {
 GLubyte *map;
 GLint rowStride;
@@ -325,6 +332,9 @@ _swrast_unmap_texture(struct gl_context *ctx, struct 
gl_texture_object *texObj)
  if (swImage->Buffer)
 return;
 
+ if (!swImage->ImageSlices)
+continue;
+
  for (i = 0; i < texture_slices(texImage); i++) {
 if (swImage->ImageSlices[i]) {
ctx->Driver.UnmapTextureImage(ctx, texImage, i);
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/4] swrast: remove _mesa_ prefix from static function

2014-04-23 Thread Brian Paul
And add a const qualifier.
---
 src/mesa/swrast/s_texture.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/swrast/s_texture.c b/src/mesa/swrast/s_texture.c
index 1d449a2..cbfa26b 100644
--- a/src/mesa/swrast/s_texture.c
+++ b/src/mesa/swrast/s_texture.c
@@ -177,8 +177,8 @@ _swrast_free_texture_image_buffer(struct gl_context *ctx,
  * Error checking for debugging only.
  */
 static void
-_mesa_check_map_teximage(struct gl_texture_image *texImage,
- GLuint slice, GLuint x, GLuint y, GLuint w, GLuint h)
+check_map_teximage(const struct gl_texture_image *texImage,
+   GLuint slice, GLuint x, GLuint y, GLuint w, GLuint h)
 {
 
if (texImage->TexObject->Target == GL_TEXTURE_1D)
@@ -216,7 +216,7 @@ _swrast_map_teximage(struct gl_context *ctx,
GLint stride, texelSize;
GLuint bw, bh;
 
-   _mesa_check_map_teximage(texImage, slice, x, y, w, h);
+   check_map_teximage(texImage, slice, x, y, w, h);
 
texelSize = _mesa_get_format_bytes(texImage->TexFormat);
stride = _mesa_format_row_stride(texImage->TexFormat, texImage->Width);
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] mesa/st: Fix pipe_framebuffer_state::height for PIPE_TEXTURE_1D_ARRAY.

2014-04-23 Thread Roland Scheidegger
Am 23.04.2014 17:22, schrieb Brian Paul:
> On 04/23/2014 09:17 AM, Jose Fonseca wrote:
>> Thanks for the review.
>>
>> - Original Message -
>>> On 04/23/2014 07:55 AM, jfons...@vmware.com wrote:
 From: José Fonseca 

 This prevents buffer overflow w/ llvmpipe when running piglit

 bin/gl-3.2-layered-rendering-clear-color-all-types 1d_array
 single_level
 -fbo -auto

 v2: Compute the framebuffer size as the minimum size, as pointed out by
 Brian;  compacted code;  ran piglit quick test list (with no
 regressions.)
 ---
src/mesa/state_tracker/st_atom_framebuffer.c | 33
++--
1 file changed, 31 insertions(+), 2 deletions(-)

 diff --git a/src/mesa/state_tracker/st_atom_framebuffer.c
 b/src/mesa/state_tracker/st_atom_framebuffer.c
 index 4c4f839..f395ec7 100644
 --- a/src/mesa/state_tracker/st_atom_framebuffer.c
 +++ b/src/mesa/state_tracker/st_atom_framebuffer.c
 @@ -31,6 +31,8 @@
  *   Brian Paul
  */

 +#include 
 +
#include "st_context.h"
#include "st_atom.h"
#include "st_cb_bitmap.h"
 @@ -44,6 +46,24 @@


/**
 + * Update framebuffer size.
 + *
 + * framebuffer->width should match fb->Weight, but for
 PIPE_TEXTURE_1D_ARRAY
>>>
>>> "fb->Width"
>>>
>>>
 + * textures fb->Height has the number of layers, and not the surface
 height.
 + */
>>>
>>> The comment seems a bit disconnected from the code.
>>> update_framebuffer_size() is used to find the size which is the min of
>>> the attached surfaces.  The comment about 1D array textures doesn't seem
>>> to matter in the code.  That just seems a little confusing.
>>
>> Yes, the update_framebuffer_size finds the min size, which I think is
>> obvious.  This comment here was supposed to explain why we do it when
>> gl_framebuffer has similar info, ie., the less obvious bit.
>>
>> But I agree that the comment could be better phrased.  What about this?
>>
>> "We need to derive pipe_framebuffer size from the bound
>> pipe_surfaces here instead of copying gl_framebuffer size because for
>> certain target types (like PIPE_TEXTURE_1D_ARRAY)
>> gl_framebuffer::Height has the number of layers instead of 1."
> 
> That sounds great.
> 
> -Brian
> 
> 
>> Jose
>>
>>
>>
 +static void
 +update_framebuffer_size(struct pipe_framebuffer_state *framebuffer,
 +struct pipe_surface *surface)
 +{
 +   assert(surface);
 +   assert(surface->width  < UINT_MAX);
 +   assert(surface->height < UINT_MAX);
 +   framebuffer->width  = MIN2(framebuffer->width,  surface->width);
 +   framebuffer->height = MIN2(framebuffer->height, surface->height);
 +}
 +
 +
 +/**
 * Update framebuffer state (color, depth, stencil, etc. buffers)
 */
static void
 @@ -57,11 +77,12 @@ update_framebuffer_state( struct st_context *st )
   st_flush_bitmap_cache(st);

   st->state.fb_orientation = st_fb_orientation(fb);
 -   framebuffer->width = fb->Width;
 -   framebuffer->height = fb->Height;

   /*printf("-- fb size %d x %d\n", fb->Width, fb->Height);*/

 +   framebuffer->width  = UINT_MAX;
 +   framebuffer->height = UINT_MAX;
 +
   /* Examine Mesa's ctx->DrawBuffer->_ColorDrawBuffers state
* to determine which surfaces to draw to
*/
 @@ -81,6 +102,7 @@ update_framebuffer_state( struct st_context *st )

 if (strb->surface) {
pipe_surface_reference(&framebuffer->cbufs[i],
strb->surface);
 +update_framebuffer_size(framebuffer, strb->surface);
 }
 strb->defined = GL_TRUE; /* we'll be drawing something */
  }
 @@ -100,12 +122,14 @@ update_framebuffer_state( struct st_context *st )
 st_update_renderbuffer_surface(st, strb);
  }
  pipe_surface_reference(&framebuffer->zsbuf, strb->surface);
 +  update_framebuffer_size(framebuffer, strb->surface);
   }
   else {
  strb =
  st_renderbuffer(fb->Attachment[BUFFER_STENCIL].Renderbuffer);
  if (strb) {
 assert(strb->surface);
 pipe_surface_reference(&framebuffer->zsbuf, strb->surface);
 + update_framebuffer_size(framebuffer, strb->surface);
  }
  else
 pipe_surface_reference(&framebuffer->zsbuf, NULL);
 @@ -122,6 +146,11 @@ update_framebuffer_state( struct st_context *st )
   }
#endif

 +   /* _mesa_test_framebuffer_completeness refuses framebuffers with no
 +* attachments, so this should never happen. */
>>>
>>> Close */ on next line.
>>>
>>>
 +   assert(framebuffer->width  != UINT_MAX);
 +   assert(fra

Re: [Mesa-dev] Mesa build instructions

2014-04-23 Thread Matt Turner
On Fri, Apr 11, 2014 at 10:37 AM, Matt Turner  wrote:
> Someone asked about my Mesa build set up. Rather than sending it
> privately I figured I'd post it for posterity on mesa-dev.
>
> I build with
>
> CFLAGS="-O2 -march=native -pipe" CXXFLAGS="$CFLAGS" ./autogen.sh
> --with-dri-drivers=i965 --with-gallium-drivers=
> --with-egl-platforms=x11,drm --enable-glx-tls --enable-gles1
> --enable-gles2
>
> --enable-debug enables assertions and builds with -g, but also
> silently adds -O0 to your CFLAGS.
>
> My build set up consists of
>
> ~/projects/mesa - contains  mesa git checkout
> ~/projects/mesa-release - release build directory
> ~/projects/mesa-debug - debug build directory
>
> I do out-of-tree builds into the debug and release directories by
> running ../mesa/autogen.sh ... from each of them, with the only
> difference being the --enable-debug flag.
>
> Environment variables allow you to use your built mesa without
> installing it: LIBGL_DRIVERS_PATH, LD_LIBRARY_PATH. Mesa's build
> system links built files into $(builddir)/$(libdir), e.g.,
> ~/projects/mesa-release/lib/. I use some wrapper scripts to allow me
> to easily run programs against my Mesa builds:
>
> ~/bin/mesa-release:
> #!/bin/sh
> LIBGL_DRIVERS_PATH=~/projects/mesa-release/lib
> LD_LIBRARY_PATH=~/projects/mesa-release/lib:${LD_LIBRARY_PATH} $@

Note for anyone using this: gmail line wrapped this.
LIBGL_DRIVERS_PATH and LD_LIBRARY_PATH need to be set on the line that
executes $@.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] gallium/util: use ui[4] instead of ui in union util_color

2014-04-23 Thread sroland
From: Roland Scheidegger 

util_color often merely represents a collection of bytes, however it is
inconvenient if those bytes can only be accessed as floats/doubles for int
formats exceeding 32bits.
(Note that since rgba8 formats use one uint, not 4 bytes, hence the byte and
short member were left as is.)
---
 src/gallium/auxiliary/util/u_pack_color.h  |   38 
 src/gallium/auxiliary/util/u_surface.c |2 +-
 src/gallium/drivers/freedreno/a2xx/fd2_draw.c  |2 +-
 src/gallium/drivers/i915/i915_clear.c  |6 ++--
 src/gallium/drivers/i915/i915_surface.c|2 +-
 src/gallium/drivers/ilo/ilo_blitter_blt.c  |4 +--
 src/gallium/drivers/nouveau/nv30/nv30_clear.c  |2 +-
 src/gallium/drivers/r300/r300_blit.c   |4 +--
 src/gallium/drivers/r300/r300_state.c  |2 +-
 src/gallium/drivers/r300/r300_state_derived.c  |   12 
 src/gallium/drivers/svga/svga_pipe_clear.c |2 +-
 src/mesa/state_tracker/st_atom_pixeltransfer.c |2 +-
 12 files changed, 39 insertions(+), 39 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_pack_color.h 
b/src/gallium/auxiliary/util/u_pack_color.h
index 166c68b..e0c9018 100644
--- a/src/gallium/auxiliary/util/u_pack_color.h
+++ b/src/gallium/auxiliary/util/u_pack_color.h
@@ -51,7 +51,7 @@
 union util_color {
ubyte ub;
ushort us;
-   uint ui;
+   uint ui[4];
ushort h[4]; /* half float */
float f[4];
double d[4];
@@ -67,32 +67,32 @@ util_pack_color_ub(ubyte r, ubyte g, ubyte b, ubyte a,
switch (format) {
case PIPE_FORMAT_ABGR_UNORM:
   {
- uc->ui = (r << 24) | (g << 16) | (b << 8) | a;
+ uc->ui[0] = (r << 24) | (g << 16) | (b << 8) | a;
   }
   return;
case PIPE_FORMAT_XBGR_UNORM:
   {
- uc->ui = (r << 24) | (g << 16) | (b << 8) | 0xff;
+ uc->ui[0] = (r << 24) | (g << 16) | (b << 8) | 0xff;
   }
   return;
case PIPE_FORMAT_BGRA_UNORM:
   {
- uc->ui = (a << 24) | (r << 16) | (g << 8) | b;
+ uc->ui[0] = (a << 24) | (r << 16) | (g << 8) | b;
   }
   return;
case PIPE_FORMAT_BGRX_UNORM:
   {
- uc->ui = (0xff << 24) | (r << 16) | (g << 8) | b;
+ uc->ui[0] = (0xff << 24) | (r << 16) | (g << 8) | b;
   }
   return;
case PIPE_FORMAT_ARGB_UNORM:
   {
- uc->ui = (b << 24) | (g << 16) | (r << 8) | a;
+ uc->ui[0] = (b << 24) | (g << 16) | (r << 8) | a;
   }
   return;
case PIPE_FORMAT_XRGB_UNORM:
   {
- uc->ui = (b << 24) | (g << 16) | (r << 8) | 0xff;
+ uc->ui[0] = (b << 24) | (g << 16) | (r << 8) | 0xff;
   }
   return;
case PIPE_FORMAT_B5G6R5_UNORM:
@@ -168,7 +168,7 @@ util_unpack_color_ub(enum pipe_format format, union 
util_color *uc,
switch (format) {
case PIPE_FORMAT_ABGR_UNORM:
   {
- uint p = uc->ui;
+ uint p = uc->ui[0];
  *r = (ubyte) ((p >> 24) & 0xff);
  *g = (ubyte) ((p >> 16) & 0xff);
  *b = (ubyte) ((p >>  8) & 0xff);
@@ -177,7 +177,7 @@ util_unpack_color_ub(enum pipe_format format, union 
util_color *uc,
   return;
case PIPE_FORMAT_XBGR_UNORM:
   {
- uint p = uc->ui;
+ uint p = uc->ui[0];
  *r = (ubyte) ((p >> 24) & 0xff);
  *g = (ubyte) ((p >> 16) & 0xff);
  *b = (ubyte) ((p >>  8) & 0xff);
@@ -186,7 +186,7 @@ util_unpack_color_ub(enum pipe_format format, union 
util_color *uc,
   return;
case PIPE_FORMAT_BGRA_UNORM:
   {
- uint p = uc->ui;
+ uint p = uc->ui[0];
  *r = (ubyte) ((p >> 16) & 0xff);
  *g = (ubyte) ((p >>  8) & 0xff);
  *b = (ubyte) ((p >>  0) & 0xff);
@@ -195,7 +195,7 @@ util_unpack_color_ub(enum pipe_format format, union 
util_color *uc,
   return;
case PIPE_FORMAT_BGRX_UNORM:
   {
- uint p = uc->ui;
+ uint p = uc->ui[0];
  *r = (ubyte) ((p >> 16) & 0xff);
  *g = (ubyte) ((p >>  8) & 0xff);
  *b = (ubyte) ((p >>  0) & 0xff);
@@ -204,7 +204,7 @@ util_unpack_color_ub(enum pipe_format format, union 
util_color *uc,
   return;
case PIPE_FORMAT_ARGB_UNORM:
   {
- uint p = uc->ui;
+ uint p = uc->ui[0];
  *r = (ubyte) ((p >>  8) & 0xff);
  *g = (ubyte) ((p >> 16) & 0xff);
  *b = (ubyte) ((p >> 24) & 0xff);
@@ -213,7 +213,7 @@ util_unpack_color_ub(enum pipe_format format, union 
util_color *uc,
   return;
case PIPE_FORMAT_XRGB_UNORM:
   {
- uint p = uc->ui;
+ uint p = uc->ui[0];
  *r = (ubyte) ((p >>  8) & 0xff);
  *g = (ubyte) ((p >> 16) & 0xff);
  *b = (ubyte) ((p >> 24) & 0xff);
@@ -352,32 +352,32 @@ util_pack_color(const float rgba[4], enum pipe_format 
format, union util_color *
switch (format) {
case PIPE_FORMAT_ABGR_UNORM:
   {
- uc->ui = (r << 24) | (g << 16) | (b << 

[Mesa-dev] [PATCH 2/2] llvmpipe: fix clearing of individual color buffers in a fb

2014-04-23 Thread sroland
From: Roland Scheidegger 

GL (3.0) allows you to clear individual color buffers in a fb. In fact
for fbs containing both int and float/normalized color buffers this is
required (because the clearing values are otherwise undefined if applied
to all buffers). The gallium interface was changed a while ago, but llvmpipe
ignored it (hence doing such individual clears always resulted in clearing
all buffers, plus some assorted asserts due to the mixed fbs).
So change the clear command to indicate the buffer to be cleared. Also, because
indicating the buffer to be cleared would have made lp_rast_arg_cmd larger
which is unacceptable (we're trying to shrink it some day) allocate the clear
value in the scene and just pass a pointer.
There's several advantages and disadvantages here:
+ clearing individual buffers works (we could also actually bin such clears now
if they'd come through clear_render_target() if the surface is in the current
fb, though we didn't do this before for the single rb case and still don't try).
+ since there's one clear per rb, we do the format conversion in setup rather
than per bin. Aside from the (drop in the ocean...) performance advantage this
means that clearing to very small values (that is, denormal when converted to
the format) should work for small float (fp16 etc.) formats, as the util code
couldn't handle it correctly before (because cpu denorms are disabled when
executing the bin commands, screwing up the magic conversion and flushing
the values to 0, though this was not verified).
- there's some overhead for traditional old-style clear-all MRT cases, since
there's one rast clear command per rb instead of one for all rbs.

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=76976.

v2: get rid of the ugly manual memcpy stuff and just use union util_color.
This is 32 bytes instead of 16 but as the allocation is per scene we can live
with those additional 16 bytes (and the additional 128 bytes in the setup
context), which makes the code much more obvious. Suggested by Brian.
---
 src/gallium/drivers/llvmpipe/lp_rast.c  |  122 +++--
 src/gallium/drivers/llvmpipe/lp_rast.h  |9 +-
 src/gallium/drivers/llvmpipe/lp_scene.c |1 -
 src/gallium/drivers/llvmpipe/lp_scene.h |1 -
 src/gallium/drivers/llvmpipe/lp_setup.c |  217 ---
 src/gallium/drivers/llvmpipe/lp_setup_context.h |3 +-
 6 files changed, 190 insertions(+), 163 deletions(-)

diff --git a/src/gallium/drivers/llvmpipe/lp_rast.c 
b/src/gallium/drivers/llvmpipe/lp_rast.c
index 0ae5976..d50ee48 100644
--- a/src/gallium/drivers/llvmpipe/lp_rast.c
+++ b/src/gallium/drivers/llvmpipe/lp_rast.c
@@ -110,25 +110,6 @@ lp_rast_tile_begin(struct lp_rasterizer_task *task,
 
 
 /**
- * Examine a framebuffer object to determine if any of the colorbuffers
- * use a pure integer format.
- * XXX this could be a gallium utility function if useful elsewhere.
- */
-static boolean
-is_fb_pure_integer(const struct pipe_framebuffer_state *fb)
-{
-   unsigned i;
-   for (i = 0; i < fb->nr_cbufs; i++) {
-  if (fb->cbufs[i] &&
-  util_format_is_pure_integer(fb->cbufs[i]->format)) {
- return TRUE;
-  }
-   }
-   return FALSE;
-}
-
-
-/**
  * Clear the rasterizer's current color tile.
  * This is a bin command called during bin processing.
  * Clear commands always clear all bound layers.
@@ -138,87 +119,42 @@ lp_rast_clear_color(struct lp_rasterizer_task *task,
 const union lp_rast_cmd_arg arg)
 {
const struct lp_scene *scene = task->scene;
+   unsigned cbuf = arg.clear_rb->cbuf;
+   union util_color uc;
+   enum pipe_format format;
 
-   if (scene->fb.nr_cbufs) {
-  unsigned i;
-  union util_color uc;
-
-  if (is_fb_pure_integer(&scene->fb)) {
- /*
-  * We expect int/uint clear values here, though some APIs
-  * might disagree (but in any case util_pack_color()
-  * couldn't handle it)...
-  */
- LP_DBG(DEBUG_RAST, "%s pure int 0x%x,0x%x,0x%x,0x%x\n", __FUNCTION__,
-arg.clear_color.ui[0],
-arg.clear_color.ui[1],
-arg.clear_color.ui[2],
-arg.clear_color.ui[3]);
-
- for (i = 0; i < scene->fb.nr_cbufs; i++) {
-enum pipe_format format = scene->fb.cbufs[i]->format;
-
-if (util_format_is_pure_sint(format)) {
-   util_format_write_4i(format, arg.clear_color.i, 0, &uc, 0, 0, 
0, 1, 1);
-}
-else {
-   assert(util_format_is_pure_uint(format));
-   util_format_write_4ui(format, arg.clear_color.ui, 0, &uc, 0, 0, 
0, 1, 1);
-}
-
-util_fill_box(scene->cbufs[i].map,
-  format,
-  scene->cbufs[i].stride,
-  scene->cbufs[i].layer_stride,
-  task->x,
-  task->y,
-   

Re: [Mesa-dev] [PATCH 0/5] util: Rework endian handling in python code

2014-04-23 Thread Jose Fonseca
Richard, Michel,

Apologies for the long silence.  I flagged this thread as worth following, but 
I failed to noticed the RFC to me.

I glanced over it and the series looks good to me AFAICT.  I agree that it is a 
better to defer the endianess to C-preprocessing time.

Jose

- Original Message -
> Ping
> 
> Richard Sandiford  writes:
> > Ping (with fixed subject)
> >
> > Richard Sandiford  writes:
> >> This is a refresh of:
> >>
> >>
> >> https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/archives/mesa-dev/2013-June/040594.html&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0A&m=qNJxBmC8T0UJrxq1t6ovCuRj4qWejx24a49GkVAHnuk%3D%0A&s=791bcf25d17824bc574aec7183fe264a488fd5b311f6726973a81f52cf580a02
> >>
> >> At the moment the python code uses sys.byteorder to decide whether
> >> u_format_table.c should be for big or little endian.  With this series
> >> it instead generates both forms, using blocks like:
> >>
> >> #ifdef PIPE_ARCH_BIG_ENDIAN
> >> ...
> >> #else
> >> ...
> >> #endif
> >>
> >> in cases where endianness matters.
> >>
> >> Doing it this way is more cross-compiler-friendly.  It also means people
> >> working on LE systems can see what the differences would be for BE.
> >>
> >> Tested on x86_64 and z.  I don't have commit access so please apply if OK.
> >>
> >> Thanks,
> >> Richard
> >>
> >> ___
> >> mesa-dev mailing list
> >> mesa-dev@lists.freedesktop.org
> >> https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/mesa-dev&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0A&m=qNJxBmC8T0UJrxq1t6ovCuRj4qWejx24a49GkVAHnuk%3D%0A&s=d03e30542dd6de0bfb0bb613d160a4128997a7be4c52d89f34a970092856ef93
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/mesa-dev&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0A&m=qNJxBmC8T0UJrxq1t6ovCuRj4qWejx24a49GkVAHnuk%3D%0A&s=d03e30542dd6de0bfb0bb613d160a4128997a7be4c52d89f34a970092856ef93
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/mesa-dev&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0A&m=qNJxBmC8T0UJrxq1t6ovCuRj4qWejx24a49GkVAHnuk%3D%0A&s=d03e30542dd6de0bfb0bb613d160a4128997a7be4c52d89f34a970092856ef93
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] swrast: Add glBlitFramebuffer to commands affected by conditional rendering

2014-04-23 Thread Carl Worth
Carl Worth  writes:
> After I picked this to the 10.1 branch and tested I found the following
> piglit regression:
>
>   $ ~/src/piglit/bin/nv_conditional_render-blitframebuffer -auto 
>   Probe color at (0,16)
> Expected: 0.00 1.00 0.00 0.00
> Observed: 1.00 0.00 0.00 0.00
>   PIGLIT: {'result': 'fail' }

I went to file a bug on this issue and found one open already:

[i965 Bisected]Piglit spec/NV_conditional_render_blitframebuffer fails
https://bugs.freedesktop.org/show_bug.cgi?id=77702

The discussion there is leaning toward the test being incorrect.

Once that bug is resolved I can pick the patch back to the stable
branch.

-Carl

-- 
carl.d.wo...@intel.com


pgp4pOMtk0D_n.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [cfe-dev] 3 element vectors in opencl 1.1+

2014-04-23 Thread Francisco Jerez
Jan Vesely  writes:

> On Tue, 2014-04-22 at 17:50 -0700, Matt Arsenault wrote:
> 
>
>> >> I think this is what v96:128 is for
>> > according to [0], it specifies only alignment, not size. I could not
>> > find an __attribute__ that would change size either.
>> >
>> > It should be possible to have ADMGPUDataLayout: public DataLayout class
>> > that would intercept the call and fix the reported value, but I think it
>> > would only move the hack to different place.
>> >
>> > I have added pocl-devel list as suggested.
>> >
>> > regards,
>> > Jan
>> >
>> > [0]http://llvm.org/docs/LangRef.html#data-layout
>> >
>> 
>> Only the size in memory matters, which is what the required alignment 
>> specifies. DataLayout::getTypeAllocSize accounts for the alignment, but 
>> getTypeStoreSize does not. I actually thought this was half of what 
>> getTypeStoreSize was for, but it turns out it isn't.
>
> hm, I always thought that alignment only puts restrictions on starting
> address and using padding was just a tool to do the job.
>
> anyway, thanks for the hint, using getTypeAllocSize works nicely.
> since we are allocating space in the argument vector I think
> getAllocSize is the right function to use.
>
> I'll post a patch.
>

I don't think that using getTypeAllocSize() instead of
getTypeStoreSize() to calculate clover::argument::size would be a
satisfactory solution.  By doing that you'd redefine the API argument
size exposed to the host for *all* argument types to be the
device-dependent aligned size, which is definitely not what you want.

AFAIK 3-element vectors are an exception because they are the only types
that are defined to have a different API size from their actual usable
size, so they probably deserve special handling in invocation.cpp (as
you did in your first patch).  As the API size is target-independent I
don't think that the fix belongs in Clang or LLVM, Clover is likely at
fault.

Thanks.

> regards,
> Jan
>
>
> -- 
> Jan Vesely 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


pgpPhkfcwcCIy.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 77582] [r600g] ogl-samples GL3.2 and GL3.3 tests doesn't run without overriding GL/GLSL version

2014-04-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=77582

Benjamin Bellec  changed:

   What|Removed |Added

 Resolution|NOTOURBUG   |INVALID
   Assignee|mesa-dev@lists.freedesktop. |dri-devel@lists.freedesktop
   |org |.org
  Component|Mesa core   |Drivers/Gallium/r600

--- Comment #7 from Benjamin Bellec  ---
Old libGL in /usr/local/lib64

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] draw/llvm: reduce memory usage

2014-04-23 Thread Zack Rusin
Lets make draw_get_option_use_llvm function available unconditionally
and use it to avoid useless allocations when LLVM paths are active.
TGSI machine is never used when we're using LLVM.

Signed-off-by: Zack Rusin 
---
 src/gallium/auxiliary/draw/draw_context.c |  6 ++
 src/gallium/auxiliary/draw/draw_context.h |  2 --
 src/gallium/auxiliary/draw/draw_gs.c  | 26 --
 src/gallium/auxiliary/draw/draw_vs.c  | 11 +++
 src/gallium/auxiliary/draw/draw_vs_exec.c |  2 ++
 5 files changed, 27 insertions(+), 20 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_context.c 
b/src/gallium/auxiliary/draw/draw_context.c
index 0a67879..ddc305b 100644
--- a/src/gallium/auxiliary/draw/draw_context.c
+++ b/src/gallium/auxiliary/draw/draw_context.c
@@ -68,6 +68,12 @@ draw_get_option_use_llvm(void)
}
return value;
 }
+#else
+boolean
+draw_get_option_use_llvm(void)
+{
+   return FALSE;
+}
 #endif
 
 
diff --git a/src/gallium/auxiliary/draw/draw_context.h 
b/src/gallium/auxiliary/draw/draw_context.h
index f114f50..48549fe 100644
--- a/src/gallium/auxiliary/draw/draw_context.h
+++ b/src/gallium/auxiliary/draw/draw_context.h
@@ -288,9 +288,7 @@ draw_get_shader_param(unsigned shader, enum pipe_shader_cap 
param);
 int
 draw_get_shader_param_no_llvm(unsigned shader, enum pipe_shader_cap param);
 
-#ifdef HAVE_LLVM
 boolean
 draw_get_option_use_llvm(void);
-#endif
 
 #endif /* DRAW_CONTEXT_H */
diff --git a/src/gallium/auxiliary/draw/draw_gs.c 
b/src/gallium/auxiliary/draw/draw_gs.c
index 7de5e03..5e503ff 100644
--- a/src/gallium/auxiliary/draw/draw_gs.c
+++ b/src/gallium/auxiliary/draw/draw_gs.c
@@ -674,11 +674,7 @@ int draw_geometry_shader_run(struct draw_geometry_shader 
*shader,
 void draw_geometry_shader_prepare(struct draw_geometry_shader *shader,
   struct draw_context *draw)
 {
-#ifdef HAVE_LLVM
boolean use_llvm = draw_get_option_use_llvm();
-#else
-   boolean use_llvm = FALSE;
-#endif
if (!use_llvm && shader && shader->machine->Tokens != shader->state.tokens) 
{
   tgsi_exec_machine_bind_shader(shader->machine,
 shader->state.tokens,
@@ -690,16 +686,18 @@ void draw_geometry_shader_prepare(struct 
draw_geometry_shader *shader,
 boolean
 draw_gs_init( struct draw_context *draw )
 {
-   draw->gs.tgsi.machine = tgsi_exec_machine_create();
-   if (!draw->gs.tgsi.machine)
-  return FALSE;
-
-   draw->gs.tgsi.machine->Primitives = align_malloc(
-  MAX_PRIMITIVES * sizeof(struct tgsi_exec_vector), 16);
-   if (!draw->gs.tgsi.machine->Primitives)
-  return FALSE;
-   memset(draw->gs.tgsi.machine->Primitives, 0,
-  MAX_PRIMITIVES * sizeof(struct tgsi_exec_vector));
+   if (!draw_get_option_use_llvm()) {
+  draw->gs.tgsi.machine = tgsi_exec_machine_create();
+  if (!draw->gs.tgsi.machine)
+ return FALSE;
+
+  draw->gs.tgsi.machine->Primitives = align_malloc(
+ MAX_PRIMITIVES * sizeof(struct tgsi_exec_vector), 16);
+  if (!draw->gs.tgsi.machine->Primitives)
+ return FALSE;
+  memset(draw->gs.tgsi.machine->Primitives, 0,
+ MAX_PRIMITIVES * sizeof(struct tgsi_exec_vector));
+   }
 
return TRUE;
 }
diff --git a/src/gallium/auxiliary/draw/draw_vs.c 
b/src/gallium/auxiliary/draw/draw_vs.c
index 55cbeb9..8bb9a7f 100644
--- a/src/gallium/auxiliary/draw/draw_vs.c
+++ b/src/gallium/auxiliary/draw/draw_vs.c
@@ -149,9 +149,11 @@ draw_vs_init( struct draw_context *draw )
 {
draw->dump_vs = debug_get_option_gallium_dump_vs();
 
-   draw->vs.tgsi.machine = tgsi_exec_machine_create();
-   if (!draw->vs.tgsi.machine)
-  return FALSE;
+   if (!draw_get_option_use_llvm()) {
+  draw->vs.tgsi.machine = tgsi_exec_machine_create();
+  if (!draw->vs.tgsi.machine)
+ return FALSE;
+   }
 
draw->vs.emit_cache = translate_cache_create();
if (!draw->vs.emit_cache) 
@@ -173,7 +175,8 @@ draw_vs_destroy( struct draw_context *draw )
if (draw->vs.emit_cache)
   translate_cache_destroy(draw->vs.emit_cache);
 
-   tgsi_exec_machine_destroy(draw->vs.tgsi.machine);
+   if (draw_get_option_use_llvm())
+  tgsi_exec_machine_destroy(draw->vs.tgsi.machine);
 }
 
 
diff --git a/src/gallium/auxiliary/draw/draw_vs_exec.c 
b/src/gallium/auxiliary/draw/draw_vs_exec.c
index 133b116..6a18d8c 100644
--- a/src/gallium/auxiliary/draw/draw_vs_exec.c
+++ b/src/gallium/auxiliary/draw/draw_vs_exec.c
@@ -63,6 +63,7 @@ vs_exec_prepare( struct draw_vertex_shader *shader,
 {
struct exec_vertex_shader *evs = exec_vertex_shader(shader);
 
+   debug_assert(!draw_get_option_use_llvm());
/* Specify the vertex program to interpret/execute.
 * Avoid rebinding when possible.
 */
@@ -96,6 +97,7 @@ vs_exec_run_linear( struct draw_vertex_shader *shader,
unsigned slot;
boolean clamp_vertex_color = shader->draw->rasterizer->clamp_vertex_color;
 
+   debug_assert(!draw_get_option_use_llvm());
tgsi_exec_set_constant_buffers

Re: [Mesa-dev] [PATCH] draw/llvm: reduce memory usage

2014-04-23 Thread Zack Rusin
>  
> -   tgsi_exec_machine_destroy(draw->vs.tgsi.machine);
> +   if (draw_get_option_use_llvm())
> +      tgsi_exec_machine_destroy(draw->vs.tgsi.machine);

That part should have used !draw_get_option_use_llvm() 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] draw/llvm: reduce memory usage

2014-04-23 Thread Roland Scheidegger
Am 23.04.2014 23:10, schrieb Zack Rusin:
> Lets make draw_get_option_use_llvm function available unconditionally
> and use it to avoid useless allocations when LLVM paths are active.
> TGSI machine is never used when we're using LLVM.
> 
> Signed-off-by: Zack Rusin 
> ---
>  src/gallium/auxiliary/draw/draw_context.c |  6 ++
>  src/gallium/auxiliary/draw/draw_context.h |  2 --
>  src/gallium/auxiliary/draw/draw_gs.c  | 26 --
>  src/gallium/auxiliary/draw/draw_vs.c  | 11 +++
>  src/gallium/auxiliary/draw/draw_vs_exec.c |  2 ++
>  5 files changed, 27 insertions(+), 20 deletions(-)
> 
> diff --git a/src/gallium/auxiliary/draw/draw_context.c 
> b/src/gallium/auxiliary/draw/draw_context.c
> index 0a67879..ddc305b 100644
> --- a/src/gallium/auxiliary/draw/draw_context.c
> +++ b/src/gallium/auxiliary/draw/draw_context.c
> @@ -68,6 +68,12 @@ draw_get_option_use_llvm(void)
> }
> return value;
>  }
> +#else
> +boolean
> +draw_get_option_use_llvm(void)
> +{
> +   return FALSE;
> +}
>  #endif
>  
>  
> diff --git a/src/gallium/auxiliary/draw/draw_context.h 
> b/src/gallium/auxiliary/draw/draw_context.h
> index f114f50..48549fe 100644
> --- a/src/gallium/auxiliary/draw/draw_context.h
> +++ b/src/gallium/auxiliary/draw/draw_context.h
> @@ -288,9 +288,7 @@ draw_get_shader_param(unsigned shader, enum 
> pipe_shader_cap param);
>  int
>  draw_get_shader_param_no_llvm(unsigned shader, enum pipe_shader_cap param);
>  
> -#ifdef HAVE_LLVM
>  boolean
>  draw_get_option_use_llvm(void);
> -#endif
>  
>  #endif /* DRAW_CONTEXT_H */
> diff --git a/src/gallium/auxiliary/draw/draw_gs.c 
> b/src/gallium/auxiliary/draw/draw_gs.c
> index 7de5e03..5e503ff 100644
> --- a/src/gallium/auxiliary/draw/draw_gs.c
> +++ b/src/gallium/auxiliary/draw/draw_gs.c
> @@ -674,11 +674,7 @@ int draw_geometry_shader_run(struct draw_geometry_shader 
> *shader,
>  void draw_geometry_shader_prepare(struct draw_geometry_shader *shader,
>struct draw_context *draw)
>  {
> -#ifdef HAVE_LLVM
> boolean use_llvm = draw_get_option_use_llvm();
> -#else
> -   boolean use_llvm = FALSE;
> -#endif
> if (!use_llvm && shader && shader->machine->Tokens != 
> shader->state.tokens) {
>tgsi_exec_machine_bind_shader(shader->machine,
>  shader->state.tokens,
> @@ -690,16 +686,18 @@ void draw_geometry_shader_prepare(struct 
> draw_geometry_shader *shader,
>  boolean
>  draw_gs_init( struct draw_context *draw )
>  {
> -   draw->gs.tgsi.machine = tgsi_exec_machine_create();
> -   if (!draw->gs.tgsi.machine)
> -  return FALSE;
> -
> -   draw->gs.tgsi.machine->Primitives = align_malloc(
> -  MAX_PRIMITIVES * sizeof(struct tgsi_exec_vector), 16);
> -   if (!draw->gs.tgsi.machine->Primitives)
> -  return FALSE;
> -   memset(draw->gs.tgsi.machine->Primitives, 0,
> -  MAX_PRIMITIVES * sizeof(struct tgsi_exec_vector));
> +   if (!draw_get_option_use_llvm()) {
> +  draw->gs.tgsi.machine = tgsi_exec_machine_create();
> +  if (!draw->gs.tgsi.machine)
> + return FALSE;
> +
> +  draw->gs.tgsi.machine->Primitives = align_malloc(
> + MAX_PRIMITIVES * sizeof(struct tgsi_exec_vector), 16);
> +  if (!draw->gs.tgsi.machine->Primitives)
> + return FALSE;
> +  memset(draw->gs.tgsi.machine->Primitives, 0,
> + MAX_PRIMITIVES * sizeof(struct tgsi_exec_vector));
> +   }
>  
> return TRUE;
>  }
> diff --git a/src/gallium/auxiliary/draw/draw_vs.c 
> b/src/gallium/auxiliary/draw/draw_vs.c
> index 55cbeb9..8bb9a7f 100644
> --- a/src/gallium/auxiliary/draw/draw_vs.c
> +++ b/src/gallium/auxiliary/draw/draw_vs.c
> @@ -149,9 +149,11 @@ draw_vs_init( struct draw_context *draw )
>  {
> draw->dump_vs = debug_get_option_gallium_dump_vs();
>  
> -   draw->vs.tgsi.machine = tgsi_exec_machine_create();
> -   if (!draw->vs.tgsi.machine)
> -  return FALSE;
> +   if (!draw_get_option_use_llvm()) {
> +  draw->vs.tgsi.machine = tgsi_exec_machine_create();
> +  if (!draw->vs.tgsi.machine)
> + return FALSE;
> +   }
>  
> draw->vs.emit_cache = translate_cache_create();
> if (!draw->vs.emit_cache) 
> @@ -173,7 +175,8 @@ draw_vs_destroy( struct draw_context *draw )
> if (draw->vs.emit_cache)
>translate_cache_destroy(draw->vs.emit_cache);
>  
> -   tgsi_exec_machine_destroy(draw->vs.tgsi.machine);
> +   if (draw_get_option_use_llvm())
> +  tgsi_exec_machine_destroy(draw->vs.tgsi.machine);
>  }
>  
>  
> diff --git a/src/gallium/auxiliary/draw/draw_vs_exec.c 
> b/src/gallium/auxiliary/draw/draw_vs_exec.c
> index 133b116..6a18d8c 100644
> --- a/src/gallium/auxiliary/draw/draw_vs_exec.c
> +++ b/src/gallium/auxiliary/draw/draw_vs_exec.c
> @@ -63,6 +63,7 @@ vs_exec_prepare( struct draw_vertex_shader *shader,
>  {
> struct exec_vertex_shader *evs = exec_vertex_shader(shader);
>  
> +   debug_assert(!draw_get_option_use_llvm());
> /* Specify the vertex program to inte

[Mesa-dev] [Bug 64386] [865G] White screen using Stellarium

2014-04-23 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=64386

--- Comment #3 from Götz  ---
If this is a Mesa bug, what can we do with this hardware to help debug this?

This error message doesn't appear anymore with the latest mesa version:
"Mesa 9.1.1 implementation error: unexpected format GL_DEPTH_COMPONENT in
_mesa_choose_tex_format()"

So, the error is a bit different than before (with an empty ~/.stellarium):

$ stellarium
Using default graphics system specified at build time:  raster 
User config directory does not exist:  "/home/test/.stellarium" 
Creating directory  "/home/test/.stellarium" 
 --- 
[ This is Stellarium 0.12.4 - http://www.stellarium.org ] 
[ Copyright (C) 2000-2013 Fabien Chereau et al  ] 
 --- 
Writing log file to: "/home/test/.stellarium/log.txt" 
File search paths: 
  0 .  "/home/test/.stellarium" 
  1 .  "/usr/share/stellarium" 
Config file  "/home/test/.stellarium/config.ini"  does not exist. Copying the
default file. 
Config file is:  "/home/test/.stellarium/config.ini" 
Going to initialize the OpenGL 2 renderer 
OpenGL supported version:  "1.3 Mesa 10.1.1" 
Qt GL paint engine is:  "OpenGL" 
StelQGL2Renderer::init : Failed because Qt paint engine is not OpenGL2 
If paint engine is OpenGL3 or higher, this code needs to be updated 
Failed to initialize the OpenGL 2 renderer, falling back to the OpenGL 1
renderer 
OpenGL supported version:  "1.3 Mesa 10.1.1" 
Qt GL paint engine is:  "OpenGL" 
GL vendor is  "Intel Open Source Technology Center" 
GL renderer is  "Mesa DRI Intel(R) 865G x86/MMX/SSE2" 
Cache directory is:  "/home/test/.cache/stellarium/stellarium" 
Sky language is  "en_US" 
Application language is  "en_US" 
Loading Solar System data ... 
Loaded 75 / 75 planet orbits from "/usr/share/stellarium/data/ssystem.ini" 
Could not find the starsConfig.json file: will copy the default one. 
Creating directory  "/home/test/.stellarium/stars/default" 
Creates file  "/home/test/.stellarium/stars/default/starsConfig.json" 
Loading star data ... 
"Loading "/usr/share/stellarium/stars/default/stars_0_0v0_3.cat": 0_0v0_2;
4963" 
"Loading "/usr/share/stellarium/stars/default/stars_1_0v0_3.cat": 1_0v0_2;
21598" 
"Loading "/usr/share/stellarium/stars/default/stars_2_0v0_3.cat": 2_0v0_2;
150090" 
"Loading "/usr/share/stellarium/stars/default/stars_3_1v0_2.cat": 3_1v0_1;
423540" 
Finished loading star catalogue data, max_geodesic_level:  3 
navigation/preset_sky_time is a double - treating as jday: 2.45151e+06 
Loaded 10051 NGC records 
Loading NGC name data ... 
Loaded 412 / 412 NGC name records successfully 
Loading star names from
"/usr/share/stellarium/skycultures/western/star_names.fab" 
Loaded 236 / 236 common star names 
Loading star names from "/usr/share/stellarium/stars/default/name.fab" 
Loaded 4359 / 4359 scientific star names 
Loading variable stars from
"/usr/share/stellarium/stars/default/gcvs_hip_part.dat" 
Loaded 6886 / 6886 variable stars 
Loaded 88 / 88 constellation records successfully for culture "western" 
Loaded 85 / 85 constellation art records successfully for culture "western" 
Loaded 89 / 89 constellation names 
Loading constellation boundary data ...  
Loaded 782 constellation boundary segments 
Not using any viewport effect 
Creating GUI ... 
Loading shortcuts... 
shortcuts.json doesn't exist, copying default... 
Creating directory  "/home/test/.stellarium/data/" 
Creating file "/home/test/.stellarium/data/shortcuts.json" 
Can't find plugin with id "RendererStatistics" 
Can't find plugin with id "RendererStatistics" 
libpng warning: iCCP: Not recognizing known sRGB profile that has been edited
Loaded plugin  "Oculars" . 
Ocular plugin - press Command-O to toggle eyepiece view mode. Press ALT-o for
configuration. 
Creating directory  "/home/test/.stellarium/modules/Oculars" 
Oculars::validateIniFile copied default_ocular.ini to 
"/home/test/.stellarium/modules/Oculars/ocular.ini" 
Loaded plugin  "Satellites" . 
Creating directory  "/home/test/.stellarium/modules/Satellites" 
Satellites::init satellites.json does not exist - copying default file to 
"/home/test/.stellarium/modules/Satellites/satellites.json" 
Satellites::init copied default satellites.json to 
"/home/test/.stellarium/modules/Satellites/satellites.json" 
Satellites: loading catalog file:
"/home/test/.stellarium/modules/Satellites/satellites.json" 
Satellite has invalid orbit: "TIANGONG 1" "37820" 
Loaded plugin  "SolarSystemEditor" . 
Trying to copy ssystem.ini to "/home/test/.stellarium/data/ssystem.ini" 
Loaded plugin  "TimeZoneConfiguration" . 
QGLFramebufferObject: Framebuffer incomplete attachment.
QGLFramebufferObject: Framebuffer incomplete attachment.
(this is 10 times repeated)
libpng warning: iCCP: Not recognizing known sRGB profile that has been edited
libpng warning: iCCP: Not recognizing known sRGB profile that has been edited
libpng warning: iCCP: Not recognizing known sRGB profi

Re: [Mesa-dev] [cfe-dev] 3 element vectors in opencl 1.1+

2014-04-23 Thread Jan Vesely
On Wed, 2014-04-23 at 19:49 +0200, Francisco Jerez wrote:
> Jan Vesely  writes:
> 
> > On Tue, 2014-04-22 at 17:50 -0700, Matt Arsenault wrote:
> > 
> >
> >> >> I think this is what v96:128 is for
> >> > according to [0], it specifies only alignment, not size. I could not
> >> > find an __attribute__ that would change size either.
> >> >
> >> > It should be possible to have ADMGPUDataLayout: public DataLayout class
> >> > that would intercept the call and fix the reported value, but I think it
> >> > would only move the hack to different place.
> >> >
> >> > I have added pocl-devel list as suggested.
> >> >
> >> > regards,
> >> > Jan
> >> >
> >> > [0]http://llvm.org/docs/LangRef.html#data-layout
> >> >
> >> 
> >> Only the size in memory matters, which is what the required alignment 
> >> specifies. DataLayout::getTypeAllocSize accounts for the alignment, but 
> >> getTypeStoreSize does not. I actually thought this was half of what 
> >> getTypeStoreSize was for, but it turns out it isn't.
> >
> > hm, I always thought that alignment only puts restrictions on starting
> > address and using padding was just a tool to do the job.
> >
> > anyway, thanks for the hint, using getTypeAllocSize works nicely.
> > since we are allocating space in the argument vector I think
> > getAllocSize is the right function to use.
> >
> > I'll post a patch.
> >
> 
> I don't think that using getTypeAllocSize() instead of
> getTypeStoreSize() to calculate clover::argument::size would be a
> satisfactory solution.  By doing that you'd redefine the API argument
> size exposed to the host for *all* argument types to be the
> device-dependent aligned size, which is definitely not what you want.

> AFAIK 3-element vectors are an exception because they are the only types
> that are defined to have a different API size from their actual usable
> size, so they probably deserve special handling in invocation.cpp (as
> you did in your first patch).  As the API size is target-independent I
> don't think that the fix belongs in Clang or LLVM, Clover is likely at
> fault.

The way I understood the ch 6.1.5 is that both API and OpenCL C 3
element vectors are required to be 4*sizeof(component). So a
sizeof(float3) == sizeof(cl_float3) == 16, and should be both host and
target independent. That's why clang (or more precisely libclc) looked
like a correct place.

I understand that target device can have stricter alignment rules, but I
don't see how it can have different type sizes (my reading of the specs
is that these are binding for the target as well).

I can resend the original patch with debug output replaced by a comment.

regards,
Jan

> 
> Thanks.
> 
> > regards,
> > Jan
> >
> >
> > -- 
> > Jan Vesely 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev

-- 
Jan Vesely 


signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [cfe-dev] 3 element vectors in opencl 1.1+

2014-04-23 Thread Erik Schnetter
On Apr 24, 2014, at 0:30 , Jan Vesely  wrote:

> On Wed, 2014-04-23 at 19:49 +0200, Francisco Jerez wrote:
>> Jan Vesely  writes:
>> 
>>> On Tue, 2014-04-22 at 17:50 -0700, Matt Arsenault wrote:
>>> 
>>> 
>> I think this is what v96:128 is for
> according to [0], it specifies only alignment, not size. I could not
> find an __attribute__ that would change size either.
> 
> It should be possible to have ADMGPUDataLayout: public DataLayout class
> that would intercept the call and fix the reported value, but I think it
> would only move the hack to different place.
> 
> I have added pocl-devel list as suggested.
> 
> regards,
> Jan
> 
> [0]http://llvm.org/docs/LangRef.html#data-layout
> 
 
 Only the size in memory matters, which is what the required alignment 
 specifies. DataLayout::getTypeAllocSize accounts for the alignment, but 
 getTypeStoreSize does not. I actually thought this was half of what 
 getTypeStoreSize was for, but it turns out it isn't.
>>> 
>>> hm, I always thought that alignment only puts restrictions on starting
>>> address and using padding was just a tool to do the job.
>>> 
>>> anyway, thanks for the hint, using getTypeAllocSize works nicely.
>>> since we are allocating space in the argument vector I think
>>> getAllocSize is the right function to use.
>>> 
>>> I'll post a patch.
>>> 
>> 
>> I don't think that using getTypeAllocSize() instead of
>> getTypeStoreSize() to calculate clover::argument::size would be a
>> satisfactory solution.  By doing that you'd redefine the API argument
>> size exposed to the host for *all* argument types to be the
>> device-dependent aligned size, which is definitely not what you want.
> 
>> AFAIK 3-element vectors are an exception because they are the only types
>> that are defined to have a different API size from their actual usable
>> size, so they probably deserve special handling in invocation.cpp (as
>> you did in your first patch).  As the API size is target-independent I
>> don't think that the fix belongs in Clang or LLVM, Clover is likely at
>> fault.
> 
> The way I understood the ch 6.1.5 is that both API and OpenCL C 3
> element vectors are required to be 4*sizeof(component). So a
> sizeof(float3) == sizeof(cl_float3) == 16, and should be both host and
> target independent. That's why clang (or more precisely libclc) looked
> like a correct place.
> 
> I understand that target device can have stricter alignment rules, but I
> don't see how it can have different type sizes (my reading of the specs
> is that these are binding for the target as well).

OpenCL specifies the alignment of types; this is not up to the target. For the 
basic types, the alignment is their size.

-erik 

-- 
Erik Schnetter 
http://www.perimeterinstitute.ca/personal/eschnetter/

My email is as private as my paper mail. I therefore support encrypting
and signing email messages. Get my PGP key from http://pgp.mit.edu/.



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/4] gallium, nv50, nvc0: add ARB_sample_shading

2014-04-23 Thread Ilia Mirkin
This is my latest iteration of the ARB_sample_shading implementation. The only
known defect is that gl_SampleMask doesn't appear to work on nv50 nor
nvc0. I'm fairly sure it's due to some bit of setup I'm missing, but it has
thus far eluded me. I believe I've addressed the various earlier review
comments either with replies or by adjusting my code. Let me know if I've
missed anything.

I reinstated the cso_cache thing, I think it's needed since blits/etc probably
should use min_samples == 1. I've used it in all the places I could find saved
the sample mask.

Both nv50 and nvc0 need to stick the sample positions into a constbuf. Perhaps
it would make sense to make that manipulation more generic, but if there's HW
that will pull the sample position of the current sample, then maybe not.

I've left all these things as system values because (a) they're system values
in mesa, and (b) it seems like a lot of trouble to expose the raster object as
an addressable item in TGSI for seemingly little benefit. A lowering pass in
the driver can just have it do the right thing.

Assuming that the generic bits get positive reviews, I'd like to push the
nv50/nvc0 bits as well despite the gl_SampleMask failure. That'll get resolved
in time, and I doubt there are too many users of that feature. (Admittedly,
probably not too many users of ARB_sample_shading in general.)

[ Another little short-coming for nv50/nvc0 is that there is no shader variant
  support, and the code is different for per-sample vs per-fragment due to the
  interpolation instructions. There are a few other things that need variant
  support to work, perhaps that'll be done at a later time. ]

Ilia Mirkin (4):
  gallium: add basic support for ARB_sample_shading
  mesa/st: add support for ARB_sample_shading
  nv50: add support for PIPE_CAP_SAMPLE_SHADING
  nvc0: add support for PIPE_CAP_SAMPLE_SHADING

 src/gallium/auxiliary/cso_cache/cso_context.c  | 19 
 src/gallium/auxiliary/cso_cache/cso_context.h  |  4 +++
 src/gallium/auxiliary/hud/hud_context.c|  3 ++
 src/gallium/auxiliary/postprocess/pp_run.c |  3 ++
 src/gallium/auxiliary/tgsi/tgsi_strings.c  |  5 ++-
 src/gallium/auxiliary/util/u_blit.c|  3 ++
 src/gallium/docs/source/context.rst|  3 ++
 src/gallium/docs/source/screen.rst |  3 ++
 src/gallium/docs/source/tgsi.rst   | 20 
 src/gallium/drivers/freedreno/freedreno_screen.c   |  1 +
 src/gallium/drivers/i915/i915_screen.c |  1 +
 src/gallium/drivers/ilo/ilo_screen.c   |  1 +
 src/gallium/drivers/llvmpipe/lp_screen.c   |  1 +
 src/gallium/drivers/nouveau/codegen/nv50_ir.h  |  8 +
 .../drivers/nouveau/codegen/nv50_ir_driver.h   |  3 +-
 .../drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 13 
 .../drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp  | 14 +
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 10 +-
 .../nouveau/codegen/nv50_ir_lowering_nv50.cpp  | 12 
 .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp  | 21 +
 .../drivers/nouveau/codegen/nv50_ir_print.cpp  |  2 ++
 .../drivers/nouveau/codegen/nv50_ir_target.cpp |  6 ++--
 .../nouveau/codegen/nv50_ir_target_nv50.cpp|  2 ++
 .../nouveau/codegen/nv50_ir_target_nvc0.cpp|  2 ++
 src/gallium/drivers/nouveau/nv30/nv30_screen.c |  1 +
 src/gallium/drivers/nouveau/nv50/nv50_context.h|  7 -
 src/gallium/drivers/nouveau/nv50/nv50_program.c|  6 +++-
 src/gallium/drivers/nouveau/nv50/nv50_program.h|  2 ++
 src/gallium/drivers/nouveau/nv50/nv50_screen.c |  1 +
 .../drivers/nouveau/nv50/nv50_shader_state.c   | 13 
 src/gallium/drivers/nouveau/nv50/nv50_state.c  | 12 
 .../drivers/nouveau/nv50/nv50_state_validate.c | 36 --
 src/gallium/drivers/nouveau/nv50/nv50_surface.c| 11 ++-
 src/gallium/drivers/nouveau/nvc0/nvc0_context.h|  2 ++
 src/gallium/drivers/nouveau/nvc0/nvc0_program.c| 17 +++---
 src/gallium/drivers/nouveau/nvc0/nvc0_program.h|  1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c |  1 +
 .../drivers/nouveau/nvc0/nvc0_shader_state.c   |  2 ++
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c  | 12 
 .../drivers/nouveau/nvc0/nvc0_state_validate.c | 34 +---
 src/gallium/drivers/nouveau/nvc0/nvc0_surface.c| 11 ++-
 src/gallium/drivers/r300/r300_screen.c |  1 +
 src/gallium/drivers/r600/r600_pipe.c   |  1 +
 src/gallium/drivers/radeonsi/si_pipe.c |  1 +
 src/gallium/drivers/softpipe/sp_screen.c   |  1 +
 src/gallium/drivers/svga/svga_screen.c |  1 +
 src/gallium/include/pipe/p_context.h   |  3 ++
 src/gallium/include/pipe/p_defines.h   |  1 +
 src/gallium/include/pipe/p_shader_tokens.h |  5 ++-
 src/mesa/state_tracker/st_atom.c

[Mesa-dev] [PATCH 1/4] gallium: add basic support for ARB_sample_shading

2014-04-23 Thread Ilia Mirkin
---
 src/gallium/auxiliary/tgsi/tgsi_strings.c|  5 -
 src/gallium/docs/source/context.rst  |  1 +
 src/gallium/docs/source/screen.rst   |  3 +++
 src/gallium/docs/source/tgsi.rst | 20 
 src/gallium/drivers/freedreno/freedreno_screen.c |  1 +
 src/gallium/drivers/i915/i915_screen.c   |  1 +
 src/gallium/drivers/ilo/ilo_screen.c |  1 +
 src/gallium/drivers/llvmpipe/lp_screen.c |  1 +
 src/gallium/drivers/nouveau/nv30/nv30_screen.c   |  1 +
 src/gallium/drivers/nouveau/nv50/nv50_screen.c   |  1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c   |  1 +
 src/gallium/drivers/r300/r300_screen.c   |  1 +
 src/gallium/drivers/r600/r600_pipe.c |  1 +
 src/gallium/drivers/radeonsi/si_pipe.c   |  1 +
 src/gallium/drivers/softpipe/sp_screen.c |  1 +
 src/gallium/drivers/svga/svga_screen.c   |  1 +
 src/gallium/include/pipe/p_context.h |  3 +++
 src/gallium/include/pipe/p_defines.h |  1 +
 src/gallium/include/pipe/p_shader_tokens.h   |  5 -
 19 files changed, 48 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_strings.c 
b/src/gallium/auxiliary/tgsi/tgsi_strings.c
index b0ba3ef..2be726c 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_strings.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_strings.c
@@ -81,7 +81,10 @@ const char *tgsi_semantic_names[TGSI_SEMANTIC_COUNT] =
"PCOORD",
"VIEWPORT_INDEX",
"LAYER",
-   "CULLDIST"
+   "CULLDIST",
+   "SAMPLEID",
+   "SAMPLEPOS",
+   "SAMPLEMASK"
 };
 
 const char *tgsi_texture_names[TGSI_TEXTURE_COUNT] =
diff --git a/src/gallium/docs/source/context.rst 
b/src/gallium/docs/source/context.rst
index 1fc8a3e..fc8dd16 100644
--- a/src/gallium/docs/source/context.rst
+++ b/src/gallium/docs/source/context.rst
@@ -67,6 +67,7 @@ objects. They all follow simple, one-method binding calls, 
e.g.
   which are used as comparison values in stencil test.
 * ``set_blend_color``
 * ``set_sample_mask``
+* ``set_min_samples`` sets the minimum number of samples that must be run.
 * ``set_clip_state``
 * ``set_polygon_stipple``
 * ``set_scissor_states`` sets the bounds for the scissor test, which culls
diff --git a/src/gallium/docs/source/screen.rst 
b/src/gallium/docs/source/screen.rst
index 89cbdbf..f5acebb 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -197,6 +197,9 @@ The integer capabilities:
   in conjunction with a texture gather opcode.
 * ``PIPE_CAP_MAX_TEXTURE_GATHER_OFFSET``: The maximum offset that can be used
   in conjunction with a texture gather opcode.
+* ``PIPE_CAP_SAMPLE_SHADING``: Whether there is support for per-sample
+  shading. The context->set_min_samples function will be expected to be
+  implemented.
 
 
 .. _pipe_capf:
diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index d5325f4..b7d016a 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -2621,6 +2621,26 @@ distances and by the 
PIPE_MAX_CLIP_OR_CULL_DISTANCE_ELEMENT_COUNT
 which specifies the maximum number of registers which can be
 annotated with those semantics.
 
+TGSI_SEMANTIC_SAMPLEID
+""
+
+For fragment shaders, this semantic label indicates that a system value
+contains the current sample id (i.e. gl_SampleID). Only the X value is used.
+
+TGSI_SEMANTIC_SAMPLEPOS
+"""
+
+For fragment shaders, this semantic label indicates that a system value
+contains the current sample's position (i.e. gl_SamplePosition). Only the X
+and Y values are used.
+
+TGSI_SEMANTIC_SAMPLEMASK
+
+
+For fragment shaders, this semantic label indicates that an output contains
+the sample mask used to disable further sample processing
+(i.e. gl_SampleMask). Only the X value is used, up to 32x MS.
+
 
 Declaration Interpolate
 ^^^
diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c 
b/src/gallium/drivers/freedreno/freedreno_screen.c
index 08556a4..d62d4b6 100644
--- a/src/gallium/drivers/freedreno/freedreno_screen.c
+++ b/src/gallium/drivers/freedreno/freedreno_screen.c
@@ -208,6 +208,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum 
pipe_cap param)
 case PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT:
 case PIPE_CAP_FAKE_SW_MSAA:
case PIPE_CAP_TEXTURE_QUERY_LOD:
+case PIPE_CAP_SAMPLE_SHADING:
return 0;
 
/* Stream output. */
diff --git a/src/gallium/drivers/i915/i915_screen.c 
b/src/gallium/drivers/i915/i915_screen.c
index b484d36..8ee4330 100644
--- a/src/gallium/drivers/i915/i915_screen.c
+++ b/src/gallium/drivers/i915/i915_screen.c
@@ -220,6 +220,7 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap 
cap)
case PIPE_CAP_TEXTURE_GATHER_SM5:
case PIPE_CAP_FAKE_SW_MSAA:
case PIPE_CAP_TEXTURE_QUERY_LOD:
+   case PIPE_CAP_SAMPLE_SHADING:
   return 0;
 
c

[Mesa-dev] [PATCH 2/4] mesa/st: add support for ARB_sample_shading

2014-04-23 Thread Ilia Mirkin
---
 src/gallium/auxiliary/cso_cache/cso_context.c | 19 +++
 src/gallium/auxiliary/cso_cache/cso_context.h |  4 
 src/gallium/auxiliary/hud/hud_context.c   |  3 +++
 src/gallium/auxiliary/postprocess/pp_run.c|  3 +++
 src/gallium/auxiliary/util/u_blit.c   |  3 +++
 src/mesa/state_tracker/st_atom.c  |  1 +
 src/mesa/state_tracker/st_atom.h  |  1 +
 src/mesa/state_tracker/st_atom_msaa.c | 23 +++
 src/mesa/state_tracker/st_cb_clear.c  |  3 +++
 src/mesa/state_tracker/st_extensions.c|  3 ++-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp| 16 ++--
 src/mesa/state_tracker/st_program.c   | 21 +
 12 files changed, 89 insertions(+), 11 deletions(-)

diff --git a/src/gallium/auxiliary/cso_cache/cso_context.c 
b/src/gallium/auxiliary/cso_cache/cso_context.c
index dd0e3df..47c40a5 100644
--- a/src/gallium/auxiliary/cso_cache/cso_context.c
+++ b/src/gallium/auxiliary/cso_cache/cso_context.c
@@ -120,6 +120,7 @@ struct cso_context {
struct pipe_viewport_state vp, vp_saved;
struct pipe_blend_color blend_color;
unsigned sample_mask, sample_mask_saved;
+   unsigned min_samples, min_samples_saved;
struct pipe_stencil_ref stencil_ref, stencil_ref_saved;
 };
 
@@ -716,6 +717,24 @@ void cso_restore_sample_mask(struct cso_context *ctx)
cso_set_sample_mask(ctx, ctx->sample_mask_saved);
 }
 
+void cso_set_min_samples(struct cso_context *ctx, unsigned min_samples)
+{
+   if (ctx->min_samples != min_samples && ctx->pipe->set_min_samples) {
+  ctx->min_samples = min_samples;
+  ctx->pipe->set_min_samples(ctx->pipe, min_samples);
+   }
+}
+
+void cso_save_min_samples(struct cso_context *ctx)
+{
+   ctx->min_samples_saved = ctx->min_samples;
+}
+
+void cso_restore_min_samples(struct cso_context *ctx)
+{
+   cso_set_min_samples(ctx, ctx->min_samples_saved);
+}
+
 void cso_set_stencil_ref(struct cso_context *ctx,
  const struct pipe_stencil_ref *sr)
 {
diff --git a/src/gallium/auxiliary/cso_cache/cso_context.h 
b/src/gallium/auxiliary/cso_cache/cso_context.h
index 1aa9998..f0a08bb 100644
--- a/src/gallium/auxiliary/cso_cache/cso_context.h
+++ b/src/gallium/auxiliary/cso_cache/cso_context.h
@@ -164,6 +164,10 @@ void cso_set_sample_mask(struct cso_context *cso, unsigned 
stencil_mask);
 void cso_save_sample_mask(struct cso_context *ctx);
 void cso_restore_sample_mask(struct cso_context *ctx);
 
+void cso_set_min_samples(struct cso_context *cso, unsigned min_samples);
+void cso_save_min_samples(struct cso_context *ctx);
+void cso_restore_min_samples(struct cso_context *ctx);
+
 void cso_set_stencil_ref(struct cso_context *cso,
  const struct pipe_stencil_ref *sr);
 void cso_save_stencil_ref(struct cso_context *cso);
diff --git a/src/gallium/auxiliary/hud/hud_context.c 
b/src/gallium/auxiliary/hud/hud_context.c
index ccf020b..b6e0184 100644
--- a/src/gallium/auxiliary/hud/hud_context.c
+++ b/src/gallium/auxiliary/hud/hud_context.c
@@ -412,6 +412,7 @@ hud_draw(struct hud_context *hud, struct pipe_resource *tex)
 
cso_save_framebuffer(cso);
cso_save_sample_mask(cso);
+   cso_save_min_samples(cso);
cso_save_blend(cso);
cso_save_depth_stencil_alpha(cso);
cso_save_fragment_shader(cso);
@@ -450,6 +451,7 @@ hud_draw(struct hud_context *hud, struct pipe_resource *tex)
 
cso_set_framebuffer(cso, &fb);
cso_set_sample_mask(cso, ~0);
+   cso_set_min_samples(cso, 1);
cso_set_blend(cso, &hud->alpha_blend);
cso_set_depth_stencil_alpha(cso, &hud->dsa);
cso_set_rasterizer(cso, &hud->rasterizer);
@@ -538,6 +540,7 @@ hud_draw(struct hud_context *hud, struct pipe_resource *tex)
/* restore states */
cso_restore_framebuffer(cso);
cso_restore_sample_mask(cso);
+   cso_restore_min_samples(cso);
cso_restore_blend(cso);
cso_restore_depth_stencil_alpha(cso);
cso_restore_fragment_shader(cso);
diff --git a/src/gallium/auxiliary/postprocess/pp_run.c 
b/src/gallium/auxiliary/postprocess/pp_run.c
index 7d9330c..06281c8 100644
--- a/src/gallium/auxiliary/postprocess/pp_run.c
+++ b/src/gallium/auxiliary/postprocess/pp_run.c
@@ -122,6 +122,7 @@ pp_run(struct pp_queue_t *ppq, struct pipe_resource *in,
cso_save_geometry_shader(cso);
cso_save_rasterizer(cso);
cso_save_sample_mask(cso);
+   cso_save_min_samples(cso);
cso_save_samplers(cso, PIPE_SHADER_FRAGMENT);
cso_save_sampler_views(cso, PIPE_SHADER_FRAGMENT);
cso_save_stencil_ref(cso);
@@ -136,6 +137,7 @@ pp_run(struct pp_queue_t *ppq, struct pipe_resource *in,
 
/* set default state */
cso_set_sample_mask(cso, ~0);
+   cso_set_min_samples(cso, 1);
cso_set_stream_outputs(cso, 0, NULL, NULL);
cso_set_geometry_shader_handle(cso, NULL);
cso_set_render_condition(cso, NULL, FALSE, 0);
@@ -187,6 +189,7 @@ pp_run(struct pp_queue_t *ppq, struct pipe_resource *in,
cso_restore_geometry_shader(cso);
cso_res

[Mesa-dev] [PATCH 4/4] nvc0: add support for PIPE_CAP_SAMPLE_SHADING

2014-04-23 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir.h  |  7 +
 .../drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp | 13 +
 .../drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp  | 14 +
 .../nouveau/codegen/nv50_ir_lowering_nvc0.cpp  | 21 +
 .../drivers/nouveau/codegen/nv50_ir_print.cpp  |  1 +
 .../drivers/nouveau/codegen/nv50_ir_target.cpp |  6 ++--
 .../nouveau/codegen/nv50_ir_target_nvc0.cpp|  2 ++
 src/gallium/drivers/nouveau/nvc0/nvc0_context.h|  2 ++
 src/gallium/drivers/nouveau/nvc0/nvc0_program.c| 17 +++
 src/gallium/drivers/nouveau/nvc0/nvc0_program.h|  1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_screen.c |  2 +-
 .../drivers/nouveau/nvc0/nvc0_shader_state.c   |  2 ++
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c  | 12 
 .../drivers/nouveau/nvc0/nvc0_state_validate.c | 34 +++---
 src/gallium/drivers/nouveau/nvc0/nvc0_surface.c| 11 ++-
 15 files changed, 131 insertions(+), 14 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h 
b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
index 96071be..56b0115 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
@@ -136,6 +136,7 @@ enum operation
OP_DFDY,
OP_RDSV, // read system value
OP_WRSV, // write system value
+   OP_PIXLD, // get info about raster object or surfaces
OP_QUADOP,
OP_QUADON,
OP_QUADPOP,
@@ -214,6 +215,12 @@ enum operation
 #define NV50_IR_SUBOP_SUCLAMP_SD(r, d) (( 0 + (r)) | ((d == 2) ? 0x10 : 0))
 #define NV50_IR_SUBOP_SUCLAMP_PL(r, d) (( 5 + (r)) | ((d == 2) ? 0x10 : 0))
 #define NV50_IR_SUBOP_SUCLAMP_BL(r, d) ((10 + (r)) | ((d == 2) ? 0x10 : 0))
+#define NV50_IR_SUBOP_PIXLD_COUNT   0
+#define NV50_IR_SUBOP_PIXLD_COVMASK 1
+#define NV50_IR_SUBOP_PIXLD_COVERED 2
+#define NV50_IR_SUBOP_PIXLD_OFFSET  3
+#define NV50_IR_SUBOP_PIXLD_CENT_OFFSET 4
+#define NV50_IR_SUBOP_PIXLD_SAMPLEID5
 #define NV50_IR_SUBOP_MADSP_SD 0x
 // Yes, we could represent those with DataType.
 // Or put the type into operation and have a couple 1000 values in that enum.
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
index a4b50ee..c258b6b 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_gk110.cpp
@@ -113,6 +113,8 @@ private:
 
void emitQUADOP(const Instruction *, uint8_t qOp, uint8_t laneMask);
 
+   void emitPIXLD(const Instruction *);
+
void emitFlow(const Instruction *);
 
inline void defId(const ValueDef&, const int pos);
@@ -1130,6 +1132,14 @@ CodeEmitterGK110::emitQUADOP(const Instruction *i, 
uint8_t qOp, uint8_t laneMask
 }
 
 void
+CodeEmitterGK110::emitPIXLD(const Instruction *i)
+{
+   emitForm_L(i, 0x7f4, 2, Modifier(0));
+   code[1] |= i->subOp << 2;
+   code[1] |= 0x0007;
+}
+
+void
 CodeEmitterGK110::emitFlow(const Instruction *i)
 {
const FlowInstruction *f = i->asFlow();
@@ -1684,6 +1694,9 @@ CodeEmitterGK110::emitInstruction(Instruction *insn)
case OP_TEXBAR:
   emitTEXBAR(insn);
   break;
+   case OP_PIXLD:
+  emitPIXLD(insn);
+  break;
case OP_BRA:
case OP_CALL:
case OP_PRERET:
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
index d486c8d..cef92cf 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
@@ -135,6 +135,8 @@ private:
void emitVSHL(const Instruction *);
void emitVectorSubOp(const Instruction *);
 
+   void emitPIXLD(const Instruction *);
+
inline void defId(const ValueDef&, const int pos);
inline void defId(const Instruction *, int d, const int pos);
inline void srcId(const ValueRef&, const int pos);
@@ -2141,6 +2143,15 @@ CodeEmitterNVC0::emitVSHL(const Instruction *i)
   code[1] |= 1 << 16;
 }
 
+void
+CodeEmitterNVC0::emitPIXLD(const Instruction *i)
+{
+   assert(i->encSize == 8);
+   emitForm_A(i, HEX64(1000, 0006));
+   code[0] |= i->subOp << 5;
+   code[1] |= 0x00e0;
+}
+
 bool
 CodeEmitterNVC0::emitInstruction(Instruction *insn)
 {
@@ -2390,6 +2401,9 @@ CodeEmitterNVC0::emitInstruction(Instruction *insn)
case OP_VSHL:
   emitVSHL(insn);
   break;
+   case OP_PIXLD:
+  emitPIXLD(insn);
+  break;
case OP_PHI:
case OP_UNION:
case OP_CONSTRAINT:
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
index 44b5ecd..ebdeee4 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
@@ -1426,6 +1426,27 @@ NVC0LoweringPass::handleRDSV(Instruction *i)
   bld.mkL

[Mesa-dev] [PATCH 3/4] nv50: add support for PIPE_CAP_SAMPLE_SHADING

2014-04-23 Thread Ilia Mirkin
---
 src/gallium/drivers/nouveau/codegen/nv50_ir.h  |  1 +
 .../drivers/nouveau/codegen/nv50_ir_driver.h   |  3 +-
 .../drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp  | 10 +-
 .../nouveau/codegen/nv50_ir_lowering_nv50.cpp  | 12 
 .../drivers/nouveau/codegen/nv50_ir_print.cpp  |  1 +
 .../nouveau/codegen/nv50_ir_target_nv50.cpp|  2 ++
 src/gallium/drivers/nouveau/nv50/nv50_context.h|  7 -
 src/gallium/drivers/nouveau/nv50/nv50_program.c|  6 +++-
 src/gallium/drivers/nouveau/nv50/nv50_program.h|  2 ++
 src/gallium/drivers/nouveau/nv50/nv50_screen.c |  2 +-
 .../drivers/nouveau/nv50/nv50_shader_state.c   | 13 
 src/gallium/drivers/nouveau/nv50/nv50_state.c  | 12 
 .../drivers/nouveau/nv50/nv50_state_validate.c | 36 --
 src/gallium/drivers/nouveau/nv50/nv50_surface.c| 11 ++-
 14 files changed, 110 insertions(+), 8 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h 
b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
index 286ac83..96071be 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
@@ -354,6 +354,7 @@ enum SVSemantic
SV_POINT_COORD,
SV_CLIP_DISTANCE,
SV_SAMPLE_INDEX,
+   SV_SAMPLE_POS,
SV_TESS_FACTOR,
SV_TESS_COORD,
SV_TID,
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
index f2f4ead..2fe5591 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_driver.h
@@ -74,7 +74,6 @@ struct nv50_ir_varying
 #define NV50_SEMANTIC_INVOCATIONID  (TGSI_SEMANTIC_COUNT + 6)
 #define NV50_SEMANTIC_TESSFACTOR(TGSI_SEMANTIC_COUNT + 7)
 #define NV50_SEMANTIC_TESSCOORD (TGSI_SEMANTIC_COUNT + 8)
-#define NV50_SEMANTIC_SAMPLEMASK(TGSI_SEMANTIC_COUNT + 9)
 #define NV50_SEMANTIC_COUNT (TGSI_SEMANTIC_COUNT + 10)
 
 #define NV50_TESS_PART_FRACT_ODD  0
@@ -181,12 +180,14 @@ struct nv50_ir_prog_info
   uint8_t edgeFlagOut;
   uint8_t fragDepth; /* output index of FragDepth */
   uint8_t sampleMask;/* output index of SampleMask */
+  boolean sampleInterp;  /* perform sample interp on all fp inputs */
   uint8_t backFaceColor[2];  /* input/output indices of back face colour */
   uint8_t globalAccess;  /* 1 for read, 2 for wr, 3 for rw */
   boolean nv50styleSurfaces; /* generate gX[] access for raw buffers */
   uint8_t resInfoCBSlot; /* cX[] used for tex handles, surface info */
   uint16_t texBindBase;  /* base address for tex handles (nve4) */
   uint16_t suInfoBase;   /* base address for surface info (nve4) */
+  uint16_t sampleInfoBase;   /* base address for sample positions */
   uint8_t msInfoCBSlot;  /* cX[] used for multisample info */
   uint16_t msInfoBase;   /* base address for multisample info */
} io;
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index 90820ea..1fc54cb 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -346,6 +346,8 @@ static nv50_ir::SVSemantic translateSysVal(uint sysval)
case TGSI_SEMANTIC_BLOCK_ID:   return nv50_ir::SV_CTAID;
case TGSI_SEMANTIC_BLOCK_SIZE: return nv50_ir::SV_NTID;
case TGSI_SEMANTIC_THREAD_ID:  return nv50_ir::SV_TID;
+   case TGSI_SEMANTIC_SAMPLEID:   return nv50_ir::SV_SAMPLE_INDEX;
+   case TGSI_SEMANTIC_SAMPLEPOS:  return nv50_ir::SV_SAMPLE_POS;
default:
   assert(0);
   return nv50_ir::SV_CLOCK;
@@ -925,7 +927,7 @@ bool Source::scanDeclaration(const struct 
tgsi_full_declaration *decl)
default:
   break;
}
-   if (decl->Interp.Centroid)
+   if (decl->Interp.Centroid || info->io.sampleInterp)
   info->in[i].centroid = 1;
 }
  }
@@ -956,6 +958,9 @@ bool Source::scanDeclaration(const struct 
tgsi_full_declaration *decl)
decl->Declaration.UsageMask << (si * 4);
 info->io.genUserClip = -1;
 break;
+ case TGSI_SEMANTIC_SAMPLEMASK:
+info->io.sampleMask = i;
+break;
  default:
 break;
  }
@@ -972,6 +977,9 @@ bool Source::scanDeclaration(const struct 
tgsi_full_declaration *decl)
   case TGSI_SEMANTIC_VERTEXID:
  info->io.vertexId = first;
  break;
+  case TGSI_SEMANTIC_SAMPLEMASK:
+ info->io.sampleMask = first;
+ break;
   default:
  break;
   }
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv50.cpp
index 29f85cf..69e88e6 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nv

[Mesa-dev] [PATCH 5/7] nvc0/ir: add support for SAMPLEMASK sysval

2014-04-23 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir.h | 1 +
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 1 +
 src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 4 
 src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp | 1 +
 src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp   | 1 +
 5 files changed, 8 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir.h 
b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
index 56b0115..c57729e 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir.h
@@ -362,6 +362,7 @@ enum SVSemantic
SV_CLIP_DISTANCE,
SV_SAMPLE_INDEX,
SV_SAMPLE_POS,
+   SV_SAMPLE_MASK,
SV_TESS_FACTOR,
SV_TESS_COORD,
SV_TID,
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index 0ec0b9a..2c302a7 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -348,6 +348,7 @@ static nv50_ir::SVSemantic translateSysVal(uint sysval)
case TGSI_SEMANTIC_THREAD_ID:  return nv50_ir::SV_TID;
case TGSI_SEMANTIC_SAMPLEID:   return nv50_ir::SV_SAMPLE_INDEX;
case TGSI_SEMANTIC_SAMPLEPOS:  return nv50_ir::SV_SAMPLE_POS;
+   case TGSI_SEMANTIC_SAMPLEMASK: return nv50_ir::SV_SAMPLE_MASK;
default:
   assert(0);
   return nv50_ir::SV_CLOCK;
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
index ebdeee4..c7e9063 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp
@@ -1447,6 +1447,10 @@ NVC0LoweringPass::handleRDSV(Instruction *i)
  off);
   break;
}
+   case SV_SAMPLE_MASK:
+  ld = bld.mkOp1(OP_PIXLD, TYPE_U32, i->getDef(0), bld.mkImm(0));
+  ld->subOp = NV50_IR_SUBOP_PIXLD_COVMASK;
+  break;
default:
   if (prog->getType() == Program::TYPE_TESSELLATION_EVAL)
  vtx = bld.mkOp1v(OP_PFETCH, TYPE_U32, bld.getSSA(), bld.mkImm(0));
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
index e74b25f..42013e5 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp
@@ -255,6 +255,7 @@ static const char *SemanticStr[SV_LAST + 1] =
"CLIP_DISTANCE",
"SAMPLE_INDEX",
"SAMPLE_POS",
+   "SAMPLE_MASK",
"TESS_FACTOR",
"TESS_COORD",
"TID",
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
index 95ed849..c747f3e 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp
@@ -284,6 +284,7 @@ TargetNVC0::getSVAddress(DataFile shaderFile, const Symbol 
*sym) const
case SV_GRIDID: return kepler ? 0x18 : ~0;
case SV_SAMPLE_INDEX:   return 0;
case SV_SAMPLE_POS: return 0;
+   case SV_SAMPLE_MASK:return 0;
default:
   return 0x;
}
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/7] gallium: add GS_INVOCATIONS property

2014-04-23 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/gallium/auxiliary/tgsi/tgsi_ureg.c | 9 +
 src/gallium/include/pipe/p_shader_tokens.h | 3 ++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_ureg.c 
b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
index 38cce58..2bf93ee 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_ureg.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_ureg.c
@@ -1468,6 +1468,14 @@ static void emit_decls( struct ureg_program *ureg )
 ureg->property_gs_max_vertices);
}
 
+   if (ureg->property_gs_invocations != ~0) {
+  assert(ureg->processor == TGSI_PROCESSOR_GEOMETRY);
+
+  emit_property(ureg,
+TGSI_PROPERTY_GS_INVOCATIONS,
+ureg->property_gs_invocations);
+   }
+
if (ureg->property_fs_coord_origin) {
   assert(ureg->processor == TGSI_PROCESSOR_FRAGMENT);
 
@@ -1757,6 +1765,7 @@ struct ureg_program *ureg_create( unsigned processor )
ureg->property_gs_input_prim = ~0;
ureg->property_gs_output_prim = ~0;
ureg->property_gs_max_vertices = ~0;
+   ureg->property_gs_invocations = ~0;
 
ureg->free_temps = util_bitmask_create();
if (ureg->free_temps == NULL)
diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h
index 1903d53..b537166 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -236,7 +236,8 @@ union tgsi_immediate_data
 #define TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS 5
 #define TGSI_PROPERTY_FS_DEPTH_LAYOUT6
 #define TGSI_PROPERTY_VS_PROHIBIT_UCPS   7
-#define TGSI_PROPERTY_COUNT  8
+#define TGSI_PROPERTY_GS_INVOCATIONS 8
+#define TGSI_PROPERTY_COUNT  9
 
 struct tgsi_property {
unsigned Type : 4;  /**< TGSI_TOKEN_TYPE_PROPERTY */
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/7] mesa/st: translate gl_SampleMaskIn to SAMPLEMASK semantic

2014-04-23 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 0d69c70..ee8c54a 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -4166,6 +4166,7 @@ static unsigned mesa_sysval_to_semantic[SYSTEM_VALUE_MAX] 
= {
TGSI_SEMANTIC_INSTANCEID,
TGSI_SEMANTIC_SAMPLEID,
TGSI_SEMANTIC_SAMPLEPOS,
+   TGSI_SEMANTIC_SAMPLEMASK,
 };
 
 /**
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/7] gallium: add INVOCATIONID semantic

2014-04-23 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/gallium/auxiliary/tgsi/tgsi_strings.c  | 3 ++-
 src/gallium/docs/source/tgsi.rst   | 6 ++
 src/gallium/include/pipe/p_shader_tokens.h | 3 ++-
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_strings.c 
b/src/gallium/auxiliary/tgsi/tgsi_strings.c
index 2be726c..5b6e47f 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_strings.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_strings.c
@@ -84,7 +84,8 @@ const char *tgsi_semantic_names[TGSI_SEMANTIC_COUNT] =
"CULLDIST",
"SAMPLEID",
"SAMPLEPOS",
-   "SAMPLEMASK"
+   "SAMPLEMASK",
+   "INVOCATIONID",
 };
 
 const char *tgsi_texture_names[TGSI_TEXTURE_COUNT] =
diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index b7d016a..170f83e 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -2641,6 +2641,12 @@ For fragment shaders, this semantic label indicates that 
an output contains
 the sample mask used to disable further sample processing
 (i.e. gl_SampleMask). Only the X value is used, up to 32x MS.
 
+TGSI_SEMANTIC_INVOCATIONID
+""
+
+For geometry shaders, this semantic label indicates that a system value
+contains the current invocation id (i.e. gl_InvocationID). Only the X value is
+used.
 
 Declaration Interpolate
 ^^^
diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
b/src/gallium/include/pipe/p_shader_tokens.h
index 74e0475..1903d53 100644
--- a/src/gallium/include/pipe/p_shader_tokens.h
+++ b/src/gallium/include/pipe/p_shader_tokens.h
@@ -170,7 +170,8 @@ struct tgsi_declaration_interp
 #define TGSI_SEMANTIC_SAMPLEID   24
 #define TGSI_SEMANTIC_SAMPLEPOS  25
 #define TGSI_SEMANTIC_SAMPLEMASK 26
-#define TGSI_SEMANTIC_COUNT  27 /**< number of semantic values */
+#define TGSI_SEMANTIC_INVOCATIONID 27
+#define TGSI_SEMANTIC_COUNT  28 /**< number of semantic values */
 
 struct tgsi_declaration_semantic
 {
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/7] gallium: catch up with mesa's system values

2014-04-23 Thread Ilia Mirkin
This adds support for gl_SampleMaskIn and gl_InvocationID system values to
gallium, mesa/st, and nvc0. The relevant piglit tests pass, except 2
gl_InvocationID-related ones that fail due to the linking being delayed by
mesa/st (I think). I wrote a lame one for gl_SampleMaskIn that just makes sure
that it's == 1 in a non-ms scenario, as nothing better seemed to be available.

None of this functionality is actually available yet since it's all behind
ARB_gpu_shader5. However you can test it out by adding
MESA_EXTENSION_OVERRIDE=GL_ARB_gpu_shader5

Note that this patchset applies on top of my ARB_sample_shading patch because
that one adds the earlier system values.

Ilia Mirkin (7):
  gallium: add INVOCATIONID semantic
  gallium: add GS_INVOCATIONS property
  mesa/st: translate gl_SampleMaskIn to SAMPLEMASK semantic
  mesa/st: translate gl_InvocationID to INVOCATIONID semantic
  nvc0/ir: add support for SAMPLEMASK sysval
  nvc0/ir: add support for INVOCATIONID system value
  nvc0/ir: set instance count based on the GS_INSTANCES property

 src/gallium/auxiliary/tgsi/tgsi_strings.c | 3 ++-
 src/gallium/auxiliary/tgsi/tgsi_ureg.c| 9 +
 src/gallium/docs/source/tgsi.rst  | 6 ++
 src/gallium/drivers/nouveau/codegen/nv50_ir.h | 1 +
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 6 +++---
 src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp | 4 
 src/gallium/drivers/nouveau/codegen/nv50_ir_print.cpp | 1 +
 src/gallium/drivers/nouveau/codegen/nv50_ir_target_nvc0.cpp   | 1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_program.c   | 1 -
 src/gallium/include/pipe/p_shader_tokens.h| 6 --
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp| 2 ++
 11 files changed, 33 insertions(+), 7 deletions(-)

-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/7] nvc0/ir: add support for INVOCATIONID system value

2014-04-23 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 1 +
 src/gallium/drivers/nouveau/nvc0/nvc0_program.c   | 1 -
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index 2c302a7..e076e72 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -349,6 +349,7 @@ static nv50_ir::SVSemantic translateSysVal(uint sysval)
case TGSI_SEMANTIC_SAMPLEID:   return nv50_ir::SV_SAMPLE_INDEX;
case TGSI_SEMANTIC_SAMPLEPOS:  return nv50_ir::SV_SAMPLE_POS;
case TGSI_SEMANTIC_SAMPLEMASK: return nv50_ir::SV_SAMPLE_MASK;
+   case TGSI_SEMANTIC_INVOCATIONID: return nv50_ir::SV_INVOCATION_ID;
default:
   assert(0);
   return nv50_ir::SV_CLOCK;
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
index 1df84f3..1d59fc4 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
@@ -51,7 +51,6 @@ nvc0_shader_input_address(unsigned sn, unsigned si, unsigned 
ubase)
case TGSI_SEMANTIC_VERTEXID: return 0x2fc;
case TGSI_SEMANTIC_TEXCOORD: return 0x300 + si * 0x10;
case TGSI_SEMANTIC_FACE: return 0x3fc;
-   case NV50_SEMANTIC_INVOCATIONID: return ~0;
default:
   assert(!"invalid TGSI input semantic");
   return ~0;
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] mesa/st: translate gl_InvocationID to INVOCATIONID semantic

2014-04-23 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index ee8c54a..e87e761 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -4167,6 +4167,7 @@ static unsigned mesa_sysval_to_semantic[SYSTEM_VALUE_MAX] 
= {
TGSI_SEMANTIC_SAMPLEID,
TGSI_SEMANTIC_SAMPLEPOS,
TGSI_SEMANTIC_SAMPLEMASK,
+   TGSI_SEMANTIC_INVOCATIONID,
 };
 
 /**
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/7] nvc0/ir: set instance count based on the GS_INVOCATIONS property

2014-04-23 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index e076e72..ef00a2f 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -829,11 +829,9 @@ void Source::scanProperty(const struct tgsi_full_property 
*prop)
case TGSI_PROPERTY_GS_MAX_OUTPUT_VERTICES:
   info->prop.gp.maxVertices = prop->u[0].Data;
   break;
-#if 0
-   case TGSI_PROPERTY_GS_INSTANCE_COUNT:
+   case TGSI_PROPERTY_GS_INVOCATIONS:
   info->prop.gp.instanceCount = prop->u[0].Data;
   break;
-#endif
case TGSI_PROPERTY_FS_COLOR0_WRITES_ALL_CBUFS:
   info->prop.fp.separateFragData = TRUE;
   break;
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/19] i965/fs: Lower LOAD_PAYLOAD and clean up.

2014-04-23 Thread Kenneth Graunke
On 04/18/2014 11:56 AM, Matt Turner wrote:
> Clean up with with register_coalesce()/dead_code_eliminate().
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 37 
> 
>  src/mesa/drivers/dri/i965/brw_fs.h   |  1 +
>  2 files changed, 38 insertions(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index e963ee8..602fc4a 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -2635,6 +2635,38 @@ fs_visitor::lower_uniform_pull_constant_loads()
> }
>  }
>  
> +bool
> +fs_visitor::lower_load_payload()
> +{
> +   bool progress = false;
> +
> +   foreach_list_safe(node, &instructions) {
> +  fs_inst *inst = (fs_inst *)node;
> +
> +  if (inst->opcode == SHADER_OPCODE_LOAD_PAYLOAD) {
> + fs_reg dst = inst->dst;
> +

It would be great to have a comment here such as:

/* src[0] represents the (optional) message header. */

It might also be worth adding a comment above the opcode definition in
the previous patch explaining that src[0] is reserved for an optional
message header, and could be BAD_FILE, while the rest of the parameters
follow in src[1..n].  Notably, this is the first opcode where you can
have a BAD_FILE early, and real parameters later (AFAIK).

> + if (inst->src[0].file != BAD_FILE) {
> +inst->insert_before(MOV(dst, inst->src[0]));
> + }
> + dst.reg_offset++;
> +
> + for (int i = 1; i < inst->sources; i++) {
> +inst->insert_before(MOV(dst, inst->src[i]));
> +dst.reg_offset++;
> + }
> +
> + inst->remove();
> + progress = true;
> +  }
> +   }
> +
> +   if (progress)
> +  invalidate_live_intervals();
> +
> +   return progress;
> +}
> +
>  void
>  fs_visitor::dump_instructions()
>  {
> @@ -3085,6 +3117,11 @@ fs_visitor::run()
>progress = compute_to_mrf() || progress;
>} while (progress);
>  
> +  if (lower_load_payload()) {
> + register_coalesce();
> + dead_code_eliminate();
> +  }
> +
>lower_uniform_pull_constant_loads();
>  
>assign_curb_setup();
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
> b/src/mesa/drivers/dri/i965/brw_fs.h
> index 4f1bdc6..c1445b8 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> @@ -392,6 +392,7 @@ public:
> void fail(const char *msg, ...);
> void no16(const char *msg, ...);
> void lower_uniform_pull_constant_loads();
> +   bool lower_load_payload();
>  
> void push_force_uncompressed();
> void pop_force_uncompressed();
> 




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/19] i965/fs: Use LOAD_PAYLOAD in emit_texture_gen7().

2014-04-23 Thread Kenneth Graunke
On 04/18/2014 11:56 AM, Matt Turner wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 135 
> +++
>  1 file changed, 73 insertions(+), 62 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index 2aa3acd..91bbe0a 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> @@ -1257,8 +1257,11 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg 
> dst, fs_reg coordinate,
> int reg_width = dispatch_width / 8;
> bool header_present = false;
>  
> -   fs_reg payload = fs_reg(this, glsl_type::float_type);
> -   fs_reg next = payload;
> +   fs_reg *sources = ralloc_array(mem_ctx, fs_reg, MAX_SAMPLER_MESSAGE_SIZE);
> +   for (int i = 0; i < MAX_SAMPLER_MESSAGE_SIZE; i++) {
> +  sources[i] = fs_reg(this, glsl_type::float_type);
> +   }
> +   int length = 0;
>  
> if (ir->op == ir_tg4 || (ir->offset && ir->op != ir_txf) || sampler >= 
> 16) {
>/* For general texture offsets (no txf workaround), we need a header to
> @@ -1272,12 +1275,13 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg 
> dst, fs_reg coordinate,
> * need to offset the Sampler State Pointer in the header.
> */
>header_present = true;
> -  next.reg_offset++;
> +  sources[length] = reg_undef;
> +  length++;
> }

So...if you don't take the header_present = true path...then it looks
like the next thing (say, shadow_c) will land in sources[0], rather than
leaving sources[0] as BAD_FILE and putting the first part of the payload
in sources[1].

Is sources[0] special and reserved for the header or isn't it?

>  
> if (ir->shadow_comparitor) {
> -  emit(MOV(next, shadow_c));
> -  next.reg_offset++;
> +  emit(MOV(sources[length], shadow_c));

I'm confused by the fact that these MOVs are still here.  I would have
expected:

sources[length] = sample_c;

When we visit expression trees, we generate results into various
registers.  The point of these MOVs is to put them into a large-VGRF we
can SEND from, at the right ref_offset.

Now, you have a new set of MAX_SAMPLER_MESSAGE_SIZE registers, and copy
from the original registers to these new temporaries, then LOAD_PAYLOAD
does a second copy from those into the new ones.  It seems like we could
just use the original registers and do a single copy.

I'm probably missing something here, but I can't think of what.

> +  length++;
> }
>  
> bool has_nonconstant_offset = ir->offset && !ir->offset->as_constant();
> @@ -1289,12 +1293,12 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg 
> dst, fs_reg coordinate,
> case ir_lod:
>break;
> case ir_txb:
> -  emit(MOV(next, lod));
> -  next.reg_offset++;
> +  emit(MOV(sources[length], lod));
> +  length++;
>break;
> case ir_txl:
> -  emit(MOV(next, lod));
> -  next.reg_offset++;
> +  emit(MOV(sources[length], lod));
> +  length++;
>break;
> case ir_txd: {
>no16("Gen7 does not support sample_d/sample_d_c in SIMD16 mode.");
> @@ -1303,21 +1307,21 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg 
> dst, fs_reg coordinate,
> * [hdr], [ref], x, dPdx.x, dPdy.x, y, dPdx.y, dPdy.y, z, dPdx.z, 
> dPdy.z
> */
>for (int i = 0; i < ir->coordinate->type->vector_elements; i++) {
> -  emit(MOV(next, coordinate));
> +  emit(MOV(sources[length], coordinate));
>coordinate.reg_offset++;
> -  next.reg_offset++;
> +  length++;
>  
>   /* For cube map array, the coordinate is (u,v,r,ai) but there are
>* only derivatives for (u, v, r).
>*/
>   if (i < ir->lod_info.grad.dPdx->type->vector_elements) {
> -emit(MOV(next, lod));
> +emit(MOV(sources[length], lod));
>  lod.reg_offset++;
> -next.reg_offset++;
> +length++;
>  
> -emit(MOV(next, lod2));
> +emit(MOV(sources[length], lod2));
>  lod2.reg_offset++;
> -next.reg_offset++;
> +length++;
>   }
>}
>  
> @@ -1325,45 +1329,45 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg 
> dst, fs_reg coordinate,
>break;
> }
> case ir_txs:
> -  emit(MOV(retype(next, BRW_REGISTER_TYPE_UD), lod));
> -  next.reg_offset++;
> +  emit(MOV(retype(sources[length], BRW_REGISTER_TYPE_UD), lod));
> +  length++;
>break;
> case ir_query_levels:
> -  emit(MOV(retype(next, BRW_REGISTER_TYPE_UD), fs_reg(0u)));
> -  next.reg_offset++;
> +  emit(MOV(retype(sources[length], BRW_REGISTER_TYPE_UD), fs_reg(0u)));
> +  length++;
>break;
> case ir_txf:
>/* Unfortunately, the parameters for LD are intermixed: u, lod, v, r. 
> */
> -  emit(MOV(retype(next, BRW_REGISTER_TYPE_D), coordinate));
> +  emit(MOV(retype(sources[lengt

Re: [Mesa-dev] [PATCH 13/19] i965/fs: Only consider real sources when comparing instructions.

2014-04-23 Thread Kenneth Graunke
On 04/18/2014 11:56 AM, Matt Turner wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 19 +++
>  1 file changed, 15 insertions(+), 4 deletions(-)

I originally thought this needed to go earlier in the patch series,
since by this point you're emitting opcodes with more than 3 sources.
However, CSE ignores SHADER_OPCODE_LOAD_PAYLOAD, so this code will never
run for opcodes that could break.  So, it's probably fine.

> diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> index 94f657d..e40567f 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
> @@ -94,10 +94,20 @@ is_expression_commutative(enum opcode op)
>  }
>  
>  static bool
> -operands_match(enum opcode op, fs_reg *xs, fs_reg *ys)
> +operands_match(fs_inst *a, fs_inst *b)
>  {
> -   if (!is_expression_commutative(op)) {
> -  return xs[0].equals(ys[0]) && xs[1].equals(ys[1]) && 
> xs[2].equals(ys[2]);
> +   fs_reg *xs = a->src;
> +   fs_reg *ys = b->src;
> +
> +   if (!is_expression_commutative(a->opcode)) {
> +  bool match = true;
> +  for (int i = 0; i < a->sources; i++) {
> + if (!xs[i].equals(ys[i])) {
> +match = false;
> +break;
> + }
> +  }
> +  return match;
> } else {
>return (xs[0].equals(ys[0]) && xs[1].equals(ys[1])) ||
>   (xs[1].equals(ys[0]) && xs[0].equals(ys[1]));

It strikes me as a bit asymmetric to have the first block handle an
arbitrary number of sources, and the second only check 0 and 1.  It
makes sense, since our commutative opcodes are all binops, but...

How about adding

  /* All commutative opcodes are binary operations. */
  assert(a->sources == 2 && b->sources == 2);

here in the commutative case?

> @@ -113,7 +123,8 @@ instructions_match(fs_inst *a, fs_inst *b)
>a->predicate_inverse == b->predicate_inverse &&
>a->conditional_mod == b->conditional_mod &&
>a->dst.type == b->dst.type &&
> -  operands_match(a->opcode, a->src, b->src);
> +  a->sources == b->sources &&
> +  operands_match(a, b);
>  }
>  
>  bool
> 



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/19] i965/fs: Only consider real sources when comparing instructions.

2014-04-23 Thread Matt Turner
On Wed, Apr 23, 2014 at 11:25 PM, Kenneth Graunke  wrote:
> On 04/18/2014 11:56 AM, Matt Turner wrote:
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 19 +++
>>  1 file changed, 15 insertions(+), 4 deletions(-)
>
> I originally thought this needed to go earlier in the patch series,
> since by this point you're emitting opcodes with more than 3 sources.
> However, CSE ignores SHADER_OPCODE_LOAD_PAYLOAD, so this code will never
> run for opcodes that could break.  So, it's probably fine.
>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
>> index 94f657d..e40567f 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
>> @@ -94,10 +94,20 @@ is_expression_commutative(enum opcode op)
>>  }
>>
>>  static bool
>> -operands_match(enum opcode op, fs_reg *xs, fs_reg *ys)
>> +operands_match(fs_inst *a, fs_inst *b)
>>  {
>> -   if (!is_expression_commutative(op)) {
>> -  return xs[0].equals(ys[0]) && xs[1].equals(ys[1]) && 
>> xs[2].equals(ys[2]);
>> +   fs_reg *xs = a->src;
>> +   fs_reg *ys = b->src;
>> +
>> +   if (!is_expression_commutative(a->opcode)) {
>> +  bool match = true;
>> +  for (int i = 0; i < a->sources; i++) {
>> + if (!xs[i].equals(ys[i])) {
>> +match = false;
>> +break;
>> + }
>> +  }
>> +  return match;
>> } else {
>>return (xs[0].equals(ys[0]) && xs[1].equals(ys[1])) ||
>>   (xs[1].equals(ys[0]) && xs[0].equals(ys[1]));
>
> It strikes me as a bit asymmetric to have the first block handle an
> arbitrary number of sources, and the second only check 0 and 1.  It
> makes sense, since our commutative opcodes are all binops, but...
>
> How about adding
>
>   /* All commutative opcodes are binary operations. */
>   assert(a->sources == 2 && b->sources == 2);
>
> here in the commutative case?

What is an example of a commutative non-binary operator?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/19] i965/fs: Combine fs_inst constructors using default parameters.

2014-04-23 Thread Kenneth Graunke
On 04/18/2014 11:56 AM, Matt Turner wrote:
> Wouldn't it be nice if case labels could be non-constant expressions.
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 77 
> +---
>  src/mesa/drivers/dri/i965/brw_fs.h   | 10 ++---
>  2 files changed, 31 insertions(+), 56 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index b0d6e4e..bb2d290 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -74,69 +74,46 @@ fs_inst::fs_inst()
> this->opcode = BRW_OPCODE_NOP;
>  }
>  
> -fs_inst::fs_inst(enum opcode opcode)
> +fs_inst::fs_inst(enum opcode opcode, const fs_reg &dst,
> + const fs_reg &src0, const fs_reg &src1, const fs_reg &src2)
>  {
> -   init();
> -   this->opcode = opcode;
> -}
> +   if (&dst == ®_undef)
> +  assert(&src0 == ®_undef);
> +   if (&src0 == ®_undef)
> +  assert(&src1 == ®_undef);
> +   if (&src1 == ®_undef)
> +  assert(&src2 == ®_undef);
>  
> -fs_inst::fs_inst(enum opcode opcode, fs_reg dst)
> -{
> init();
> -   this->opcode = opcode;
> -   this->dst = dst;
>  
> -   if (dst.file == GRF)
> -  assert(dst.reg_offset >= 0);
> -}
> +   if (&src2 != ®_undef) {
> +  goto src2;
> +   } else if (&src1 != ®_undef) {
> +  goto src1;
> +   } else if (&src0 != ®_undef) {
> +  goto src0;
> +   } else if (&dst != ®_undef) {
> +  goto dst;
> +   }
>  
> -fs_inst::fs_inst(enum opcode opcode, fs_reg dst, fs_reg src0)
> -{
> -   init();
> -   this->opcode = opcode;
> -   this->dst = dst;
> +src2:
> +   this->src[2] = src2;
> +   if (src[2].file == GRF)
> +  assert(src[2].reg_offset >= 0);
> +src1:
> +   this->src[1] = src1;
> +   if (src[1].file == GRF)
> +  assert(src[1].reg_offset >= 0);
> +src0:
> this->src[0] = src0;
> -
> -   if (dst.file == GRF)
> -  assert(dst.reg_offset >= 0);
> if (src[0].file == GRF)
>assert(src[0].reg_offset >= 0);
> -}
> -
> -fs_inst::fs_inst(enum opcode opcode, fs_reg dst, fs_reg src0, fs_reg src1)
> -{
> -   init();
> -   this->opcode = opcode;
> +dst:
> this->dst = dst;
> -   this->src[0] = src0;
> -   this->src[1] = src1;
> -
> if (dst.file == GRF)
>assert(dst.reg_offset >= 0);
> -   if (src[0].file == GRF)
> -  assert(src[0].reg_offset >= 0);
> -   if (src[1].file == GRF)
> -  assert(src[1].reg_offset >= 0);
> -}
>  
> -fs_inst::fs_inst(enum opcode opcode, fs_reg dst,
> -  fs_reg src0, fs_reg src1, fs_reg src2)
> -{
> -   init();
> this->opcode = opcode;
> -   this->dst = dst;
> -   this->src[0] = src0;
> -   this->src[1] = src1;
> -   this->src[2] = src2;
> -
> -   if (dst.file == GRF)
> -  assert(dst.reg_offset >= 0);
> -   if (src[0].file == GRF)
> -  assert(src[0].reg_offset >= 0);
> -   if (src[1].file == GRF)
> -  assert(src[1].reg_offset >= 0);
> -   if (src[2].file == GRF)
> -  assert(src[2].reg_offset >= 0);
>  }
>  
>  fs_inst::fs_inst(const fs_inst &that)
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
> b/src/mesa/drivers/dri/i965/brw_fs.h
> index 8af4520..8e2af4f 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> @@ -192,12 +192,10 @@ public:
> void init();
>  
> fs_inst();
> -   fs_inst(enum opcode opcode);
> -   fs_inst(enum opcode opcode, fs_reg dst);
> -   fs_inst(enum opcode opcode, fs_reg dst, fs_reg src0);
> -   fs_inst(enum opcode opcode, fs_reg dst, fs_reg src0, fs_reg src1);
> -   fs_inst(enum opcode opcode, fs_reg dst,
> -   fs_reg src0, fs_reg src1,fs_reg src2);
> +   fs_inst(enum opcode opcode, const fs_reg &dst = reg_undef,
> +   const fs_reg &src0 = reg_undef,
> +   const fs_reg &src1 = reg_undef,
> +   const fs_reg &src2 = reg_undef);
> fs_inst(const fs_inst &that);
>  
> bool equals(fs_inst *inst) const;
> 

I really don't like this code...lots if if, goto, and reliance on
pointer comparisons with global variables.

What about instead making init() look like your array constructor, and
building everything on top of that?  i.e.

void
fs_inst::init(enum opcode opcode, const fs_reg &dst, int sources = 1,
fs_reg *src = NULL)
{
   memset(this, 0, sizeof(*this));

   this->sources = sources;
   if (src) {
  this->src = src;
   } else {
  this->src = ralloc_array(this, fs_reg, sources);
  for (int i = 0; i < sources; i++)
 this->src[i] = reg_undef;
   }

   this->conditional_mod = BRW_CONDITIONAL_NONE;

   this->dst = dst;

   /* This will be the case for almost all instructions. */
   this->regs_written = 1;

   this->writes_accumulator = false;
}

fs_inst::fs_inst()
{
   init(BRW_OPCODE_NOP, reg_undef);
}

fs_inst::fs_inst(enum opcode opcode, const fs_reg &dst)
{
   init(opcode, dst);
}

fs_inst::fs_inst(enum opcode opcode, const fs_reg &dst,
 const fs_reg &src0)
{
   fs_reg *src = ralloc_array(this, fs_reg, 1);
   src[0] = src0;
   init(opcode, d

Re: [Mesa-dev] [PATCH 13/19] i965/fs: Only consider real sources when comparing instructions.

2014-04-23 Thread Kenneth Graunke
On 04/23/2014 11:39 PM, Matt Turner wrote:
> On Wed, Apr 23, 2014 at 11:25 PM, Kenneth Graunke  
> wrote:
>> On 04/18/2014 11:56 AM, Matt Turner wrote:
>>> ---
>>>  src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 19 +++
>>>  1 file changed, 15 insertions(+), 4 deletions(-)
>>
>> I originally thought this needed to go earlier in the patch series,
>> since by this point you're emitting opcodes with more than 3 sources.
>> However, CSE ignores SHADER_OPCODE_LOAD_PAYLOAD, so this code will never
>> run for opcodes that could break.  So, it's probably fine.
>>
>>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
>>> b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
>>> index 94f657d..e40567f 100644
>>> --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
>>> +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
>>> @@ -94,10 +94,20 @@ is_expression_commutative(enum opcode op)
>>>  }
>>>
>>>  static bool
>>> -operands_match(enum opcode op, fs_reg *xs, fs_reg *ys)
>>> +operands_match(fs_inst *a, fs_inst *b)
>>>  {
>>> -   if (!is_expression_commutative(op)) {
>>> -  return xs[0].equals(ys[0]) && xs[1].equals(ys[1]) && 
>>> xs[2].equals(ys[2]);
>>> +   fs_reg *xs = a->src;
>>> +   fs_reg *ys = b->src;
>>> +
>>> +   if (!is_expression_commutative(a->opcode)) {
>>> +  bool match = true;
>>> +  for (int i = 0; i < a->sources; i++) {
>>> + if (!xs[i].equals(ys[i])) {
>>> +match = false;
>>> +break;
>>> + }
>>> +  }
>>> +  return match;
>>> } else {
>>>return (xs[0].equals(ys[0]) && xs[1].equals(ys[1])) ||
>>>   (xs[1].equals(ys[0]) && xs[0].equals(ys[1]));
>>
>> It strikes me as a bit asymmetric to have the first block handle an
>> arbitrary number of sources, and the second only check 0 and 1.  It
>> makes sense, since our commutative opcodes are all binops, but...
>>
>> How about adding
>>
>>   /* All commutative opcodes are binary operations. */
>>   assert(a->sources == 2 && b->sources == 2);
>>
>> here in the commutative case?
> 
> What is an example of a commutative non-binary operator?

I don't think we have any in i965, but things like min3/max3/mid3 are
three-source operations that are "commutative".

At any rate, adding the assertion makes it obvious that this code is
correct by design, and we didn't just forget to update it for
arbitrary-length source lists...I don't think it's an unreasonable
request...



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev