Re: [Mesa-dev] [PATCH 1/2] tgsi: texture types

2015-06-10 Thread Jose Fonseca
I think it make sense for us to start using SVIEW regardless, and 
uniformize things.



Even if GLSL will never support independent texture/samplers, D3D10, 
OpenCL, Metal, and potential SPIR-V all do.


Roland, could you prepare a patch for llvmpipe, so that it infers from 
SAMPLE_* opcode count instead of SVIEW for, and therefore unblock Rob?



BTW, even if you avoid intermediate TGSI on GLSL -> NIR, don't you still 
need to handle TGSI generated by the state tracker? (For things like 
blits, and mipmap generation, identity shaders, clear shaders, etc)



Jose


On 10/06/15 21:07, Rob Clark wrote:

that is starting to look more attractive, mostly just because
tgsi_transform stuff is so cumbersome..

(I did start thinking about just adding type to decl's in general,
since really it would be better to have for type information for IN's
and OUT's too.. but then decided I'd probably rather spend my time on
support in mesa st to go straight from glsl to nir and bypass tgsi,
rather than going down that rabbit hole)

BR,
-R

On Wed, Jun 10, 2015 at 3:51 PM, Marek Olšák  wrote:

There is also the option of adding the sampler type to either the SAMP
declaration or texture instructions. This will move us further away
from adopting SVIEW, but I don't see that happening for OpenGL anyway.

Marek

On Wed, Jun 10, 2015 at 8:59 PM, Rob Clark  wrote:

So, afaiu, everything that might insert a sampler is using
tgsi_transform_shader()?  There aren't too many of those, and I think
I can fix them up to not violate that constraint.

(It does occur to me that I may end up needing to fix u_blitter to
differentiate between blitting float vs int to avoid some regressions
in freedreno, since I'd no longer be using shader variants based on
bound samplers.. but I guess that is unrelated and a separate patch)

BR,
-R

On Wed, Jun 10, 2015 at 2:55 PM, Roland Scheidegger  wrote:

My biggest problem with that is the initial case I saw as a problem:
draw is going to modify these shaders in some cases (aaline stage for
example), adding its own sampler, and it doesn't know anything about
distinguishing shaders with sampler views or without.
The same goes for any other potential code which may modify shaders
similarly - needs to be modified not just to always use sampler views
but use them based on if the incoming shader already uses them or not.
Which conceptually looks worse to me. But otherwise I agree this should
work.

Roland


Am 10.06.2015 um 20:30 schrieb Rob Clark:

Hmm, at least tgsi_text_translate() doesn't appear to use tgsi_ureg..
and there are still a number of users of tgsi_text_translate().. I
guess handling this in tgsi_ureg would avoid fixing all the tgsi_ureg
users, but that still leaves a lot of others.  Changing them all still
seems to be too intrusive to me.

(And also, I have a large collection of saved tgsi shaders that I use
for standalone testing of my shader compiler and don't really like the
idea of fixing up 700 or 800 tgsi shaders by hand :-P)

That said, looking at the code like llvmpipe where Roland/Jose where
thinking we might have problems..  by making the assumption that we
never mix TEX* and SAMPLE* opc's, I think we can loosen the
restriction to:

   for TEX* instructions, the tgsi must either *not* include SVIEW, or
*must* include a matching SVIEW[idx] for every SAMP[idx]

Which is a restriction that glsl_to_tgsi follows.

If you follow this restriction, then for TEX* shaders which have
SVIEW's, file_max[TGSI_FILE_SAMPLER_VIEW] ==
file_max[TGSI_FILE_SAMPLER] and file_mask[TGSI_FILE_SAMPLER_VIEW] ==
file_mask[TGSI_FILE_SAMPLER].. so code takes a different but
equivalent path.   And for TEX* shaders which don't have SVIEW's
everything continues to work as before.

With this approach, we don't have to fix up everything to create
SVIEW[idx] for every SAMP[idx], as long as glsl_to_tgsi always creates
SVIEW[idx] for each SAMP[idx], and any other tgsi generator that later
adds SVIEW support for TEX* instructions follows the same pattern.

So, tl;dr: I think really all I need to add to this patch is add blurb
in tgsi.rst to explain this restriction and usage of SVIEW for TEX*

Thoughts?

BR,
-R

On Tue, Jun 9, 2015 at 1:20 PM, Marek Olšák  wrote:

If you only want to modify TGSI and not all the users, you only have
to fix tgsi_ureg. tgsi_ureg is a layer that can hide a lot of small
ugly details if needed, including sampler view declarations when the
users don't even know about them.

Marek





On Tue, Jun 9, 2015 at 6:01 PM, Rob Clark  wrote:

On Tue, Jun 9, 2015 at 9:32 AM, Roland Scheidegger  wrote:

Am 09.06.2015 um 15:00 schrieb Rob Clark:

On Tue, Jun 9, 2015 at 5:01 AM, Jose Fonseca  wrote:

On 09/06/15 04:03, Rob Clark wrote:


On Mon, Jun 8, 2015 at 10:50 PM, Roland Scheidegger 
wrote:


Am 09.06.2015 um 04:40 schrieb Rob Clark:


On Mon, Jun 8, 2015 at 10:36 PM, Roland Scheidegger 
wrote:


Am 09.06.2015 um 04

Re: [Mesa-dev] COMPSIZE function in OpenGL XML registry

2015-06-10 Thread Jose Fonseca

I'm not sure what you are trying to accomplish.

if you're doing some sort of serialization of OpenGL calls other than 
GLX, then it might be worthwhile to look at


  https://github.com/apitrace/apitrace/blob/master/specs/glapi.py
  https://github.com/apitrace/apitrace/blob/master/helpers/glsize.hpp

Jose

On 10/06/15 23:05, Shervin Sharifi wrote:

Thanks Ian.
If I want to implement the actual CompSize function, how should I figure
out the details?

Thanks,
Shervin

On Wed, Jun 10, 2015 at 2:56 PM, Ian Romanick mailto:i...@freedesktop.org>> wrote:

On 06/10/2015 11:25 AM, Shervin Sharifi wrote:
> Hi,
>
>  This may not be the right forum to ask this, but I didn't know of a
> better forum, so I thought I can ask here.
>
>  I'm new to OpenGL. I am looking at XML registry for OpenGL and there
> are some parameters with attributes containing a function COMPSIZE (I've
> pasted an example below).
>  I tried to find information on what the COMPSIZE function is, where/how
> it is used, etc, but couldn't find documentation or credible information
> on the Internet.
>  Any information or pointer to that would be really helpful.

It's a signal to code generation scripts that the size of the data
referenced by the "pointer" parameter depends on the values of "type"
and "stride".  For example, GLX protocol code uses this to know how much
image data to send to the server for glTexImage2D.

>  Thanks,
> Shervin
>
>
>
> This example is from gl.xml in the OpenGL registry:
>
> 
> void glBinormalPointerEXT
> GLenum
> type
> GLsizei stride
> const void
> *pointer
> 
>
>
 > ___
 > mesa-dev mailing list
 > mesa-dev@lists.freedesktop.org

 > http://lists.freedesktop.org/mailman/listinfo/mesa-dev




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] mesa/main: avoid null access in format_array_table_init()

2015-06-11 Thread Jose Fonseca

On 05/05/15 11:50, Juha-Pekka Heikkila wrote:

If _mesa_hash_table_create failed we'd get null pointer. Report
error and go away.

Signed-off-by: Juha-Pekka Heikkila 
---
  src/mesa/main/formats.c | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/src/mesa/main/formats.c b/src/mesa/main/formats.c
index 8af44e9..f7c9402 100644
--- a/src/mesa/main/formats.c
+++ b/src/mesa/main/formats.c
@@ -397,6 +397,11 @@ format_array_format_table_init(void)
 format_array_format_table = _mesa_hash_table_create(NULL, NULL,
 array_formats_equal);

+   if (!format_array_format_table) {
+  _mesa_error_no_memory(__func__);
+  return;
+   }
+
 for (f = 1; f < MESA_FORMAT_COUNT; ++f) {
info = _mesa_get_format_info(f);
if (!info->ArrayFormat)
@@ -432,6 +437,11 @@ _mesa_format_from_array_format(uint32_t array_format)

 call_once(&format_array_format_table_exists, 
format_array_format_table_init);

+   if (!format_array_format_table) {
+  format_array_format_table_exists = ONCE_FLAG_INIT;


This is not portable, as ONCE_FLAG_INIT is meant to be an initializer 
expression.  In particular it's defined as a structure initializer on 
Windows "{ 0 }" and is not a valid rvalue expression.


I've just fixed the build, but this still looks like a bad idea:

-  the idea of call_once is "calling once", not "keep trying" -- and 
this usage can easily lead to leaks crashes, etc, depending on how 
call_once was implemented.


- touching touching format_array_format_table_exists introduces a race 
condition: imagine another call_once happens when 
format_array_format_table_exists is being overwritten, and catches half 
written



I also don't understand what's exactly the problem here.  Why would 
_mesa_hash_table_create() fail?  I think it might be better to just 
abort(), rather than this sort of half-remedies.


Maybe we need a `_mesa_error_no_memory_fatal` or add a `fatal` parameter 
to `_mesa_error_no_memory`.


Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium: remove explicit values from PIPE_CAP_ enums

2015-06-11 Thread Jose Fonseca
RY = 106,
-   PIPE_CAP_UMA = 107,
-   PIPE_CAP_CONDITIONAL_RENDER_INVERTED = 108,
-   PIPE_CAP_MAX_VERTEX_ATTRIB_STRIDE = 109,
-   PIPE_CAP_SAMPLER_VIEW_TARGET = 110,
-   PIPE_CAP_CLIP_HALFZ = 111,
-   PIPE_CAP_VERTEXID_NOBASE = 112,
-   PIPE_CAP_POLYGON_OFFSET_CLAMP = 113,
-   PIPE_CAP_MULTISAMPLE_Z_RESOLVE = 114,
-   PIPE_CAP_RESOURCE_FROM_USER_MEMORY = 115,
-   PIPE_CAP_DEVICE_RESET_STATUS_QUERY = 116,
+   PIPE_CAP_INDEP_BLEND_FUNC,
+   PIPE_CAP_MAX_TEXTURE_ARRAY_LAYERS,
+   PIPE_CAP_TGSI_FS_COORD_ORIGIN_UPPER_LEFT,
+   PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT,
+   PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_HALF_INTEGER,
+   PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER,
+   PIPE_CAP_DEPTH_CLIP_DISABLE,
+   PIPE_CAP_SHADER_STENCIL_EXPORT,
+   PIPE_CAP_TGSI_INSTANCEID,
+   PIPE_CAP_VERTEX_ELEMENT_INSTANCE_DIVISOR,
+   PIPE_CAP_FRAGMENT_COLOR_CLAMPED,
+   PIPE_CAP_MIXED_COLORBUFFER_FORMATS,
+   PIPE_CAP_SEAMLESS_CUBE_MAP,
+   PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE,
+   PIPE_CAP_MIN_TEXEL_OFFSET,
+   PIPE_CAP_MAX_TEXEL_OFFSET,
+   PIPE_CAP_CONDITIONAL_RENDER,
+   PIPE_CAP_TEXTURE_BARRIER,
+   PIPE_CAP_MAX_STREAM_OUTPUT_SEPARATE_COMPONENTS,
+   PIPE_CAP_MAX_STREAM_OUTPUT_INTERLEAVED_COMPONENTS,
+   PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME,
+   PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS,
+   PIPE_CAP_VERTEX_COLOR_UNCLAMPED,
+   PIPE_CAP_VERTEX_COLOR_CLAMPED,
+   PIPE_CAP_GLSL_FEATURE_LEVEL,
+   PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION,
+   PIPE_CAP_USER_VERTEX_BUFFERS,
+   PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY,
+   PIPE_CAP_VERTEX_BUFFER_STRIDE_4BYTE_ALIGNED_ONLY,
+   PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY,
+   PIPE_CAP_COMPUTE,
+   PIPE_CAP_USER_INDEX_BUFFERS,
+   PIPE_CAP_USER_CONSTANT_BUFFERS,
+   PIPE_CAP_CONSTANT_BUFFER_OFFSET_ALIGNMENT,
+   PIPE_CAP_START_INSTANCE,
+   PIPE_CAP_QUERY_TIMESTAMP,
+   PIPE_CAP_TEXTURE_MULTISAMPLE,
+   PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT,
+   PIPE_CAP_CUBE_MAP_ARRAY,
+   PIPE_CAP_TEXTURE_BUFFER_OBJECTS,
+   PIPE_CAP_TEXTURE_BUFFER_OFFSET_ALIGNMENT,
+   PIPE_CAP_TGSI_TEXCOORD,
+   PIPE_CAP_PREFER_BLIT_BASED_TEXTURE_TRANSFER,
+   PIPE_CAP_QUERY_PIPELINE_STATISTICS,
+   PIPE_CAP_TEXTURE_BORDER_COLOR_QUIRK,
+   PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE,
+   PIPE_CAP_MAX_VIEWPORTS,
+   PIPE_CAP_ENDIANNESS,
+   PIPE_CAP_MIXED_FRAMEBUFFER_SIZES,
+   PIPE_CAP_TGSI_VS_LAYER_VIEWPORT,
+   PIPE_CAP_MAX_GEOMETRY_OUTPUT_VERTICES,
+   PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS,
+   PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS,
+   PIPE_CAP_TEXTURE_GATHER_SM5,
+   PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT,
+   PIPE_CAP_FAKE_SW_MSAA,
+   PIPE_CAP_TEXTURE_QUERY_LOD,
+   PIPE_CAP_MIN_TEXTURE_GATHER_OFFSET,
+   PIPE_CAP_MAX_TEXTURE_GATHER_OFFSET,
+   PIPE_CAP_SAMPLE_SHADING,
+   PIPE_CAP_TEXTURE_GATHER_OFFSETS,
+   PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION,
+   PIPE_CAP_MAX_VERTEX_STREAMS,
+   PIPE_CAP_DRAW_INDIRECT,
+   PIPE_CAP_TGSI_FS_FINE_DERIVATIVE,
+   PIPE_CAP_VENDOR_ID,
+   PIPE_CAP_DEVICE_ID,
+   PIPE_CAP_ACCELERATED,
+   PIPE_CAP_VIDEO_MEMORY,
+   PIPE_CAP_UMA,
+   PIPE_CAP_CONDITIONAL_RENDER_INVERTED,
+   PIPE_CAP_MAX_VERTEX_ATTRIB_STRIDE,
+   PIPE_CAP_SAMPLER_VIEW_TARGET,
+   PIPE_CAP_CLIP_HALFZ,
+   PIPE_CAP_VERTEXID_NOBASE,
+   PIPE_CAP_POLYGON_OFFSET_CLAMP,
+   PIPE_CAP_MULTISAMPLE_Z_RESOLVE,
+   PIPE_CAP_RESOURCE_FROM_USER_MEMORY,
+   PIPE_CAP_DEVICE_RESET_STATUS_QUERY,
  };

  #define PIPE_QUIRK_TEXTURE_BORDER_COLOR_SWIZZLE_NV50 (1 << 0)



Reviewed-by: Jose Fonseca 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] tgsi: update docs for SVIEW usage with TEX* instructions

2015-06-12 Thread Jose Fonseca

On 11/06/15 21:38, Rob Clark wrote:

From: Rob Clark 

Based on mailing list discussion here:

http://lists.freedesktop.org/archives/mesa-dev/2014-November/071583.html

Signed-off-by: Rob Clark 
Reviewed-by: Roland Scheidegger 
---
  src/gallium/docs/source/tgsi.rst | 12 
  1 file changed, 12 insertions(+)

diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index f77702a..89ca172 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -2965,6 +2965,18 @@ resource can be one of BUFFER, 1D, 2D, 3D, 1DArray and 
2DArray.
  type must be 1 or 4 entries (if specifying on a per-component
  level) out of UNORM, SNORM, SINT, UINT and FLOAT.

+For TEX\* style texture sample opcodes (as opposed to SAMPLE\* opcodes
+which take an explicit SVIEW[#] source register), there may be optionally
+SVIEW[#] declarations.  In this case, the SVIEW index is implied by the
+SAMP index, and there must be a corresponding SVIEW[#] declaration for
+each SAMP[#] declaration.  Drivers are free to ignore this if they wish.
+But note in particular that some drivers need to know the sampler type
+(float/int/unsigned) in order to generate the correct code, so cases
+where integer textures are sampled, SVIEW[#] declarations should be
+used.
+
+NOTE: It is NOT legal to mix SAMPLE\* style opcodes and TEX\* opcodes
+in the same shader.

  Declaration Resource
  



Series looks good to me AFAICT.  Thanks for doing this.

Reviewed-by: Jose Fonseca 

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: Fail linkage when UBO exceeds GL_MAX_UNIFORM_BLOCK_SIZE.

2015-06-16 Thread Jose Fonseca
It's not totally clear whether other Mesa drivers can safely cope with
over-sized UBOs, but at least for llvmpipe receiving a UBO larger than
its limit causes problems, as it won't fit into its internal display
lists.

This fixes piglit "arb_uniform_buffer_object-maxuniformblocksize
fsexceed" without regressions for llvmpipe.

NVIDIA driver also fails to link the shader from
"arb_uniform_buffer_object-maxuniformblocksize fsexceed".

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65525

PS: I don't recommend cherry-picking this for Mesa stable, as some app
might inadvertently been relying on UBOs larger than
GL_MAX_UNIFORM_BLOCK_SIZE to work on other drivers, so even if this
commit is universally accepted it's probably best to let it mature in
master for a while.
---
 src/glsl/linker.cpp | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index 9978380..4a726d4 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -2355,6 +2355,13 @@ check_resources(struct gl_context *ctx, struct 
gl_shader_program *prog)
unsigned total_uniform_blocks = 0;
 
for (unsigned i = 0; i < prog->NumUniformBlocks; i++) {
+  if (prog->UniformBlocks[i].UniformBufferSize > 
ctx->Const.MaxUniformBlockSize) {
+ linker_error(prog, "Uniform block %s too big (%d/%d)\n",
+  prog->UniformBlocks[i].Name,
+  prog->UniformBlocks[i].UniformBufferSize,
+  ctx->Const.MaxUniformBlockSize);
+  }
+
   for (unsigned j = 0; j < MESA_SHADER_STAGES; j++) {
 if (prog->UniformBlockStageIndex[j][i] != -1) {
blocks[j]++;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Fail linkage when UBO exceeds GL_MAX_UNIFORM_BLOCK_SIZE.

2015-06-16 Thread Jose Fonseca

On 16/06/15 15:29, Ilia Mirkin wrote:

On Tue, Jun 16, 2015 at 10:22 AM, Roland Scheidegger  wrote:

This looks like a good idea to me.
That said, llvmpipe would still crash if the declared size in the shader
wouldn't exceed the max uniform block size, but the bound buffer does
IIRC (the test doesn't test this but could be easily modified to do so).
So, I'm wondering if we should do both - fail to link if the declared
size exceeds the limit, and just limit the size we copy in llvmpipe to
the limit, though it's possible this would require some more changes to
be really safe so we don't try to access such elements (with indirect
access, though we don't verify direct ones at all) in the shader.


Not to derail this too much, but just FWIW I just checked this patch
in to prevent gpu errors:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=8b24388647f626a5cad10fd48e61335ed26a8560

Didn't fix the trace, but at least it no longer complained about
illegal sizes. Trace available at

http://people.freedesktop.org/~imirkin/traces/gzdoom.trace

It just renders black now. The claim is that the game (but not
necessarily the trace) works OK on NVIDIA blob drivers. I haven't
analyzed the trace in much detail yet.

   -ilia



Interesting. I don't get errors from NVIDIA.  So it does look like 
binding large UBOs is treated differently.  (And understandibly, since 
one can bind a range of a UBO too.)


I think Roland's right, in llvmpipe we'll need to handle that better by 
truncating the constant buffer copied to the display lists.


BTW, the trace use PERSISTENT mappings, which aren't supported in 
apitrace.  So I suspect it will never render well.


Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] llvmpipe: Truncate the binned constants to max const buffer size.

2015-06-18 Thread Jose Fonseca
Tested with Ilia Mirkin's gzdoom.trace and
"arb_uniform_buffer_object-maxuniformblocksize fsexceed" piglit test
without my earlier fix to fail linkage when UBO exceeds
GL_MAX_UNIFORM_BLOCK_SIZE.
---
 src/gallium/auxiliary/gallivm/lp_bld_limits.h | 6 +-
 src/gallium/drivers/llvmpipe/lp_setup.c   | 5 -
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_limits.h 
b/src/gallium/auxiliary/gallivm/lp_bld_limits.h
index 49064fe..db50351 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_limits.h
+++ b/src/gallium/auxiliary/gallivm/lp_bld_limits.h
@@ -51,8 +51,12 @@
 
 #define LP_MAX_TGSI_PREDS 16
 
+#define LP_MAX_TGSI_CONSTS 4096
+
 #define LP_MAX_TGSI_CONST_BUFFERS 16
 
+#define LP_MAX_TGSI_CONST_BUFFER_SIZE (LP_MAX_TGSI_CONSTS * sizeof(float[4]))
+
 /*
  * For quick access we cache registers in statically
  * allocated arrays. Here we define the maximum size
@@ -100,7 +104,7 @@ gallivm_get_shader_param(enum pipe_shader_cap param)
case PIPE_SHADER_CAP_MAX_OUTPUTS:
   return 32;
case PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE:
-  return sizeof(float[4]) * 4096;
+  return LP_MAX_TGSI_CONST_BUFFER_SIZE;
case PIPE_SHADER_CAP_MAX_CONST_BUFFERS:
   return PIPE_MAX_CONSTANT_BUFFERS;
case PIPE_SHADER_CAP_MAX_TEMPS:
diff --git a/src/gallium/drivers/llvmpipe/lp_setup.c 
b/src/gallium/drivers/llvmpipe/lp_setup.c
index 56292c6..4c8167a 100644
--- a/src/gallium/drivers/llvmpipe/lp_setup.c
+++ b/src/gallium/drivers/llvmpipe/lp_setup.c
@@ -1069,10 +1069,13 @@ try_update_scene_state( struct lp_setup_context *setup )
if (setup->dirty & LP_SETUP_NEW_CONSTANTS) {
   for (i = 0; i < Elements(setup->constants); ++i) {
  struct pipe_resource *buffer = setup->constants[i].current.buffer;
- const unsigned current_size = setup->constants[i].current.buffer_size;
+ const unsigned current_size = 
MIN2(setup->constants[i].current.buffer_size,
+LP_MAX_TGSI_CONST_BUFFER_SIZE);
  const ubyte *current_data = NULL;
  int num_constants;
 
+ STATIC_ASSERT(DATA_BLOCK_SIZE >= LP_MAX_TGSI_CONST_BUFFER_SIZE);
+
  if (buffer) {
 /* resource buffer */
 current_data = (ubyte *) llvmpipe_resource_data(buffer);
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] darwin: Suppress type conversion warnings for GLhandleARB

2015-06-19 Thread Jose Fonseca

On 19/06/15 04:46, Ian Romanick wrote:

On 06/17/2015 10:53 PM, Julien Isorce wrote:

From: Jon TURNEY 

On darwin, GLhandleARB is defined as a void *, not the unsigned int it is on
linux.

For the moment, apply a cast to supress the warning

Possibly this is safe, as for the mesa software renderer the shader program
handle is not a real pointer, but a integer handle

Probably this is not the right thing to do, and we should pay closer attention
to how the GLhandlerARB type is used.


In Mesa, glBindAttribLocation (which takes GLuint) and
glBindAttribLocationARB (which takes GLhandleARB) are *the same
function*.  The same applies to pretty much all the other GLhandleARB
functions.



Properly fixing this is a nightmare, but I think that short term 
workaround is feasible.


This is the generated glapitemp.h:

  KEYWORD1 void KEYWORD2 NAME(BindAttribLocationARB)(GLhandleARB 
program, GLuint index, const GLcharARB * name)

  {
  (void) program; (void) index; (void) name;
 DISPATCH(BindAttribLocation, (program, index, name), (F, 
"glBindAttribLocationARB(%d, %d, %p);\n", program, index, (const void *) 
name));

  }

Provided that GLhandlerARB is defined as `unsigned long` during Mesa 
build on MacOSX (to avoid these int<->void *) conversions [1], the 
compiler should implicitly cast the 64bits GLhandlerARB program to an 
32-bits GLuint.


So, when an app calls glBindAttribLocationARB it will be dispatched to 
_mesa_BindAttribLocation, and the program truncated. So it should all 
just work.


Ditto for when GLhandleARB appears as return value.


The only problem is when GLhandleARB appears as a pointer, as there is 
only one such instance:


  GLAPI void APIENTRY glGetAttachedObjectsARB (GLhandleARB 
containerObj, GLsizei maxCount, GLsizei *count, GLhandleARB *obj);


But we do have a separate entry-point for this 
(_mesa_GetAttachedObjectsARB) so again, we're all good.



So, Jon/Julien's patch seems perfectly workable -- all really left to do 
is to silence  GLhandleARB <-> GLuint conversions.



Jose


[1] Apitrace also defines GLhandleARB as unsigned long internally to 
avoid this 
https://github.com/apitrace/apitrace/blob/master/thirdparty/khronos/GL/glext.patch


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] glapi: remap_helper.py: remove unused argument 'es'

2015-06-19 Thread Jose Fonseca

I only did minor tweaks to these files, but the series LGTM.

Reviewed-by: Jose Fonseca 


On 19/06/15 13:21, Emil Velikov wrote:

Identical to the previous commit - unused by neither the Autotools,
Android or SCons build.

XXX: There are no more users of gl_api.filter_functions_by_api(). Should
we just nuke it ?

Cc:  Dylan Baker 
Cc:  Jose Fonseca 
Signed-off-by: Emil Velikov 
---
  src/mapi/glapi/gen/remap_helper.py | 8 
  1 file changed, 8 deletions(-)

diff --git a/src/mapi/glapi/gen/remap_helper.py 
b/src/mapi/glapi/gen/remap_helper.py
index 94ae193..edc6c3e 100644
--- a/src/mapi/glapi/gen/remap_helper.py
+++ b/src/mapi/glapi/gen/remap_helper.py
@@ -174,12 +174,6 @@ def _parser():
  metavar="input_file_name",
  dest='file_name',
  help="An xml description file.")
-parser.add_argument('-c', '--es-version',
-choices=[None, 'es1', 'es2'],
-default=None,
-metavar='ver',
-dest='es',
-help='A GLES version to support')
  return parser.parse_args()


@@ -188,8 +182,6 @@ def main():
  args = _parser()

  api = gl_XML.parse_GL_API(args.file_name)
-if args.es is not None:
-api.filter_functions_by_api(args.es)

  printer = PrintGlRemap()
  printer.Print(api)



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] glsl: Fix counting of varyings.

2015-06-19 Thread Jose Fonseca
When input and output varyings started to be counted separately (commit
42305fb5) the is_varying_var function wasn't updated to return true for
output varyings or input varyings for stages other than the fragment
shader), effectively making the varying limit to never be checked.

With this change, color, texture coord, and generic varyings are not
counted, but others are ignored.  It is assumed the hardware will handle
special varyings internally (ie, optimistic rather than pessimistic), to
avoid causing regressions where things were working somehow.

This fixes `glsl-max-varyings --exceed-limits` with softpipe/llvmpipe,
which was asserting because we were getting varyings beyond
VARYING_SLOT_MAX in st_glsl_to_tgsi.cpp.

It also prevents the assertion failure with
https://bugs.freedesktop.org/show_bug.cgi?id=90539 but the tests still
fails due to the link error.

This change also adds a few assertions to catch this sort of errors
earlier, and potentially prevent buffer overflows in the future (no
buffer overflow was detected here though).

However, this change causes several tests to regress:

  spec/glsl-1.10/execution/varying-packing/simple ivec3 array
  spec/glsl-1.10/execution/varying-packing/simple ivec3 separate
  spec/glsl-1.10/execution/varying-packing/simple uvec3 array
  spec/glsl-1.10/execution/varying-packing/simple uvec3 separate
  spec/arb_gpu_shader_fp64/varying-packing/simple dmat3 array
  spec/glsl-1.50/execution/geometry/max-input-components
  spec/glsl-1.50/execution/variable-indexing/gs-input-array-vec4-index-rd
  
spec/glsl-1.50/execution/variable-indexing/vs-output-array-vec4-index-wr-before-gs

But this all seem to be issues either in the way we count varyings
(e.g., geometry inputs get counted multiple times) or in the tests
themselves, or limitations in the varying packer, and deserve attention
on their own right.
---
 src/glsl/link_varyings.cpp | 70 --
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  2 +
 2 files changed, 58 insertions(+), 14 deletions(-)

diff --git a/src/glsl/link_varyings.cpp b/src/glsl/link_varyings.cpp
index 278a778..7649720 100644
--- a/src/glsl/link_varyings.cpp
+++ b/src/glsl/link_varyings.cpp
@@ -190,6 +190,8 @@ cross_validate_outputs_to_inputs(struct gl_shader_program 
*prog,
   */
  const unsigned idx = var->data.location - VARYING_SLOT_VAR0;
 
+ assert(idx < MAX_VARYING);
+
  if (explicit_locations[idx] != NULL) {
 linker_error(prog,
  "%s shader has multiple outputs explicitly "
@@ -1031,25 +1033,63 @@ varying_matches::match_comparator(const void 
*x_generic, const void *y_generic)
 /**
  * Is the given variable a varying variable to be counted against the
  * limit in ctx->Const.MaxVarying?
- * This includes variables such as texcoords, colors and generic
- * varyings, but excludes variables such as gl_FrontFacing and gl_FragCoord.
+ *
+ * OpenGL specification states:
+ *
+ *   Each output variable component used as either a vertex shader output or
+ *   fragment shader input counts against this limit, except for the components
+ *   of gl_Position. A program containing only a vertex and fragment shader
+ *   that accesses more than this limit's worth of components of outputs may
+ *   fail to link, unless device-dependent optimizations are able to make the
+ *   program fit within available hardware resources.
+ *
  */
 static bool
 var_counts_against_varying_limit(gl_shader_stage stage, const ir_variable *var)
 {
-   /* Only fragment shaders will take a varying variable as an input */
-   if (stage == MESA_SHADER_FRAGMENT &&
-   var->data.mode == ir_var_shader_in) {
-  switch (var->data.location) {
-  case VARYING_SLOT_POS:
-  case VARYING_SLOT_FACE:
-  case VARYING_SLOT_PNTC:
- return false;
-  default:
- return true;
-  }
+   assert(var->data.mode == ir_var_shader_in || var->data.mode == 
ir_var_shader_out);
+
+   /* FIXME: It looks like we're currently counting each input multiple times
+* on geometry shaders.  See piglit
+* spec/glsl-1.50/execution/geometry/max-input-components */
+   if (stage == MESA_SHADER_GEOMETRY) {
+  return false;
+   }
+
+   switch (var->data.location) {
+   case VARYING_SLOT_POS:
+  return false;
+   case VARYING_SLOT_COL0:
+   case VARYING_SLOT_COL1:
+   case VARYING_SLOT_FOGC:
+   case VARYING_SLOT_TEX0:
+   case VARYING_SLOT_TEX1:
+   case VARYING_SLOT_TEX2:
+   case VARYING_SLOT_TEX3:
+   case VARYING_SLOT_TEX4:
+   case VARYING_SLOT_TEX5:
+   case VARYING_SLOT_TEX6:
+   case VARYING_SLOT_TEX7:
+  return true;
+   case VARYING_SLOT_PSIZ:
+   case VARYING_SLOT_BFC0:
+   case VARYING_SLOT_BFC1:
+   case VARYING_SLOT_EDGE:
+   case VARYING_SLOT_CLIP_VERTEX:
+   case VARYING_SLOT_CLIP_DIST0:
+   case VARYING_SLOT_CLIP_DIST1:
+   case VARYING_SLOT_PRIMITIVE_ID:
+   case VARYING_SLOT_LAYER:
+   case VARYING_SLOT_VIEWPORT:
+   case VARYING_SLOT_FACE:

[Mesa-dev] [PATCH 1/2] glsl: Specify the shader stage in linker errors due to too many in/outputs.

2015-06-19 Thread Jose Fonseca
---
 src/glsl/link_varyings.cpp | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/glsl/link_varyings.cpp b/src/glsl/link_varyings.cpp
index 7b2d4bd..278a778 100644
--- a/src/glsl/link_varyings.cpp
+++ b/src/glsl/link_varyings.cpp
@@ -1540,13 +1540,15 @@ check_against_output_limit(struct gl_context *ctx,
const unsigned output_components = output_vectors * 4;
if (output_components > max_output_components) {
   if (ctx->API == API_OPENGLES2 || prog->IsES)
- linker_error(prog, "shader uses too many output vectors "
+ linker_error(prog, "%s shader uses too many output vectors "
   "(%u > %u)\n",
+  _mesa_shader_stage_to_string(producer->Stage),
   output_vectors,
   max_output_components / 4);
   else
- linker_error(prog, "shader uses too many output components "
+ linker_error(prog, "%s shader uses too many output components "
   "(%u > %u)\n",
+  _mesa_shader_stage_to_string(producer->Stage),
   output_components,
   max_output_components);
 
@@ -1579,13 +1581,15 @@ check_against_input_limit(struct gl_context *ctx,
const unsigned input_components = input_vectors * 4;
if (input_components > max_input_components) {
   if (ctx->API == API_OPENGLES2 || prog->IsES)
- linker_error(prog, "shader uses too many input vectors "
+ linker_error(prog, "%s shader uses too many input vectors "
   "(%u > %u)\n",
+  _mesa_shader_stage_to_string(consumer->Stage),
   input_vectors,
   max_input_components / 4);
   else
- linker_error(prog, "shader uses too many input components "
+ linker_error(prog, "%s shader uses too many input components "
   "(%u > %u)\n",
+  _mesa_shader_stage_to_string(consumer->Stage),
   input_components,
   max_input_components);
 
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/11] glapi fixes - build whole of mesa with

2015-06-19 Thread Jose Fonseca

On 19/06/15 20:56, Emil Velikov wrote:

Hi all,

A lovely series inspired (more like 'was awaken to send these out') by
Pal Rohár, who was having issues when building xlib-libgl (plus the now
enabled gles*)

So here, we teach the final two static glapi users about shared-glapi,
plus some related fixes. After this is done we can finally start
transitioning to shared-only glapi, with some more details as mentioned
in one of the patches:

 XXX: With this one done, we can finally transition with enforcing
 shared-glapi, and

  - link the dri modules against libglapi.so, add --no-undefined to
 the LDFLAGS
  - drop the dlopen(libglapi.so/libGL.so, RTLD_GLOBAL) workarounds
 in the loaders - libGL, libEGL and libgbm.
  - start killing off/cleaning up the dispatch ?

 The caveats:
 1) up to what stage do we care about static libraries
  - libgl (either dri or xlib based)
  - osmesa
  - libEGL

 2) how about other platforms (scons) ?
  - currently the scons uses static glapi,
  - would we need the dlopen(...) on windows ?

Hope everyone is excited about this one as I am :-)


Maybe I missed the context of this changes, but why this matters or is 
an improvement?



I understand the rationale for EGL and DRI.  But I'm asking specifically 
about xlib libgl, osmesa, and Windows ICD drivers.



At a glance, for osmesa and xlib-libgl, forcing to split into multiple 
.so seems a step backwards.  Rather than making these easy to use and 
embedded, it adds complexity, plus potentially prevents using os mesa 
and libgl-xlib along side with the system's true libGL.so.



Finally, it's not clear whether this would force Windows OpenGL ICD 
drivers to be split into two multiple DLLs, but I'm afraid that's a big 
show stopper.



In summary, having the ability of using a shared glapi sounds great, but 
forcing shared glapi everywhere, sounds a bad idea.



Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/11] glapi fixes - build whole of mesa with

2015-06-22 Thread Jose Fonseca

On 19/06/15 23:09, Emil Velikov wrote:

On 19 June 2015 at 21:26, Jose Fonseca  wrote:

On 19/06/15 20:56, Emil Velikov wrote:


Hi all,

A lovely series inspired (more like 'was awaken to send these out') by
Pal Rohár, who was having issues when building xlib-libgl (plus the now
enabled gles*)

So here, we teach the final two static glapi users about shared-glapi,
plus some related fixes. After this is done we can finally start
transitioning to shared-only glapi, with some more details as mentioned
in one of the patches:

  XXX: With this one done, we can finally transition with enforcing
  shared-glapi, and

   - link the dri modules against libglapi.so, add --no-undefined to
  the LDFLAGS
   - drop the dlopen(libglapi.so/libGL.so, RTLD_GLOBAL) workarounds
  in the loaders - libGL, libEGL and libgbm.
   - start killing off/cleaning up the dispatch ?

  The caveats:
  1) up to what stage do we care about static libraries
   - libgl (either dri or xlib based)
   - osmesa
   - libEGL

  2) how about other platforms (scons) ?
   - currently the scons uses static glapi,
   - would we need the dlopen(...) on windows ?

Hope everyone is excited about this one as I am :-)



Maybe I missed the context of this changes, but why this matters or is an
improvement?


If one goes the extra mile (which this series doesn't) - one configure
option less, substantial some code de-duplication and consistent use
of the code amongst all components provided. This way any
improvements/cleanups made to the shared glapi will be available to
osmesa/xlib-libgl.


I'm perfectly happy with removing the configure option.

And I understand the benefits of unified code paths, but I believe that 
for this particular case, the difference in requirements really demands 
the separate code paths.



In summary, having the ability of using a shared glapi sounds great, but
forcing shared glapi everywhere, sounds a bad idea.


I'm suspecting that people might be keen on the following idea - use
static glapi for osmesa/xlib-libgl and shared one everywhere else?


Yes, that sounds reasonable for me.  (Needs libgl-gdi too.)



I fear that this will lead to further separation/bit-rot between the
different implementations, but it seems like the bester compromise.


I don't feel strongly between: a) using the same source code for both 
static/shared glapi (switched by a pre-processor define), or b) only 
share the interface but have shared/static glapi implementations.  I'm 
actually not that familiar with that code.



Either way, we can have two glapi build targets (a shared-glapi and a 
static-glapipe) side-by-side, so that there are no more source-wide 
configure flags.



I believe a lot of the complexity of that code comes from assembly.  I 
wonder if it's really justified nowadays (and even if it is, whether it 
would be better served with GNU C assembly.) Futhermore, I believe on 
Windows we use any assembly, so if we split shared/static glapi source 
code, we could probably abandon assembly from the static-glapi.



Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Building Mesa/LLVMpipe on Windows

2015-06-22 Thread Jose Fonseca

On 22/06/15 19:40, Florian Link wrote:

Hi everyone,

I spent some time building Mesa/llvmpipe on Windows and created a Python
script
that implements all the required steps (downloading/extracting all
prerequisites and sources,
configuring and building LLVM and Mesa).

The script is available at:

https://github.com/florianlink/MesaOnWindows


Given you're building for MSVC, you could avoid MinGW by using 
http://winflexbison.sourceforge.net/ .


BTW, I've been playing with AppVeyor for building Mesa builds with MSVC. 
 You can see the builds log


https://ci.appveyor.com/project/jrfonseca/mesa

It doesn't build everything -- it uses pre-compiled LLVM binaries --, 
and it also leverages a lot of software that is pre-installed int 
AppVeyor build images.


>
> I hope it helps some people struggling with the build details on Windows!
> If you are interested, feel free to incorporate it into Mesa,

Maybe this sort of script wouldn't be a bad idea indeed.

> I placed the script into the public domain.

Didn't know about unlicense.org . Interesting.  A bit off-topic, but I 
actually have been considering public domain for future personal pet 
projects, because when




Best regards,
Florian

P.S. Is there any reason why there are no prebuilt Mesa opengl32.dll
files available on the web? I considered putting a current dll onto
Github as well, are there any reasons why I should not do that?


No particular reason other than nobody could be bothered.  Mesa doesn't 
ship compiled binaries for any OS, not just Windows.


Personally I don't the time to prepare binaries.  If this ever was to 
happen it would have to be fully automated via something like AppVeyor 
(MSVC) or Travis-Ci (mingw cross-compilers).


I also worry about people just downloading opengl32.dll, without 
understanding what they are doing, running into all sort of troubles, 
and flooding with bug reports / support requests.


Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/11] glapi fixes - build whole of mesa with

2015-06-23 Thread Jose Fonseca

On 22/06/15 19:51, Emil Velikov wrote:

On 22 June 2015 at 15:01, Jose Fonseca  wrote:

On 19/06/15 23:09, Emil Velikov wrote:


On 19 June 2015 at 21:26, Jose Fonseca  wrote:


On 19/06/15 20:56, Emil Velikov wrote:



Hi all,

A lovely series inspired (more like 'was awaken to send these out') by
Pal Rohár, who was having issues when building xlib-libgl (plus the now
enabled gles*)

So here, we teach the final two static glapi users about shared-glapi,
plus some related fixes. After this is done we can finally start
transitioning to shared-only glapi, with some more details as mentioned
in one of the patches:

   XXX: With this one done, we can finally transition with enforcing
   shared-glapi, and

- link the dri modules against libglapi.so, add --no-undefined to
   the LDFLAGS
- drop the dlopen(libglapi.so/libGL.so, RTLD_GLOBAL) workarounds
   in the loaders - libGL, libEGL and libgbm.
- start killing off/cleaning up the dispatch ?

   The caveats:
   1) up to what stage do we care about static libraries
- libgl (either dri or xlib based)
- osmesa
- libEGL

   2) how about other platforms (scons) ?
- currently the scons uses static glapi,
- would we need the dlopen(...) on windows ?

Hope everyone is excited about this one as I am :-)




Maybe I missed the context of this changes, but why this matters or is an
improvement?


If one goes the extra mile (which this series doesn't) - one configure
option less, substantial some code de-duplication and consistent use
of the code amongst all components provided. This way any
improvements/cleanups made to the shared glapi will be available to
osmesa/xlib-libgl.



I'm perfectly happy with removing the configure option.

And I understand the benefits of unified code paths, but I believe that for
this particular case, the difference in requirements really demands the
separate code paths.


In summary, having the ability of using a shared glapi sounds great, but
forcing shared glapi everywhere, sounds a bad idea.


I'm suspecting that people might be keen on the following idea - use
static glapi for osmesa/xlib-libgl and shared one everywhere else?



Yes, that sounds reasonable for me.  (Needs libgl-gdi too.)


Indeed. Everything gdi is build only via scons so we'll touch it only if needed.



I fear that this will lead to further separation/bit-rot between the
different implementations, but it seems like the bester compromise.



I don't feel strongly between: a) using the same source code for both
static/shared glapi (switched by a pre-processor define), or b) only share
the interface but have shared/static glapi implementations.  I'm actually
not that familiar with that code.


Either way, we can have two glapi build targets (a shared-glapi and a
static-glapipe) side-by-side, so that there are no more source-wide
configure flags.


In theory it should be fine, in practise... I'm rather cautious as
mapi is the most convoluted part in mesa, and with the
"subdir-objects" option being toggled soon things may go (albeit
unlikely) subtly haywire.



I believe a lot of the complexity of that code comes from assembly.  I
wonder if it's really justified nowadays (and even if it is, whether it
would be better served with GNU C assembly.) Futhermore, I believe on
Windows we use any assembly, so if we split shared/static glapi source code,
we could probably abandon assembly from the static-glapi.


I'm not 100% sure but I'd suspect that Cygwin might use it when
combined with swrast_dri. Don't know what others use - iirc some of
the BSD folks are moving over to llvm. That I aside there is a massive
amount of #ifdef spaghetti, apart from the assembly code.

Can I have your ack/nack on the idea of having shared-glapi available
for xlib-libgl (patches 2, 3 and 4), until we have both glapi's built
in in parallel ? As mentioned originally, currently we fail to build
if one enabled gles* and xlib-libgl and adding another hack in
configure.ac is feel like flocking up a dead horse.


I rarely use auto conf myself, but the mentioned 2-4 patches look OK to me.

Acked-by: Jose Fonseca 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] glsl: Fix counting of varyings.

2015-06-23 Thread Jose Fonseca

On 22/06/15 17:14, Ian Romanick wrote:

On 06/19/2015 06:08 AM, Jose Fonseca wrote:

When input and output varyings started to be counted separately (commit
42305fb5) the is_varying_var function wasn't updated to return true for
output varyings or input varyings for stages other than the fragment
shader), effectively making the varying limit to never be checked.


Without SSO, counting the varying inputs used by, say, the fragment
shader, should be sufficient.  With SSO, it's more difficult.


With this change, color, texture coord, and generic varyings are not
counted, but others are ignored.  It is assumed the hardware will handle
special varyings internally (ie, optimistic rather than pessimistic), to
avoid causing regressions where things were working somehow.

This fixes `glsl-max-varyings --exceed-limits` with softpipe/llvmpipe,
which was asserting because we were getting varyings beyond
VARYING_SLOT_MAX in st_glsl_to_tgsi.cpp.

It also prevents the assertion failure with
https://bugs.freedesktop.org/show_bug.cgi?id=90539 but the tests still
fails due to the link error.

This change also adds a few assertions to catch this sort of errors
earlier, and potentially prevent buffer overflows in the future (no
buffer overflow was detected here though).

However, this change causes several tests to regress:

   spec/glsl-1.10/execution/varying-packing/simple ivec3 array
   spec/glsl-1.10/execution/varying-packing/simple ivec3 separate
   spec/glsl-1.10/execution/varying-packing/simple uvec3 array
   spec/glsl-1.10/execution/varying-packing/simple uvec3 separate


Wait... so the ivec3 and uvec3 tests fail, but the vec3 test passes?


Correct.  This is partial diff from vec3 vs ivec3's GLSL:

 : GLSL source for vertex shader 1:
-: #version 110
-varying vec3 var000[42];
-varying float var001;
-varying float var002;
+: #version 130
+flat out ivec3 var000[42];
+out float var001;
+out float var002;
 uniform int i;

And it looks like the varying packer refuses the pack together variables 
of different types.


Not sure this is a bug in the test, or a limitation in the varying 
packing pass.  Either way, it's a bug that was being hidden and needs to 
be addressed.



   spec/arb_gpu_shader_fp64/varying-packing/simple dmat3 array
   spec/glsl-1.50/execution/geometry/max-input-components
   spec/glsl-1.50/execution/variable-indexing/gs-input-array-vec4-index-rd
   
spec/glsl-1.50/execution/variable-indexing/vs-output-array-vec4-index-wr-before-gs

But this all seem to be issues either in the way we count varyings
(e.g., geometry inputs get counted multiple times) or in the tests
themselves, or limitations in the varying packer, and deserve attention
on their own right.


Do you have a feeling for which tests are which sorts of problems?


Only a rough idea:

- The "varying-packing/simple"  failures look all similar in nature to 
what I described above, ie., int, uint, or doubles not being packed with 
floats.


- the geometry related ones are because the code to count GS varyings 
over-estimates the varyings (it counts the varyings for the whole 
primitive, not just a single vertex)


  but I workaround this for now in my change, by returning 0 for GS 
(ie, no change for GS).





I'd like to run this through GLES3 conformance before it gets pushed.
I'm not too worried about the geometry shader issues, but the ivec /
uvec tests seem more problematic.


Sure.




---
  src/glsl/link_varyings.cpp | 70 --
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  2 +
  2 files changed, 58 insertions(+), 14 deletions(-)

diff --git a/src/glsl/link_varyings.cpp b/src/glsl/link_varyings.cpp
index 278a778..7649720 100644
--- a/src/glsl/link_varyings.cpp
+++ b/src/glsl/link_varyings.cpp
@@ -190,6 +190,8 @@ cross_validate_outputs_to_inputs(struct gl_shader_program 
*prog,
*/
   const unsigned idx = var->data.location - VARYING_SLOT_VAR0;

+ assert(idx < MAX_VARYING);
+
   if (explicit_locations[idx] != NULL) {
  linker_error(prog,
   "%s shader has multiple outputs explicitly "
@@ -1031,25 +1033,63 @@ varying_matches::match_comparator(const void 
*x_generic, const void *y_generic)
  /**
   * Is the given variable a varying variable to be counted against the
   * limit in ctx->Const.MaxVarying?
- * This includes variables such as texcoords, colors and generic
- * varyings, but excludes variables such as gl_FrontFacing and gl_FragCoord.
+ *
+ * OpenGL specification states:


Please use the canonical format.

 * Section A.B (Foo Bar) of the OpenGL X.Y Whichever Profile spec
 * says:

That enables later readers to more easily find the text in the spec.
Also, the language changes from time to time.


+ *
+ *   Each output variable component used as either a vertex shader output or
+ *   fragment shader input counts against this limit, except for the componen

Re: [Mesa-dev] [PATCH 2/2] glsl: Fix counting of varyings.

2015-06-23 Thread Jose Fonseca

On 23/06/15 15:36, Jose Fonseca wrote:

On 22/06/15 17:14, Ian Romanick wrote:

On 06/19/2015 06:08 AM, Jose Fonseca wrote:

When input and output varyings started to be counted separately (commit
42305fb5) the is_varying_var function wasn't updated to return true for
output varyings or input varyings for stages other than the fragment
shader), effectively making the varying limit to never be checked.


Without SSO, counting the varying inputs used by, say, the fragment
shader, should be sufficient.  With SSO, it's more difficult.


With this change, color, texture coord, and generic varyings are not
counted, but others are ignored.  It is assumed the hardware will handle
special varyings internally (ie, optimistic rather than pessimistic), to
avoid causing regressions where things were working somehow.

This fixes `glsl-max-varyings --exceed-limits` with softpipe/llvmpipe,
which was asserting because we were getting varyings beyond
VARYING_SLOT_MAX in st_glsl_to_tgsi.cpp.

It also prevents the assertion failure with
https://bugs.freedesktop.org/show_bug.cgi?id=90539 but the tests still
fails due to the link error.

This change also adds a few assertions to catch this sort of errors
earlier, and potentially prevent buffer overflows in the future (no
buffer overflow was detected here though).

However, this change causes several tests to regress:

   spec/glsl-1.10/execution/varying-packing/simple ivec3 array
   spec/glsl-1.10/execution/varying-packing/simple ivec3 separate
   spec/glsl-1.10/execution/varying-packing/simple uvec3 array
   spec/glsl-1.10/execution/varying-packing/simple uvec3 separate


Wait... so the ivec3 and uvec3 tests fail, but the vec3 test passes?


Correct.  This is partial diff from vec3 vs ivec3's GLSL:

  : GLSL source for vertex shader 1:
-: #version 110
-varying vec3 var000[42];
-varying float var001;
-varying float var002;
+: #version 130
+flat out ivec3 var000[42];
+out float var001;
+out float var002;
  uniform int i;

And it looks like the varying packer refuses the pack together variables
of different types.

Not sure this is a bug in the test, or a limitation in the varying
packing pass.  Either way, it's a bug that was being hidden and needs to
be addressed.


   spec/arb_gpu_shader_fp64/varying-packing/simple dmat3 array
   spec/glsl-1.50/execution/geometry/max-input-components

spec/glsl-1.50/execution/variable-indexing/gs-input-array-vec4-index-rd

spec/glsl-1.50/execution/variable-indexing/vs-output-array-vec4-index-wr-before-gs


But this all seem to be issues either in the way we count varyings
(e.g., geometry inputs get counted multiple times) or in the tests
themselves, or limitations in the varying packer, and deserve attention
on their own right.


Do you have a feeling for which tests are which sorts of problems?


Only a rough idea:

- The "varying-packing/simple"  failures look all similar in nature to
what I described above, ie., int, uint, or doubles not being packed with
floats.

- the geometry related ones are because the code to count GS varyings
over-estimates the varyings (it counts the varyings for the whole
primitive, not just a single vertex)

   but I workaround this for now in my change, by returning 0 for GS
(ie, no change for GS).




I'd like to run this through GLES3 conformance before it gets pushed.
I'm not too worried about the geometry shader issues, but the ivec /
uvec tests seem more problematic.


Sure.




---
  src/glsl/link_varyings.cpp | 70
--
  src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  2 +
  2 files changed, 58 insertions(+), 14 deletions(-)

diff --git a/src/glsl/link_varyings.cpp b/src/glsl/link_varyings.cpp
index 278a778..7649720 100644
--- a/src/glsl/link_varyings.cpp
+++ b/src/glsl/link_varyings.cpp
@@ -190,6 +190,8 @@ cross_validate_outputs_to_inputs(struct
gl_shader_program *prog,
*/
   const unsigned idx = var->data.location - VARYING_SLOT_VAR0;

+ assert(idx < MAX_VARYING);
+
   if (explicit_locations[idx] != NULL) {
  linker_error(prog,
   "%s shader has multiple outputs explicitly "
@@ -1031,25 +1033,63 @@ varying_matches::match_comparator(const void
*x_generic, const void *y_generic)
  /**
   * Is the given variable a varying variable to be counted against the
   * limit in ctx->Const.MaxVarying?
- * This includes variables such as texcoords, colors and generic
- * varyings, but excludes variables such as gl_FrontFacing and
gl_FragCoord.
+ *
+ * OpenGL specification states:


Please use the canonical format.

 * Section A.B (Foo Bar) of the OpenGL X.Y Whichever Profile spec
 * says:

That enables later readers to more easily find the text in the spec.
Also, the language changes from time to time.


+ *
+ *   Each output variable component used as either a vertex shader
output or
+ *   fragment shader input counts against this limit

Re: [Mesa-dev] [PATCH] st/mesa: remove unneeded pipe_surface_release() in st_render_texture()

2015-06-23 Thread Jose Fonseca

On 23/06/15 17:30, Brian Paul wrote:

This caused us to always free the pipe_surface for the renderbuffer.
The subsequent call to st_update_renderbuffer_surface() would typically
just recreate it.  Remove the call to pipe_surface_release() and let
st_update_renderbuffer_surface() take care of freeing the old surface
if it needs to be replaced (because of change to mipmap level, etc).

This can save quite a few calls to pipe_context::create_surface() and
surface_destroy().
---
  src/mesa/state_tracker/st_cb_fbo.c | 2 --
  1 file changed, 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_fbo.c 
b/src/mesa/state_tracker/st_cb_fbo.c
index 0399eef..5707590 100644
--- a/src/mesa/state_tracker/st_cb_fbo.c
+++ b/src/mesa/state_tracker/st_cb_fbo.c
@@ -511,8 +511,6 @@ st_render_texture(struct gl_context *ctx,
 strb->rtt_layered = att->Layered;
 pipe_resource_reference(&strb->texture, pt);

-   pipe_surface_release(pipe, &strb->surface);
-
 st_update_renderbuffer_surface(st, strb);

 strb->Base.Format = st_pipe_format_to_mesa_format(pt->format);



Reviewed-by: Jose Fonseca 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/os: add os_wait_until_zero

2015-06-26 Thread Jose Fonseca

On 26/06/15 12:05, Marek Olšák wrote:

From: Marek Olšák 

This will be used by radeon and amdgpu winsyses.
Copied from the amdgpu winsys.
---
  src/gallium/auxiliary/os/os_time.c | 36 +++-
  src/gallium/auxiliary/os/os_time.h | 10 ++
  2 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/os/os_time.c 
b/src/gallium/auxiliary/os/os_time.c
index f7e4ca4..63b6879 100644
--- a/src/gallium/auxiliary/os/os_time.c
+++ b/src/gallium/auxiliary/os/os_time.c
@@ -33,11 +33,12 @@
   */


-#include "pipe/p_config.h"
+#include "pipe/p_defines.h"

  #if defined(PIPE_OS_UNIX)
  #  include  /* timeval */
  #  include  /* timeval */
+#  include  /* sched_yield */
  #elif defined(PIPE_SUBSYSTEM_WINDOWS_USER)
  #  include 
  #else
@@ -92,3 +93,36 @@ os_time_sleep(int64_t usecs)
  }

  #endif
+
+
+bool os_wait_until_zero(int *var, uint64_t timeout)


should var be a volatile pointer? I'm surprised it works without it.

Maybe it just works on Unixes thanks to the sched_yield call, and the 
assumption it might any any side effects.


Jose


+{
+   if (!*var)
+  return true;
+
+   if (!timeout)
+  return false;
+
+   if (timeout == PIPE_TIMEOUT_INFINITE) {
+  while (*var) {
+#if defined(PIPE_OS_UNIX)
+ sched_yield();
+#endif
+  }
+  return true;
+   }
+   else {
+  int64_t start_time = os_time_get_nano();
+  int64_t end_time = start_time + timeout;
+
+  while (*var) {
+ if (os_time_timeout(start_time, end_time, os_time_get_nano()))
+return false;
+
+#if defined(PIPE_OS_UNIX)
+ sched_yield();
+#endif
+  }
+  return true;
+   }
+}
diff --git a/src/gallium/auxiliary/os/os_time.h 
b/src/gallium/auxiliary/os/os_time.h
index 4fab03c..fdc8040 100644
--- a/src/gallium/auxiliary/os/os_time.h
+++ b/src/gallium/auxiliary/os/os_time.h
@@ -94,6 +94,16 @@ os_time_timeout(int64_t start,
  }


+/**
+ * Wait until the variable at the given memory location is zero.
+ *
+ * \param var   variable
+ * \param timeout   timeout, can be anything from 0 (no wait) to
+ *  PIPE_TIME_INFINITE (wait forever)
+ * \return true if the variable is zero
+ */
+bool os_wait_until_zero(int *var, uint64_t timeout);
+
  #ifdef__cplusplus
  }
  #endif



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/os: add os_wait_until_zero

2015-06-26 Thread Jose Fonseca
As others pointed, volatile and atomic are slightly different things, 
but you have point: atomic operations should probably take volatile 
pointers as arguments.


This is what C11 did

  http://en.cppreference.com/w/c/atomic/atomic_load

so I do believe that it makes sense to update p_atomic helpers to match 
(as one day hopefully we'll replace everything with stdatomic.h)


Jose


On 26/06/15 16:33, Marek Olšák wrote:

I expect the variable will be changed using an atomic operation by the
CPU, or using a coherent store instruction by the GPU.

If this is wrong and volatile is really required here, then
p_atomic_read is wrong too. Should we fix it? For example:

#define p_atomic_read(_v) (*(volatile int*)(_v))

Then, os_wait_until_zero can use p_atomic_read.

Marek

On Fri, Jun 26, 2015 at 4:48 PM, Ilia Mirkin  wrote:

On Fri, Jun 26, 2015 at 7:05 AM, Marek Olšák  wrote:

From: Marek Olšák 

This will be used by radeon and amdgpu winsyses.
Copied from the amdgpu winsys.
---
  src/gallium/auxiliary/os/os_time.c | 36 +++-
  src/gallium/auxiliary/os/os_time.h | 10 ++
  2 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/os/os_time.c 
b/src/gallium/auxiliary/os/os_time.c
index f7e4ca4..63b6879 100644
--- a/src/gallium/auxiliary/os/os_time.c
+++ b/src/gallium/auxiliary/os/os_time.c
@@ -33,11 +33,12 @@
   */


-#include "pipe/p_config.h"
+#include "pipe/p_defines.h"

  #if defined(PIPE_OS_UNIX)
  #  include  /* timeval */
  #  include  /* timeval */
+#  include  /* sched_yield */
  #elif defined(PIPE_SUBSYSTEM_WINDOWS_USER)
  #  include 
  #else
@@ -92,3 +93,36 @@ os_time_sleep(int64_t usecs)
  }

  #endif
+
+
+bool os_wait_until_zero(int *var, uint64_t timeout)


Does this need to be volatile?


+{
+   if (!*var)
+  return true;
+
+   if (!timeout)
+  return false;
+
+   if (timeout == PIPE_TIMEOUT_INFINITE) {
+  while (*var) {
+#if defined(PIPE_OS_UNIX)
+ sched_yield();
+#endif
+  }
+  return true;
+   }
+   else {
+  int64_t start_time = os_time_get_nano();
+  int64_t end_time = start_time + timeout;
+
+  while (*var) {
+ if (os_time_timeout(start_time, end_time, os_time_get_nano()))
+return false;
+
+#if defined(PIPE_OS_UNIX)
+ sched_yield();
+#endif
+  }
+  return true;
+   }
+}
diff --git a/src/gallium/auxiliary/os/os_time.h 
b/src/gallium/auxiliary/os/os_time.h
index 4fab03c..fdc8040 100644
--- a/src/gallium/auxiliary/os/os_time.h
+++ b/src/gallium/auxiliary/os/os_time.h
@@ -94,6 +94,16 @@ os_time_timeout(int64_t start,
  }


+/**
+ * Wait until the variable at the given memory location is zero.
+ *
+ * \param var   variable
+ * \param timeout   timeout, can be anything from 0 (no wait) to
+ *  PIPE_TIME_INFINITE (wait forever)
+ * \return true if the variable is zero
+ */
+bool os_wait_until_zero(int *var, uint64_t timeout);
+
  #ifdef __cplusplus
  }
  #endif
--
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] darwin: Suppress type conversion warnings for GLhandleARB

2015-06-28 Thread Jose Fonseca

On 25/06/15 23:18, Julien Isorce wrote:



On 19 June 2015 at 10:24, Jose Fonseca mailto:jfons...@vmware.com>> wrote:

On 19/06/15 04:46, Ian Romanick wrote:

On 06/17/2015 10:53 PM, Julien Isorce wrote:

From: Jon TURNEY mailto:jon.tur...@dronecode.org.uk>>

On darwin, GLhandleARB is defined as a void *, not the
unsigned int it is on
linux.

For the moment, apply a cast to supress the warning

Possibly this is safe, as for the mesa software renderer the
shader program
handle is not a real pointer, but a integer handle

Probably this is not the right thing to do, and we should
pay closer attention
to how the GLhandlerARB type is used.


In Mesa, glBindAttribLocation (which takes GLuint) and
glBindAttribLocationARB (which takes GLhandleARB) are *the same
function*.  The same applies to pretty much all the other
GLhandleARB
functions.



Properly fixing this is a nightmare, but I think that short term
workaround is feasible.

This is the generated glapitemp.h:

   KEYWORD1 void KEYWORD2 NAME(BindAttribLocationARB)(GLhandleARB
program, GLuint index, const GLcharARB * name)
   {
   (void) program; (void) index; (void) name;
  DISPATCH(BindAttribLocation, (program, index, name), (F,
"glBindAttribLocationARB(%d, %d, %p);\n", program, index, (const
void *) name));
   }

Provided that GLhandlerARB is defined as `unsigned long` during Mesa
build on MacOSX


Hi, where exactly ? or do you mean we just need to apply the patch [1]
you pointed ?

(to avoid these int<->void *) conversions [1], the compiler should
implicitly cast the 64bits GLhandlerARB program to an 32-bits GLuint.

So, when an app calls glBindAttribLocationARB it will be dispatched
to _mesa_BindAttribLocation, and the program truncated. So it should
all just work.

Ditto for when GLhandleARB appears as return value.


The only problem is when GLhandleARB appears as a pointer, as there
is only one such instance:

   GLAPI void APIENTRY glGetAttachedObjectsARB (GLhandleARB
containerObj, GLsizei maxCount, GLsizei *count, GLhandleARB *obj);

But we do have a separate entry-point for this
(_mesa_GetAttachedObjectsARB) so again, we're all good.


So, Jon/Julien's patch seems perfectly workable -- all really left
to do is to silence  GLhandleARB <-> GLuint conversions.


That's a good news.
So Jose concretely what needs to be done ? Just apply the patch [1] you
pointed or apply cast everywhere ?

All conversions are in the 3 files, src/mesa/main/dlist.c,
src/mesa/main/shaderapi.c and src/mesa/main/shader_query.cpp, am I right ?

I did not notice before, but without any change on upstream code, it
gives an error only when compiling c++ files. For c files it is a warning:

main/shaderapi.*c*:1148:23: warning: incompatible pointer to integer
conversion passing 'GLhandleARB' (aka 'void *') to parameter of type
'GLuint' (aka 'unsigned int') [-Wint-conversion] attach_shader(ctx,
program, shader);

main/shader_query.*cpp*:72:7: error: no matching function for call to
'_mesa_lookup_shader_program_err'**"glBindAttribLocation");../../src/mesa/main/shaderobj.h:89:1:
note: candidate function not viable: cannot convert argument of
incomplete type 'GLhandleARB' (aka 'void *') to 'GLuint' (aka 'unsigned
int')_mesa_lookup_shader_program_err(struct gl_context *ctx, GLuint name,

Thx



Jose


[1] Apitrace also defines GLhandleARB as unsigned long internally to
avoid this

https://github.com/apitrace/apitrace/blob/master/thirdparty/khronos/GL/glext.patch




I mean something like this:

diff --git a/configure.ac b/configure.ac
index af61aa2..afcfbf6 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1353,7 +1353,7 @@ if test "x$enable_dri" = xyes; then
 fi
 ;;
 darwin*)
-DEFINES="$DEFINES -DGLX_ALIAS_UNSUPPORTED"
+DEFINES="$DEFINES -DGLX_ALIAS_UNSUPPORTED -DBUILDING_MESA"
 if test "x$with_dri_drivers" = "xyes"; then
 with_dri_drivers="swrast"
 fi
diff --git a/include/GL/glext.h b/include/GL/glext.h
index a3873a6..1f52871 100644
--- a/include/GL/glext.h
+++ b/include/GL/glext.h
@@ -3879,7 +3879,12 @@ GLAPI void APIENTRY glMinSampleShadingARB 
(GLfloat value);

 #ifndef GL_ARB_shader_objects
 #define GL_ARB_shader_objects 1
 #ifdef __APPLE__
+#ifdef BUILDING_MESA
+// Avoid uint <-> void *  warnings
+typedef unsigned long GLhandleARB;
+#else
 typedef void *GLhandleARB;
+#endif
 #else
 typedef unsigned int GLhandleARB;
 #endif


Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Extension to get Mesa IRs (Was: [Bug 91173])

2015-07-01 Thread Jose Fonseca
On 01/07/15 22:30, bugzilla-dae...@freedesktop.org wrote:> *Comment # 14 


> on bug 91173  from
> Ilia Mirkin  *
>
> Erm... ok...
>
> MOV R0.zw, c[A0.x + 9];
> MOV R1.x, c[0].w;
> ADD R0.x, c[A0.x + 9].y, R1;
> FLR R0.y, R0.x;
>
> vs
>
>0: MAD TEMP[0].xy, IN[1], CONST[7]., CONST[7].
>3: MOV TEMP[0].zw, CONST[ADDR[0].x+9]
>7: FLR TEMP[0].y, CONST[0].
>
> Could be that I'm matching the wrong shaders. But this seems highly 
suspect.
> Need to see if there's a good way of dumping mesa ir... I wonder if 
it doesn't
> notice the write-mask on the MOV R0.zw and thinks that R0 contains 
the value it

> wants.

Nice detective work on this bug, Ilia.

> Could be that I'm matching the wrong shaders.

I think it could be quite useful if there was a 
"GL_MESAX_get_internal_representation" Mesa specific extension to 
extract a text representation of the current bound GLSL, TGSI, hardware 
speicfic, etc, exclusively for debugging purposes.


It doesn't even need to be advertised on non-debug builds of Mesa.  But 
merely being able to see next to each other all the IRs at a given call 
in a trace, will probably save some time / grief for us developers on 
similar situations.



I did something akin to this for NVIDIA prioprietary drivers on 
https://github.com/apitrace/apitrace/commit/49192a4e48d080e44a0d66f059e6897f07cf67f8 
but I don't think GetProgramBinary is apropriate for Mesa (only one format.)



Instead, for Mesa we could have something like

   GLint n;
   // this will trigget IRs being collected into an array internally
   glGetIntegerv(GL_NUM_ACTIVE_IRS, &n);

   for (i=0; i < n; ++i) {
   GLint nameLength;
   char *name;
   GLint sourceLength;
   char *source;
   glGetActiveInternalRepr(&nameLength, NULL, &sourceLength, NULL);
   name = malloc(nameLength)
   source = malloc(sourceLength)
   glGetActiveInternalRepr(NULL, name, NULL, source);
   }

And this would need to be plumbed through all the way inside the 
drivers, each layer would  advertise additional IRs.


And the information here would only be obtainable/valid immediately 
after a draw call.



A completely different tack, is that apitrace's glretrace would 
advertise an unique environment variable (e.g,MESA_IR_DUMP_ALL=fd), and 
all drivers/layers would write shaders repres, and when they are 
bound/unbound/destroyed on  a preestablished format:


CREATE "GLSL/123"
...
EOF

CREATE TGSI/456
EOF

BIND GLSL/123
BIND TGSI/456
BIND HW/789

UNBIND GLSL/123
UNBIND TGSI/456
UNBIND HW/789

DESTROY GLSL/123
DESTROY TGSI/456
DESTROY HW/789


I don't feel strongly either way, but I suspect that having a proper 
extension, even if a little more work at start, will be more robust on 
the long term.  And less runtime overhead.  GL extensions also give a 
mechanism to revise/deprecate this functionality in the future.



Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/prog: relative offsets into constbufs are not constant

2015-07-01 Thread Jose Fonseca

On 02/07/15 06:55, Matt Turner wrote:

On Wed, Jul 1, 2015 at 3:22 PM, Ilia Mirkin  wrote:

The optimization logic relies on being able to read out constbuf values
from program parameters. However that only works if there's no relative
addressing involved.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91173
Signed-off-by: Ilia Mirkin 
---


It would be a pretty neat project to make i915 and r200 consume NIR.

We'd get to delete Mesa IR and these optimization passes and probably
generate better code.


Or maybe have IR -> NIR -> IR, allowing to remove the optimization passes.

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] darwin: Suppress type conversion warnings for GLhandleARB

2015-07-02 Thread Jose Fonseca

On 02/07/15 13:16, Emil Velikov wrote:

On 1 July 2015 at 00:33, Julien Isorce  wrote:

darwin: silence GLhandleARB convertions from and to GLuint

This patch and its description are inspired from Jose Fonseca
explanations and suggestions.

With this patch the following logic applies and only if __APPLE__:

When building mesa, GLhandleARB is defined as unsigned long and
at some point casted to GLuint in gl fuction implementations.
These exact points are where these errors and warnings appear.

When building an application GLhandleARB is defined as void*.
Later when calling a gl function, for example glBindAttribLocationARB,
it will be dispatched to _mesa_BindAttribLocation. So internally
void* will be treated as unsigned long which has the same size.
So the same truncation happens when casting it to GLuint.

Same when GLhandleARB appears as return value.
For mesa it will be GLuint -> unsigned long.
For an application it will be GLuint -> unsigned long -> void*.
Note that the value will be preserved when casting back to GLuint.

When GLhandleARB appears as a pointer there are also separate
entry-points, i.e. _mesa_FuncNameARB. So the same logic can
be applied.

https://bugs.freedesktop.org/show_bug.cgi?id=66346
Signed-off-by: Julien Isorce 
---
  configure.ac   | 2 +-
  include/GL/glext.h | 5 +
  2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/configure.ac b/configure.ac
index 7661bd9..1cd8e77 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1357,7 +1357,7 @@ if test "x$enable_dri" = xyes; then
  fi
  ;;
  darwin*)
-DEFINES="$DEFINES -DGLX_ALIAS_UNSUPPORTED"
+DEFINES="$DEFINES -DGLX_ALIAS_UNSUPPORTED -DBUILDING_MESA"
  if test "x$with_dri_drivers" = "xyes"; then
  with_dri_drivers="swrast"
  fi
diff --git a/include/GL/glext.h b/include/GL/glext.h
index a3873a6..e5f1d89 100644
--- a/include/GL/glext.h
+++ b/include/GL/glext.h
@@ -3879,7 +3879,12 @@ GLAPI void APIENTRY glMinSampleShadingARB (GLfloat 
value);
  #ifndef GL_ARB_shader_objects
  #define GL_ARB_shader_objects 1
  #ifdef __APPLE__
+#ifdef BUILDING_MESA
+/* Avoid uint <-> void* warnings */
+typedef unsigned long GLhandleARB;
+#else
  typedef void *GLhandleARB;
+#endif

Ideally we'll ship a header without this change, but that evolves
adding a hook at the build/install stage into the autotools build.


I don't think that complexity is justifiable.  There is even a 
precedent, e.g., from the top of mesa/include/GL/gl.h:


#  if (defined(_MSC_VER) || defined(__MINGW32__)) && defined(BUILD_GL32) 
/* tag specify we're building mesa as a DLL */


Many other open-sources projects have public headers that react 
different when building vs external use, and I have yet seen one that 
has install hooks to filter that stuff out.


Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] darwin: Suppress type conversion warnings for GLhandleARB

2015-07-02 Thread Jose Fonseca

On 01/07/15 00:33, Julien Isorce wrote:

darwin: silence GLhandleARB convertions from and to GLuint

This patch and its description are inspired from Jose Fonseca
explanations and suggestions.

With this patch the following logic applies and only if __APPLE__:

When building mesa, GLhandleARB is defined as unsigned long and
at some point casted to GLuint in gl fuction implementations.
These exact points are where these errors and warnings appear.

When building an application GLhandleARB is defined as void*.
Later when calling a gl function, for example glBindAttribLocationARB,
it will be dispatched to _mesa_BindAttribLocation. So internally
void* will be treated as unsigned long which has the same size.
So the same truncation happens when casting it to GLuint.

Same when GLhandleARB appears as return value.
For mesa it will be GLuint -> unsigned long.
For an application it will be GLuint -> unsigned long -> void*.
Note that the value will be preserved when casting back to GLuint.

When GLhandleARB appears as a pointer there are also separate
entry-points, i.e. _mesa_FuncNameARB. So the same logic can
be applied.

https://bugs.freedesktop.org/show_bug.cgi?id=66346
Signed-off-by: Julien Isorce 
---
  configure.ac   | 2 +-
  include/GL/glext.h | 5 +
  2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/configure.ac b/configure.ac
index 7661bd9..1cd8e77 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1357,7 +1357,7 @@ if test "x$enable_dri" = xyes; then
  fi
  ;;
  darwin*)
-DEFINES="$DEFINES -DGLX_ALIAS_UNSUPPORTED"
+DEFINES="$DEFINES -DGLX_ALIAS_UNSUPPORTED -DBUILDING_MESA"
  if test "x$with_dri_drivers" = "xyes"; then
  with_dri_drivers="swrast"
  fi
diff --git a/include/GL/glext.h b/include/GL/glext.h
index a3873a6..e5f1d89 100644
--- a/include/GL/glext.h
+++ b/include/GL/glext.h
@@ -3879,7 +3879,12 @@ GLAPI void APIENTRY glMinSampleShadingARB (GLfloat 
value);
  #ifndef GL_ARB_shader_objects
  #define GL_ARB_shader_objects 1
  #ifdef __APPLE__
+#ifdef BUILDING_MESA
+/* Avoid uint <-> void* warnings */
+typedef unsigned long GLhandleARB;
+#else
  typedef void *GLhandleARB;
+#endif
  #else
  typedef unsigned int GLhandleARB;
  #endif



The only snafu here is that this change can get dropped when updating 
glext.h.  On the other hand, I believe that there's also precedent to 
modify Khronos headers in Mesa.


Reviewed-by: Jose Fonseca 

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Extension to get Mesa IRs (Was: [Bug 91173])

2015-07-02 Thread Jose Fonseca

On 02/07/15 16:34, Ilia Mirkin wrote:

On Thu, Jul 2, 2015 at 1:55 AM, Jose Fonseca  wrote:

On 01/07/15 22:30, bugzilla-dae...@freedesktop.org wrote:> *Comment # 14
<https://bugs.freedesktop.org/show_bug.cgi?id=91173#c14>

on bug 91173 <https://bugs.freedesktop.org/show_bug.cgi?id=91173> from
Ilia Mirkin <mailto:imir...@alum.mit.edu> *

Erm... ok...

MOV R0.zw, c[A0.x + 9];
MOV R1.x, c[0].w;
ADD R0.x, c[A0.x + 9].y, R1;
FLR R0.y, R0.x;

vs

0: MAD TEMP[0].xy, IN[1], CONST[7]., CONST[7].
3: MOV TEMP[0].zw, CONST[ADDR[0].x+9]
7: FLR TEMP[0].y, CONST[0].

Could be that I'm matching the wrong shaders. But this seems highly
suspect.
Need to see if there's a good way of dumping mesa ir... I wonder if it
doesn't
notice the write-mask on the MOV R0.zw and thinks that R0 contains the
value it
wants.


Nice detective work on this bug, Ilia.


Could be that I'm matching the wrong shaders.


I think it could be quite useful if there was a
"GL_MESAX_get_internal_representation" Mesa specific extension to extract a
text representation of the current bound GLSL, TGSI, hardware speicfic, etc,
exclusively for debugging purposes.

It doesn't even need to be advertised on non-debug builds of Mesa.  But
merely being able to see next to each other all the IRs at a given call in a
trace, will probably save some time / grief for us developers on similar
situations.


I did something akin to this for NVIDIA prioprietary drivers on
https://github.com/apitrace/apitrace/commit/49192a4e48d080e44a0d66f059e6897f07cf67f8
but I don't think GetProgramBinary is apropriate for Mesa (only one format.)


Instead, for Mesa we could have something like

GLint n;
// this will trigget IRs being collected into an array internally
glGetIntegerv(GL_NUM_ACTIVE_IRS, &n);

for (i=0; i < n; ++i) {
GLint nameLength;
char *name;
GLint sourceLength;
char *source;
glGetActiveInternalRepr(&nameLength, NULL, &sourceLength, NULL);
name = malloc(nameLength)
source = malloc(sourceLength)
glGetActiveInternalRepr(NULL, name, NULL, source);
}

And this would need to be plumbed through all the way inside the drivers,
each layer would  advertise additional IRs.

And the information here would only be obtainable/valid immediately after a
draw call.


A completely different tack, is that apitrace's glretrace would advertise an
unique environment variable (e.g,MESA_IR_DUMP_ALL=fd), and all
drivers/layers would write shaders repres, and when they are
bound/unbound/destroyed on  a preestablished format:

CREATE "GLSL/123"
...
EOF

CREATE TGSI/456
EOF

BIND GLSL/123
BIND TGSI/456
BIND HW/789

UNBIND GLSL/123
UNBIND TGSI/456
UNBIND HW/789

DESTROY GLSL/123
DESTROY TGSI/456
DESTROY HW/789


I don't feel strongly either way, but I suspect that having a proper
extension, even if a little more work at start, will be more robust on the
long term.  And less runtime overhead.  GL extensions also give a mechanism
to revise/deprecate this functionality in the future.


This would still require fairly extensive changes as you'd have to
track all the bindings together.


Really? I don't think so.  Which alternative are you referring to?

Yet another option would be to provide a callback

  typedef void (*GLircallbackMESA)(const char *name, const char *body);

  void glGetActiveInternalReprMesa(GLircallbackMESA callback);

and basically each layer would dump the IRs, and invoke the downstream 
layers with the same callback.




Anyways, *something* would be fantastic. It's also incredibly
difficult to tell what shader is being used for a particular draw...
I've resorted to taking mmt traces of nouveau to see what it's
actually drawing with. Sometimes qapitrace doesn't show a shader,
sometimes it's a fixed function shader, etc. Note that among other
things, this has to account for any shader keys that might exist at
any of the levels. And since bugs are often in optimization passes,
being able to see both pre- and post-opt shaders would be *really*
nice.



That's doable. Each layer just needs to keep the intermediate versions 
around.  We could have glretrace invoke a 
glEnable(GL_KEEP_ALL_IR_MESAX), so that the driver would have not have 
to do this all the time.



Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Extension to get Mesa IRs (Was: [Bug 91173])

2015-07-02 Thread Jose Fonseca

On 02/07/15 17:08, Ilia Mirkin wrote:

On Thu, Jul 2, 2015 at 11:57 AM, Jose Fonseca  wrote:

On 02/07/15 16:34, Ilia Mirkin wrote:


On Thu, Jul 2, 2015 at 1:55 AM, Jose Fonseca  wrote:


On 01/07/15 22:30, bugzilla-dae...@freedesktop.org wrote:> *Comment # 14
<https://bugs.freedesktop.org/show_bug.cgi?id=91173#c14>


on bug 91173 <https://bugs.freedesktop.org/show_bug.cgi?id=91173> from
Ilia Mirkin <mailto:imir...@alum.mit.edu> *

Erm... ok...

MOV R0.zw, c[A0.x + 9];
MOV R1.x, c[0].w;
ADD R0.x, c[A0.x + 9].y, R1;
FLR R0.y, R0.x;

vs

 0: MAD TEMP[0].xy, IN[1], CONST[7]., CONST[7].
 3: MOV TEMP[0].zw, CONST[ADDR[0].x+9]
 7: FLR TEMP[0].y, CONST[0].

Could be that I'm matching the wrong shaders. But this seems highly
suspect.
Need to see if there's a good way of dumping mesa ir... I wonder if it
doesn't
notice the write-mask on the MOV R0.zw and thinks that R0 contains the
value it
wants.



Nice detective work on this bug, Ilia.


Could be that I'm matching the wrong shaders.



I think it could be quite useful if there was a
"GL_MESAX_get_internal_representation" Mesa specific extension to extract
a
text representation of the current bound GLSL, TGSI, hardware speicfic,
etc,
exclusively for debugging purposes.

It doesn't even need to be advertised on non-debug builds of Mesa.  But
merely being able to see next to each other all the IRs at a given call
in a
trace, will probably save some time / grief for us developers on similar
situations.


I did something akin to this for NVIDIA prioprietary drivers on

https://github.com/apitrace/apitrace/commit/49192a4e48d080e44a0d66f059e6897f07cf67f8
but I don't think GetProgramBinary is apropriate for Mesa (only one
format.)


Instead, for Mesa we could have something like

 GLint n;
 // this will trigget IRs being collected into an array internally
 glGetIntegerv(GL_NUM_ACTIVE_IRS, &n);

 for (i=0; i < n; ++i) {
 GLint nameLength;
 char *name;
 GLint sourceLength;
 char *source;
 glGetActiveInternalRepr(&nameLength, NULL, &sourceLength, NULL);
 name = malloc(nameLength)
 source = malloc(sourceLength)
 glGetActiveInternalRepr(NULL, name, NULL, source);
 }

And this would need to be plumbed through all the way inside the drivers,
each layer would  advertise additional IRs.

And the information here would only be obtainable/valid immediately after
a
draw call.


A completely different tack, is that apitrace's glretrace would advertise
an
unique environment variable (e.g,MESA_IR_DUMP_ALL=fd), and all
drivers/layers would write shaders repres, and when they are
bound/unbound/destroyed on  a preestablished format:

CREATE "GLSL/123"
...
EOF

CREATE TGSI/456
EOF

BIND GLSL/123
BIND TGSI/456
BIND HW/789

UNBIND GLSL/123
UNBIND TGSI/456
UNBIND HW/789

DESTROY GLSL/123
DESTROY TGSI/456
DESTROY HW/789


I don't feel strongly either way, but I suspect that having a proper
extension, even if a little more work at start, will be more robust on
the
long term.  And less runtime overhead.  GL extensions also give a
mechanism
to revise/deprecate this functionality in the future.



This would still require fairly extensive changes as you'd have to
track all the bindings together.



Really? I don't think so.  Which alternative are you referring to?


The MESA_IR_DUMP_ALL=fd thing. You can't just have a single ID for the
TGSI/HW as it might change based on other states. By the time you get
it sufficiently robust, you might as well do the GL extension.



Yet another option would be to provide a callback

   typedef void (*GLircallbackMESA)(const char *name, const char *body);

   void glGetActiveInternalReprMesa(GLircallbackMESA callback);

and basically each layer would dump the IRs, and invoke the downstream
layers with the same callback.


What "name" would the driver supply here? And how would you link
things up together?


Giving llvmpipe example, which I'm more familiar,

 - src/mesa/state_tracKer would invoke with "state_tracker/tgsi/{vs,fs}"
and "glsl-ir/{vs,fs}"
 - and invoke pipe_context::get_active_ir (callback) if the pipe driver 
implements it

 - src/gallium/drivers/llvmpipe would invoke with
   - "llvmpipe/tgsi/{vs,fs}" (which might differ from the state tracker 
due to draw module

   - "llvmpipe/llvm/{vs,fs,setup}_{full,partial}"
   - and maybe even "llvmpipe/x86/{vs,fs}

The idea is that this glGetActiveInternalReprMesa() call dumps what's 
active _now_, which is only makes sense immediately after draw calls. So 
the only thing the drivers need to do is dump what they see bound.


That is, there's no need to "link things up together".

Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Extension to get Mesa IRs (Was: [Bug 91173])

2015-07-02 Thread Jose Fonseca

On 02/07/15 17:24, Ilia Mirkin wrote:

On Thu, Jul 2, 2015 at 12:17 PM, Jose Fonseca  wrote:

On 02/07/15 17:08, Ilia Mirkin wrote:


On Thu, Jul 2, 2015 at 11:57 AM, Jose Fonseca  wrote:


On 02/07/15 16:34, Ilia Mirkin wrote:



On Thu, Jul 2, 2015 at 1:55 AM, Jose Fonseca 
wrote:



On 01/07/15 22:30, bugzilla-dae...@freedesktop.org wrote:> *Comment #
14
<https://bugs.freedesktop.org/show_bug.cgi?id=91173#c14>



on bug 91173 <https://bugs.freedesktop.org/show_bug.cgi?id=91173> from
Ilia Mirkin <mailto:imir...@alum.mit.edu> *

Erm... ok...

MOV R0.zw, c[A0.x + 9];
MOV R1.x, c[0].w;
ADD R0.x, c[A0.x + 9].y, R1;
FLR R0.y, R0.x;

vs

  0: MAD TEMP[0].xy, IN[1], CONST[7]., CONST[7].
  3: MOV TEMP[0].zw, CONST[ADDR[0].x+9]
  7: FLR TEMP[0].y, CONST[0].

Could be that I'm matching the wrong shaders. But this seems highly
suspect.
Need to see if there's a good way of dumping mesa ir... I wonder if it
doesn't
notice the write-mask on the MOV R0.zw and thinks that R0 contains the
value it
wants.




Nice detective work on this bug, Ilia.


Could be that I'm matching the wrong shaders.




I think it could be quite useful if there was a
"GL_MESAX_get_internal_representation" Mesa specific extension to
extract
a
text representation of the current bound GLSL, TGSI, hardware speicfic,
etc,
exclusively for debugging purposes.

It doesn't even need to be advertised on non-debug builds of Mesa.  But
merely being able to see next to each other all the IRs at a given call
in a
trace, will probably save some time / grief for us developers on
similar
situations.


I did something akin to this for NVIDIA prioprietary drivers on


https://github.com/apitrace/apitrace/commit/49192a4e48d080e44a0d66f059e6897f07cf67f8
but I don't think GetProgramBinary is apropriate for Mesa (only one
format.)


Instead, for Mesa we could have something like

  GLint n;
  // this will trigget IRs being collected into an array internally
  glGetIntegerv(GL_NUM_ACTIVE_IRS, &n);

  for (i=0; i < n; ++i) {
  GLint nameLength;
  char *name;
  GLint sourceLength;
  char *source;
  glGetActiveInternalRepr(&nameLength, NULL, &sourceLength,
NULL);
  name = malloc(nameLength)
  source = malloc(sourceLength)
  glGetActiveInternalRepr(NULL, name, NULL, source);
  }

And this would need to be plumbed through all the way inside the
drivers,
each layer would  advertise additional IRs.

And the information here would only be obtainable/valid immediately
after
a
draw call.


A completely different tack, is that apitrace's glretrace would
advertise
an
unique environment variable (e.g,MESA_IR_DUMP_ALL=fd), and all
drivers/layers would write shaders repres, and when they are
bound/unbound/destroyed on  a preestablished format:

CREATE "GLSL/123"
...
EOF

CREATE TGSI/456
EOF

BIND GLSL/123
BIND TGSI/456
BIND HW/789

UNBIND GLSL/123
UNBIND TGSI/456
UNBIND HW/789

DESTROY GLSL/123
DESTROY TGSI/456
DESTROY HW/789


I don't feel strongly either way, but I suspect that having a proper
extension, even if a little more work at start, will be more robust on
the
long term.  And less runtime overhead.  GL extensions also give a
mechanism
to revise/deprecate this functionality in the future.




This would still require fairly extensive changes as you'd have to
track all the bindings together.




Really? I don't think so.  Which alternative are you referring to?



The MESA_IR_DUMP_ALL=fd thing. You can't just have a single ID for the
TGSI/HW as it might change based on other states. By the time you get
it sufficiently robust, you might as well do the GL extension.



Yet another option would be to provide a callback

typedef void (*GLircallbackMESA)(const char *name, const char *body);

void glGetActiveInternalReprMesa(GLircallbackMESA callback);

and basically each layer would dump the IRs, and invoke the downstream
layers with the same callback.



What "name" would the driver supply here? And how would you link
things up together?



Giving llvmpipe example, which I'm more familiar,

  - src/mesa/state_tracKer would invoke with "state_tracker/tgsi/{vs,fs}"
and "glsl-ir/{vs,fs}"
  - and invoke pipe_context::get_active_ir (callback) if the pipe driver
implements it
  - src/gallium/drivers/llvmpipe would invoke with
- "llvmpipe/tgsi/{vs,fs}" (which might differ from the state tracker due
to draw module
- "llvmpipe/llvm/{vs,fs,setup}_{full,partial}"
- and maybe even "llvmpipe/x86/{vs,fs}

The idea is that this glGetActiveInternalReprMesa() call dumps what's active
_now_, which is only makes sense immediately after draw calls. So the only
thing the drivers need to do is dump what they see bound.


Ah OK. So I guess tilers will have to disable their render queues for
this one. 

Re: [Mesa-dev] Extension to get Mesa IRs (Was: [Bug 91173])

2015-07-02 Thread Jose Fonseca

On 02/07/15 17:39, Ilia Mirkin wrote:

On Thu, Jul 2, 2015 at 12:24 PM, Ilia Mirkin  wrote:

On Thu, Jul 2, 2015 at 12:17 PM, Jose Fonseca  wrote:

On 02/07/15 17:08, Ilia Mirkin wrote:


On Thu, Jul 2, 2015 at 11:57 AM, Jose Fonseca  wrote:


On 02/07/15 16:34, Ilia Mirkin wrote:



On Thu, Jul 2, 2015 at 1:55 AM, Jose Fonseca 
wrote:



On 01/07/15 22:30, bugzilla-dae...@freedesktop.org wrote:> *Comment #
14
<https://bugs.freedesktop.org/show_bug.cgi?id=91173#c14>



on bug 91173 <https://bugs.freedesktop.org/show_bug.cgi?id=91173> from
Ilia Mirkin <mailto:imir...@alum.mit.edu> *

Erm... ok...

MOV R0.zw, c[A0.x + 9];
MOV R1.x, c[0].w;
ADD R0.x, c[A0.x + 9].y, R1;
FLR R0.y, R0.x;

vs

  0: MAD TEMP[0].xy, IN[1], CONST[7]., CONST[7].
  3: MOV TEMP[0].zw, CONST[ADDR[0].x+9]
  7: FLR TEMP[0].y, CONST[0].

Could be that I'm matching the wrong shaders. But this seems highly
suspect.
Need to see if there's a good way of dumping mesa ir... I wonder if it
doesn't
notice the write-mask on the MOV R0.zw and thinks that R0 contains the
value it
wants.




Nice detective work on this bug, Ilia.


Could be that I'm matching the wrong shaders.




I think it could be quite useful if there was a
"GL_MESAX_get_internal_representation" Mesa specific extension to
extract
a
text representation of the current bound GLSL, TGSI, hardware speicfic,
etc,
exclusively for debugging purposes.

It doesn't even need to be advertised on non-debug builds of Mesa.  But
merely being able to see next to each other all the IRs at a given call
in a
trace, will probably save some time / grief for us developers on
similar
situations.


I did something akin to this for NVIDIA prioprietary drivers on


https://github.com/apitrace/apitrace/commit/49192a4e48d080e44a0d66f059e6897f07cf67f8
but I don't think GetProgramBinary is apropriate for Mesa (only one
format.)


Instead, for Mesa we could have something like

  GLint n;
  // this will trigget IRs being collected into an array internally
  glGetIntegerv(GL_NUM_ACTIVE_IRS, &n);

  for (i=0; i < n; ++i) {
  GLint nameLength;
  char *name;
  GLint sourceLength;
  char *source;
  glGetActiveInternalRepr(&nameLength, NULL, &sourceLength,
NULL);
  name = malloc(nameLength)
  source = malloc(sourceLength)
  glGetActiveInternalRepr(NULL, name, NULL, source);
  }

And this would need to be plumbed through all the way inside the
drivers,
each layer would  advertise additional IRs.

And the information here would only be obtainable/valid immediately
after
a
draw call.


A completely different tack, is that apitrace's glretrace would
advertise
an
unique environment variable (e.g,MESA_IR_DUMP_ALL=fd), and all
drivers/layers would write shaders repres, and when they are
bound/unbound/destroyed on  a preestablished format:

CREATE "GLSL/123"
...
EOF

CREATE TGSI/456
EOF

BIND GLSL/123
BIND TGSI/456
BIND HW/789

UNBIND GLSL/123
UNBIND TGSI/456
UNBIND HW/789

DESTROY GLSL/123
DESTROY TGSI/456
DESTROY HW/789


I don't feel strongly either way, but I suspect that having a proper
extension, even if a little more work at start, will be more robust on
the
long term.  And less runtime overhead.  GL extensions also give a
mechanism
to revise/deprecate this functionality in the future.




This would still require fairly extensive changes as you'd have to
track all the bindings together.




Really? I don't think so.  Which alternative are you referring to?



The MESA_IR_DUMP_ALL=fd thing. You can't just have a single ID for the
TGSI/HW as it might change based on other states. By the time you get
it sufficiently robust, you might as well do the GL extension.



Yet another option would be to provide a callback

typedef void (*GLircallbackMESA)(const char *name, const char *body);

void glGetActiveInternalReprMesa(GLircallbackMESA callback);

and basically each layer would dump the IRs, and invoke the downstream
layers with the same callback.



What "name" would the driver supply here? And how would you link
things up together?



Giving llvmpipe example, which I'm more familiar,

  - src/mesa/state_tracKer would invoke with "state_tracker/tgsi/{vs,fs}"
and "glsl-ir/{vs,fs}"
  - and invoke pipe_context::get_active_ir (callback) if the pipe driver
implements it
  - src/gallium/drivers/llvmpipe would invoke with
- "llvmpipe/tgsi/{vs,fs}" (which might differ from the state tracker due
to draw module
- "llvmpipe/llvm/{vs,fs,setup}_{full,partial}"
- and maybe even "llvmpipe/x86/{vs,fs}

The idea is that this glGetActiveInternalReprMesa() call dumps what's active
_now_, which is only makes sense immediately after draw calls. So the only
thing the drivers need to do is dump what they see bound.


Ah OK. So I guess tiler

Re: [Mesa-dev] Extension to get Mesa IRs (Was: [Bug 91173])

2015-07-02 Thread Jose Fonseca

On 02/07/15 17:49, Ilia Mirkin wrote:

On Thu, Jul 2, 2015 at 12:40 PM, Jose Fonseca  wrote:

On 02/07/15 17:24, Ilia Mirkin wrote:


On Thu, Jul 2, 2015 at 12:17 PM, Jose Fonseca  wrote:



Ah OK. So I guess tilers will have to disable their render queues for
this one. Which seems like a reasonable trade-off...



I don't see why.

This is a purely SW query. So I don't see why the HW needs to see any
difference.


It just won't have compiled the shaders, I think. I guess this could force it.


AFAIK, tiles defer the _rendering_, not the compilation. At least 
llvmpipe compiles everything at draw time.





That said, glretrace already does glReadPixels when dumping state, so one
way or the other, when inspecting state in qapitrace, everything will be
flushed and and synched.


But that's too late -- you said the glGetActiveBla would go right
after the draw call. Presumably if you did it right after glReadPixels
it'd end up seeing the state left over from a blit or something?


Fair enough. It's the first thing after glDraw. Forget about glReadpixels.

I guess just still don't understand what's special about tilers.  But I 
don't think it's pertinent now.




Perhaps the API should instead be

glEnable(GL_PROGRAM_SAVE_DUMP)
glProgramDumpDebugInfo(progid, callback)

which would then optionally dump any info associated with that
program. That way it doesn't even have to be internally active (due to
a subsequent blit or who-knows-what). But it would rely on that
program having been previously-drawn-with which would have generated
the relevant data.



Doing this immediately after draw call is no problem at all. I don't 
think it's worth complicating things by allowing a lag between draw and 
shader extraction. It just makes things more unreliable which defeats 
the point.


Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Extension to get Mesa IRs (Was: [Bug 91173])

2015-07-02 Thread Jose Fonseca

On 02/07/15 19:45, Ilia Mirkin wrote:

On Thu, Jul 2, 2015 at 2:31 PM, Jose Fonseca  wrote:

On 02/07/15 17:49, Ilia Mirkin wrote:


On Thu, Jul 2, 2015 at 12:40 PM, Jose Fonseca  wrote:


On 02/07/15 17:24, Ilia Mirkin wrote:



On Thu, Jul 2, 2015 at 12:17 PM, Jose Fonseca 
wrote:




Ah OK. So I guess tilers will have to disable their render queues for
this one. Which seems like a reasonable trade-off...




I don't see why.

This is a purely SW query. So I don't see why the HW needs to see any
difference.



It just won't have compiled the shaders, I think. I guess this could force
it.



AFAIK, tiles defer the _rendering_, not the compilation. At least llvmpipe
compiles everything at draw time.




That said, glretrace already does glReadPixels when dumping state, so one
way or the other, when inspecting state in qapitrace, everything will be
flushed and and synched.



But that's too late -- you said the glGetActiveBla would go right
after the draw call. Presumably if you did it right after glReadPixels
it'd end up seeing the state left over from a blit or something?



Fair enough. It's the first thing after glDraw. Forget about glReadpixels.

I guess just still don't understand what's special about tilers.  But I
don't think it's pertinent now.


What's special about tilers is that they defer renders. Compiling the
program can similarly get deferred because they can. (And sometimes
entire renders get dropped due to clears, etc.) Should it get
deferred? Dunno. I don't even remember if freedreno defers
compilation, and never knew what vc4 did.





Perhaps the API should instead be

glEnable(GL_PROGRAM_SAVE_DUMP)
glProgramDumpDebugInfo(progid, callback)

which would then optionally dump any info associated with that
program. That way it doesn't even have to be internally active (due to
a subsequent blit or who-knows-what). But it would rely on that
program having been previously-drawn-with which would have generated
the relevant data.




Doing this immediately after draw call is no problem at all. I don't think
it's worth complicating things by allowing a lag between draw and shader
extraction. It just makes things more unreliable which defeats the point.


Would it really complicate things though? Internally, it can never
drop the debug info since a program might later be reused wholesale
and there won't be another compilation, so it has to store the info on
the program object.


Program object might not exist (e.g. when debugging fixed-function).

And the concept of program object looses meaning in the downstream 
layers (e.g inside gallium pipe drivers, where TGSI can come from all 
sort of utility modules and not just GLSL).



I have little doubts: for this to be feasible, it's imperative this 
applies to the immediately validated state.  Our stack has too many 
layer to do anything else: it would be complex and buggy.



Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: fix lp_build_compare_ext

2015-07-06 Thread Jose Fonseca

On 04/07/15 07:15, Vinson Lee wrote:

On Fri, Jul 3, 2015 at 6:05 PM,   wrote:

From: Roland Scheidegger 

The expansion should always be to the same width as the input arguments
no matter what, since these functions should work with any bit width of
the arguments (the sext is a no-op on any sane simd architecture).
Thus, fix the caller expecting differently.

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=91222 (not tested
otherwise)
---
  src/gallium/auxiliary/gallivm/lp_bld_logic.c   | 2 +-
  src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c | 3 +++
  2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_logic.c 
b/src/gallium/auxiliary/gallivm/lp_bld_logic.c
index f724cfa..80b53e5 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_logic.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_logic.c
@@ -81,7 +81,7 @@ lp_build_compare_ext(struct gallivm_state *gallivm,
   boolean ordered)
  {
 LLVMBuilderRef builder = gallivm->builder;
-   LLVMTypeRef int_vec_type = lp_build_int_vec_type(gallivm, 
lp_type_int_vec(32, 32 * type.length));
+   LLVMTypeRef int_vec_type = lp_build_int_vec_type(gallivm, type);
 LLVMValueRef zeros = LLVMConstNull(int_vec_type);
 LLVMValueRef ones = LLVMConstAllOnes(int_vec_type);
 LLVMValueRef cond;
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
index 1f2af85..0ad78b0 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
@@ -1961,8 +1961,11 @@ dset_emit_cpu(
 struct lp_build_emit_data * emit_data,
 unsigned pipe_func)
  {
+   LLVMBuilderRef builder = bld_base->base.gallivm->builder;
 LLVMValueRef cond = lp_build_cmp(&bld_base->dbl_bld, pipe_func,
  emit_data->args[0], emit_data->args[1]);
+   /* arguments were 64 bit but store as 32 bit */
+   cond = LLVMBuildTrunc(builder, cond, bld_base->int_bld.int_vec_type, "");
 emit_data->output[emit_data->chan] = cond;
  }

--
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev



Tested-by: Vinson Lee 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev



Reviewed-by: Jose Fonseca 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] gallium: add interface for writable shader images

2015-07-07 Thread Jose Fonseca
I'm not experienced with the semantics around resources that can be 
read/written by shaders, so I can't really make educated comments.


But overall this looks good to me FWIW.

On 05/07/15 14:25, Marek Olšák wrote:

From: Marek Olšák 

Other approaches are being considered:

1) Don't use resource wrappers (views) and pass all view parameters
(format, layer range, level) to set_shader_images just like
set_vertex_buffers, set_constant_buffer, or even glBindImageTexture do.


I don't know how much pipe drivers leverage this nowadays, but these 
structures are convenient placeholders for driver data, particular when 
they don't support something (e.g., a certain format, or need some 
swizzling), natively.




2) Use pipe_sampler_view instead of pipe_image_view,
and maybe even use set_sampler_views instead of set_shader_images.
set_sampler_views would have to use start_slot >= PIPE_MAX_SAMPLERS for
all writable images to allow for OpenGL textures in the lower slots.


If pipe_sampler_view  and pipe_image_view are the same, we could indeed 
use one structure for both.  While still keeping the separate 
create/bind/destroy functions.


This would enable drivers to treat them uniformly internally if they 
wanted (e.g, by concatenating all views bindings into a single array as 
you described). Or seperate internal objects if they wanted.


This seems the best of both worlds.

There is even a precendent: {create,bind,delete}_{fs,vs,gs}_state. 
These all use the same template structure, but drivers are free to 
create joint or disjoint private structures for each kind.  And in face 
llvmpipe (and all draw based drivers), end up using different private 
objects.


Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] gallium: add interface for writable shader images

2015-07-07 Thread Jose Fonseca

On 07/07/15 21:28, Ilia Mirkin wrote:

On Tue, Jul 7, 2015 at 4:24 PM, Jose Fonseca  wrote:

I'm not experienced with the semantics around resources that can be
read/written by shaders, so I can't really make educated comments.

But overall this looks good to me FWIW.

On 05/07/15 14:25, Marek Olšák wrote:


From: Marek Olšák 

Other approaches are being considered:

1) Don't use resource wrappers (views) and pass all view parameters
 (format, layer range, level) to set_shader_images just like
 set_vertex_buffers, set_constant_buffer, or even glBindImageTexture
do.



I don't know how much pipe drivers leverage this nowadays, but these
structures are convenient placeholders for driver data, particular when they
don't support something (e.g., a certain format, or need some swizzling),
natively.



2) Use pipe_sampler_view instead of pipe_image_view,
 and maybe even use set_sampler_views instead of set_shader_images.
 set_sampler_views would have to use start_slot >= PIPE_MAX_SAMPLERS
for
 all writable images to allow for OpenGL textures in the lower slots.



If pipe_sampler_view  and pipe_image_view are the same, we could indeed use
one structure for both.  While still keeping the separate
create/bind/destroy functions.


The big difference is that a sampler view has a first/last layer and
first/last level, while image views are more like surfaces which just
have the one of each. But they also need a byte range for buffer
images.


D3D11_TEX2D_ARRAY_UAV allows to specify first/last layer 
https://msdn.microsoft.com/en-us/library/windows/desktop/ff476242.aspx , 
so it sounds that once pipe_image_view is updated to handle D3D11, the 
difference would reduce to the absence of last_level



Of course we could just ignore that and guarantee that first==last for images.


Yes, it might not be a bad idea.

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Extension to get Mesa IRs (Was: [Bug 91173])

2015-07-13 Thread Jose Fonseca

On 12/07/15 01:49, Ilia Mirkin wrote:

On Thu, Jul 2, 2015 at 4:54 PM, Jose Fonseca  wrote:

On 02/07/15 19:45, Ilia Mirkin wrote:


On Thu, Jul 2, 2015 at 2:31 PM, Jose Fonseca  wrote:


On 02/07/15 17:49, Ilia Mirkin wrote:



On Thu, Jul 2, 2015 at 12:40 PM, Jose Fonseca 
wrote:



On 02/07/15 17:24, Ilia Mirkin wrote:




On Thu, Jul 2, 2015 at 12:17 PM, Jose Fonseca 
wrote:





Ah OK. So I guess tilers will have to disable their render queues for
this one. Which seems like a reasonable trade-off...





I don't see why.

This is a purely SW query. So I don't see why the HW needs to see any
difference.




It just won't have compiled the shaders, I think. I guess this could
force
it.




AFAIK, tiles defer the _rendering_, not the compilation. At least
llvmpipe
compiles everything at draw time.




That said, glretrace already does glReadPixels when dumping state, so
one
way or the other, when inspecting state in qapitrace, everything will
be
flushed and and synched.




But that's too late -- you said the glGetActiveBla would go right
after the draw call. Presumably if you did it right after glReadPixels
it'd end up seeing the state left over from a blit or something?




Fair enough. It's the first thing after glDraw. Forget about
glReadpixels.

I guess just still don't understand what's special about tilers.  But I
don't think it's pertinent now.



What's special about tilers is that they defer renders. Compiling the
program can similarly get deferred because they can. (And sometimes
entire renders get dropped due to clears, etc.) Should it get
deferred? Dunno. I don't even remember if freedreno defers
compilation, and never knew what vc4 did.





Perhaps the API should instead be

glEnable(GL_PROGRAM_SAVE_DUMP)
glProgramDumpDebugInfo(progid, callback)

which would then optionally dump any info associated with that
program. That way it doesn't even have to be internally active (due to
a subsequent blit or who-knows-what). But it would rely on that
program having been previously-drawn-with which would have generated
the relevant data.





Doing this immediately after draw call is no problem at all. I don't
think
it's worth complicating things by allowing a lag between draw and shader
extraction. It just makes things more unreliable which defeats the point.



Would it really complicate things though? Internally, it can never
drop the debug info since a program might later be reused wholesale
and there won't be another compilation, so it has to store the info on
the program object.



Program object might not exist (e.g. when debugging fixed-function).

And the concept of program object looses meaning in the downstream layers
(e.g inside gallium pipe drivers, where TGSI can come from all sort of
utility modules and not just GLSL).


I have little doubts: for this to be feasible, it's imperative this applies
to the immediately validated state.  Our stack has too many layer to do
anything else: it would be complex and buggy.


Jose


Jose,

Were you planning on working on something like this? I could _really_
use it for some bugs I'm tracking down (and failing thus far),
unfortunately the shaders are unreadable and get compiled very far
away from time of use, which makes it harder to track.


No, I'm afraid I don't have the time myself.  It's not directly useful 
to anything I'm working on at the moment.  My goal was only to help come 
up with good design for this, so that if/when somebody couldn't resist 
the urge to scratch this itch, there was a tentative design/plan in 
place already.


Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC] gallium: add interface for writable shader images

2015-07-13 Thread Jose Fonseca

On 09/07/15 22:05, Marek Olšák wrote:

I'd like to discuss one more thing that will affect whether image
slots will be global (shared by all shaders) or not.

Which image unit an image uniform uses is not a compile-time thing,
but it's specified later using glUniform1i. That means we need a
per-shader table that maps image uniforms to global image units. One
possible solution is to add this pipe_context function:

void (*set_shader_image_mapping)(
struct pipe_context *, unsigned shader,
unsigned start_decl_index, unsigned count,
unsigned *decl_to_slot_mapping);

This is only required if the shader image slots are global and not
per-shader. (if they are per-shader, st/mesa can reorder the slots for
each shader independently just like it already does for textures and
UBOs) Shader storage buffer objects suffer from the same issue. Atomic
counters don't.

Therefore, image slots must be per-shader (like sampler slots) to
avoid this craziness and keep things simple.


Sounds OK to me too.

D3D11's UAV binding points are global, but that can be easily 
accomodated by binding the same UAVs array on all stages.


Jose


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/14] egl: remove the non-haiku scons build

2015-07-14 Thread Jose Fonseca

Reviewed-by: Jose Fonseca 


On 14/07/15 16:02, Emil Velikov wrote:

It has been broken since 2011 with commit c98ea26e16b(egl: Make
egl_dri2 and egl_glx built-in drivers.). When the backends got merged
into the main library each entry point was guarded by a
_EGL_BUILT_IN_DRIVER_* define.

As the define was missing, the linker kindly removed the whole of the
dri2 backend, thus we did not notice any errors due to the unresolved
link to xcb and friends.

Cc: Chia-I Wu 
Cc: Jose Fonseca 
Signed-off-by: Emil Velikov 
---
  src/SConscript   |  4 
  src/egl/drivers/dri2/Makefile.am |  2 --
  src/egl/drivers/dri2/SConscript  | 40 
  src/egl/main/SConscript  | 31 ---
  4 files changed, 8 insertions(+), 69 deletions(-)
  delete mode 100644 src/egl/drivers/dri2/SConscript

diff --git a/src/SConscript b/src/SConscript
index b0578e8..46482fb 100644
--- a/src/SConscript
+++ b/src/SConscript
@@ -31,10 +31,6 @@ SConscript('mesa/SConscript')
  if not env['embedded']:
  if env['platform'] not in ('cygwin', 'darwin', 'freebsd', 'haiku', 
'windows'):
  SConscript('glx/SConscript')
-if env['platform'] not in ['darwin', 'haiku', 'sunos', 'windows']:
-if env['dri']:
-SConscript('egl/drivers/dri2/SConscript')
-SConscript('egl/main/SConscript')
  if env['platform'] == 'haiku':
  SConscript('egl/drivers/haiku/SConscript')
  SConscript('egl/main/SConscript')
diff --git a/src/egl/drivers/dri2/Makefile.am b/src/egl/drivers/dri2/Makefile.am
index 55be4a7..f4649de 100644
--- a/src/egl/drivers/dri2/Makefile.am
+++ b/src/egl/drivers/dri2/Makefile.am
@@ -69,5 +69,3 @@ if HAVE_EGL_PLATFORM_SURFACELESS
  libegl_dri2_la_SOURCES += platform_surfaceless.c
  AM_CFLAGS += -DHAVE_SURFACELESS_PLATFORM
  endif
-
-EXTRA_DIST = SConscript
diff --git a/src/egl/drivers/dri2/SConscript b/src/egl/drivers/dri2/SConscript
deleted file mode 100644
index 5b03107..000
--- a/src/egl/drivers/dri2/SConscript
+++ /dev/null
@@ -1,40 +0,0 @@
-Import('*')
-
-env = env.Clone()
-
-env.Append(CPPDEFINES = [
-   'DEFAULT_DRIVER_DIR=\\"\\"'
-])
-
-env.Append(CPPPATH = [
-   '#/include',
-   '#/src/egl/main',
-   '#/src/loader',
-])
-
-sources = [
-   'egl_dri2.c',
-]
-
-if env['x11']:
-   sources.append('platform_x11.c')
-   env.Append(CPPDEFINES = [
-   'HAVE_X11_PLATFORM',
-   ])
-   #env.Append(CPPPATH = [
-   #   'XCB_DRI2_CFLAGS',
-   #])
-
-if env['drm']:
-   env.PkgUseModules('DRM')
-
-env.Prepend(LIBS = [
-   libloader,
-])
-
-egl_dri2 = env.ConvenienceLibrary(
-   target = 'egl_dri2',
-   source = sources,
-)
-
-Export('egl_dri2')
diff --git a/src/egl/main/SConscript b/src/egl/main/SConscript
index c001283..6fc1341 100644
--- a/src/egl/main/SConscript
+++ b/src/egl/main/SConscript
@@ -10,29 +10,14 @@ env.Append(CPPDEFINES = [
  '_EGL_DRIVER_SEARCH_DIR=\\"\\"',
  ])

-if env['platform'] == 'haiku':
-env.Append(CPPDEFINES = [
-'_EGL_NATIVE_PLATFORM=_EGL_PLATFORM_HAIKU',
-'_EGL_OS_UNIX',
-'_EGL_BUILT_IN_DRIVER_HAIKU',
-])
-env.Prepend(LIBS = [
-egl_haiku,
-libloader,
-])
-else:
-env.Append(CPPDEFINES = [
-'_EGL_NATIVE_PLATFORM=_EGL_PLATFORM_X11',
-'_EGL_OS_UNIX',
-])
-if env['dri']:
-env.Prepend(LIBS = [
-egl_dri2,
-libloader,
-])
-# Disallow undefined symbols
-if env['platform'] != 'darwin':
-env.Append(SHLINKFLAGS = ['-Wl,-z,defs'])
+env.Append(CPPDEFINES = [
+'_EGL_NATIVE_PLATFORM=_EGL_PLATFORM_HAIKU',
+'_EGL_OS_UNIX',
+'_EGL_BUILT_IN_DRIVER_HAIKU',
+])
+env.Prepend(LIBS = [
+egl_haiku,
+])

  env.Append(CPPPATH = [
  '#/include',



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] Match swrast modes more loosely.

2015-07-15 Thread Jose Fonseca
From: Tom Hughes 

https://bugs.freedesktop.org/show_bug.cgi?id=90817

Signed-off-by: Jose Fonseca 
---
 src/glx/dri_common.c | 59 +++-
 1 file changed, 58 insertions(+), 1 deletion(-)

diff --git a/src/glx/dri_common.c b/src/glx/dri_common.c
index 63c8de3..1a62ee2 100644
--- a/src/glx/dri_common.c
+++ b/src/glx/dri_common.c
@@ -266,6 +266,36 @@ scalarEqual(struct glx_config *mode, unsigned int attrib, 
unsigned int value)
 }
 
 static int
+scalarGreaterEqual(struct glx_config *mode, unsigned int attrib, unsigned int 
value)
+{
+   unsigned int glxValue;
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(attribMap); i++)
+  if (attribMap[i].attrib == attrib) {
+ glxValue = *(unsigned int *) ((char *) mode + attribMap[i].offset);
+ return glxValue == GLX_DONT_CARE || glxValue >= value;
+  }
+
+   return GL_TRUE;  /* Is a non-existing attribute greater than or 
equal to value? */
+}
+
+static int
+booleanSupported(struct glx_config *mode, unsigned int attrib, unsigned int 
value)
+{
+   unsigned int glxValue;
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(attribMap); i++)
+  if (attribMap[i].attrib == attrib) {
+ glxValue = *(unsigned int *) ((char *) mode + attribMap[i].offset);
+ return glxValue == GLX_DONT_CARE || glxValue;
+  }
+
+   return GL_TRUE;  /* Is a non-existing attribute supported? */
+}
+
+static int
 driConfigEqual(const __DRIcoreExtension *core,
struct glx_config *config, const __DRIconfig *driConfig)
 {
@@ -313,10 +343,37 @@ driConfigEqual(const __DRIcoreExtension *core,
  if (value & __DRI_ATTRIB_TEXTURE_RECTANGLE_BIT)
 glxValue |= GLX_TEXTURE_RECTANGLE_BIT_EXT;
  if (config->bindToTextureTargets != GLX_DONT_CARE &&
- glxValue != config->bindToTextureTargets)
+ glxValue != (config->bindToTextureTargets & glxValue))
+return GL_FALSE;
+ break;
+
+  case __DRI_ATTRIB_STENCIL_SIZE:
+  case __DRI_ATTRIB_ACCUM_RED_SIZE:
+  case __DRI_ATTRIB_ACCUM_GREEN_SIZE:
+  case __DRI_ATTRIB_ACCUM_BLUE_SIZE:
+  case __DRI_ATTRIB_ACCUM_ALPHA_SIZE:
+ if (value != 0 && !scalarEqual(config, attrib, value))
 return GL_FALSE;
  break;
 
+  case __DRI_ATTRIB_DOUBLE_BUFFER:
+  case __DRI_ATTRIB_BIND_TO_TEXTURE_RGB:
+  case __DRI_ATTRIB_BIND_TO_TEXTURE_RGBA:
+  case __DRI_ATTRIB_BIND_TO_MIPMAP_TEXTURE:
+  case __DRI_ATTRIB_FRAMEBUFFER_SRGB_CAPABLE:
+  if (value && !booleanSupported(config, attrib, value))
+return GL_FALSE;
+  break;
+
+  case __DRI_ATTRIB_SAMPLE_BUFFERS:
+  case __DRI_ATTRIB_SAMPLES:
+  case __DRI_ATTRIB_AUX_BUFFERS:
+  case __DRI_ATTRIB_MAX_PBUFFER_WIDTH:
+  case __DRI_ATTRIB_MAX_PBUFFER_HEIGHT:
+  case __DRI_ATTRIB_MAX_PBUFFER_PIXELS:
+ if (!scalarGreaterEqual(config, attrib, value))
+return GL_FALSE;
+
   default:
  if (!scalarEqual(config, attrib, value))
 return GL_FALSE;
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] c99_math: Implement exp2f for MSVC.

2015-07-16 Thread Jose Fonseca

On 16/07/15 05:30, Matt Turner wrote:

---
This will go in before my double promotion series which uses exp2f.

  include/c99_math.h | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/include/c99_math.h b/include/c99_math.h
index 7ed7cc2..0ca5a73 100644
--- a/include/c99_math.h
+++ b/include/c99_math.h
@@ -140,6 +140,12 @@ llrintf(float f)
 return rounded;
  }

+static inline float
+exp2f(float f)
+{
+   return powf(2.0f, f);
+}
+
  #endif /* C99 */





Looks good. Thanks.

Reviewed-by: Jose Fonseca 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: Detect and provide macros for function attributes pure and const.

2015-07-18 Thread Jose Fonseca

On 18/07/15 01:38, Eric Anholt wrote:

Emil Velikov  writes:


On 14/07/15 19:45, Eric Anholt wrote:

These are really useful hints to the compiler in the absence of link-time
optimization, and I'm going to use them in VC4.

I've made the const attribute be ATTRIBUTE_CONST unlike other function
attributes, because we have other things in the tree #defining CONST for
their own unrelated purposes.

Mindly related: how people feel about making these macros less screamy,
by following the approach used in the kernel: PURE -> __pure and so on ?


I'd love it.


Less screamy is fine, but beware prefixing double underscore: the C 
standard stipulates that its use is reserved for for C/C++ runtime. [1]


Look at stdlibc++ implementation: every internal variable has a double 
underscore prefix.


Maybe kernel gets away on GLIBC (and because it doesn't use C++), but 
there's no guarantee it will work on other C runtimes, and even if it 
does, it could start failing anytime.


Jose

[1] http://stackoverflow.com/a/224420
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallivm: Don't use raw_debug_ostream for dissasembling

2015-07-20 Thread Jose Fonseca

On 20/07/15 18:35, Tom Stellard wrote:

All LLVM API calls that require an ostream object have been removed from
the disassemble() function, so we don't need to use this class to wrap
_debug_printf() we can just call this function directly.
---
  src/gallium/auxiliary/gallivm/lp_bld_debug.cpp | 27 +-
  1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_debug.cpp 
b/src/gallium/auxiliary/gallivm/lp_bld_debug.cpp
index 405e648..ec88f33 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_debug.cpp
+++ b/src/gallium/auxiliary/gallivm/lp_bld_debug.cpp
@@ -123,7 +123,7 @@ lp_debug_dump_value(LLVMValueRef value)
   * - http://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html
   */
  static size_t
-disassemble(const void* func, llvm::raw_ostream & Out)
+disassemble(const void* func)
  {
 const uint8_t *bytes = (const uint8_t *)func;

@@ -141,7 +141,8 @@ disassemble(const void* func, llvm::raw_ostream & Out)
 char outline[1024];

 if (!D) {
-  Out << "error: couldn't create disassembler for triple " << Triple << 
"\n";
+  _debug_printf("error: couldn't create disassembler for triple %s\n",
+Triple.c_str());
return 0;
 }

@@ -155,13 +156,13 @@ disassemble(const void* func, llvm::raw_ostream & Out)
 * so that between runs.
 */

-  Out << llvm::format("%6lu:\t", (unsigned long)pc);
+  _debug_printf("%6lu:\t", (unsigned long)pc);

Size = LLVMDisasmInstruction(D, (uint8_t *)bytes + pc, extent - pc, 0, 
outline,
 sizeof outline);

if (!Size) {
- Out << "invalid\n";
+ _debug_printf("invalid\n");
   pc += 1;
   break;
}
@@ -173,10 +174,10 @@ disassemble(const void* func, llvm::raw_ostream & Out)
if (0) {
   unsigned i;
   for (i = 0; i < Size; ++i) {
-Out << llvm::format("%02x ", bytes[pc + i]);
+_debug_printf("%02x ", bytes[pc + i]);
   }
   for (; i < 16; ++i) {
-Out << "   ";
+_debug_printf("   ");
   }
}

@@ -184,9 +185,9 @@ disassemble(const void* func, llvm::raw_ostream & Out)
 * Print the instruction.
 */

-  Out << outline;
+  _debug_printf("%*s", Size, outline);

-  Out << "\n";
+  _debug_printf("\n");

/*
 * Stop disassembling on return statements, if there is no record of a
@@ -206,13 +207,12 @@ disassemble(const void* func, llvm::raw_ostream & Out)
pc += Size;

if (pc >= extent) {
- Out << "disassembly larger than " << extent << "bytes, aborting\n";
+ _debug_printf("disassembly larger than %ull bytes, aborting\n", 
extent);
   break;
}
 }

-   Out << "\n";
-   Out.flush();
+   _debug_printf("\n");

 LLVMDisasmDispose(D);

@@ -229,9 +229,8 @@ disassemble(const void* func, llvm::raw_ostream & Out)

  extern "C" void
  lp_disassemble(LLVMValueRef func, const void *code) {
-   raw_debug_ostream Out;
-   Out << LLVMGetValueName(func) << ":\n";
-   disassemble(code, Out);
+   _debug_printf("%s:\n", LLVMGetValueName(func));
+   disassemble(code);
  }





Series looks good AFAICT.

Reviewed-by: Jose Fonseca 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: Detect and provide macros for function attributes pure and const.

2015-07-22 Thread Jose Fonseca

On 21/07/15 15:57, Emil Velikov wrote:

On 18 July 2015 at 08:13, Jose Fonseca  wrote:

On 18/07/15 01:38, Eric Anholt wrote:


Emil Velikov  writes:


On 14/07/15 19:45, Eric Anholt wrote:


These are really useful hints to the compiler in the absence of
link-time
optimization, and I'm going to use them in VC4.

I've made the const attribute be ATTRIBUTE_CONST unlike other function
attributes, because we have other things in the tree #defining CONST for
their own unrelated purposes.


Mindly related: how people feel about making these macros less screamy,
by following the approach used in the kernel: PURE -> __pure and so on ?



I'd love it.



Less screamy is fine, but beware prefixing double underscore: the C standard
stipulates that its use is reserved for for C/C++ runtime. [1]


I though about it before posting although I've seen others define
those, even do so in their public headers.
Now that I have some examples from my current /usr/include

Searching for __pure
dwarves/dutil.h:#define __pure __attribute__ ((pure))

Searching for __attribute_const__
sys/cdefs.h:# define __attribute_const__ __attribute__ ((__const__))
sys/cdefs.h:# define __attribute_const__ /* Ignore */

Searching for __printf

Searching for __always_unused

Searching for __noreturn

Searching for __packed
libvisual-0.4/libvisual/lv_defines.h:# define __packed __attribute__ ((packed))
libvisual-0.4/libvisual/lv_defines.h:# define __packed /* no packed */
bsd/sys/cdefs.h:#  define __packed __attribute__((__packed__))
bsd/sys/cdefs.h:#  define __packed

Searching for __deprecated
pciaccess.h:#define __deprecated __attribute__((deprecated))
pciaccess.h:#define __deprecated

Searching for __weak

Searching for __alias

With a handful of other headers defining more double underscore prefixed macros.


Look at stdlibc++ implementation: every internal variable has a double
underscore prefix.


Unless we're talking about STL/other template library we don't care
what library foo uses in it's internal implementation do we ? After
all these will be resolved at compile time.


Maybe kernel gets away on GLIBC (and because it doesn't use C++), but
there's no guarantee it will work on other C runtimes, and even if it does,
it could start failing anytime.


True, it's not the best of ideas. Just worth pointing out that "the
cat is already out", for other projects.



>  There are already more than 12K "#define __foo" cases on my system.

These defines are reserved for system headers, so it's natural to be 
lots of them in /usr/include.



MacOSX also defines some of these on its sys/cdefs.h:

  http://www.opensource.apple.com/source/xnu/xnu-1456.1.26/bsd/sys/cdefs.h

The question is: can we expect that most systems will define these 
__foo, or at least not use them for other purposes.


I don't know the answer.  At a glance MSVC doesn't seem to rely on them 
for anything.  So it might work.  I don't oppose if you want to give it 
a shot.



Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium/util: Stop bundling our snprintf implementation.

2015-07-22 Thread Jose Fonseca
Use MSVCRT functions instead.  Their semantics are slightly
different but they can be made to work as expected.

Also, use the same code paths for both MSVCRT and MinGW.

No testing yet.  Just built.

https://bugs.freedesktop.org/show_bug.cgi?id=91418
---
 src/gallium/auxiliary/Makefile.sources  |1 -
 src/gallium/auxiliary/util/u_snprintf.c | 1480 ---
 src/gallium/auxiliary/util/u_string.h   |   33 +-
 3 files changed, 30 insertions(+), 1484 deletions(-)
 delete mode 100644 src/gallium/auxiliary/util/u_snprintf.c

diff --git a/src/gallium/auxiliary/Makefile.sources 
b/src/gallium/auxiliary/Makefile.sources
index 62e6b94..3616d88 100644
--- a/src/gallium/auxiliary/Makefile.sources
+++ b/src/gallium/auxiliary/Makefile.sources
@@ -274,7 +274,6 @@ C_SOURCES := \
util/u_simple_shaders.h \
util/u_slab.c \
util/u_slab.h \
-   util/u_snprintf.c \
util/u_split_prim.h \
util/u_sse.h \
util/u_staging.c \
diff --git a/src/gallium/auxiliary/util/u_snprintf.c 
b/src/gallium/auxiliary/util/u_snprintf.c
deleted file mode 100644
index 39e9b70..000
--- a/src/gallium/auxiliary/util/u_snprintf.c
+++ /dev/null
@@ -1,1480 +0,0 @@
-/*
- * Copyright (c) 1995 Patrick Powell.
- *
- * This code is based on code written by Patrick Powell .
- * It may be used for any purpose as long as this notice remains intact on all
- * source code distributions.
- */
-
-/*
- * Copyright (c) 2008 Holger Weiss.
- *
- * This version of the code is maintained by Holger Weiss .
- * My changes to the code may freely be used, modified and/or redistributed for
- * any purpose.  It would be nice if additions and fixes to this file 
(including
- * trivial code cleanups) would be sent back in order to let me include them in
- * the version available at .
- * However, this is not a requirement for using or redistributing (possibly
- * modified) versions of this file, nor is leaving this notice intact 
mandatory.
- */
-
-/*
- * History
- *
- * 2008-01-20 Holger Weiss  for C99-snprintf 1.1:
- *
- * Fixed the detection of infinite floating point values on IRIX (and
- * possibly other systems) and applied another few minor cleanups.
- *
- * 2008-01-06 Holger Weiss  for C99-snprintf 1.0:
- *
- * Added a lot of new features, fixed many bugs, and incorporated various
- * improvements done by Andrew Tridgell , Russ Allbery
- * , Hrvoje Niksic , Damien Miller
- * , and others for the Samba, INN, Wget, and OpenSSH
- * projects.  The additions include: support the "e", "E", "g", "G", and
- * "F" conversion specifiers (and use conversion style "f" or "F" for the
- * still unsupported "a" and "A" specifiers); support the "hh", "ll", "j",
- * "t", and "z" length modifiers; support the "#" flag and the (non-C99)
- * "'" flag; use localeconv(3) (if available) to get both the current
- * locale's decimal point character and the separator between groups of
- * digits; fix the handling of various corner cases of field width and
- * precision specifications; fix various floating point conversion bugs;
- * handle infinite and NaN floating point values; don't attempt to write to
- * the output buffer (which may be NULL) if a size of zero was specified;
- * check for integer overflow of the field width, precision, and return
- * values and during the floating point conversion; use the OUTCHAR() macro
- * instead of a function for better performance; provide asprintf(3) and
- * vasprintf(3) functions; add new test cases.  The replacement functions
- * have been renamed to use an "rpl_" prefix, the function calls in the
- * main project (and in this file) must be redefined accordingly for each
- * replacement function which is needed (by using Autoconf or other means).
- * Various other minor improvements have been applied and the coding style
- * was cleaned up for consistency.
- *
- * 2007-07-23 Holger Weiss  for Mutt 1.5.13:
- *
- * C99 compliant snprintf(3) and vsnprintf(3) functions return the number
- * of characters that would have been written to a sufficiently sized
- * buffer (excluding the '\0').  The original code simply returned the
- * length of the resulting output string, so that's been fixed.
- *
- * 1998-03-05 Michael Elkins  for Mutt 0.90.8:
- *
- * The original code assumed that both snprintf(3) and vsnprintf(3) were
- * missing.  Some systems only have snprintf(3) but not vsnprintf(3), so
- * the code is now broken down under HAVE_SNPRINTF and HAVE_VSNPRINTF.
- *
- * 1998-01-27 Thomas Roessler  for Mutt 0.89i:
- *
- * The PGP code was using unsigned hexadecimal formats.  Unfortunately,
- * unsigned formats simply didn't work.
- *
- * 1997-10-22 Brandon Long  for Mutt 0.87.1:
- *
- * Ok, added some minimal floating point support, which means this probably
- * requires libm on most operating 

Re: [Mesa-dev] [PATCH] targets/dri: scons: add missing link against libdrm

2015-07-22 Thread Jose Fonseca

On 22/07/15 16:04, Emil Velikov wrote:

Otherwise the final dri module will have (additional) unresolved
symbols.

Cc: Brian Paul 
Cc: Jose Fonseca 
Signed-off-by: Emil Velikov 
---

We can only fix the remaining unresolved symbols (_glapi_foo), as we
remove the non-shared glapi when building with DRI.

With this we at least match the autotools build.

-Emil

  src/gallium/targets/dri/SConscript | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/src/gallium/targets/dri/SConscript 
b/src/gallium/targets/dri/SConscript
index 8d29f3b..2fb0da0 100644
--- a/src/gallium/targets/dri/SConscript
+++ b/src/gallium/targets/dri/SConscript
@@ -25,6 +25,8 @@ if env['llvm']:
  env.Append(CPPDEFINES = 'GALLIUM_LLVMPIPE')
  env.Prepend(LIBS = [llvmpipe])

+env.PkgUseModules('DRM')
+
  env.Append(CPPDEFINES = [
  'GALLIUM_VMWGFX',
  'GALLIUM_SOFTPIPE',



Reviwed-by: Jose Fonseca 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: Detect and provide macros for function attributes pure and const.

2015-07-22 Thread Jose Fonseca

On 22/07/15 17:13, Jose Fonseca wrote:

On 21/07/15 15:57, Emil Velikov wrote:

On 18 July 2015 at 08:13, Jose Fonseca  wrote:

On 18/07/15 01:38, Eric Anholt wrote:


Emil Velikov  writes:


On 14/07/15 19:45, Eric Anholt wrote:


These are really useful hints to the compiler in the absence of
link-time
optimization, and I'm going to use them in VC4.

I've made the const attribute be ATTRIBUTE_CONST unlike other
function
attributes, because we have other things in the tree #defining
CONST for
their own unrelated purposes.


Mindly related: how people feel about making these macros less
screamy,
by following the approach used in the kernel: PURE -> __pure and so
on ?



I'd love it.



Less screamy is fine, but beware prefixing double underscore: the C
standard
stipulates that its use is reserved for for C/C++ runtime. [1]


I though about it before posting although I've seen others define
those, even do so in their public headers.
Now that I have some examples from my current /usr/include

Searching for __pure
dwarves/dutil.h:#define __pure __attribute__ ((pure))

Searching for __attribute_const__
sys/cdefs.h:# define __attribute_const__ __attribute__ ((__const__))
sys/cdefs.h:# define __attribute_const__ /* Ignore */

Searching for __printf

Searching for __always_unused

Searching for __noreturn

Searching for __packed
libvisual-0.4/libvisual/lv_defines.h:# define __packed __attribute__
((packed))
libvisual-0.4/libvisual/lv_defines.h:# define __packed /* no packed */
bsd/sys/cdefs.h:#  define __packed __attribute__((__packed__))
bsd/sys/cdefs.h:#  define __packed

Searching for __deprecated
pciaccess.h:#define __deprecated __attribute__((deprecated))
pciaccess.h:#define __deprecated

Searching for __weak

Searching for __alias

With a handful of other headers defining more double underscore
prefixed macros.


Look at stdlibc++ implementation: every internal variable has a double
underscore prefix.


Unless we're talking about STL/other template library we don't care
what library foo uses in it's internal implementation do we ? After
all these will be resolved at compile time.


Maybe kernel gets away on GLIBC (and because it doesn't use C++), but
there's no guarantee it will work on other C runtimes, and even if it
does,
it could start failing anytime.


True, it's not the best of ideas. Just worth pointing out that "the
cat is already out", for other projects.



 >  There are already more than 12K "#define __foo" cases on my system.

These defines are reserved for system headers, so it's natural to be
lots of them in /usr/include.


MacOSX also defines some of these on its sys/cdefs.h:

   http://www.opensource.apple.com/source/xnu/xnu-1456.1.26/bsd/sys/cdefs.h

The question is: can we expect that most systems will define these
__foo, or at least not use them for other purposes.

I don't know the answer.  At a glance MSVC doesn't seem to rely on them
for anything.  So it might work.  I don't oppose if you want to give it
a shot.


Jose


Ironically, Windows headers already define PURE (its used in COM 
interfaces).  Just realized this now after this was merged.


So we really need a different name...

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: Detect and provide macros for function attributes pure and const.

2015-07-22 Thread Jose Fonseca

On 22/07/15 21:01, Jose Fonseca wrote:

On 22/07/15 17:13, Jose Fonseca wrote:

On 21/07/15 15:57, Emil Velikov wrote:

On 18 July 2015 at 08:13, Jose Fonseca  wrote:

On 18/07/15 01:38, Eric Anholt wrote:


Emil Velikov  writes:


On 14/07/15 19:45, Eric Anholt wrote:


These are really useful hints to the compiler in the absence of
link-time
optimization, and I'm going to use them in VC4.

I've made the const attribute be ATTRIBUTE_CONST unlike other
function
attributes, because we have other things in the tree #defining
CONST for
their own unrelated purposes.


Mindly related: how people feel about making these macros less
screamy,
by following the approach used in the kernel: PURE -> __pure and so
on ?



I'd love it.



Less screamy is fine, but beware prefixing double underscore: the C
standard
stipulates that its use is reserved for for C/C++ runtime. [1]


I though about it before posting although I've seen others define
those, even do so in their public headers.
Now that I have some examples from my current /usr/include

Searching for __pure
dwarves/dutil.h:#define __pure __attribute__ ((pure))

Searching for __attribute_const__
sys/cdefs.h:# define __attribute_const__ __attribute__ ((__const__))
sys/cdefs.h:# define __attribute_const__ /* Ignore */

Searching for __printf

Searching for __always_unused

Searching for __noreturn

Searching for __packed
libvisual-0.4/libvisual/lv_defines.h:# define __packed __attribute__
((packed))
libvisual-0.4/libvisual/lv_defines.h:# define __packed /* no packed */
bsd/sys/cdefs.h:#  define __packed __attribute__((__packed__))
bsd/sys/cdefs.h:#  define __packed

Searching for __deprecated
pciaccess.h:#define __deprecated __attribute__((deprecated))
pciaccess.h:#define __deprecated

Searching for __weak

Searching for __alias

With a handful of other headers defining more double underscore
prefixed macros.


Look at stdlibc++ implementation: every internal variable has a double
underscore prefix.


Unless we're talking about STL/other template library we don't care
what library foo uses in it's internal implementation do we ? After
all these will be resolved at compile time.


Maybe kernel gets away on GLIBC (and because it doesn't use C++), but
there's no guarantee it will work on other C runtimes, and even if it
does,
it could start failing anytime.


True, it's not the best of ideas. Just worth pointing out that "the
cat is already out", for other projects.



 >  There are already more than 12K "#define __foo" cases on my system.

These defines are reserved for system headers, so it's natural to be
lots of them in /usr/include.


MacOSX also defines some of these on its sys/cdefs.h:


http://www.opensource.apple.com/source/xnu/xnu-1456.1.26/bsd/sys/cdefs.h

The question is: can we expect that most systems will define these
__foo, or at least not use them for other purposes.

I don't know the answer.  At a glance MSVC doesn't seem to rely on them
for anything.  So it might work.  I don't oppose if you want to give it
a shot.


Jose


Ironically, Windows headers already define PURE (its used in COM
interfaces).  Just realized this now after this was merged.

So we really need a different name...


Question is which name: "ATTRIBUTE_PURE", "__pure", something else?

As I said, I don't oppose __pure, but some logic will be necessary to 
avoid redefition when system headers (like sys/cdefs.h) already define 
it, which is not trivial.  It will need something like



   #if HAVE_SYS_CDEFS_H
   #include   // for __pure
   #endif
   #ifndef __pure
   
   #endif


Jose




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Using the right context in st_texture_release_all_sampler_views()

2015-07-22 Thread Jose Fonseca

On 22/07/15 23:32, Brian Paul wrote:

Hi Marek,

This is regarding your commit "st/mesa: use pipe_sampler_view_release
for releasing sampler views" from last October.

Basically, we have:

void
st_texture_release_all_sampler_views(struct st_context *st,
  struct st_texture_object *stObj)
{
GLuint i;

/* XXX This should use sampler_views[i]->pipe, not st->pipe */
for (i = 0; i < stObj->num_sampler_views; ++i)
   pipe_sampler_view_release(st->pipe, &stObj->sampler_views[i]);
}

Our VMware/svga driver has an issue when
pipe_context::sampler_view_destroy() is called with one context and a
sampler view which was created with another context.  I can hack around
it in our driver code, but it would be nice to fix this in the state
tracker.

Ideally, the above code should be something like:

for (i = 0; i < stObj->num_sampler_views; ++i)
   pipe_sampler_view_reference(&stObj->sampler_views[i], NULL);

The current code which uses the st->pipe context came from the bug
https://bugs.freedesktop.org/show_bug.cgi?id=81680

AFAICT, you were just working around an R600 driver issue.  Any chance
we could fix the state tracker and re-test Firefox on R600?


Freeing a view from a different context with another context is wrong.

But freeing a view with a context that might be current on another 
thread is also wrong, as pipe_context are not thread safe.



If that was the previous behavior, then maybe Firefox crashed due to the 
race conditions.  Looking at the stack trace, I suspect that what 
happened was that one context was freeing the st_texture, and all its 
views, while the other context was destroying itself, and all its own 
views.  In short, st_texture_release_sampler_view was being called 
simultanouesly from two different threads.



The proper fix IMO is to let the pipe context that owns the view to 
destroyed from whatever thread it is current (or whenever it is made 
current.)



Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Using the right context in st_texture_release_all_sampler_views()

2015-07-22 Thread Jose Fonseca

On 23/07/15 01:00, Brian Paul wrote:

On 07/22/2015 05:31 PM, Jose Fonseca wrote:

On 22/07/15 23:32, Brian Paul wrote:

Hi Marek,

This is regarding your commit "st/mesa: use pipe_sampler_view_release
for releasing sampler views" from last October.

Basically, we have:

void
st_texture_release_all_sampler_views(struct st_context *st,
  struct st_texture_object *stObj)
{
GLuint i;

/* XXX This should use sampler_views[i]->pipe, not st->pipe */
for (i = 0; i < stObj->num_sampler_views; ++i)
   pipe_sampler_view_release(st->pipe, &stObj->sampler_views[i]);
}

Our VMware/svga driver has an issue when
pipe_context::sampler_view_destroy() is called with one context and a
sampler view which was created with another context.  I can hack around
it in our driver code, but it would be nice to fix this in the state
tracker.

Ideally, the above code should be something like:

for (i = 0; i < stObj->num_sampler_views; ++i)
   pipe_sampler_view_reference(&stObj->sampler_views[i], NULL);

The current code which uses the st->pipe context came from the bug
https://bugs.freedesktop.org/show_bug.cgi?id=81680

AFAICT, you were just working around an R600 driver issue.  Any chance
we could fix the state tracker and re-test Firefox on R600?


Freeing a view from a different context with another context is wrong.

But freeing a view with a context that might be current on another
thread is also wrong, as pipe_context are not thread safe.


If that was the previous behavior, then maybe Firefox crashed due to the
race conditions.  Looking at the stack trace, I suspect that what
happened was that one context was freeing the st_texture, and all its
views, while the other context was destroying itself, and all its own
views.  In short, st_texture_release_sampler_view was being called
simultanouesly from two different threads.


The proper fix IMO is to let the pipe context that owns the view to
destroyed from whatever thread it is current (or whenever it is made
current.)


OK, in our off-list discussion it wasn't clear to me that
multi-threading was your main concern until your last message.  I see
what you're saying now.

So, when we destroy a texture object which may be shared by multiple
contexts, we theoretically need to move the texture's sampler views to
new, per-context lists.  At some later time, when the context is used,
we'd check if the list of sampler views to be destroyed was non-empty
and free them.

Unfortunately, that involves one context/thread reaching into another
context.  And that sounds messy.


Yes.  We could have a sepeare mutex for this list.

Still, it's hard to prevent dead-locks (e.g, two contexts trying to add 
a deferred view to each-others defferred list.)



The ultimate solution may be to get rid of the per-texture list of
sampler views and instead store sampler views in a context-private data
structure.  I'll have to think that through.


Yes, that's another possibility. I haven't thought it through either.

One way or another, you'll still want to notify a context to "garbagge 
collect" its views, even if it's just an atomic flag on each context, 
otherwise a texture that was shared by two contexts might end up 
lingering around indefinitely because contexts are holding to its views.



Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Using the right context in st_texture_release_all_sampler_views()

2015-07-23 Thread Jose Fonseca

On 23/07/15 01:08, Jose Fonseca wrote:

On 23/07/15 01:00, Brian Paul wrote:

On 07/22/2015 05:31 PM, Jose Fonseca wrote:

On 22/07/15 23:32, Brian Paul wrote:

Hi Marek,

This is regarding your commit "st/mesa: use pipe_sampler_view_release
for releasing sampler views" from last October.

Basically, we have:

void
st_texture_release_all_sampler_views(struct st_context *st,
  struct st_texture_object *stObj)
{
GLuint i;

/* XXX This should use sampler_views[i]->pipe, not st->pipe */
for (i = 0; i < stObj->num_sampler_views; ++i)
   pipe_sampler_view_release(st->pipe, &stObj->sampler_views[i]);
}

Our VMware/svga driver has an issue when
pipe_context::sampler_view_destroy() is called with one context and a
sampler view which was created with another context.  I can hack around
it in our driver code, but it would be nice to fix this in the state
tracker.

Ideally, the above code should be something like:

for (i = 0; i < stObj->num_sampler_views; ++i)
   pipe_sampler_view_reference(&stObj->sampler_views[i], NULL);

The current code which uses the st->pipe context came from the bug
https://bugs.freedesktop.org/show_bug.cgi?id=81680

AFAICT, you were just working around an R600 driver issue.  Any chance
we could fix the state tracker and re-test Firefox on R600?


Freeing a view from a different context with another context is wrong.

But freeing a view with a context that might be current on another
thread is also wrong, as pipe_context are not thread safe.


If that was the previous behavior, then maybe Firefox crashed due to the
race conditions.  Looking at the stack trace, I suspect that what
happened was that one context was freeing the st_texture, and all its
views, while the other context was destroying itself, and all its own
views.  In short, st_texture_release_sampler_view was being called
simultanouesly from two different threads.


The proper fix IMO is to let the pipe context that owns the view to
destroyed from whatever thread it is current (or whenever it is made
current.)


OK, in our off-list discussion it wasn't clear to me that
multi-threading was your main concern until your last message.  I see
what you're saying now.

So, when we destroy a texture object which may be shared by multiple
contexts, we theoretically need to move the texture's sampler views to
new, per-context lists.  At some later time, when the context is used,
we'd check if the list of sampler views to be destroyed was non-empty
and free them.

Unfortunately, that involves one context/thread reaching into another
context.  And that sounds messy.


Yes.  We could have a sepeare mutex for this list.

Still, it's hard to prevent dead-locks (e.g, two contexts trying to add
a deferred view to each-others defferred list.)


The ultimate solution may be to get rid of the per-texture list of
sampler views and instead store sampler views in a context-private data
structure.  I'll have to think that through.


Yes, that's another possibility. I haven't thought it through either.

One way or another, you'll still want to notify a context to "garbagge
collect" its views, even if it's just an atomic flag on each context,
otherwise a texture that was shared by two contexts might end up
lingering around indefinitely because contexts are holding to its views.


What about this:

- we keep the surface views in a LRU-ordered linked list in each context 
 (whenever a view is unbounded, it's moved to the end of the list the 
time-stamp/frame-sequence number noted)


- and we dereference the old views whenever they haven't been refered 
for a while -- either in time (e.g, 1 second) or frames (e.g, 1 or 2 
frames.)


This has several benefits:

- the context that destroys the texture object doesn't need to worry 
about views from other contexts -- those context will automatically 
destroy them on their time


- there's no need for any context to ever need to refer other context's 
data.


- drivers for which views are heavy-weight  (e.g, contain auxiliary 
resources with their own storage) will reclaim that storage earlier if 
those views stop being needed



The aspect I'm less clear is how to effeciently find the views -- in 
addition of the LRU list, we'll probably need an additional data 
structure (an hash, or a table of lists indexed by texture object) to 
quickly find the views.



Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallivm: Don't use raw_debug_ostream for dissasembling

2015-07-23 Thread Jose Fonseca

On 20/07/15 21:39, Jose Fonseca wrote:

On 20/07/15 18:35, Tom Stellard wrote:

All LLVM API calls that require an ostream object have been removed from
the disassemble() function, so we don't need to use this class to wrap
_debug_printf() we can just call this function directly.
---
  src/gallium/auxiliary/gallivm/lp_bld_debug.cpp | 27
+-
  1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_debug.cpp
b/src/gallium/auxiliary/gallivm/lp_bld_debug.cpp
index 405e648..ec88f33 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_debug.cpp
+++ b/src/gallium/auxiliary/gallivm/lp_bld_debug.cpp
@@ -123,7 +123,7 @@ lp_debug_dump_value(LLVMValueRef value)
   * - http://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html
   */
  static size_t
-disassemble(const void* func, llvm::raw_ostream & Out)
+disassemble(const void* func)
  {
 const uint8_t *bytes = (const uint8_t *)func;

@@ -141,7 +141,8 @@ disassemble(const void* func, llvm::raw_ostream &
Out)
 char outline[1024];

 if (!D) {
-  Out << "error: couldn't create disassembler for triple " <<
Triple << "\n";
+  _debug_printf("error: couldn't create disassembler for triple
%s\n",
+Triple.c_str());
return 0;
 }

@@ -155,13 +156,13 @@ disassemble(const void* func, llvm::raw_ostream
& Out)
 * so that between runs.
 */

-  Out << llvm::format("%6lu:\t", (unsigned long)pc);
+  _debug_printf("%6lu:\t", (unsigned long)pc);

Size = LLVMDisasmInstruction(D, (uint8_t *)bytes + pc, extent
- pc, 0, outline,
 sizeof outline);

if (!Size) {
- Out << "invalid\n";
+ _debug_printf("invalid\n");
   pc += 1;
   break;
}
@@ -173,10 +174,10 @@ disassemble(const void* func, llvm::raw_ostream
& Out)
if (0) {
   unsigned i;
   for (i = 0; i < Size; ++i) {
-Out << llvm::format("%02x ", bytes[pc + i]);
+_debug_printf("%02x ", bytes[pc + i]);
   }
   for (; i < 16; ++i) {
-Out << "   ";
+_debug_printf("   ");
   }
}

@@ -184,9 +185,9 @@ disassemble(const void* func, llvm::raw_ostream &
Out)
 * Print the instruction.
 */

-  Out << outline;
+  _debug_printf("%*s", Size, outline);

-  Out << "\n";
+  _debug_printf("\n");

/*
 * Stop disassembling on return statements, if there is no
record of a
@@ -206,13 +207,12 @@ disassemble(const void* func, llvm::raw_ostream
& Out)
pc += Size;

if (pc >= extent) {
- Out << "disassembly larger than " << extent << "bytes,
aborting\n";
+ _debug_printf("disassembly larger than %ull bytes,
aborting\n", extent);
   break;
}
 }

-   Out << "\n";
-   Out.flush();
+   _debug_printf("\n");

 LLVMDisasmDispose(D);

@@ -229,9 +229,8 @@ disassemble(const void* func, llvm::raw_ostream &
Out)

  extern "C" void
  lp_disassemble(LLVMValueRef func, const void *code) {
-   raw_debug_ostream Out;
-   Out << LLVMGetValueName(func) << ":\n";
-   disassemble(code, Out);
+   _debug_printf("%s:\n", LLVMGetValueName(func));
+   disassemble(code);
  }





Series looks good AFAICT.

Reviewed-by: Jose Fonseca 


Tom,

This broke the profile build.  I fixed with commit 
d6b50ba980b733a82fefe2a0f115635a359c445f, but only then I realized that 
this change actually causes regression of functionality.


The debug stream was being used so that we would write the assembly to 
/tmp/perf-X.map.asm files, so that we could later annotate the 
assembly intrusctions with the profile hits (via the bin/perf-annotate-jit).


But now the assembly is being redirected into the stderr, hence 
perf-annotate-jit won't get anything from /tmp/perf-X.map.asm.


What was your motivation for this change? General cleanup, or is the 
llvm::raw_ostream causing you problems?


I think that, if I don't reinstate the debug_stream, then I'll need to 
replace it with STL streams (std::stringstream and std::ostream).


Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Match swrast modes more loosely.

2015-07-23 Thread Jose Fonseca
Sure.  It's not easy to grasp the side effects of this, so it doesn't 
surprise me.


Do you know which hunk caused problems?

Also, I wonder if it would be possible to make the relaxed matching 
specific to swrast. (Because for HW renderer it's pretty much guaranteed 
that the X visuals will match -- the problem is SW rendering with X 
servers running something else.)


Jose

On 23/07/15 20:54, Marek Olšák wrote:

Hi Jose,

FYI, I had to revert this, because it broke glxgears on radeonsi.

Marek

On Wed, Jul 15, 2015 at 3:25 PM, Jose Fonseca  wrote:

From: Tom Hughes 

https://bugs.freedesktop.org/show_bug.cgi?id=90817

Signed-off-by: Jose Fonseca 
---
  src/glx/dri_common.c | 59 +++-
  1 file changed, 58 insertions(+), 1 deletion(-)

diff --git a/src/glx/dri_common.c b/src/glx/dri_common.c
index 63c8de3..1a62ee2 100644
--- a/src/glx/dri_common.c
+++ b/src/glx/dri_common.c
@@ -266,6 +266,36 @@ scalarEqual(struct glx_config *mode, unsigned int attrib, 
unsigned int value)
  }

  static int
+scalarGreaterEqual(struct glx_config *mode, unsigned int attrib, unsigned int 
value)
+{
+   unsigned int glxValue;
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(attribMap); i++)
+  if (attribMap[i].attrib == attrib) {
+ glxValue = *(unsigned int *) ((char *) mode + attribMap[i].offset);
+ return glxValue == GLX_DONT_CARE || glxValue >= value;
+  }
+
+   return GL_TRUE;  /* Is a non-existing attribute greater than or 
equal to value? */
+}
+
+static int
+booleanSupported(struct glx_config *mode, unsigned int attrib, unsigned int 
value)
+{
+   unsigned int glxValue;
+   int i;
+
+   for (i = 0; i < ARRAY_SIZE(attribMap); i++)
+  if (attribMap[i].attrib == attrib) {
+ glxValue = *(unsigned int *) ((char *) mode + attribMap[i].offset);
+ return glxValue == GLX_DONT_CARE || glxValue;
+  }
+
+   return GL_TRUE;  /* Is a non-existing attribute supported? */
+}
+
+static int
  driConfigEqual(const __DRIcoreExtension *core,
 struct glx_config *config, const __DRIconfig *driConfig)
  {
@@ -313,10 +343,37 @@ driConfigEqual(const __DRIcoreExtension *core,
   if (value & __DRI_ATTRIB_TEXTURE_RECTANGLE_BIT)
  glxValue |= GLX_TEXTURE_RECTANGLE_BIT_EXT;
   if (config->bindToTextureTargets != GLX_DONT_CARE &&
- glxValue != config->bindToTextureTargets)
+ glxValue != (config->bindToTextureTargets & glxValue))
+return GL_FALSE;
+ break;
+
+  case __DRI_ATTRIB_STENCIL_SIZE:
+  case __DRI_ATTRIB_ACCUM_RED_SIZE:
+  case __DRI_ATTRIB_ACCUM_GREEN_SIZE:
+  case __DRI_ATTRIB_ACCUM_BLUE_SIZE:
+  case __DRI_ATTRIB_ACCUM_ALPHA_SIZE:
+ if (value != 0 && !scalarEqual(config, attrib, value))
  return GL_FALSE;
   break;

+  case __DRI_ATTRIB_DOUBLE_BUFFER:
+  case __DRI_ATTRIB_BIND_TO_TEXTURE_RGB:
+  case __DRI_ATTRIB_BIND_TO_TEXTURE_RGBA:
+  case __DRI_ATTRIB_BIND_TO_MIPMAP_TEXTURE:
+  case __DRI_ATTRIB_FRAMEBUFFER_SRGB_CAPABLE:
+  if (value && !booleanSupported(config, attrib, value))
+return GL_FALSE;
+  break;
+
+  case __DRI_ATTRIB_SAMPLE_BUFFERS:
+  case __DRI_ATTRIB_SAMPLES:
+  case __DRI_ATTRIB_AUX_BUFFERS:
+  case __DRI_ATTRIB_MAX_PBUFFER_WIDTH:
+  case __DRI_ATTRIB_MAX_PBUFFER_HEIGHT:
+  case __DRI_ATTRIB_MAX_PBUFFER_PIXELS:
+ if (!scalarGreaterEqual(config, attrib, value))
+return GL_FALSE;
+
default:
   if (!scalarEqual(config, attrib, value))
  return GL_FALSE;
--
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/10] gallivm: implement the correct version of LRP

2015-10-15 Thread Jose Fonseca

Roland was on PTO.

IMO, the change makes sense from a numeric accuracy POV.

I fear this might cause some slowdown with llvmpipe (two muls intead of 
one), but hopefully it won't be significant.  The accuracy issue could 
cause glitches to llvmpipe too.


Jose

On 15/10/15 15:44, Marek Olšák wrote:

Any comment or is this okay with people? Given, "(1-t)*a + t*b", the
original code didn't return b for t=1 because it's "floating-point".

Marek

On Sun, Oct 11, 2015 at 3:29 AM, Marek Olšák  wrote:

From: Marek Olšák 

The previous version has precision issues. This can be a problem
with tessellation. Sadly, I can't find the article where I read it
anymore. I'm not sure if the unsafe-fp-math flag would be enough to revert
this.
---
  src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c | 13 +++--
  1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
index 0ad78b0..512558b 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
@@ -538,12 +538,13 @@ lrp_emit(
 struct lp_build_tgsi_context * bld_base,
 struct lp_build_emit_data * emit_data)
  {
-   LLVMValueRef tmp;
-   tmp = lp_build_emit_llvm_binary(bld_base, TGSI_OPCODE_SUB,
-   emit_data->args[1],
-   emit_data->args[2]);
-   emit_data->output[emit_data->chan] = lp_build_emit_llvm_ternary(bld_base,
-TGSI_OPCODE_MAD, emit_data->args[0], tmp, 
emit_data->args[2]);
+   struct lp_build_context *bld = &bld_base->base;
+   LLVMValueRef inv, a, b;
+
+   inv = lp_build_sub(bld, bld_base->base.one, emit_data->args[0]);
+   a = lp_build_mul(bld, emit_data->args[1], emit_data->args[0]);
+   b = lp_build_mul(bld, emit_data->args[2], inv);
+   emit_data->output[emit_data->chan] = lp_build_add(bld, a, b);
  }

  /* TGSI_OPCODE_MAD */
--
2.1.4


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/10] gallivm: implement the correct version of LRP

2015-10-15 Thread Jose Fonseca

On 15/10/15 16:20, Roland Scheidegger wrote:

Am 15.10.2015 um 16:44 schrieb Marek Olšák:

Any comment or is this okay with people? Given, "(1-t)*a + t*b", the
original code didn't return b for t=1 because it's "floating-point".

Marek

On Sun, Oct 11, 2015 at 3:29 AM, Marek Olšák  wrote:

From: Marek Olšák 

The previous version has precision issues. This can be a problem
with tessellation. Sadly, I can't find the article where I read it
anymore. I'm not sure if the unsafe-fp-math flag would be enough to revert
this.
---
  src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c | 13 +++--
  1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c 
b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
index 0ad78b0..512558b 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_action.c
@@ -538,12 +538,13 @@ lrp_emit(
 struct lp_build_tgsi_context * bld_base,
 struct lp_build_emit_data * emit_data)
  {
-   LLVMValueRef tmp;
-   tmp = lp_build_emit_llvm_binary(bld_base, TGSI_OPCODE_SUB,
-   emit_data->args[1],
-   emit_data->args[2]);
-   emit_data->output[emit_data->chan] = lp_build_emit_llvm_ternary(bld_base,
-TGSI_OPCODE_MAD, emit_data->args[0], tmp, 
emit_data->args[2]);
+   struct lp_build_context *bld = &bld_base->base;
+   LLVMValueRef inv, a, b;
+
+   inv = lp_build_sub(bld, bld_base->base.one, emit_data->args[0]);
+   a = lp_build_mul(bld, emit_data->args[1], emit_data->args[0]);
+   b = lp_build_mul(bld, emit_data->args[2], inv);
+   emit_data->output[emit_data->chan] = lp_build_add(bld, a, b);
  }

  /* TGSI_OPCODE_MAD */
--


Please add a comment why it's using t*a + (1-t)*b and not (a-b)*t + b.
Though it is yet another thing we should have some more control over in
tgsi.


> Because if you're willing to allow unsafe-fp-math, then you should

also be willing to accept the simpler formula (I'm quite sure
unsafe-fp-math would be allowed to turn one formula into the other).


Yep, that's my understanding of "unsafe fp math" too.

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] mesa: fix incorrect opcode in save_BlendFunci()

2015-10-16 Thread Jose Fonseca

On 15/10/15 15:51, Brian Paul wrote:

Fixes assertion failure with new piglit
arb_draw_buffers_blend-state_set_get test.

Cc: mesa-sta...@lists.freedesktop.org
---
  src/mesa/main/dlist.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c
index fdb839c..2b65b2e 100644
--- a/src/mesa/main/dlist.c
+++ b/src/mesa/main/dlist.c
@@ -1400,7 +1400,7 @@ save_BlendFunci(GLuint buf, GLenum sfactor, GLenum 
dfactor)
 GET_CURRENT_CONTEXT(ctx);
 Node *n;
 ASSERT_OUTSIDE_SAVE_BEGIN_END_AND_FLUSH(ctx);
-   n = alloc_instruction(ctx, OPCODE_BLEND_FUNC_SEPARATE_I, 3);
+   n = alloc_instruction(ctx, OPCODE_BLEND_FUNC_I, 3);
 if (n) {
n[1].ui = buf;
n[2].e = sfactor;



Series is

Reviewed-by: Jose Fonseca 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/4] st/mesa: check of out-of-memory in st_DrawPixels()

2015-10-16 Thread Jose Fonseca

On 15/10/15 20:01, Brian Paul wrote:

Before, if make_texture() or st_create_texture_sampler_view() failed
we silently no-op'd the glDrawPixels.  Now, set GL_OUT_OF_MEMORY.
This also allows us to un-nest a bunch of code.
---
  src/mesa/state_tracker/st_cb_drawpixels.c | 74 +--
  1 file changed, 40 insertions(+), 34 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c 
b/src/mesa/state_tracker/st_cb_drawpixels.c
index e4d3580..05f6e6b 100644
--- a/src/mesa/state_tracker/st_cb_drawpixels.c
+++ b/src/mesa/state_tracker/st_cb_drawpixels.c
@@ -975,6 +975,7 @@ st_DrawPixels(struct gl_context *ctx, GLint x, GLint y,
 int num_sampler_view = 1;
 struct gl_pixelstore_attrib clippedUnpack;
 struct st_fp_variant *fpv = NULL;
+   struct pipe_resource *pt;

 /* Mesa state should be up to date by now */
 assert(ctx->NewState == 0x0);
@@ -1030,42 +1031,47 @@ st_DrawPixels(struct gl_context *ctx, GLint x, GLint y,
st_upload_constants(st, fpv->parameters, PIPE_SHADER_FRAGMENT);
 }

-   /* draw with textured quad */
-   {
-  struct pipe_resource *pt
- = make_texture(st, width, height, format, type, unpack, pixels);
-  if (pt) {
- sv[0] = st_create_texture_sampler_view(st->pipe, pt);
-
- if (sv[0]) {
-/* Create a second sampler view to read stencil.
- * The stencil is written using the shader stencil export
- * functionality. */
-if (write_stencil) {
-   enum pipe_format stencil_format =
- util_format_stencil_only(pt->format);
-   /* we should not be doing pixel map/transfer (see above) */
-   assert(num_sampler_view == 1);
-   sv[1] = st_create_texture_sampler_view_format(st->pipe, pt,
- stencil_format);
-   num_sampler_view++;
-}
+   /* Put glDrawPixels image into a texture */
+   pt = make_texture(st, width, height, format, type, unpack, pixels);
+   if (!pt) {
+  _mesa_error(ctx, GL_OUT_OF_MEMORY, "glDrawPixels");
+  return;
+   }

-draw_textured_quad(ctx, x, y, ctx->Current.RasterPos[2],
-   width, height,
-   ctx->Pixel.ZoomX, ctx->Pixel.ZoomY,
-   sv,
-   num_sampler_view,
-   driver_vp,
-   driver_fp, fpv,
-   color, GL_FALSE, write_depth, write_stencil);
-pipe_sampler_view_reference(&sv[0], NULL);
-if (num_sampler_view > 1)
-   pipe_sampler_view_reference(&sv[1], NULL);
- }
- pipe_resource_reference(&pt, NULL);
-  }
+   /* create sampler view for the image */
+   sv[0] = st_create_texture_sampler_view(st->pipe, pt);
+   if (!sv[0]) {
+  _mesa_error(ctx, GL_OUT_OF_MEMORY, "glDrawPixels");
+  pipe_resource_reference(&pt, NULL);
+  return;
 }
+
+   /* Create a second sampler view to read stencil.  The stencil is
+* written using the shader stencil export functionality.
+*/
+   if (write_stencil) {
+  enum pipe_format stencil_format =
+ util_format_stencil_only(pt->format);
+  /* we should not be doing pixel map/transfer (see above) */
+  assert(num_sampler_view == 1);
+  sv[1] = st_create_texture_sampler_view_format(st->pipe, pt,
+stencil_format);


Should check null sv[1] here too.


+  num_sampler_view++;
+   }
+
+   draw_textured_quad(ctx, x, y, ctx->Current.RasterPos[2],
+  width, height,
+  ctx->Pixel.ZoomX, ctx->Pixel.ZoomY,
+  sv,
+  num_sampler_view,
+  driver_vp,
+  driver_fp, fpv,
+  color, GL_FALSE, write_depth, write_stencil);
+   pipe_sampler_view_reference(&sv[0], NULL);
+   if (num_sampler_view > 1)
+  pipe_sampler_view_reference(&sv[1], NULL);
+
+   pipe_resource_reference(&pt, NULL);
  }






Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] st/mesa: optimize 4-component ubyte glDrawPixels

2015-10-16 Thread Jose Fonseca

On 15/10/15 20:01, Brian Paul wrote:

If we didn't find a gallium surface format that exactly matched the
glDrawPixels format/type combination, we used some other 32-bit packed
RGBA format and swizzled the whole image in the mesa texstore/format code.

That slow path can be avoided in some common cases by using the
pipe_samper_view's swizzle terms to do the swizzling at texture sampling
time instead.

For now, only GL_RGBA/ubyte and GL_BGRA/ubyte combinations are supported.
In the future other formats and types like GL_UNSIGNED_INT_8_8_8_8 could
be added.
---
  src/mesa/state_tracker/st_cb_drawpixels.c | 73 +++
  1 file changed, 64 insertions(+), 9 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c 
b/src/mesa/state_tracker/st_cb_drawpixels.c
index 05f6e6b..a135761 100644
--- a/src/mesa/state_tracker/st_cb_drawpixels.c
+++ b/src/mesa/state_tracker/st_cb_drawpixels.c
@@ -395,15 +395,35 @@ make_texture(struct st_context *st,
 * Note that the image is actually going to be upside down in
 * the texture.  We deal with that with texcoords.
 */
-  success = _mesa_texstore(ctx, 2,   /* dims */
-   baseInternalFormat, /* baseInternalFormat */
-   mformat,  /* mesa_format */
-   transfer->stride, /* dstRowStride, bytes */
-   &dest,/* destSlices */
-   width, height, 1, /* size */
-   format, type, /* src format/type */
-   pixels,   /* data source */
-   unpack);
+  if ((format == GL_RGBA || format == GL_BGRA)
+  && type == GL_UNSIGNED_BYTE) {
+ /* Use a memcpy-based texstore to avoid software pixel swizzling.
+  * We'll do the necessary swizzling with the pipe_sampler_view to
+  * give much better performance.
+  * XXX in the future, expand this to accomodate more format and
+  * type combinations.
+  */
+ _mesa_memcpy_texture(ctx, 2,
+  mformat,  /* mesa_format */
+  transfer->stride, /* dstRowStride, bytes */
+  &dest,/* destSlices */
+  width, height, 1, /* size */
+  format, type, /* src format/type */
+  pixels,   /* data source */
+  unpack);
+ success = GL_TRUE;
+  }
+  else {
+ success = _mesa_texstore(ctx, 2,   /* dims */
+  baseInternalFormat, /* baseInternalFormat */
+  mformat,  /* mesa_format */
+  transfer->stride, /* dstRowStride, bytes */
+  &dest,/* destSlices */
+  width, height, 1, /* size */
+  format, type, /* src format/type */
+  pixels,   /* data source */
+  unpack);
+  }

/* unmap */
pipe_transfer_unmap(pipe, transfer);
@@ -958,6 +978,38 @@ clamp_size(struct pipe_context *pipe, GLsizei *width, 
GLsizei *height,


  /**
+ * Set the sampler view's swizzle terms.  This is used to handle RGBA
+ * swizzling when the incoming image format isn't an exact match for
+ * the actual texture format.  For example, if we have glDrawPixels(
+ * GL_RGBA, GL_UNSIGNED_BYTE) and we chose the texture format
+ * PIPE_FORMAT_B8G8R8A8 then we can do use the sampler view swizzle to
+ * avoid swizzling all the pixels in software in the texstore code.
+ */
+static void
+setup_sampler_swizzle(struct pipe_sampler_view *sv, GLenum format, GLenum type)
+{
+   if ((format == GL_RGBA || format == GL_BGRA) && type == GL_UNSIGNED_BYTE) {
+  const struct util_format_description *desc =
+ util_format_description(sv->texture->format);
+  /* Every gallium driver supports at least one 32-bit packed RGBA format.
+   * We must have chosen one for (GL_RGBA, GL_UNSIGNED_BYTE).
+   */
+  assert(desc->block.bits == 32);
+  /* use the format's swizzle to setup the sampler swizzle */
+  sv->swizzle_r = desc->swizzle[0];
+  sv->swizzle_g = desc->swizzle[1];
+  sv->swizzle_b = desc->swizzle[2];
+  sv->swizzle_a = desc->swizzle[3];


I think it should be the other way around: the sampler view's swizzle 
should _undo_ the format swizzle, not apply it again.


This indeed works for RGBA8_URNOM / BGRA8_UNORM, but by mere 
coincidence.  It will fail for something like ABGR8_UNORM.


If you don't want to deal with the swizzle inversion now, it might be 
better to explicitly check that the texture->format is RGBA8_URNOM / 
BGRA8_UNORM


Jos

Re: [Mesa-dev] [PATCH] st/mesa: check for out-of-memory in st_DrawPixels()

2015-10-16 Thread Jose Fonseca

On 16/10/15 23:24, Brian Paul wrote:

Before, if make_texture() or st_create_texture_sampler_view() failed
we silently no-op'd the glDrawPixels.  Now, set GL_OUT_OF_MEMORY.
This also allows us to un-nest a bunch of code.

v2: also check if allocation of sv[1] fails, per Jose.
---
  src/mesa/state_tracker/st_cb_drawpixels.c | 76 ++-
  1 file changed, 44 insertions(+), 32 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c 
b/src/mesa/state_tracker/st_cb_drawpixels.c
index e4d3580..79fb9ec 100644
--- a/src/mesa/state_tracker/st_cb_drawpixels.c
+++ b/src/mesa/state_tracker/st_cb_drawpixels.c
@@ -975,6 +975,7 @@ st_DrawPixels(struct gl_context *ctx, GLint x, GLint y,
 int num_sampler_view = 1;
 struct gl_pixelstore_attrib clippedUnpack;
 struct st_fp_variant *fpv = NULL;
+   struct pipe_resource *pt;

 /* Mesa state should be up to date by now */
 assert(ctx->NewState == 0x0);
@@ -1030,42 +1031,53 @@ st_DrawPixels(struct gl_context *ctx, GLint x, GLint y,
st_upload_constants(st, fpv->parameters, PIPE_SHADER_FRAGMENT);
 }

-   /* draw with textured quad */
-   {
-  struct pipe_resource *pt
- = make_texture(st, width, height, format, type, unpack, pixels);
-  if (pt) {
- sv[0] = st_create_texture_sampler_view(st->pipe, pt);
-
- if (sv[0]) {
-/* Create a second sampler view to read stencil.
- * The stencil is written using the shader stencil export
- * functionality. */
-if (write_stencil) {
-   enum pipe_format stencil_format =
- util_format_stencil_only(pt->format);
-   /* we should not be doing pixel map/transfer (see above) */
-   assert(num_sampler_view == 1);
-   sv[1] = st_create_texture_sampler_view_format(st->pipe, pt,
- stencil_format);
-   num_sampler_view++;
-}
+   /* Put glDrawPixels image into a texture */
+   pt = make_texture(st, width, height, format, type, unpack, pixels);
+   if (!pt) {
+  _mesa_error(ctx, GL_OUT_OF_MEMORY, "glDrawPixels");
+  return;
+   }

-draw_textured_quad(ctx, x, y, ctx->Current.RasterPos[2],
-   width, height,
-   ctx->Pixel.ZoomX, ctx->Pixel.ZoomY,
-   sv,
-   num_sampler_view,
-   driver_vp,
-   driver_fp, fpv,
-   color, GL_FALSE, write_depth, write_stencil);
-pipe_sampler_view_reference(&sv[0], NULL);
-if (num_sampler_view > 1)
-   pipe_sampler_view_reference(&sv[1], NULL);
- }
+   /* create sampler view for the image */
+   sv[0] = st_create_texture_sampler_view(st->pipe, pt);
+   if (!sv[0]) {
+  _mesa_error(ctx, GL_OUT_OF_MEMORY, "glDrawPixels");
+  pipe_resource_reference(&pt, NULL);
+  return;
+   }
+
+   /* Create a second sampler view to read stencil.  The stencil is
+* written using the shader stencil export functionality.
+*/
+   if (write_stencil) {
+  enum pipe_format stencil_format =
+ util_format_stencil_only(pt->format);
+  /* we should not be doing pixel map/transfer (see above) */
+  assert(num_sampler_view == 1);
+  sv[1] = st_create_texture_sampler_view_format(st->pipe, pt,
+stencil_format);
+  if (!sv[1]) {
+ _mesa_error(ctx, GL_OUT_OF_MEMORY, "glDrawPixels");
   pipe_resource_reference(&pt, NULL);
+ pipe_sampler_view_reference(&sv[0], NULL);
+ return;
}
+  num_sampler_view++;
 }
+
+   draw_textured_quad(ctx, x, y, ctx->Current.RasterPos[2],
+  width, height,
+  ctx->Pixel.ZoomX, ctx->Pixel.ZoomY,
+  sv,
+  num_sampler_view,
+  driver_vp,
+  driver_fp, fpv,
+  color, GL_FALSE, write_depth, write_stencil);
+   pipe_sampler_view_reference(&sv[0], NULL);
+   if (num_sampler_view > 1)
+  pipe_sampler_view_reference(&sv[1], NULL);
+
+   pipe_resource_reference(&pt, NULL);
  }





Looks good. Patch 1-3 of this series is

Reviewed-by: Jose Fonseca 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] st/mesa: optimize 4-component ubyte glDrawPixels

2015-10-19 Thread Jose Fonseca

On 17/10/15 03:56, Brian Paul wrote:

If we didn't find a gallium surface format that exactly matched the
glDrawPixels format/type combination, we used some other 32-bit packed
RGBA format and swizzled the whole image in the mesa texstore/format code.

That slow path can be avoided in some common cases by using the
pipe_samper_view's swizzle terms to do the swizzling at texture sampling
time instead.

For now, only GL_RGBA/ubyte and GL_BGRA/ubyte combinations are supported.
In the future other formats and types like GL_UNSIGNED_INT_8_8_8_8 could
be added.

v2: fix incorrect swizzle setup (need to invert the tex format's swizzle)
---
  src/mesa/state_tracker/st_cb_drawpixels.c | 104 +++---
  1 file changed, 95 insertions(+), 9 deletions(-)

diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c 
b/src/mesa/state_tracker/st_cb_drawpixels.c
index 79fb9ec..000d4f2 100644
--- a/src/mesa/state_tracker/st_cb_drawpixels.c
+++ b/src/mesa/state_tracker/st_cb_drawpixels.c
@@ -395,15 +395,35 @@ make_texture(struct st_context *st,
 * Note that the image is actually going to be upside down in
 * the texture.  We deal with that with texcoords.
 */
-  success = _mesa_texstore(ctx, 2,   /* dims */
-   baseInternalFormat, /* baseInternalFormat */
-   mformat,  /* mesa_format */
-   transfer->stride, /* dstRowStride, bytes */
-   &dest,/* destSlices */
-   width, height, 1, /* size */
-   format, type, /* src format/type */
-   pixels,   /* data source */
-   unpack);
+  if ((format == GL_RGBA || format == GL_BGRA)
+  && type == GL_UNSIGNED_BYTE) {
+ /* Use a memcpy-based texstore to avoid software pixel swizzling.
+  * We'll do the necessary swizzling with the pipe_sampler_view to
+  * give much better performance.
+  * XXX in the future, expand this to accomodate more format and
+  * type combinations.
+  */
+ _mesa_memcpy_texture(ctx, 2,
+  mformat,  /* mesa_format */
+  transfer->stride, /* dstRowStride, bytes */
+  &dest,/* destSlices */
+  width, height, 1, /* size */
+  format, type, /* src format/type */
+  pixels,   /* data source */
+  unpack);
+ success = GL_TRUE;
+  }
+  else {
+ success = _mesa_texstore(ctx, 2,   /* dims */
+  baseInternalFormat, /* baseInternalFormat */
+  mformat,  /* mesa_format */
+  transfer->stride, /* dstRowStride, bytes */
+  &dest,/* destSlices */
+  width, height, 1, /* size */
+  format, type, /* src format/type */
+  pixels,   /* data source */
+  unpack);
+  }

/* unmap */
pipe_transfer_unmap(pipe, transfer);
@@ -958,6 +978,69 @@ clamp_size(struct pipe_context *pipe, GLsizei *width, 
GLsizei *height,


  /**
+ * Search the array of 4 swizzle components for the named component and return
+ * its position.
+ */
+static int


Return type and `i` should be unsigned to avoid sign conversion. 
Otherwise looks great.


Reviewed by: Jose Fonseca 



+search_swizzle(const unsigned char swizzle[4], unsigned component)
+{
+   int i;
+   for (i = 0; i < 4; i++) {
+  if (swizzle[i] == component)
+ return i;
+   }
+   assert(!"search_swizzle() failed");
+   return 0;
+}
+
+
+/**
+ * Set the sampler view's swizzle terms.  This is used to handle RGBA
+ * swizzling when the incoming image format isn't an exact match for
+ * the actual texture format.  For example, if we have glDrawPixels(
+ * GL_RGBA, GL_UNSIGNED_BYTE) and we chose the texture format
+ * PIPE_FORMAT_B8G8R8A8 then we can do use the sampler view swizzle to
+ * avoid swizzling all the pixels in software in the texstore code.
+ */
+static void
+setup_sampler_swizzle(struct pipe_sampler_view *sv, GLenum format, GLenum type)
+{
+   if ((format == GL_RGBA || format == GL_BGRA) && type == GL_UNSIGNED_BYTE) {
+  const struct util_format_description *desc =
+ util_format_description(sv->texture->format);
+  unsigned c0, c1, c2, c3;
+
+  /* Every gallium driver supports at least one 32-bit packed RGBA format.
+   * We must have chosen one for (GL_RGBA, GL_UNSIGNED_BYTE).
+   */
+

Re: [Mesa-dev] [PATCH 10/10] vbo: convert display list GL_LINE_LOOP prims to GL_LINE_STRIP

2015-10-19 Thread Jose Fonseca

On 16/10/15 22:25, Brian Paul wrote:

When a long GL_LINE_LOOP prim was split across primitives we drew
stray lines.  See previous commit for details.

This patch converts GL_LINE_LOOP prims into GL_LINE_STRIP prims so
that drivers don't have to worry about the _mesa_prim::begin/end flags.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81174
---
  src/mesa/vbo/vbo_save_api.c | 53 +
  1 file changed, 53 insertions(+)

diff --git a/src/mesa/vbo/vbo_save_api.c b/src/mesa/vbo/vbo_save_api.c
index 6688ba0..d49aa15 100644
--- a/src/mesa/vbo/vbo_save_api.c
+++ b/src/mesa/vbo/vbo_save_api.c
@@ -360,6 +360,51 @@ merge_prims(struct _mesa_prim *prim_list,
 *prim_count = prev_prim - prim_list + 1;
  }

+
+/**
+ * Convert GL_LINE_LOOP primitive into GL_LINE_STRIP so that drivers
+ * don't have to worry about handling the _mesa_prim::begin/end flags.
+ * See https://bugs.freedesktop.org/show_bug.cgi?id=81174
+ */
+static void
+convert_line_loop_to_strip(struct vbo_save_context *save,
+   struct vbo_save_vertex_list *node)
+{
+   struct _mesa_prim *prim = &node->prim[node->prim_count - 1];
+
+   assert(prim->mode == GL_LINE_LOOP);
+
+   if (prim->end) {
+  /* Copy the 0th vertex to end of the buffer and extend the
+   * vertex count by one to finish the line loop.
+   */
+  const GLuint sz = save->vertex_size;
+  /* 0th vertex: */
+  const fi_type *src = save->buffer + prim->start * sz;
+  /* end of buffer: */
+  fi_type *dst = save->buffer + (prim->start + prim->count) * sz;
+
+  memcpy(dst, src, sz * sizeof(float));
+
+  prim->count++;
+  node->count++;
+  save->vert_count++;
+  save->buffer_ptr += sz;
+  save->vertex_store->used += sz;
+   }
+
+   if (!prim->begin) {
+  /* Drawing the second or later section of a long line loop.
+   * Skip the 0th vertex.
+   */
+  prim->start++;
+  prim->count--;
+   }
+
+   prim->mode = GL_LINE_STRIP;
+}
+
+
  /**
   * Insert the active immediate struct onto the display list currently
   * being built.
@@ -441,6 +486,10 @@ _save_compile_vertex_list(struct gl_context *ctx)
  */
 save->copied.nr = _save_copy_vertices(ctx, node, save->buffer);

+   if (node->prim[node->prim_count - 1].mode == GL_LINE_LOOP) {
+  convert_line_loop_to_strip(save, node);
+   }
+
 merge_prims(node->prim, &node->prim_count);

 /* Deal with GL_COMPILE_AND_EXECUTE:
@@ -482,6 +531,10 @@ _save_compile_vertex_list(struct gl_context *ctx)
save->buffer_ptr = vbo_save_map_vertex_store(ctx, save->vertex_store);
save->out_of_memory = save->buffer_ptr == NULL;
 }
+   else {
+  /* update buffer_ptr for next vertex */
+  save->buffer_ptr = save->vertex_store->buffer + save->vertex_store->used;
+   }

 if (save->prim_store->used > VBO_SAVE_PRIM_SIZE - 6) {
save->prim_store->refcount--;



Nice catch.

I'm not very familiar with this code, but FWIW, other than the issues I 
mentioned separately, the series looks good to me.


Reviewed-by: Jose Fonseca 

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] scons: Build nir/glsl_types.cpp once.

2015-10-19 Thread Jose Fonseca
Undoes early hacks, and ensures nir/glsl_types.cpp is built once, and
only once.

The root problem is that SCons doesn't know about NIR nor any source
file in the NIR_FILES source list.

Tested with libgl-gdi and libgl-xlib scons targets.
---
 src/gallium/targets/libgl-gdi/SConscript   | 10 +-
 src/gallium/targets/libgl-gdi/glsl_types_hack.cpp  |  3 ---
 src/gallium/targets/libgl-xlib/SConscript  |  3 ---
 src/gallium/targets/libgl-xlib/glsl_types_hack.cpp |  3 ---
 src/gallium/targets/osmesa/SConscript  |  7 +--
 src/gallium/targets/osmesa/glsl_types_hack.cpp |  3 ---
 src/glsl/SConscript|  7 ++-
 src/mesa/drivers/x11/SConscript|  1 -
 8 files changed, 8 insertions(+), 29 deletions(-)
 delete mode 100644 src/gallium/targets/libgl-gdi/glsl_types_hack.cpp
 delete mode 100644 src/gallium/targets/libgl-xlib/glsl_types_hack.cpp
 delete mode 100644 src/gallium/targets/osmesa/glsl_types_hack.cpp

diff --git a/src/gallium/targets/libgl-gdi/SConscript 
b/src/gallium/targets/libgl-gdi/SConscript
index eb777a8..594f34d 100644
--- a/src/gallium/targets/libgl-gdi/SConscript
+++ b/src/gallium/targets/libgl-gdi/SConscript
@@ -7,10 +7,6 @@ env = env.Clone()
 
 env.Append(CPPPATH = [
 '#src',
-'#src/mesa',
-'#src/mapi',
-'#src/glsl',
-'#src/glsl/nir',
 '#src/gallium/state_trackers/wgl',
 '#src/gallium/winsys/sw',
 ])
@@ -24,11 +20,7 @@ env.Append(LIBS = [
 
 env.Prepend(LIBS = [mesautil])
 
-sources = [
-'libgl_gdi.c',
-'glsl_types_hack.cpp'
-]
-
+sources = ['libgl_gdi.c']
 drivers = []
 
 if True:
diff --git a/src/gallium/targets/libgl-gdi/glsl_types_hack.cpp 
b/src/gallium/targets/libgl-gdi/glsl_types_hack.cpp
deleted file mode 100644
index 5c042f2..000
--- a/src/gallium/targets/libgl-gdi/glsl_types_hack.cpp
+++ /dev/null
@@ -1,3 +0,0 @@
-/* errrg scons.. otherwise "scons: *** Two environments with different actions 
were specified for the same target: 
$mesa/build/linux-x86_64-debug/glsl/nir/glsl_types.os" */
-#include "glsl_types.cpp"
-
diff --git a/src/gallium/targets/libgl-xlib/SConscript 
b/src/gallium/targets/libgl-xlib/SConscript
index fedc522..df5a220 100644
--- a/src/gallium/targets/libgl-xlib/SConscript
+++ b/src/gallium/targets/libgl-xlib/SConscript
@@ -6,8 +6,6 @@ Import('*')
 env = env.Clone()
 
 env.Append(CPPPATH = [
-'#/src/glsl',
-'#/src/glsl/nir',
 '#/src/mapi',
 '#/src/mesa',
 '#/src/mesa/main',
@@ -38,7 +36,6 @@ env.Prepend(LIBS = [
 
 sources = [
 'xlib.c',
-'glsl_types_hack.cpp',
 ]
 
 if True:
diff --git a/src/gallium/targets/libgl-xlib/glsl_types_hack.cpp 
b/src/gallium/targets/libgl-xlib/glsl_types_hack.cpp
deleted file mode 100644
index 5c042f2..000
--- a/src/gallium/targets/libgl-xlib/glsl_types_hack.cpp
+++ /dev/null
@@ -1,3 +0,0 @@
-/* errrg scons.. otherwise "scons: *** Two environments with different actions 
were specified for the same target: 
$mesa/build/linux-x86_64-debug/glsl/nir/glsl_types.os" */
-#include "glsl_types.cpp"
-
diff --git a/src/gallium/targets/osmesa/SConscript 
b/src/gallium/targets/osmesa/SConscript
index 78930a9..4a9115b 100644
--- a/src/gallium/targets/osmesa/SConscript
+++ b/src/gallium/targets/osmesa/SConscript
@@ -5,8 +5,6 @@ env = env.Clone()
 env.Prepend(CPPPATH = [
 '#src/mapi',
 '#src/mesa',
-'#src/glsl',
-'#src/glsl/nir',
 #Dir('../../../mapi'), # src/mapi build path for python-generated GL API 
files/headers
 ])
 
@@ -24,10 +22,7 @@ env.Prepend(LIBS = [
 
 env.Append(CPPDEFINES = ['GALLIUM_TRACE', 'GALLIUM_SOFTPIPE'])
 
-sources = [
-'target.c',
-'glsl_types_hack.cpp'
-]
+sources = ['target.c']
 
 if env['llvm']:
 env.Append(CPPDEFINES = 'GALLIUM_LLVMPIPE')
diff --git a/src/gallium/targets/osmesa/glsl_types_hack.cpp 
b/src/gallium/targets/osmesa/glsl_types_hack.cpp
deleted file mode 100644
index 5c042f2..000
--- a/src/gallium/targets/osmesa/glsl_types_hack.cpp
+++ /dev/null
@@ -1,3 +0,0 @@
-/* errrg scons.. otherwise "scons: *** Two environments with different actions 
were specified for the same target: 
$mesa/build/linux-x86_64-debug/glsl/nir/glsl_types.os" */
-#include "glsl_types.cpp"
-
diff --git a/src/glsl/SConscript b/src/glsl/SConscript
index 927cbdc..70bf5b0 100644
--- a/src/glsl/SConscript
+++ b/src/glsl/SConscript
@@ -61,6 +61,12 @@ source_lists = env.ParseSourceList('Makefile.sources')
 for l in ('LIBGLCPP_FILES', 'LIBGLSL_FILES'):
 glsl_sources += source_lists[l]
 
+# add nir/glsl_types.cpp manually, because SCons still doesn't know about NIR.
+# XXX: Remove this once we build NIR and NIR_FILES.
+glsl_sources += [
+'nir/glsl_types.cpp',
+]
+
 if env['msvc']:
 env.Prepend(CPPPATH = ['#/src/getopt'])
 env.PrependUnique(LIBS = [getopt])
@@ -81,7 +87,6 @@ mesa_objs = env.StaticObject([
 'prog_hash_table.c',
 'symbol_table.c',
 'dummy_errors.c',
-'nir/glsl_types.cpp',
 ])
 
 compiler_objs += mesa_objs
diff --git a

Re: [Mesa-dev] MSVC, MinGW build break

2015-10-19 Thread Jose Fonseca

On 17/10/15 17:44, Rob Clark wrote:

On Sat, Oct 17, 2015 at 12:36 PM, Brian Paul  wrote:

On 10/17/2015 10:07 AM, Brian Paul wrote:


On 10/17/2015 07:04 AM, Rob Clark wrote:


On Fri, Oct 16, 2015 at 11:11 PM, Brian Paul  wrote:


Hi Rob,

Your recent commit "nir: remove dependency on glsl" broke the build
for MSVC
and MinGW.

For MSVC:


[...]



Hopefully it's something simple to fix.



these types should all be coming from glsl_types.cpp which moved into
NIR..

I've no idea about MSVC or MinGW builds.. (I did at least fix up the
scons build, although in not a very pretty way..).  I guess the best
thing I could suggest is to:

git show -M b9b40ef9b7644ea24768bc8b7464b1719efe99bf

and make equivalent changes in whatever build files MSVC/MinGW uses??



Yeah, that's what I did.



Actually, I'm kind of hoping we can find a cleaner fix for scons.  Maybe
Jose can take a look when he has time.



yeah, current solution is not pretty..  Emil had suggested introducing
a libnir for scons build, but that was well beyond my understanding of
scons and the scons build setup.

That would be nice if someone who knew what they were doing could have a look.



The problem was actually quite simple.

I'll post a review request now.


> v3: I f***ing hate scons.. but at least it builds

Please CC'me when there are difficulties related to the scons build 
before or during code review.


There's no need for people to be struggling with SCons specific issues 
all alone.  But there's too much traffic in Mesa for me to keep up with 
all of it (specially because lately my focus has been elsewhere), so 
unless I'm explicitly CC'ed the odds are that it will go past me as 
happened here.  (e-mail that CC me won't be automatically moved into a 
folder -- will stay on my top inbox folder where's it's bound to catch 
my attention.)



BTW, the root cause is not so much SCons' "quirkyness", but merely the 
fact that SCons hasn't yet been updated to build NIR, hence adding 
nir/glsl_types.cpp to src/glsl/Makefile.sources' NIR_FILES variable had 
no effect for scson.  All the trouble resulted of that.


And we are probably likely to run into issues again, until we integrate 
NIR into scons.  We should make a bit of time for that as soon as we can.



Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Introducing OpenSWR: High performance software rasterizer

2015-10-20 Thread Jose Fonseca

On 20/10/15 18:11, Rowley, Timothy O wrote:

Hi.  I'd like to introduce the Mesa3D community to a software project
that we hope to upstream.  We're a small team at Intel working on
software defined visualization (http://sdvis.org/), and have
opensource projects in both the raytracing (Embree, OSPRay) and
rasterization (OpenSWR) realms.

We're a different Intel team from that of i965 fame, with a different
type of customer and workloads.  Our customers have large clusters of
compute nodes that for various reasons do not have GPUs, and are
working with extremely large geometry models.

We've been working on a high performance, highly scalable rasterizer
and driver to interface with Mesa3D.  Our rasterizer functions as a
"software gpu", relying on the mature well-supported Mesa3D to provide
API and state tracking layers.

We would like to contribute this code to Mesa3D and continue doing
active development in your source repository.  We welcome discussion
about how this will happen and questions about the project itself.
Below are some answers to what we think might be frequently asked
questions.

Bruce and I will be the public contacts for this project, but this
project isn't solely our work - there's a dedicated group of people
working on the core SWR code.

   Tim Rowley
   Bruce Cherniak

   Intel Corporation

Why another software rasterizer?


Good question, given there are already three (swrast, softpipe,
llvmpipe) in the Mesa3D tree. Two important reasons for this:

  * Architecture - given our focus on scientific visualization, our
workloads are much different than the typical game; we have heavy
vertex load and relatively simple shaders.  In addition, the core
counts of machines we run on are much higher.  These parameters led
to design decisions much different than llvmpipe.

  * Historical - Intel had developed a high performance software
graphics stack for internal purposes.  Later we adapted this
graphics stack for use in visualization and decided to move forward
with Mesa3D to provide a high quality API layer while at the same
time benefiting from the excellent performance the software
rasterizerizer gives us.


It wouldn't be too dificult to make llvmpipe's vertex-shading 
distributed across threads.



What's the architecture?


SWR is a tile based immediate mode renderer with a sort-free threading
model which is arranged as a ring of queues.  Each entry in the ring
represents a draw context that contains all of the draw state and work
queues.  An API thread sets up each draw context and worker threads
will execute both the frontend (vertex/geometry processing) and
backend (fragment) work as required.  The ring allows for backend
threads to pull work in order.  Large draws are split into chunks to
allow vertex processing to happen in parallel, with the backend work
pickup preserving draw ordering.

Our pipeline uses just-in-time compiled code for the fetch shader that
does vertex attribute gathering and AOS to SOA conversions, the vertex
shader and fragment shaders, streamout, and fragment blending. SWR
core also supports geometry and compute shaders but we haven't exposed
them through our driver yet. The fetch shader, streamout, and blend is
built internally to swr core using LLVM directly, while for the vertex
and pixel shaders we reuse bits of llvmpipe from
gallium/auxiliary/gallivm to build the kernels, which we wrap
differently than llvmpipe's auxiliary/draw code.

What's the performance?
---

For the types of high-geometry workloads we're interested in, we are
significantly faster than llvmpipe.  This is to be expected, as
llvmpipe only threads the fragment processing and not the geometry
frontend.

The linked slide below shows some performance numbers from a benchmark
dataset and application.  On a 36 total core dual E5-2699v3 we see
performance 29x to 51x that of llvmpipe.

http://openswr.org/slides/SWR_Sept15.pdf

While our current performance is quite good, we know there is more
potential in this architecture.  When we switched from a prototype
OpenGL driver to Mesa we regressed performance severely, some due to
interface issues that need tuning, some differences in shader code
generation, and some due to conformance and feature additions to the
core swr.  We are looking to recovering most of this performance back.


I tried it on my i7-5500U, but I run into two issues:

- OpenSWR seems to only use 2 threads (even though my system support 4 
threads)


- and even when I compensate llvmpipe to only use 2 rasterizer threads, 
I still only get half the framerate of llvmpipe with the "gloss" Mesa 
demo (a very simple texturing demo):


$ ./gloss
SWR create screen!
This processor supports AVX2.
720 frames in 5.004 seconds = 143.885 FPS
737 frames in 5.005 seconds = 147.253 FPS
729 frames in 5.004 seconds = 145.683 FPS
732 frames in 5.002 seconds = 146.341 FPS
735 frames

Re: [Mesa-dev] Introducing OpenSWR: High performance software rasterizer

2015-10-20 Thread Jose Fonseca

On 20/10/15 23:16, Rowley, Timothy O wrote:



On Oct 20, 2015, at 4:23 PM, Jose Fonseca  wrote:

I tried it on my i7-5500U, but I run into two issues:

- OpenSWR seems to only use 2 threads (even though my system support 4 threads)

- and even when I compensate llvmpipe to only use 2 rasterizer threads, I still only get 
half the framerate of llvmpipe with the "gloss" Mesa demo (a very simple 
texturing demo):

$ ./gloss
SWR create screen!
This processor supports AVX2.
720 frames in 5.004 seconds = 143.885 FPS
737 frames in 5.005 seconds = 147.253 FPS
729 frames in 5.004 seconds = 145.683 FPS
732 frames in 5.002 seconds = 146.341 FPS
735 frames in 5.001 seconds = 146.971 FPS
[...]
$ GALLIUM_DRIVER=llvmpipe LP_NUM_THREADS=2 ./gloss
1539 frames in 5.002 seconds = 307.677 FPS
1719 frames in 5 seconds = 343.8 FPS
1780 frames in 5.002 seconds = 355.858 FPS
1497 frames in 5.002 seconds = 299.28 FPS
1548 frames in 5.001 seconds = 309.538 FPS
[..]

I see similar ratio with more complex  workload with the trace from:

  http://people.freedesktop.org/~jrfonseca/traces/furmark-1.8.2-svga.trace

(you'll need to download https://github.com/apitrace/apitrace and build)

My questions are:

- Is this the expected performance when texturing is used? Or is there 
something wrong with my setup?



Two things are happening here to cause the behavior you’re seeing.  First, 
OpenSWR only generates threads equal to the number of physical cores.  On our 
workloads, going beyond that and using hyperthreads was a minimal or negative 
performance increase.  Second, one thread is reserved for the API thread, which 
does not participate in either frontend (geometry) or backend (fragment) work.  
Thus on your two core 5500U OpenSWR only had one raster thread versus 
llvmpipe’s two, giving half the performance.  If you want to switch OpenSWR to 
using hyperthreads, set the environment variable KNOB_MAX_THREADS_PER_CORE=0.


Thanks for the explanations.  It's closer now, but still a bit of gap:

$ KNOB_MAX_THREADS_PER_CORE=0 ./gloss
SWR create screen!
This processor supports AVX2.
--> numThreads = 3
1102 frames in 5.002 seconds = 220.312 FPS
1133 frames in 5.001 seconds = 226.555 FPS
1130 frames in 5.002 seconds = 225.91 FPS
^C
$ GALLIUM_DRIVER=llvmpipe LP_NUM_THREADS=2 ./gloss
1456 frames in 5 seconds = 291.2 FPS
1617 frames in 5.003 seconds = 323.206 FPS
1571 frames in 5.002 seconds = 314.074 FPS


One final question: you said that one thread is reserved for the API, 
but I see all threads (with top `H`) maxing up the CPU.  So if the 
thread reserved for the API is not doing vertex/fragment processing, 
then what is it using 100% of a CPU thread for?



Final thoughts: I understand this project has its own history, but I 
echo what Roland said -- it would be nice to unify with llvmpipe at one 
point, in some way or fashion.  Our (VMware's) focus has been desktop 
composition, but there's no reason why a single SW renderer can't 
satisfy both ends of the spectrum, especially for JIT enable renderers, 
since they can emit at runtime the code most suited for the workload.


That said, it's really nice seeing Mesa and Gallium enabling this sort 
of experiments with SW rendering.



Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallivm: Translate all util_cpu_caps bits to LLVM attributes.

2015-10-21 Thread Jose Fonseca
This should prevent disparity between features Mesa and LLVM
believe are supported by the CPU.

http://lists.freedesktop.org/archives/mesa-dev/2015-October/thread.html#96990

Tested on a i7-3720QM w/ LLVM 3.3 and 3.6.
---
 src/gallium/auxiliary/gallivm/lp_bld_misc.cpp | 34 ++-
 1 file changed, 33 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp 
b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
index 72fab8c..7073956 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
+++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
@@ -498,6 +498,32 @@ 
lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
}
 
llvm::SmallVector MAttrs;
+   if (util_cpu_caps.has_sse) {
+  MAttrs.push_back("+sse");
+   }
+   if (util_cpu_caps.has_sse2) {
+  MAttrs.push_back("+sse2");
+   }
+   if (util_cpu_caps.has_sse3) {
+  MAttrs.push_back("+sse3");
+   }
+   if (util_cpu_caps.has_ssse3) {
+  MAttrs.push_back("+ssse3");
+   }
+   if (util_cpu_caps.has_sse4_1) {
+#if HAVE_LLVM >= 0x0304
+  MAttrs.push_back("+sse4.1");
+#else
+  MAttrs.push_back("+sse41");
+#endif
+   }
+   if (util_cpu_caps.has_sse4_2) {
+#if HAVE_LLVM >= 0x0304
+  MAttrs.push_back("+sse4.2");
+#else
+  MAttrs.push_back("+sse42");
+#endif
+   }
if (util_cpu_caps.has_avx) {
   /*
* AVX feature is not automatically detected from CPUID by the X86 target
@@ -509,8 +535,14 @@ 
lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
   if (util_cpu_caps.has_f16c) {
  MAttrs.push_back("+f16c");
   }
-  builder.setMAttrs(MAttrs);
+  if (util_cpu_caps.has_avx2) {
+ MAttrs.push_back("+avx2");
+  }
+   }
+   if (util_cpu_caps.has_altivec) {
+  MAttrs.push_back("+altivec");
}
+   builder.setMAttrs(MAttrs);
 
 #if HAVE_LLVM >= 0x0305
StringRef MCPU = llvm::sys::getHostCPUName();
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: Translate all util_cpu_caps bits to LLVM attributes.

2015-10-22 Thread Jose Fonseca

On 21/10/15 17:35, Gustaw Smolarczyk wrote:

I am just a bystander, but I have one suggestion to this patch.

2015-10-21 18:25 GMT+02:00 Jose Fonseca :

This should prevent disparity between features Mesa and LLVM
believe are supported by the CPU.

http://lists.freedesktop.org/archives/mesa-dev/2015-October/thread.html#96990

Tested on a i7-3720QM w/ LLVM 3.3 and 3.6.
---
  src/gallium/auxiliary/gallivm/lp_bld_misc.cpp | 34 ++-
  1 file changed, 33 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp 
b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
index 72fab8c..7073956 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
+++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
@@ -498,6 +498,32 @@ 
lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
 }

 llvm::SmallVector MAttrs;


Maybe increase the size of the SmallVector here?

Gustaw


Good point. Will do. Thanks.

Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Introducing OpenSWR: High performance software rasterizer

2015-10-22 Thread Jose Fonseca

On 22/10/15 00:43, Rowley, Timothy O wrote:



On Oct 20, 2015, at 5:58 PM, Jose Fonseca  wrote:

Thanks for the explanations.  It's closer now, but still a bit of gap:

$ KNOB_MAX_THREADS_PER_CORE=0 ./gloss
SWR create screen!
This processor supports AVX2.
--> numThreads = 3
1102 frames in 5.002 seconds = 220.312 FPS
1133 frames in 5.001 seconds = 226.555 FPS
1130 frames in 5.002 seconds = 225.91 FPS
^C
$ GALLIUM_DRIVER=llvmpipe LP_NUM_THREADS=2 ./gloss
1456 frames in 5 seconds = 291.2 FPS
1617 frames in 5.003 seconds = 323.206 FPS
1571 frames in 5.002 seconds = 314.074 FPS


A bit more of an apples to apples comparison might be single-threaded llvmpipe 
(LP_NUM_THREADS=1) and single-threaded swr (KNOB_SINGLE_THREADED=1).  Running 
gloss and glxgears (another favorite “benchmark” :) ) under these conditions 
show swr running a bit slower, though a little closer than your numbers.



Indeed that seems a better comparison.

$ KNOB_SINGLE_THREADED=1 ./gloss
SWR create screen!
This processor supports AVX2.
733 frames in 5.003 seconds = 146.512 FPS
787 frames in 5.004 seconds = 157.274 FPS
793 frames in 5.005 seconds = 158.442 FPS
799 frames in 5.001 seconds = 159.768 FPS
787 frames in 5.005 seconds = 157.243 FPS
$ GALLIUM_DRIVER=llvmpipe LP_NUM_THREADS=0 ./gloss
939 frames in 5.002 seconds = 187.725 FPS
1032 frames in 5.001 seconds = 206.359 FPS
1017 frames in 5.002 seconds = 203.319 FPS
1021 frames in 5 seconds = 204.2 FPS
1039 frames in 5.002 seconds = 207.717 FPS

> Examining performance traces, we think swr’s concept of hot-tiles, 
the working memory representation of the render target, and the 
associated load/store functions contribute to most of the difference. 
We might be able to optimize those conversions; additionally fast clear 
would help these demos.  For larger workloads this small per-frame cost 
doesn’t really affect the performance.



These initial observations from you and others regarding performance have been 
interesting.  Our performance work has been with large workloads on high core 
count configurations, where while some of the decisions such as a dedicated 
core for the application/API might have cost performance a bit, the percentage 
is much less than on the dual and quad core processors.  We’ll look into some 
changes/tuning that will benefit both extremes, though we might have to end up 
conceding that llvmpipe will be faster at glxgears. :-)


I don't care for gears -- it practically measure present/blit rate --, 
but gloss spite simple is sensitive to texturing performance.



Final thoughts: I understand this project has its own history, but I echo what 
Roland said -- it would be nice to unify with llvmpipe at one point, in some 
way or fashion.  Our (VMware's) focus has been desktop composition, but there's 
no reason why a single SW renderer can't satisfy both ends of the spectrum, 
especially for JIT enable renderers, since they can emit at runtime the code 
most suited for the workload.


We would be happy for someone to take some of the ideas from swr to speed up 
llvmpipe, but for now our development will continue on the swr core and driver. 
 We’re not planning on replacing llvmpipe - its intent of working on any 
architecture is admirable.  In the ideal world the solution would be something 
that combines the best traits of both rasterizers, but at this point the 
shortest path to having a performant solution for our customers is with swr.


Fair enough.

They do share a lot already, Mesa, gallium statetracker, and gallivm. 
If further development in openswr is planned, it might require to jump 
through a few hoops, but I think it's worth to figure out what would 
take to get this merged into master so that, whenever there are 
interface changes, openswer won't get the short stick.



That said, it's really nice seeing Mesa and Gallium enabling this sort of 
experiments with SW rendering.


Yes, we were quite happy with how fast we were able to get a new driver 
functioning with gallium.  The major thing slowing us was the documentation, 
which is not uniform in coverage.  There was a lot of reading other drivers’ 
source to figure out how things were supposed to work.


Yes, that's a fair comment.

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallivm: Explicitly disable unsupported CPU features.

2015-10-23 Thread Jose Fonseca
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92214
---
 src/gallium/auxiliary/gallivm/lp_bld_misc.cpp | 69 ---
 1 file changed, 31 insertions(+), 38 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp 
b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
index e70a75f..0781e36 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
+++ b/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp
@@ -498,50 +498,43 @@ 
lp_build_create_jit_compiler_for_module(LLVMExecutionEngineRef *OutJIT,
}
 
llvm::SmallVector MAttrs;
-   if (util_cpu_caps.has_sse) {
-  MAttrs.push_back("+sse");
-   }
-   if (util_cpu_caps.has_sse2) {
-  MAttrs.push_back("+sse2");
-   }
-   if (util_cpu_caps.has_sse3) {
-  MAttrs.push_back("+sse3");
-   }
-   if (util_cpu_caps.has_ssse3) {
-  MAttrs.push_back("+ssse3");
-   }
-   if (util_cpu_caps.has_sse4_1) {
+
+#if defined(PIPE_ARCH_X86) || defined(PIPE_ARCH_X86_64)
+   /*
+* We need to unset attributes because sometimes LLVM mistakenly assumes
+* certain features are present given the processor name.
+*
+* https://bugs.freedesktop.org/show_bug.cgi?id=92214
+*/
+   MAttrs.push_back(util_cpu_caps.has_sse? "+sse": "-sse"   );
+   MAttrs.push_back(util_cpu_caps.has_sse2   ? "+sse2"   : "-sse2"  );
+   MAttrs.push_back(util_cpu_caps.has_sse3   ? "+sse3"   : "-sse3"  );
+   MAttrs.push_back(util_cpu_caps.has_ssse3  ? "+ssse3"  : "-ssse3" );
 #if HAVE_LLVM >= 0x0304
-  MAttrs.push_back("+sse4.1");
+   MAttrs.push_back(util_cpu_caps.has_sse4_1 ? "+sse4.1" : "-sse4.1");
 #else
-  MAttrs.push_back("+sse41");
+   MAttrs.push_back(util_cpu_caps.has_sse4_1 ? "+sse41"  : "-sse41" );
 #endif
-   }
-   if (util_cpu_caps.has_sse4_2) {
 #if HAVE_LLVM >= 0x0304
-  MAttrs.push_back("+sse4.2");
+   MAttrs.push_back(util_cpu_caps.has_sse4_2 ? "+sse4.2" : "-sse4.2");
 #else
-  MAttrs.push_back("+sse42");
+   MAttrs.push_back(util_cpu_caps.has_sse4_2 ? "+sse42"  : "-sse42" );
 #endif
-   }
-   if (util_cpu_caps.has_avx) {
-  /*
-   * AVX feature is not automatically detected from CPUID by the X86 target
-   * yet, because the old (yet default) JIT engine is not capable of
-   * emitting the opcodes. On newer llvm versions it is and at least some
-   * versions (tested with 3.3) will emit avx opcodes without this anyway.
-   */
-  MAttrs.push_back("+avx");
-  if (util_cpu_caps.has_f16c) {
- MAttrs.push_back("+f16c");
-  }
-  if (util_cpu_caps.has_avx2) {
- MAttrs.push_back("+avx2");
-  }
-   }
-   if (util_cpu_caps.has_altivec) {
-  MAttrs.push_back("+altivec");
-   }
+   /*
+* AVX feature is not automatically detected from CPUID by the X86 target
+* yet, because the old (yet default) JIT engine is not capable of
+* emitting the opcodes. On newer llvm versions it is and at least some
+* versions (tested with 3.3) will emit avx opcodes without this anyway.
+*/
+   MAttrs.push_back(util_cpu_caps.has_avx  ? "+avx"  : "-avx");
+   MAttrs.push_back(util_cpu_caps.has_f16c ? "+f16c" : "-f16c");
+   MAttrs.push_back(util_cpu_caps.has_avx2 ? "+avx2" : "-avx2");
+#endif
+
+#if defined(PIPE_ARCH_PPC)
+   MAttrs.push_back(util_cpu_caps.has_altivec ? "+altivec" : "-altivec");
+#endif
+
builder.setMAttrs(MAttrs);
 
 #if HAVE_LLVM >= 0x0305
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] gallivm: fix tex offsets with mirror repeat linear

2015-10-23 Thread Jose Fonseca

On 22/10/15 23:42, srol...@vmware.com wrote:

From: Roland Scheidegger 

Can't see why anyone would ever want to use this, but it was clearly broken.
This fixes the piglit texwrap offset test using this combination.
---
  src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c | 9 +
  1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c 
b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
index 125505e..26bfa0d 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_sample_soa.c
@@ -405,16 +405,17 @@ lp_build_sample_wrap_linear(struct 
lp_build_sample_context *bld,
break;

 case PIPE_TEX_WRAP_MIRROR_REPEAT:
+  if (offset) {
+ offset = lp_build_int_to_float(coord_bld, offset);
+ offset = lp_build_div(coord_bld, offset, length_f);
+ coord = lp_build_add(coord_bld, coord, offset);
+  }
/* compute mirror function */
coord = lp_build_coord_mirror(bld, coord);

/* scale coord to length */
coord = lp_build_mul(coord_bld, coord, length_f);
coord = lp_build_sub(coord_bld, coord, half);
-  if (offset) {
- offset = lp_build_int_to_float(coord_bld, offset);
- coord = lp_build_add(coord_bld, coord, offset);
-  }

/* convert to int, compute lerp weight */
lp_build_ifloor_fract(coord_bld, coord, &coord0, &weight);



Nice finds.  Series is

Reviewed-by: Jose Fonseca 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: disable f16c when not using AVX

2015-10-26 Thread Jose Fonseca

On 23/10/15 22:26, srol...@vmware.com wrote:

From: Roland Scheidegger 

f16c intrinsic can only be emitted when AVX is used. So when we disable AVX
due to forcing 128bit vectors we must not use this intrinsic (depending on
llvm version, this worked previously because llvm used AVX even when we didn't
tell it to, however I've seen this fail with llvm 3.3 since
718249843b915decf8fccec92e466ac1a6219934 which seems to have the side effect
of disabling avx in llvm albeit it only touches sse flags really).


Good catch.


Possibly one day should actually try to use avx even with 128bit vectors...


In the past we needed to override util_cpu_caps.has_avx on AVX capable 
machines but where old-JIT code.  But that's no longer the case: the min 
supported LLVM version is 3.3, which supports AVX both with MCJIT and 
old-JIT.



There, the only point of this code is to enable a developer to test SSE2 
code paths on a AVX capable machine.


There's no other reason for someone to go out of his way to override 
LP_NATIVE_VECTOR_WIDTH of 256 with 128.



So maybe it's worth to make this comment clear: the sole point is to 
enable SSE2 testing on AVX machines, and all avx flags, and flags which 
depend on avx, need to be masked out.



BTW the "For simulating less capable machines" code needs to be updated 
too (it's missing has_avx2=0).




---
  src/gallium/auxiliary/gallivm/lp_bld_init.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c 
b/src/gallium/auxiliary/gallivm/lp_bld_init.c
index 017d075..e6eede8 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_init.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c
@@ -427,6 +427,7 @@ lp_build_init(void)
 */
util_cpu_caps.has_avx = 0;
util_cpu_caps.has_avx2 = 0;
+  util_cpu_caps.has_f16c = 0;
 }

  #ifdef PIPE_ARCH_PPC_64



Reviewed-by: Jose Fonseca 

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallivm: disable f16c when not using AVX

2015-10-26 Thread Jose Fonseca

On 26/10/15 14:58, Roland Scheidegger wrote:

Am 26.10.2015 um 10:02 schrieb Jose Fonseca:

On 23/10/15 22:26, srol...@vmware.com wrote:

From: Roland Scheidegger 

f16c intrinsic can only be emitted when AVX is used. So when we
disable AVX
due to forcing 128bit vectors we must not use this intrinsic
(depending on
llvm version, this worked previously because llvm used AVX even when
we didn't
tell it to, however I've seen this fail with llvm 3.3 since
718249843b915decf8fccec92e466ac1a6219934 which seems to have the side
effect
of disabling avx in llvm albeit it only touches sse flags really).


Good catch.


Possibly one day should actually try to use avx even with 128bit
vectors...


In the past we needed to override util_cpu_caps.has_avx on AVX capable
machines but where old-JIT code.  But that's no longer the case: the min
supported LLVM version is 3.3, which supports AVX both with MCJIT and
old-JIT.


There, the only point of this code is to enable a developer to test SSE2
code paths on a AVX capable machine.

There's no other reason for someone to go out of his way to override
LP_NATIVE_VECTOR_WIDTH of 256 with 128.


So maybe it's worth to make this comment clear: the sole point is to
enable SSE2 testing on AVX machines, and all avx flags, and flags which
depend on avx, need to be masked out.


Well that's not quite true. Forcing 128bit wide vectors will get you
faster shader compiles and less llvm memory usage. And in some odd cases
the compiled shaders aren't even slower. Disabling AVX on top of that
doesn't really change much there though things might be minimally slower
(of course, if you hit the things which actually depend on avx, like
f16c, that's a different story).

.


Though it's not really exposed much as a feature, I find it's atleast
as interesting for development to figure out why shaders using 256bit
vectors are scaling appropriately or not compared to 128bit rather than
SSE2 emulation. And for the former case it makes sense to leave avx
enabled. You are right though that the initial idea was to essentially
force llvm for the compiled shader to look like it was compiled on a
less capable machine (albeit since we're setting the cpu type,
instruction scheduling will still be different).


So it sounds LP_NATIVE_VECTOR_WIDTH is not expressive enough for all 
development test cases.  We probably want another env var to fake SSE2 
machines etc, and let LP_NATIVE_VECTOR_WIDTH to be something orthogonal 
(that will however default to 128/256 based on the machine features).


Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] MSVC (2015) builds

2015-11-03 Thread Jose Fonseca

On 03/11/15 15:48, Brian Paul wrote:

On 11/02/2015 08:42 PM, Janusz Ganczarski wrote:

Hello,
In attachment fixed Visual C++ 2015 (VC 14) builds for Mesa.
Currently only gallium softpipe driver support. Gallium llvmpipe
driver support work in progress.


I'm not sure we're interested in MSVC project files (I'm certainly not).
  Unless we had several people who wanted to actively develop Mesa with
MSVC and promise to maintain MSVC support, I don't think this would be
used much and would eventually be dropped.  We've gone through this
several times in the past.

-Brian


I agree with Brian.  Maitaining MSVC projects would be a waste of time 
for us.  Even when we use MSVC we just use scons and MSVC command line 
compilers.  I never use MSVC IDE to develop or debug Mesa -- I do most 
of development in Linux, mostly with MinGW, and just build Mesa w/ MSVC 
when necessary.


Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st/mesa: Destroy buffer object's mutex.

2015-11-09 Thread Jose Fonseca
Ideally we should have a _mesa_cleanup_buffer_object function in
src/mesa/bufferobj.c so that the destruction logic resided in a single
place.
---
 src/mesa/state_tracker/st_cb_bufferobjects.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/state_tracker/st_cb_bufferobjects.c 
b/src/mesa/state_tracker/st_cb_bufferobjects.c
index 8afd336..5d20b26 100644
--- a/src/mesa/state_tracker/st_cb_bufferobjects.c
+++ b/src/mesa/state_tracker/st_cb_bufferobjects.c
@@ -83,6 +83,7 @@ st_bufferobj_free(struct gl_context *ctx, struct 
gl_buffer_object *obj)
if (st_obj->buffer)
   pipe_resource_reference(&st_obj->buffer, NULL);
 
+   mtx_destroy(&st_obj->Base.Mutex);
free(st_obj->Base.Label);
free(st_obj);
 }
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] RFC: llvmpipe map scene buffers outside thread.

2015-11-10 Thread Jose Fonseca

On 09/11/15 03:58, Dave Airlie wrote:

From: Dave Airlie 

There might be a reason we do this inside the thread, but I'm not aware of it
yet, move stuff around and see if this jogs anyone's memory.


It might be a relic from the time where we had swizzled tiles.

Jose



Doing this outside the thread at least with front buffer rendering avoids
problems with XGetImage failing in the thread and deadlocking, now things
crash, which is a lot nicer from a piglit point of view.
---
  src/gallium/drivers/llvmpipe/lp_scene.c | 21 -
  src/gallium/drivers/llvmpipe/lp_scene.h |  5 +++--
  src/gallium/drivers/llvmpipe/lp_setup.c |  1 +
  3 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/src/gallium/drivers/llvmpipe/lp_scene.c 
b/src/gallium/drivers/llvmpipe/lp_scene.c
index 2441b3c..1a6fe5c 100644
--- a/src/gallium/drivers/llvmpipe/lp_scene.c
+++ b/src/gallium/drivers/llvmpipe/lp_scene.c
@@ -147,7 +147,7 @@ lp_scene_bin_reset(struct lp_scene *scene, unsigned x, 
unsigned y)


  void
-lp_scene_begin_rasterization(struct lp_scene *scene)
+lp_scene_map_buffers(struct lp_scene *scene)
  {
 const struct pipe_framebuffer_state *fb = &scene->fb;
 int i;
@@ -200,16 +200,20 @@ lp_scene_begin_rasterization(struct lp_scene *scene)
 }
  }

-
+void
+lp_scene_begin_rasterization(struct lp_scene *scene)
+{
+   scene->started = true;
+}


  /**
   * Free all the temporary data in a scene.
   */
-void
-lp_scene_end_rasterization(struct lp_scene *scene )
+static void
+lp_scene_unmap_buffers(struct lp_scene *scene )
  {
-   int i, j;
+   int i;

 /* Unmap color buffers */
 for (i = 0; i < scene->fb.nr_cbufs; i++) {
@@ -232,7 +236,14 @@ lp_scene_end_rasterization(struct lp_scene *scene )
zsbuf->u.tex.first_layer);
scene->zsbuf.map = NULL;
 }
+}

+void
+lp_scene_end_rasterization(struct lp_scene *scene )
+{
+   int i, j;
+   lp_scene_unmap_buffers(scene);
+   scene->started = false;
 /* Reset all command lists:
  */
 for (i = 0; i < scene->tiles_x; i++) {
diff --git a/src/gallium/drivers/llvmpipe/lp_scene.h 
b/src/gallium/drivers/llvmpipe/lp_scene.h
index b1464bb..7ed38c9 100644
--- a/src/gallium/drivers/llvmpipe/lp_scene.h
+++ b/src/gallium/drivers/llvmpipe/lp_scene.h
@@ -178,6 +178,7 @@ struct lp_scene {

 struct cmd_bin tile[TILES_X][TILES_Y];
 struct data_block_list data;
+   boolean started;
  };


@@ -405,8 +406,8 @@ lp_scene_begin_rasterization(struct lp_scene *scene);
  void
  lp_scene_end_rasterization(struct lp_scene *scene );

-
-
+void
+lp_scene_map_buffers(struct lp_scene *scene);


  #endif /* LP_BIN_H */
diff --git a/src/gallium/drivers/llvmpipe/lp_setup.c 
b/src/gallium/drivers/llvmpipe/lp_setup.c
index 1778b13..df2c323 100644
--- a/src/gallium/drivers/llvmpipe/lp_setup.c
+++ b/src/gallium/drivers/llvmpipe/lp_setup.c
@@ -176,6 +176,7 @@ lp_setup_rasterize_scene( struct lp_setup_context *setup )
  * Certainly, lp_scene_end_rasterization() would need to be deferred too
  * and there's probably other bits why this doesn't actually work.
  */
+   lp_scene_map_buffers(scene);
 lp_rast_queue_scene(screen->rast, scene);
 lp_rast_finish(screen->rast);
 pipe_mutex_unlock(screen->rast_mutex);



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] egl: Pass the correct X visual depth to xcb_put_image().

2015-01-23 Thread Jose Fonseca

It looks like nobody really cares, so I'll take it as consent.

This only happens if X requires 24bit visuals. Maybe a minority of 
drivers do that.  At least Intel X driver does.


BTW, this should go to stable branches too.

Jose

On 19/01/15 23:09, Jose Fonseca wrote:

From: José Fonseca 

The dri2_x11_add_configs_for_visuals() function happily matches a 32
bits EGLconfig with a 24 bits X visual.  However it was passing 32bits
depth to xcb_put_image(), making X server unhappy:

   https://github.com/apitrace/apitrace/issues/313#issuecomment-70571911

PS: I rarely use the Mesa DRI software rasterizers (I usually use the
non-DRI Xlib SW renderers), but every time I try them they seem broken
at some fundamental level.  I wonder if it's just me or if nobody truly
uses them on a daily basis.
---
  src/egl/drivers/dri2/platform_x11.c | 24 +---
  1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/src/egl/drivers/dri2/platform_x11.c 
b/src/egl/drivers/dri2/platform_x11.c
index dd88e90..cbcf6a7 100644
--- a/src/egl/drivers/dri2/platform_x11.c
+++ b/src/egl/drivers/dri2/platform_x11.c
@@ -49,8 +49,7 @@ dri2_x11_swap_interval(_EGLDriver *drv, _EGLDisplay *disp, 
_EGLSurface *surf,

  static void
  swrastCreateDrawable(struct dri2_egl_display * dri2_dpy,
- struct dri2_egl_surface * dri2_surf,
- int depth)
+ struct dri2_egl_surface * dri2_surf)
  {
 uint32_t   mask;
 const uint32_t function = GXcopy;
@@ -66,8 +65,7 @@ swrastCreateDrawable(struct dri2_egl_display * dri2_dpy,
 valgc[0] = function;
 valgc[1] = False;
 xcb_create_gc(dri2_dpy->conn, dri2_surf->swapgc, dri2_surf->drawable, 
mask, valgc);
-   dri2_surf->depth = depth;
-   switch (depth) {
+   switch (dri2_surf->depth) {
case 32:
case 24:
   dri2_surf->bytes_per_pixel = 4;
@@ -82,7 +80,7 @@ swrastCreateDrawable(struct dri2_egl_display * dri2_dpy,
   dri2_surf->bytes_per_pixel = 0;
   break;
default:
- _eglLog(_EGL_WARNING, "unsupported depth %d", depth);
+ _eglLog(_EGL_WARNING, "unsupported depth %d", dri2_surf->depth);
 }
  }

@@ -257,12 +255,6 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay 
*disp, EGLint type,
_eglError(EGL_BAD_ALLOC, "dri2->createNewDrawable");
goto cleanup_pixmap;
 }
-
-   if (dri2_dpy->dri2) {
-  xcb_dri2_create_drawable (dri2_dpy->conn, dri2_surf->drawable);
-   } else {
-  swrastCreateDrawable(dri2_dpy, dri2_surf, _eglGetConfigKey(conf, 
EGL_BUFFER_SIZE));
-   }

 if (type != EGL_PBUFFER_BIT) {
cookie = xcb_get_geometry (dri2_dpy->conn, dri2_surf->drawable);
@@ -275,9 +267,19 @@ dri2_x11_create_surface(_EGLDriver *drv, _EGLDisplay 
*disp, EGLint type,

dri2_surf->base.Width = reply->width;
dri2_surf->base.Height = reply->height;
+  dri2_surf->depth = reply->depth;
free(reply);
 }

+   if (dri2_dpy->dri2) {
+  xcb_dri2_create_drawable (dri2_dpy->conn, dri2_surf->drawable);
+   } else {
+  if (type == EGL_PBUFFER_BIT) {
+ dri2_surf->depth = _eglGetConfigKey(conf, EGL_BUFFER_SIZE);
+  }
+  swrastCreateDrawable(dri2_dpy, dri2_surf);
+   }
+
 /* we always copy the back buffer to front */
 dri2_surf->base.PostSubBufferSupportedNV = EGL_TRUE;




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/docs: fix docs wrt ARL/ARR/FLR

2015-01-29 Thread Jose Fonseca

On 29/01/15 19:40, srol...@vmware.com wrote:

From: Roland Scheidegger 

since the address reg holds integer values, ARL/ARR do an implicit float-to-int
conversion, so clarify that. Thus it is also incorrect to say that FLR really
does the same as ARL.
---
  src/gallium/docs/source/tgsi.rst | 18 --
  1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index ff322e8..84b0ed6 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -48,13 +48,13 @@ used.

  .. math::

-  dst.x = \lfloor src.x\rfloor
+  dst.x = (int) \lfloor src.x\rfloor

-  dst.y = \lfloor src.y\rfloor
+  dst.y = (int) \lfloor src.y\rfloor

-  dst.z = \lfloor src.z\rfloor
+  dst.z = (int) \lfloor src.z\rfloor

-  dst.w = \lfloor src.w\rfloor
+  dst.w = (int) \lfloor src.w\rfloor


  .. opcode:: MOV - Move
@@ -313,8 +313,6 @@ This instruction replicates its result.

  .. opcode:: FLR - Floor

-This is identical to :opcode:`ARL`.
-
  .. math::

dst.x = \lfloor src.x\rfloor
@@ -637,13 +635,13 @@ This instruction replicates its result.

  .. math::

-  dst.x = round(src.x)
+  dst.x = (int) round(src.x)

-  dst.y = round(src.y)
+  dst.y = (int) round(src.y)

-  dst.z = round(src.z)
+  dst.z = (int) round(src.z)

-  dst.w = round(src.w)
+  dst.w = (int) round(src.w)


  .. opcode:: SSG - Set Sign



Looks good.

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: fix display list 8-byte alignment issue

2015-01-30 Thread Jose Fonseca

Looks good to me.


Just one minor suggestion: if we replaced

  sizeof(void *) == 8

with

  sizeof(void *) > sizeof(GLuint)

we would avoid the magic number 8 and make the code correct for any 
pointer size.


Jose



On 28/01/15 03:06, Brian Paul wrote:

The _mesa_dlist_alloc() function is only guaranteed to return a pointer
with 4-byte alignment.  On 64-bit systems which don't support unaligned
loads (e.g. SPARC or MIPS) this could lead to a bus error in the VBO code.

The solution is to add a new  _mesa_dlist_alloc_aligned() function which
will return a pointer to an 8-byte aligned address on 64-bit systems.
This is accomplished by inserting a 4-byte NOP instruction in the display
list when needed.

The only place this actually matters is the VBO code where we need to
allocate a 'struct vbo_save_vertex_list' which needs to be 8-byte
aligned (just as if it were malloc'd).

The gears demo and others hit this bug.

Bugzilla: 
https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.freedesktop.org_show-5Fbug.cgi-3Fid-3D88662&d=AwIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=zfmBZnnVGHeYde45pMKNnVyzeaZbdIqVLprmZCM2zzE&m=DaIFyU2hnmYCL1EaelhvLHTsOdyZV8y6pEVRwRkcp8Q&s=g3pos-5bc_Uu5plvnuQvIjeEJXLDgTr5eOznhu6o-Fo&e=
Cc: "10.4" 
---
  src/mesa/main/dlist.c   | 72 +
  src/mesa/main/dlist.h   |  3 ++
  src/mesa/vbo/vbo_save_api.c |  5 +++-
  3 files changed, 73 insertions(+), 7 deletions(-)

diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c
index 138d360..dc6070b 100644
--- a/src/mesa/main/dlist.c
+++ b/src/mesa/main/dlist.c
@@ -487,6 +487,7 @@ typedef enum
 /* The following three are meta instructions */
 OPCODE_ERROR,/* raise compiled-in error */
 OPCODE_CONTINUE,
+   OPCODE_NOP,  /* No-op (used for 8-byte alignment */
 OPCODE_END_OF_LIST,
 OPCODE_EXT_0
  } OpCode;
@@ -1012,13 +1013,16 @@ memdup(const void *src, GLsizei bytes)
   * Allocate space for a display list instruction (opcode + payload space).
   * \param opcode  the instruction opcode (OPCODE_* value)
   * \param bytes   instruction payload size (not counting opcode)
- * \return pointer to allocated memory (the opcode space)
+ * \param align8  does the payload need to be 8-byte aligned?
+ *This is only relevant in 64-bit environments.
+ * \return pointer to allocated memory (the payload will be at pointer+1)
   */
  static Node *
-dlist_alloc(struct gl_context *ctx, OpCode opcode, GLuint bytes)
+dlist_alloc(struct gl_context *ctx, OpCode opcode, GLuint bytes, bool align8)
  {
 const GLuint numNodes = 1 + (bytes + sizeof(Node) - 1) / sizeof(Node);
 const GLuint contNodes = 1 + POINTER_DWORDS;  /* size of continue info */
+   GLuint nopNode;
 Node *n;

 if (opcode < OPCODE_EXT_0) {
@@ -1032,7 +1036,19 @@ dlist_alloc(struct gl_context *ctx, OpCode opcode, 
GLuint bytes)
}
 }

-   if (ctx->ListState.CurrentPos + numNodes + contNodes > BLOCK_SIZE) {
+   if (sizeof(void *) == 8 && align8 && ctx->ListState.CurrentPos % 2 == 0) {
+  /* The opcode would get placed at node[0] and the payload would start
+   * at node[1].  But the payload needs to be at an even offset (8-byte
+   * multiple).
+   */
+  nopNode = 1;
+   }
+   else {
+  nopNode = 0;
+   }
+
+   if (ctx->ListState.CurrentPos + nopNode + numNodes + contNodes
+   > BLOCK_SIZE) {
/* This block is full.  Allocate a new block and chain to it */
Node *newblock;
n = ctx->ListState.CurrentBlock + ctx->ListState.CurrentPos;
@@ -1042,13 +1058,34 @@ dlist_alloc(struct gl_context *ctx, OpCode opcode, 
GLuint bytes)
   _mesa_error(ctx, GL_OUT_OF_MEMORY, "Building display list");
   return NULL;
}
+
+  /* a fresh block should be 8-byte aligned on 64-bit systems */
+  assert(((GLintptr) newblock) % sizeof(void *) == 0);
+
save_pointer(&n[1], newblock);
ctx->ListState.CurrentBlock = newblock;
ctx->ListState.CurrentPos = 0;
+
+  /* Display list nodes are always 4 bytes.  If we need 8-byte alignment
+   * we have to insert a NOP so that the payload of the real opcode lands
+   * on an even location:
+   *   node[0] = OPCODE_NOP
+   *   node[1] = OPCODE_x;
+   *   node[2] = start of payload
+   */
+  nopNode = sizeof(void *) == 8 && align8;
 }

 n = ctx->ListState.CurrentBlock + ctx->ListState.CurrentPos;
-   ctx->ListState.CurrentPos += numNodes;
+   if (nopNode) {
+  assert(ctx->ListState.CurrentPos % 2 == 0); /* even value */
+  n[0].opcode = OPCODE_NOP;
+  n++;
+  /* The "real" opcode will now be at an odd location and the payload
+   * will be at an even location.
+   */
+   }
+   ctx->ListState.CurrentPos += nopNode + numNodes;

 n[0].opcode = opcode;

@@ -1069,7 +1106,22 @@ dlist_alloc(struct gl_context *ctx, OpCode opcode, 
GLuint bytes)
  void *
  _mesa_dlist_alloc(struct 

Re: [Mesa-dev] [PATCH] mesa: Add new fast mtx_t mutex type for basic use cases

2015-01-30 Thread Jose Fonseca

On 29/01/15 17:14, Kristian Høgsberg wrote:

On Thu, Jan 29, 2015 at 6:36 AM, Emil Velikov  wrote:

On 28/01/15 05:08, Kristian Høgsberg wrote:

While modern pthread mutexes are very fast, they still incur a call to an
external DSO and overhead of the generality and features of pthread mutexes.
Most mutexes in mesa only needs lock/unlock, and the idea here is that we can
inline the atomic operation and make the fast case just two intructions.
Mutexes are subtle and finicky to implement, so we carefully copy the
implementation from Ulrich Dreppers well-written and well-reviewed paper:

   "Futexes Are Tricky"
   
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.akkadia.org_drepper_futex.pdf&d=AwIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=zfmBZnnVGHeYde45pMKNnVyzeaZbdIqVLprmZCM2zzE&m=NS0xLkqIj43l--WADuy3EQa3yVe4rItSr1sBgtCZJ28&s=jUMBbUUMfsjTAo4ye4aoY9kqeuG10NtNEuSLKRsxPoc&e=

We implement "mutex3", which gives us a mutex that has no syscalls on
uncontended lock or unlock.  Further, the uncontended case boils down to a
cmpxchg and an untaken branch and the uncontended unlock is just a locked decr
and an untaken branch.  We use __builtin_expect() to indicate that contention
is unlikely so that gcc will put the contention code out of the main code
flow.


I don't oppose the idea of a faster mutex.  But do you have some 
performance figures with this patch?  (It doesn't need to be a real-life 
app -- an artificial demo/benchmark would suffice.


What I'd like to know is, is the performance improvement significant 
enough to at least justify the complexity of maintaining a multiple 
mutex type across our code?


Because I never had the impression that mutexes were a bottleneck. 
Atomic reference counting is probably more of an problem.



A fast mutex only supports lock/unlock, can't be recursive or used with
condition variables.  We keep the pthread mutex implementation around as
full_mtx_t for the few places where we use condition variables or recursive
locking.  For platforms or compilers where futex and atomics aren't available,
mtx_t falls back to the pthread mutex.





The pthread mutex lock/unlock overhead shows up on benchmarks for CPU bound
applications.  Most CPU bound cases are helped and some of our internal
bind_buffer_object heavy benchmarks gain up to 10%.


Hi Kristian,

Can I humbly ask that you split this into two patches - one that
introduces the new functions/struct and another one that uses them ?
This way it'll be easier if/when things go crazy.

Also the patch seems to wonder between posix and win32
+ typedef full_mtx_t mtx_t;

and
+ typedef mtx_t fast_mtx_t;

Looks like a left over from the "should I rename XX variables to fast*
or just one to full*" moment :)


Yeah, that's how it progressed :)  At first I called it fast_mtx_t and
planned on replacing simple uses of mtx_t one by one. Jordan suggested
that it'd be easier to make the regular mutex fast and then rename the
couple of places that use more feature than we provide.




I'm however strongly against having a non-standard mutex using a 
standard name like `mtx_t`.


The point of using C11 names for threading primitives was to enable us 
to implement Mesa using standard-looking C code.  The idea was that at 
one point we'd only use our C11 threads.h emulation where needed. 
Please keep in mind that if/when platforms start providing C11 threads.h 
we might be forced to use them instead of our own, as system/3rd party 
headers might start depending on them on their ABIs.


It is imperative that any non-standard mutexes use names that don't 
collide with C11 threads names.



Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] killing off the address reg in tgsi

2015-01-30 Thread Jose Fonseca

On 29/01/15 21:20, Roland Scheidegger wrote:

Hi,

the address reg in tgsi is quite a nuisance. glsl-to-tgsi code assumes
that indirections can only be done through the address reg and has quite
some extra code to deal with this. Even though hardware and apis which
worked like that are definitely old by now.
Thus, I'm proposing the address reg be nuked. I am however not quite
sure what the implications for drivers are, other than I'm certain
llvmpipe can handle that already.
For that reason, I suspect at least initially a new cap bit would be
required so glsl-to-tgsi would skip the extra code.


A new cap might not be necessary -- supporting integer opcodes 
(PIPE_SHADER_CAP_INTEGERS) might suffice.


glsl-to-tgsi already varies its output depending on it.


I tend to think
longer term it would be great if it could be nuked completely, I am
however not sure if that is easily done with drivers for old hw (such as
r300) - I guess if necessary we could keep operations such as ARL (or
even ARR though clearly not UARL!) and just define them to be usable
with temp regs.

Opinions?

Roland


Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Fixes to build Mesa 10.4 against LLVM 3.5 on windows

2015-02-03 Thread Jose Fonseca

On 16/12/14 20:08, Rob Conde wrote:

I built Mesa 10.4 with LLVM 3.5 today and I had to make a couple of
fixes to get it to work:


 1. root\src\mesa\compiler.h

After:

#include "c99_compat.h" /* inline, __func__, etc. */

Add:

#ifdef _MSC_VER
#define __attribute__(a)
#endif

Otherwise you get a compile error in
root\build\windows-x86_64\mesa\program\program_parse.tab.c


This looks like a bug in the particular bison version you have.  I 
haven't seen this issue.


I'm using bison 2.4.1 and flex 2.5.35 on my MSVC builds.  And bison 
3.0.2 on Linux.  AFAICT, all generated program_parse.tab.c I've seen 
don't rely on __attribute__ for non-GNU compilers.


Which bison/flex version did you use?



 2. root\scons\llvm.py

Add 'LLVMBitReader' to libs list (i.e. 'LLVMBitWriter',
'LLVMBitReader', 'LLVMX86Disassembler', ...)

Otherwise you get a linker error on opengl32.dll


Thanks.  I pushed this change.

Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Rename mesa/src/util (Was: gallium/util: add u_bit_scan64)

2015-02-04 Thread Jose Fonseca

This change broke MinGW/MSVC builds because ffsll is not available there.


There is a ffsll C fallback, but it's in src/mesa/main/imports.[ch].  So 
rather than duplicating it in src/gallium/auxiliary/util/u_math.h I'd 
prefer move it to src/util.



And here lies the problem: what header name should be used for math helpers?


I think the filenames in src/util and the directory itself is poorly 
named for something that is meant to be included by some many other 
components:

- there is no unique prefix in most headers
- util/ clashes with src/gallium/auxiliary/util/


Hence I'd like to propose to:

- rename src/util to something unique (e.g, cgrt, for Common Graphics 
RunTime


And maybe:

- prefix all header/source files in there with a cgrt_* unique prefix too

And maybe in the future

- use cgrt_* prefix for symbols too.


Jose



On 01/02/15 17:15, Marek Olšák wrote:

From: Marek Olšák 

Same as u_bit_scan, but for uint64_t.
---
  src/gallium/auxiliary/util/u_math.h | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/src/gallium/auxiliary/util/u_math.h 
b/src/gallium/auxiliary/util/u_math.h
index 19c7343..f5d3487 100644
--- a/src/gallium/auxiliary/util/u_math.h
+++ b/src/gallium/auxiliary/util/u_math.h
@@ -587,6 +587,13 @@ u_bit_scan(unsigned *mask)
 return i;
  }

+static INLINE int
+u_bit_scan64(uint64_t *mask)
+{
+   int i = ffsll(*mask) - 1;
+   *mask &= ~(1llu << i);
+   return i;
+}

  /**
   * Return float bits.



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Rename mesa/src/util (Was: gallium/util: add u_bit_scan64)

2015-02-04 Thread Jose Fonseca

On 04/02/15 20:18, Kenneth Graunke wrote:

On Wednesday, February 04, 2015 02:04:38 PM Jose Fonseca wrote:

This change broke MinGW/MSVC builds because ffsll is not available there.


There is a ffsll C fallback, but it's in src/mesa/main/imports.[ch].  So
rather than duplicating it in src/gallium/auxiliary/util/u_math.h I'd
prefer move it to src/util.


And here lies the problem: what header name should be used for math helpers?


I think the filenames in src/util and the directory itself is poorly
named for something that is meant to be included by some many other
components:
- there is no unique prefix in most headers
- util/ clashes with src/gallium/auxiliary/util/


Hence I'd like to propose to:

- rename src/util to something unique (e.g, cgrt, for Common Graphics
RunTime

And maybe:

- prefix all header/source files in there with a cgrt_* unique prefix too

And maybe in the future

- use cgrt_* prefix for symbols too.


Jose


"util" is meant to be for shared utility across the entire code base -
both Mesa and Gallium.  It's been growing slowly as people move things
there.  It might make sense to move a lot of src/gallium/auxiliary/util
there, in fact - there's always been a lot of duplication between Mesa
and Gallium's utility code.  But that's up to the Gallium developers.

I think that "util" is precisely the right name.  If a new contributor
wants to find a hash table, or a set, or some macros...they're going to
look for utility code.  src/util is obviously named and easy to find.

I think any acronym like "cgrt" is going to confuse people.  src/cgrt
sounds like "some obscure part of the system I can ignore for now" -
easily overlooked, and what does the acronym mean anyway...

We chose not to add the "u_" prefix, partly for historical reasons
(Mesa never used one), but also specifically to avoid clashing with
src/gallium/auxiliary/util.  Most people don't put src/util in their
include path, and instead use #include "util/ralloc.h" - which already
is a prefix of sorts.  What additional value does "u_" provide?


We had src/util in the include path. But now we don't, so yes, header 
collision is less likely.


There's a problem with the symbols though -- gallium can (and is) 
embedded in other software systems -- it's not just used to make OpenGL 
drivers.


And if src/util is to be a dependency of gallium, it means it ends up 
being statically linked against other stuff too.   And if everybody just 
uses the most obvious header names, and the most obvious symbol names, 
it's just a matter of time until a collision happens.


That said, it looks the symbols so far in u_mesa



I think you should just invent a header name and put it there.  "math.h"
does sound fairly generic.  If you're just reimplementing things like
ffsll that are usually provided by your system, it might make sense to
call it something like "os_compat.h" (along the lines of c99_compat.h).

Or maybe Brian is right - we could just move Gallium's utility code to
src/util and use it everywhere.  It'd be nice to not have two sets.


To be clear: I'm all for moving as much code from src/gallium/auxility 
to src/util -- that's my objective here.


But I believe that not all code in src/gallium/auxiliary/util can be 
moved into src/util as some is gallium specific (depends on gallium 
types, helpers, etc), so merely moving files won't work generally: the 
gallium-specific stuff needs to stay behind, and therefore, most 
co-exisit without colliding with the stuff that gets moved into src/util.


Even u_math.[ch] can't be trivially moved -- it depends on u_debug.[ch] 
which has a bunch of gallium specific stuff.


Moving all this is one go will be tricky.  Doing piece by piece seems 
safer and more guaranteed.



I don't feel strongly about this.  But it's a matter of practicality: I 
can't afford take a week off my main work to move the bulk of 
src/gallium/auxiliary/util into src/util, but I can take a couple of 
hours to get a sub-module, or a subset of it.


Maybe I can approach it from the different angle: if I get things that 
are dependend by pretty much everything else in 
src/gallium/auxiliary/util, like p_config.h and u_debug.h, out of the 
way, then the rest will be easier to migrate...





Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] llvmpipe: Trivially advertise PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT.

2015-02-05 Thread Jose Fonseca
Nothing special needs to be done.

Even though llvmpipe copies constant (ie uniform) buffers internally, the
application is suppose to flush and sync, so all should work.

All bufferstorage piglit tests pass.
---
 src/gallium/drivers/llvmpipe/lp_screen.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index 3b31656..507cfcf 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -246,9 +246,10 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
   return PIPE_ENDIAN_NATIVE;
case PIPE_CAP_TGSI_VS_LAYER_VIEWPORT:
   return 1;
+   case PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT:
+  return 1;
case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
case PIPE_CAP_TEXTURE_GATHER_SM5:
-   case PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT:
case PIPE_CAP_TEXTURE_QUERY_LOD:
case PIPE_CAP_SAMPLE_SHADING:
case PIPE_CAP_TEXTURE_GATHER_OFFSETS:
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] llvmpipe: Trivially advertise PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT.

2015-02-05 Thread Jose Fonseca

On 05/02/15 15:07, Roland Scheidegger wrote:

Am 05.02.2015 um 15:33 schrieb Jose Fonseca:

Nothing special needs to be done.

Even though llvmpipe copies constant (ie uniform) buffers internally, the
application is suppose to flush and sync, so all should work.

All bufferstorage piglit tests pass.
---
  src/gallium/drivers/llvmpipe/lp_screen.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c 
b/src/gallium/drivers/llvmpipe/lp_screen.c
index 3b31656..507cfcf 100644
--- a/src/gallium/drivers/llvmpipe/lp_screen.c
+++ b/src/gallium/drivers/llvmpipe/lp_screen.c
@@ -246,9 +246,10 @@ llvmpipe_get_param(struct pipe_screen *screen, enum 
pipe_cap param)
return PIPE_ENDIAN_NATIVE;
 case PIPE_CAP_TGSI_VS_LAYER_VIEWPORT:
return 1;
+   case PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT:
+  return 1;
 case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
 case PIPE_CAP_TEXTURE_GATHER_SM5:
-   case PIPE_CAP_BUFFER_MAP_PERSISTENT_COHERENT:
 case PIPE_CAP_TEXTURE_QUERY_LOD:
 case PIPE_CAP_SAMPLE_SHADING:
 case PIPE_CAP_TEXTURE_GATHER_OFFSETS:



Looks good to me. I vaguely remember I didn't enable that because I
wasn't sure if it would really always work (things like using buffers as
render targets or as textures for instance). It's quite possible though
this isn't a problem (and the former isn't possible in GL in any case).

Roland



Thanks for the review.

Now that we don't swizzle textures/rendertargets it should be OK -- once 
the fences are signalled all buffer and even texture contents should be 
up to date.



My immediate interest is being able to run tests for apitrace and 
coherent mappings [1] [2].


If there is some bug in some obscure corner case we can deal with it 
when we come across it.



Jose

[1] 
https://github.com/apitrace/apitrace-tests/blob/master/apps/gl/map_coherent.cpp
[2] 
https://github.com/apitrace/apitrace/blob/master/docs/VMWX_map_buffer_debug.txt

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Rename mesa/src/util (Was: gallium/util: add u_bit_scan64)

2015-02-07 Thread Jose Fonseca

On 07/02/15 00:10, Matt Turner wrote:

On Fri, Feb 6, 2015 at 3:58 PM, Emil Velikov  wrote:

"util" is meant to be for shared utility across the entire code base -
both Mesa and Gallium.  It's been growing slowly as people move things
there.  It might make sense to move a lot of src/gallium/auxiliary/util
there, in fact - there's always been a lot of duplication between Mesa
and Gallium's utility code.  But that's up to the Gallium developers.


Imho currently the util library is growing on the basis of "we can
share X let's throw it in there" rather than putting much thought
about the structure/"design" of it - unlike in gallium.


Are you serious? Let's be honest with ourselves. I probably would have
been a better plan to not put commonly useful code deep in Gallium in
the first place.


Historic reasons, as Brian explained.  Gallium was supposed to become a 
dependency of Mesa but it didn't panned out.



Apparently this is what I get for trying to do the right thing an pull
the atomics code out into a place the rest of the Mesa project can use
it.


I really appreciate you went the extra mile there.  And for me it's way 
more important that we start sharing code than the naming structure.


Especially when naming is subject to test/style whereas code reuse is 
something everybody can readily agree on.


If the outcome of this email thread would be to dicentivate you to share 
more code, then that would be worst outcome indeed.


Anyway, let's get out of this criticism spiral, and instead focus on how 
we can solve the issues to everybody's satisfaction.



How about instead of an annoying bikeshed thread we just finish moving
bits of Gallium's util directory to src/util and be done with it?


If renaming src/util is not something we can agree fine.  Let's forget 
about it.



But I don't think I (or anybody) has the time to move 
src/gallium/auxiliary/util to src/util in one go.  The code is entangled 
with src/gallium/include .


That is, moving the whole src/gaullium/auxiliary/util to src/util equals 
to add gallium as dependency to whole mesa.  If that's OK, then I agree 
with Brian's suggestion: might as well do that (leave util in 
src/gallium/axuliary ) and add src/gallium/* as includes/dependency 
everwhere.


I think for Mesa (src/mesa) this is fine.  I'm not sure about src/glsl.

Again, I suspect this won't be something we'll agree neither.



So I'm back to the beginning: I want to move some math helpers from 
src/gallium/auxiliary/util/u_math to somewhere inside src/util.  I need 
_some_ name: cgrt_*.h is no good, math.h would collide with standard C 
headers, u_math.h would collide with src/gallium/auxiliary/util, so it 
must be something else.  I'm open to suggestions.  If none I'll go with 
mathhelpers.h




Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] mesa-10.4.4: BROKEN TLS support in GLX with llvm-toolchain v3.6.0rc2

2015-02-07 Thread Jose Fonseca

I think we decided not to support unreleased LLVM builds on stable releases.

This is because building without errors is not enough -- there are often 
other changes that need to go with this. Furthermore it's often a moving 
target.


In short, if you want to use bleeding edge LLVM, you must use bleeding 
edge Mesa too.


What I think it might be worthwhile to do is have a ceiling on the 
maximum supported LLVM version on a stable branch.  That is, fail 
configure if the user attempts to build with a LLVM that's too new.


Jose

On 07/02/15 22:53, Sedat Dilek wrote:

Just as a hint:

You need to cherry-pick...

commit ef7e0b39a24966526b102643523feac765771842
"gallivm: Update for RTDyldMemoryManager becoming an unique_ptr."

...from mesa upstream to build v10.4.4 successfully.

- Sedat -


On Sat, Feb 7, 2015 at 11:42 PM, Sedat Dilek  wrote:

[ Please CC me I am not subscribed to mesa-dev and llvmdev MLs ]

Hi,

I already reported this when playing 1st time with my llvm-toolchain
v3.6.0rc2 and mesa v10.3.7 [1].
The issue still remains in mesa v10.4.4.

So, this is a field test to see if LLVM/Clang v3.6.0rc2 fits my needs.

I see the following build-error...
...

make[4]: Entering directory `/home/wearefam/src/mesa/mesa-git/src/mapi'
   CC shared_glapi_libglapi_la-entry.lo
clang version 3.6.0 (tags/RELEASE_360/rc2)
Target: x86_64-unknown-linux-gnu
Thread model: posix
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6.4
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9.2
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Selected multilib: .;@m64
  "/opt/llvm-toolchain-3.6.0rc2/bin/clang" -cc1 -triple
x86_64-unknown-linux-gnu -emit-obj -mrelax-all -disable-free
-main-file-name entry.c -mrelocation-model static -mthread-model posix
-mdisable-fp-elim -relaxed-aliasing -fmath-errno -masm-verbose
-mconstructor-aliases -munwind-tables -fuse-init-array -target-cpu
x86-64 -target-linker-version 2.22 -v -g -dwarf-column-info
-coverage-file /home/wearefam/src/mesa/mesa-git/src/mapi/entry.c
-resource-dir /opt/llvm-toolchain-3.6.0rc2/bin/../lib/clang/3.6.0
-dependency-file .deps/shared_glapi_libglapi_la-entry.Tpo
-sys-header-deps -MP -MT shared_glapi_libglapi_la-entry.lo -D
"PACKAGE_NAME=\"Mesa\"" -D "PACKAGE_TARNAME=\"mesa\"" -D
"PACKAGE_VERSION=\"10.4.4\"" -D "PACKAGE_STRING=\"Mesa 10.4.4\"" -D
"PACKAGE_BUGREPORT=\"https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.freedesktop.org_enter-5Fbug.cgi-3Fproduct-3DMesa&d=AwIBaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=zfmBZnnVGHeYde45pMKNnVyzeaZbdIqVLprmZCM2zzE&m=v1Q5d0xLrpLdBsHNMciHYlRYr673KPaVUvoUHKGZo40&s=3E29z8re4h9Gj47n8ZUg8WuKtNUUqZPJBj2rEYda0R8&e=
 \""
-D "PACKAGE_URL=\"\"" -D "PACKAGE=\"mesa\"" -D "VERSION=\"10.4.4\"" -D
STDC_HEADERS=1 -D HAVE_SYS_TYPES_H=1 -D HAVE_SYS_STAT_H=1 -D
HAVE_STDLIB_H=1 -D HAVE_STRING_H=1 -D HAVE_MEMORY_H=1 -D
HAVE_STRINGS_H=1 -D HAVE_INTTYPES_H=1 -D HAVE_STDINT_H=1 -D
HAVE_UNISTD_H=1 -D HAVE_DLFCN_H=1 -D "LT_OBJDIR=\".libs/\"" -D
YYTEXT_POINTER=1 -D HAVE___BUILTIN_BSWAP32=1 -D
HAVE___BUILTIN_BSWAP64=1 -D HAVE___BUILTIN_CLZ=1 -D
HAVE___BUILTIN_CLZLL=1 -D HAVE___BUILTIN_CTZ=1 -D
HAVE___BUILTIN_EXPECT=1 -D HAVE___BUILTIN_FFS=1 -D
HAVE___BUILTIN_FFSLL=1 -D HAVE___BUILTIN_POPCOUNT=1 -D
HAVE___BUILTIN_POPCOUNTLL=1 -D HAVE___BUILTIN_UNREACHABLE=1 -D
HAVE_DLADDR=1 -D HAVE_PTHREAD=1 -D HAVE_LIBEXPAT=1 -D
USE_EXTERNAL_DXTN_LIB=1 -D _GNU_SOURCE -D USE_SSE41 -D DEBUG -D
USE_X86_64_ASM -D HAVE_XLOCALE_H -D HAVE_STRTOF -D HAVE_DLOPEN -D
HAVE_POSIX_MEMALIGN -D HAVE_LIBDRM -D GLX_USE_DRM -D HAVE_LIBUDEV -D
GLX_INDIRECT_RENDERING -D GLX_DIRECT_RENDERING -D GLX_USE_TLS -D
HAVE_ALIAS -D HAVE_MINCORE -D HAVE_LLVM=0x0306 -D LLVM_VERSION_PATCH=0
-D MAPI_MODE_GLAPI -D
"MAPI_ABI_HEADER=\"shared-glapi/glapi_mapi_tmp.h\"" -I . -I
../../include -I ../../src/mapi -I ../../src/mapi -I /opt/xorg/include
-internal-isystem /usr/local/include -internal-isystem
/opt/llvm-toolchain-3.6.0rc2/bin/../lib/clang/3.6.0/include
-internal-externc-isystem /usr/include/x86_64-linux-gnu
-internal-externc-isystem /include -internal-externc-isystem
/usr/include -O0 -Wall -Werror=implicit-function-declaration
-Werror=missing-prototypes -std=c99 -fdebug-compilation-dir
/home/wearefam/src/mesa/mesa-git/src/mapi -ferror-limit 19
-fmessage-length 0 -pthread -mstackrealign -fobjc-runtime=gcc
-fdiagnostics-show-option -o entry.o -x c ../../src/mapi/entry.c
clang -cc1 version 3.6.0 based upon LLVM 3.6.0 default target
x86_64-unknown-linux-gnu
ignoring nonexistent directory "/include"
ignoring duplicate directory "."
ignoring duplicate directory "."
#include "..." search starts here:
#include <...> search starts here:
  .
  ../../include
  /opt/xorg/include
  /usr/local/include
  /opt/llvm-toolchain-3.6.0rc2/bin/../lib/clang/3.6.0/include
  /u

Re: [Mesa-dev] Rename mesa/src/util (Was: gallium/util: add u_bit_scan64)

2015-02-08 Thread Jose Fonseca
As I said earlier on this thread, it's not that simple: u_math.c depends on 
u_cpu_detect.c and more.  I wish it was a matter of merely moving u_math.[ch] 
to src/util, but it's not.

And because I don't believe I have the time to untangle u_math.[ch] from 
everything else, I'm restrictied to more incremental solutions.  That is, add a 
new header to src/util with an unique name, and cherry-pick bits and pieces 
from u_math.[ch] as deemed necessary/practical.

Jose



From: Marek Olšák 
Sent: 08 February 2015 11:27
To: Jose Fonseca
Cc: Matt Turner; Emil Velikov; ML mesa-dev
Subject: Re: [Mesa-dev] Rename mesa/src/util (Was: gallium/util: add 
u_bit_scan64)

I kind of like the "util_" prefix everywhere. u_math only depends on
p_config.h and p_compiler.h. I don't think it would be hard to move
those two into src/util as well. We have always wanted Mesa to use
more of Gallium. This might be a good start.

Just my 2 cents.

Marek

On Sat, Feb 7, 2015 at 3:46 PM, Jose Fonseca  wrote:
> On 07/02/15 00:10, Matt Turner wrote:
>>
>> On Fri, Feb 6, 2015 at 3:58 PM, Emil Velikov 
>> wrote:
>>>>
>>>> "util" is meant to be for shared utility across the entire code base -
>>>> both Mesa and Gallium.  It's been growing slowly as people move things
>>>> there.  It might make sense to move a lot of src/gallium/auxiliary/util
>>>> there, in fact - there's always been a lot of duplication between Mesa
>>>> and Gallium's utility code.  But that's up to the Gallium developers.
>>>>
>>> Imho currently the util library is growing on the basis of "we can
>>> share X let's throw it in there" rather than putting much thought
>>> about the structure/"design" of it - unlike in gallium.
>>
>>
>> Are you serious? Let's be honest with ourselves. I probably would have
>> been a better plan to not put commonly useful code deep in Gallium in
>> the first place.
>
>
> Historic reasons, as Brian explained.  Gallium was supposed to become a
> dependency of Mesa but it didn't panned out.
>
>> Apparently this is what I get for trying to do the right thing an pull
>> the atomics code out into a place the rest of the Mesa project can use
>> it.
>
>
> I really appreciate you went the extra mile there.  And for me it's way more
> important that we start sharing code than the naming structure.
>
> Especially when naming is subject to test/style whereas code reuse is
> something everybody can readily agree on.
>
> If the outcome of this email thread would be to dicentivate you to share
> more code, then that would be worst outcome indeed.
>
> Anyway, let's get out of this criticism spiral, and instead focus on how we
> can solve the issues to everybody's satisfaction.
>
>> How about instead of an annoying bikeshed thread we just finish moving
>> bits of Gallium's util directory to src/util and be done with it?
>
>
> If renaming src/util is not something we can agree fine.  Let's forget about
> it.
>
>
> But I don't think I (or anybody) has the time to move
> src/gallium/auxiliary/util to src/util in one go.  The code is entangled
> with src/gallium/include .
>
> That is, moving the whole src/gaullium/auxiliary/util to src/util equals to
> add gallium as dependency to whole mesa.  If that's OK, then I agree with
> Brian's suggestion: might as well do that (leave util in
> src/gallium/axuliary ) and add src/gallium/* as includes/dependency
> everwhere.
>
> I think for Mesa (src/mesa) this is fine.  I'm not sure about src/glsl.
>
> Again, I suspect this won't be something we'll agree neither.
>
>
>
> So I'm back to the beginning: I want to move some math helpers from
> src/gallium/auxiliary/util/u_math to somewhere inside src/util.  I need
> _some_ name: cgrt_*.h is no good, math.h would collide with standard C
> headers, u_math.h would collide with src/gallium/auxiliary/util, so it must
> be something else.  I'm open to suggestions.  If none I'll go with
> mathhelpers.h
>
>
>
> Jose
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddev&d=AwIBaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=zfmBZnnVGHeYde45pMKNnVyzeaZbdIqVLprmZCM2zzE&m=sAgl8jEh0WnJsnslLG59t6U2AbWzLAmPO3N7rrA9YPE&s=L3d9gFZyZkC0vZ0eYA6Bl3TFEihZ0Btt56_Hl3iTvS0&e=
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] util/u_atomic: Add new macro p_atomic_add

2015-02-09 Thread Jose Fonseca

On 06/02/15 22:39, Carl Worth wrote:

On Fri, Feb 06 2015, Aaron Watry wrote:

Ignore me if this is a stupid question, but should those both be
sizeof(short)?  I'd expect the first to be sizeof(char).


Not a stupid question. That was a copy-and-paste (kill-and-yank ?) bug
of mine.

Thanks for your attention to detail. I've fixed this in my tree.

-Carl


Hi Carl,

Just one more tweak to InterlockedExchangeAdd64 as per patch attached. 
(The 64bit intrisicis are only available on 64-bits, but the 
non-intrisinc version is available everywhee.)


With that u_test_atomic builds and passes for me both on 32 and 64bits.

Sorry for the delay. And thanks for your help in keeping MSVC support on 
par.


Jose

diff --git a/src/util/u_atomic.h b/src/util/u_atomic.h
index 4eb0ec6..0c43410 100644
--- a/src/util/u_atomic.h
+++ b/src/util/u_atomic.h
@@ -147,10 +147,10 @@ char _InterlockedCompareExchange8(char volatile *Destination8, char Exchange8, c
  (assert(!"should not get here"), 0))
 
 #define p_atomic_add(_v, _i) (\
-   sizeof *(_v) == sizeof(short)   ? _InterlockedExchangeAdd8 ((char *)   (_v), (_i)) : \
+   sizeof *(_v) == sizeof(char)? _InterlockedExchangeAdd8 ((char *)   (_v), (_i)) : \
sizeof *(_v) == sizeof(short)   ? _InterlockedExchangeAdd16((short *)  (_v), (_i)) : \
sizeof *(_v) == sizeof(long)? _InterlockedExchangeAdd  ((long *)   (_v), (_i)) : \
-   sizeof *(_v) == sizeof(__int64) ? _InterlockedExchangeAdd64((__int64 *)(_v), (_i)) : \
+   sizeof *(_v) == sizeof(__int64) ? InterlockedExchangeAdd64 ((__int64 *)(_v), (_i)) : \
  (assert(!"should not get here"), 0))
 
 #define p_atomic_cmpxchg(_v, _old, _new) (\
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/util: Define ffsll on OpenBSD.

2015-02-09 Thread Jose Fonseca

On 09/02/15 16:59, Jon TURNEY wrote:

On 06/02/2015 19:58, Matt Turner wrote:

On Fri, Feb 6, 2015 at 3:38 AM, Jonathan Gray  wrote:

OpenBSD has ffs in libc but does not have ffsll so use the compiler
builtin.  PIPE_OS_BSD isn't suitable here as FreeBSD has ffsll in libc.

Signed-off-by: Jonathan Gray 
---
  src/gallium/auxiliary/util/u_math.h | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/src/gallium/auxiliary/util/u_math.h
b/src/gallium/auxiliary/util/u_math.h
index 5db5b66..ec282f3 100644
--- a/src/gallium/auxiliary/util/u_math.h
+++ b/src/gallium/auxiliary/util/u_math.h
@@ -531,6 +531,8 @@ unsigned ffs( unsigned u )
  #elif defined(__MINGW32__) || defined(PIPE_OS_ANDROID)
  #define ffs __builtin_ffs
  #define ffsll __builtin_ffsll
+#elif defined(__OpenBSD__)
+#define ffsll __builtin_ffsll
  #endif


Autoconf checks for presence of a bunch of builtins. Please use those
instead (in this case, HAVE___BUILTIN_FFSLL).


Yes, please.

This has just been 'fixed' for MinGW, now for OpenBSD, and also needs
fixing for Cygwin.




Attached is a patch which attempts to do this using autoconf checks.


The issue is that this will break scons builds unless these checks are 
replicated there.  And SCons implementation of configure checks are not 
great to be honest -- they either are cached (but in such way were 
multiple builds from same source tree pick up wrong values) or they need 
to be re-checked on every build (wasting time for incremental builds).


This is why, within reason, I personally like to avoid configure checks 
when practical.



So for now I'd prefer to leave MinGW 'fixed' as you put it.

But fell free to fix the other platforms as you propose.


BTW, isn't there any standard include that defines ffsll as macro or 
inline on top of __builtin_ffsll for systems that support it?



Jose

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] mesa-10.4.4: BROKEN TLS support in GLX with llvm-toolchain v3.6.0rc2

2015-02-09 Thread Jose Fonseca

On 09/02/15 17:44, Emil Velikov wrote:

Hi Sedat,

On 07/02/15 22:42, Sedat Dilek wrote:

[ Please CC me I am not subscribed to mesa-dev and llvmdev MLs ]

Hi,

I already reported this when playing 1st time with my llvm-toolchain
v3.6.0rc2 and mesa v10.3.7 [1].
The issue still remains in mesa v10.4.4.

So, this is a field test to see if LLVM/Clang v3.6.0rc2 fits my needs.

I see the following build-error...
...

make[4]: Entering directory `/home/wearefam/src/mesa/mesa-git/src/mapi'
   CC shared_glapi_libglapi_la-entry.lo
clang version 3.6.0 (tags/RELEASE_360/rc2)
Target: x86_64-unknown-linux-gnu
Thread model: posix
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6.4
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9.2
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Selected multilib: .;@m64
  "/opt/llvm-toolchain-3.6.0rc2/bin/clang" -cc1 -triple
x86_64-unknown-linux-gnu -emit-obj -mrelax-all -disable-free
-main-file-name entry.c -mrelocation-model static -mthread-model posix
-mdisable-fp-elim -relaxed-aliasing -fmath-errno -masm-verbose
-mconstructor-aliases -munwind-tables -fuse-init-array -target-cpu
x86-64 -target-linker-version 2.22 -v -g -dwarf-column-info
-coverage-file /home/wearefam/src/mesa/mesa-git/src/mapi/entry.c
-resource-dir /opt/llvm-toolchain-3.6.0rc2/bin/../lib/clang/3.6.0
-dependency-file .deps/shared_glapi_libglapi_la-entry.Tpo
-sys-header-deps -MP -MT shared_glapi_libglapi_la-entry.lo -D
"PACKAGE_NAME=\"Mesa\"" -D "PACKAGE_TARNAME=\"mesa\"" -D
"PACKAGE_VERSION=\"10.4.4\"" -D "PACKAGE_STRING=\"Mesa 10.4.4\"" -D
"PACKAGE_BUGREPORT=\"https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.freedesktop.org_enter-5Fbug.cgi-3Fproduct-3DMesa&d=AwID-g&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=zfmBZnnVGHeYde45pMKNnVyzeaZbdIqVLprmZCM2zzE&m=LdbG9btkWrIw7ABhWDTiHtGwFnB7cDCY6cHlnzXawlQ&s=sH_YjhwwusfBRXm5X-z7LRFkCz68ItpSfCmRnHJAYkw&e=
 \""
-D "PACKAGE_URL=\"\"" -D "PACKAGE=\"mesa\"" -D "VERSION=\"10.4.4\"" -D
STDC_HEADERS=1 -D HAVE_SYS_TYPES_H=1 -D HAVE_SYS_STAT_H=1 -D
HAVE_STDLIB_H=1 -D HAVE_STRING_H=1 -D HAVE_MEMORY_H=1 -D
HAVE_STRINGS_H=1 -D HAVE_INTTYPES_H=1 -D HAVE_STDINT_H=1 -D
HAVE_UNISTD_H=1 -D HAVE_DLFCN_H=1 -D "LT_OBJDIR=\".libs/\"" -D
YYTEXT_POINTER=1 -D HAVE___BUILTIN_BSWAP32=1 -D
HAVE___BUILTIN_BSWAP64=1 -D HAVE___BUILTIN_CLZ=1 -D
HAVE___BUILTIN_CLZLL=1 -D HAVE___BUILTIN_CTZ=1 -D
HAVE___BUILTIN_EXPECT=1 -D HAVE___BUILTIN_FFS=1 -D
HAVE___BUILTIN_FFSLL=1 -D HAVE___BUILTIN_POPCOUNT=1 -D
HAVE___BUILTIN_POPCOUNTLL=1 -D HAVE___BUILTIN_UNREACHABLE=1 -D
HAVE_DLADDR=1 -D HAVE_PTHREAD=1 -D HAVE_LIBEXPAT=1 -D
USE_EXTERNAL_DXTN_LIB=1 -D _GNU_SOURCE -D USE_SSE41 -D DEBUG -D
USE_X86_64_ASM -D HAVE_XLOCALE_H -D HAVE_STRTOF -D HAVE_DLOPEN -D
HAVE_POSIX_MEMALIGN -D HAVE_LIBDRM -D GLX_USE_DRM -D HAVE_LIBUDEV -D
GLX_INDIRECT_RENDERING -D GLX_DIRECT_RENDERING -D GLX_USE_TLS -D
HAVE_ALIAS -D HAVE_MINCORE -D HAVE_LLVM=0x0306 -D LLVM_VERSION_PATCH=0
-D MAPI_MODE_GLAPI -D
"MAPI_ABI_HEADER=\"shared-glapi/glapi_mapi_tmp.h\"" -I . -I
../../include -I ../../src/mapi -I ../../src/mapi -I /opt/xorg/include
-internal-isystem /usr/local/include -internal-isystem
/opt/llvm-toolchain-3.6.0rc2/bin/../lib/clang/3.6.0/include
-internal-externc-isystem /usr/include/x86_64-linux-gnu
-internal-externc-isystem /include -internal-externc-isystem
/usr/include -O0 -Wall -Werror=implicit-function-declaration
-Werror=missing-prototypes -std=c99 -fdebug-compilation-dir
/home/wearefam/src/mesa/mesa-git/src/mapi -ferror-limit 19
-fmessage-length 0 -pthread -mstackrealign -fobjc-runtime=gcc
-fdiagnostics-show-option -o entry.o -x c ../../src/mapi/entry.c
clang -cc1 version 3.6.0 based upon LLVM 3.6.0 default target
x86_64-unknown-linux-gnu
ignoring nonexistent directory "/include"
ignoring duplicate directory "."
ignoring duplicate directory "."
#include "..." search starts here:
#include <...> search starts here:
  .
  ../../include
  /opt/xorg/include
  /usr/local/include
  /opt/llvm-toolchain-3.6.0rc2/bin/../lib/clang/3.6.0/include
  /usr/include/x86_64-linux-gnu
  /usr/include
End of search list.
In file included from ../../src/mapi/entry.c:49:
./entry_x86-64_tls.h:66:1: warning: tentative array definition assumed
to have one element
x86_64_entry_start[];
^
fatal error: error in backend: symbol 'x86_64_entry_start' is already defined
clang: error: clang frontend command failed with exit code 70 (use -v
to see invocation)
clang version 3.6.0 (tags/RELEASE_360/rc2)
Target: x86_64-unknown-linux-gnu
Thread model: posix
clang: note: diagnostic msg: PLEASE submit a bug report to
https://urldefense.proofpoint.com/v2/url?u=http-3A__llvm.org_bugs_&d=AwID-g&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=zfmBZnnVGHeYde45pMKNnVyzeaZbdIqVLprmZCM2zzE&m=LdbG9btkWrIw7ABhWDTiHtGwFnB7cDCY6cHlnzXawlQ&s=

Re: [Mesa-dev] mesa-10.4.4: BROKEN TLS support in GLX with llvm-toolchain v3.6.0rc2

2015-02-10 Thread Jose Fonseca

On 09/02/15 23:25, Sedat Dilek wrote:

On Mon, Feb 9, 2015 at 9:51 PM, Jose Fonseca  wrote:

On 09/02/15 17:44, Emil Velikov wrote:


Hi Sedat,

On 07/02/15 22:42, Sedat Dilek wrote:


[ Please CC me I am not subscribed to mesa-dev and llvmdev MLs ]

Hi,

I already reported this when playing 1st time with my llvm-toolchain
v3.6.0rc2 and mesa v10.3.7 [1].
The issue still remains in mesa v10.4.4.

So, this is a field test to see if LLVM/Clang v3.6.0rc2 fits my needs.

I see the following build-error...
...

make[4]: Entering directory `/home/wearefam/src/mesa/mesa-git/src/mapi'
CC shared_glapi_libglapi_la-entry.lo
clang version 3.6.0 (tags/RELEASE_360/rc2)
Target: x86_64-unknown-linux-gnu
Thread model: posix
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6.4
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9.2
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Selected multilib: .;@m64
   "/opt/llvm-toolchain-3.6.0rc2/bin/clang" -cc1 -triple
x86_64-unknown-linux-gnu -emit-obj -mrelax-all -disable-free
-main-file-name entry.c -mrelocation-model static -mthread-model posix
-mdisable-fp-elim -relaxed-aliasing -fmath-errno -masm-verbose
-mconstructor-aliases -munwind-tables -fuse-init-array -target-cpu
x86-64 -target-linker-version 2.22 -v -g -dwarf-column-info
-coverage-file /home/wearefam/src/mesa/mesa-git/src/mapi/entry.c
-resource-dir /opt/llvm-toolchain-3.6.0rc2/bin/../lib/clang/3.6.0
-dependency-file .deps/shared_glapi_libglapi_la-entry.Tpo
-sys-header-deps -MP -MT shared_glapi_libglapi_la-entry.lo -D
"PACKAGE_NAME=\"Mesa\"" -D "PACKAGE_TARNAME=\"mesa\"" -D
"PACKAGE_VERSION=\"10.4.4\"" -D "PACKAGE_STRING=\"Mesa 10.4.4\"" -D

"PACKAGE_BUGREPORT=\"https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.freedesktop.org_enter-5Fbug.cgi-3Fproduct-3DMesa&d=AwID-g&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=zfmBZnnVGHeYde45pMKNnVyzeaZbdIqVLprmZCM2zzE&m=LdbG9btkWrIw7ABhWDTiHtGwFnB7cDCY6cHlnzXawlQ&s=sH_YjhwwusfBRXm5X-z7LRFkCz68ItpSfCmRnHJAYkw&e=
\""

-D "PACKAGE_URL=\"\"" -D "PACKAGE=\"mesa\"" -D "VERSION=\"10.4.4\"" -D
STDC_HEADERS=1 -D HAVE_SYS_TYPES_H=1 -D HAVE_SYS_STAT_H=1 -D
HAVE_STDLIB_H=1 -D HAVE_STRING_H=1 -D HAVE_MEMORY_H=1 -D
HAVE_STRINGS_H=1 -D HAVE_INTTYPES_H=1 -D HAVE_STDINT_H=1 -D
HAVE_UNISTD_H=1 -D HAVE_DLFCN_H=1 -D "LT_OBJDIR=\".libs/\"" -D
YYTEXT_POINTER=1 -D HAVE___BUILTIN_BSWAP32=1 -D
HAVE___BUILTIN_BSWAP64=1 -D HAVE___BUILTIN_CLZ=1 -D
HAVE___BUILTIN_CLZLL=1 -D HAVE___BUILTIN_CTZ=1 -D
HAVE___BUILTIN_EXPECT=1 -D HAVE___BUILTIN_FFS=1 -D
HAVE___BUILTIN_FFSLL=1 -D HAVE___BUILTIN_POPCOUNT=1 -D
HAVE___BUILTIN_POPCOUNTLL=1 -D HAVE___BUILTIN_UNREACHABLE=1 -D
HAVE_DLADDR=1 -D HAVE_PTHREAD=1 -D HAVE_LIBEXPAT=1 -D
USE_EXTERNAL_DXTN_LIB=1 -D _GNU_SOURCE -D USE_SSE41 -D DEBUG -D
USE_X86_64_ASM -D HAVE_XLOCALE_H -D HAVE_STRTOF -D HAVE_DLOPEN -D
HAVE_POSIX_MEMALIGN -D HAVE_LIBDRM -D GLX_USE_DRM -D HAVE_LIBUDEV -D
GLX_INDIRECT_RENDERING -D GLX_DIRECT_RENDERING -D GLX_USE_TLS -D
HAVE_ALIAS -D HAVE_MINCORE -D HAVE_LLVM=0x0306 -D LLVM_VERSION_PATCH=0
-D MAPI_MODE_GLAPI -D
"MAPI_ABI_HEADER=\"shared-glapi/glapi_mapi_tmp.h\"" -I . -I
../../include -I ../../src/mapi -I ../../src/mapi -I /opt/xorg/include
-internal-isystem /usr/local/include -internal-isystem
/opt/llvm-toolchain-3.6.0rc2/bin/../lib/clang/3.6.0/include
-internal-externc-isystem /usr/include/x86_64-linux-gnu
-internal-externc-isystem /include -internal-externc-isystem
/usr/include -O0 -Wall -Werror=implicit-function-declaration
-Werror=missing-prototypes -std=c99 -fdebug-compilation-dir
/home/wearefam/src/mesa/mesa-git/src/mapi -ferror-limit 19
-fmessage-length 0 -pthread -mstackrealign -fobjc-runtime=gcc
-fdiagnostics-show-option -o entry.o -x c ../../src/mapi/entry.c
clang -cc1 version 3.6.0 based upon LLVM 3.6.0 default target
x86_64-unknown-linux-gnu
ignoring nonexistent directory "/include"
ignoring duplicate directory "."
ignoring duplicate directory "."
#include "..." search starts here:
#include <...> search starts here:
   .
   ../../include
   /opt/xorg/include
   /usr/local/include
   /opt/llvm-toolchain-3.6.0rc2/bin/../lib/clang/3.6.0/include
   /usr/include/x86_64-linux-gnu
   /usr/include
End of search list.
In file included from ../../src/mapi/entry.c:49:
./entry_x86-64_tls.h:66:1: warning: tentative array definition assumed
to have one element
x86_64_entry_start[];
^
fatal error: error in backend: symbol 'x86_64_entry_start' is already
defined
clang: error: clang frontend command failed with exit cod

[Mesa-dev] [PATCH 1/2] util/u_atomic: Test p_atomic_add() for 8bit integers.

2015-02-12 Thread Jose Fonseca
---
 src/util/u_atomic_test.c | 32 +---
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/src/util/u_atomic_test.c b/src/util/u_atomic_test.c
index c506275..8bddf8d 100644
--- a/src/util/u_atomic_test.c
+++ b/src/util/u_atomic_test.c
@@ -37,8 +37,9 @@
 #include "u_atomic.h"
 
 
-#define test_atomic_cmpxchg(type, ones) \
-   static void test_atomic_cmpxchg_##type (void) { \
+/* Test operations that are supported for all types, including 8 bits types */
+#define test_atomic_8bits(type, ones) \
+   static void test_atomic_8bits_##type (void) { \
   type v, r; \
   \
   p_atomic_set(&v, ones); \
@@ -55,18 +56,24 @@
   assert(v == 0 && "p_atomic_cmpxchg"); \
   assert(r == ones && "p_atomic_cmpxchg"); \
   \
+  v = 23; \
+  p_atomic_add(&v, 42); \
+  r = p_atomic_read(&v); \
+  assert(r == 65 && "p_atomic_add"); \
+  \
   (void) r; \
}
 
 
+/* Test operations that are not supported for 8 bits types */
 #define test_atomic(type, ones) \
-   test_atomic_cmpxchg(type, ones) \
+   test_atomic_8bits(type, ones) \
\
static void test_atomic_##type (void) { \
   type v, r; \
   bool b; \
   \
-  test_atomic_cmpxchg_##type(); \
+  test_atomic_8bits_##type(); \
   \
   v = 2; \
   b = p_atomic_dec_zero(&v); \
@@ -97,11 +104,6 @@
   assert(v == ones && "p_atomic_dec_return"); \
   assert(r == v && "p_atomic_dec_return"); \
   \
-  v = 23; \
-  p_atomic_add(&v, 42); \
-  r = p_atomic_read(&v); \
-  assert(r == 65 && "p_atomic_add"); \
-  \
   (void) r; \
   (void) b; \
}
@@ -117,9 +119,9 @@ test_atomic(uint32_t, UINT32_C(0x))
 test_atomic(int64_t, INT64_C(-1))
 test_atomic(uint64_t, UINT64_C(0x))
 
-test_atomic_cmpxchg(int8_t, INT8_C(-1))
-test_atomic_cmpxchg(uint8_t, UINT8_C(0xff))
-test_atomic_cmpxchg(bool, true)
+test_atomic_8bits(int8_t, INT8_C(-1))
+test_atomic_8bits(uint8_t, UINT8_C(0xff))
+test_atomic_8bits(bool, true)
 
 int
 main()
@@ -134,9 +136,9 @@ main()
test_atomic_int64_t();
test_atomic_uint64_t();
 
-   test_atomic_cmpxchg_int8_t();
-   test_atomic_cmpxchg_uint8_t();
-   test_atomic_cmpxchg_bool();
+   test_atomic_8bits_int8_t();
+   test_atomic_8bits_uint8_t();
+   test_atomic_8bits_bool();
 
return 0;
 }
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] util/u_atomic: Add _InterlockedExchangeAdd8/16 for older MSVC.

2015-02-12 Thread Jose Fonseca
We need to build certain parts of Mesa (namely gallium, llvmpipe, and
therefore util) with Windows SDK 7.0.7600, which includes MSVC 2008.
---
 src/util/u_atomic.h | 32 ++--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/src/util/u_atomic.h b/src/util/u_atomic.h
index e123e17..f54def3 100644
--- a/src/util/u_atomic.h
+++ b/src/util/u_atomic.h
@@ -88,7 +88,7 @@
 
 #if _MSC_VER < 1600
 
-/* Implement _InterlockedCompareExchange8 in terms of 
InterlockedCompareExchange16 */
+/* Implement _InterlockedCompareExchange8 in terms of 
_InterlockedCompareExchange16 */
 static __inline
 char _InterlockedCompareExchange8(char volatile *Destination8, char Exchange8, 
char Comparand8)
 {
@@ -103,7 +103,7 @@ char _InterlockedCompareExchange8(char volatile 
*Destination8, char Exchange8, c
* neighboring byte untouched */
   short Exchange16 = (Initial16 & ~Mask8) | ((short)Exchange8 << Shift8);
   short Comparand16 = Initial16;
-  short Initial16 = InterlockedCompareExchange16(Destination16, 
Exchange16, Comparand16);
+  short Initial16 = _InterlockedCompareExchange16(Destination16, 
Exchange16, Comparand16);
   if (Initial16 == Comparand16) {
  /* succeeded */
  return Comparand8;
@@ -114,6 +114,34 @@ char _InterlockedCompareExchange8(char volatile 
*Destination8, char Exchange8, c
return Initial8;
 }
 
+/* Implement _InterlockedExchangeAdd16 in terms of 
_InterlockedCompareExchange16 */
+static __inline
+short _InterlockedExchangeAdd16(short volatile *Addend, short Value)
+{
+   short Initial = *Addend;
+   short Comparand;
+   do {
+  short Exchange = Initial + Value;
+  Comparand = Initial;
+  Initial = _InterlockedCompareExchange16(Addend, Exchange, Comparand);
+   } while(Initial != Comparand);
+   return Comparand;
+}
+
+/* Implement _InterlockedExchangeAdd8 in terms of _InterlockedCompareExchange8 
*/
+static __inline
+char _InterlockedExchangeAdd8(char volatile *Addend, char Value)
+{
+   char Initial = *Addend;
+   char Comparand;
+   do {
+  char Exchange = Initial + Value;
+  Comparand = Initial;
+  Initial = _InterlockedCompareExchange8(Addend, Exchange, Comparand);
+   } while(Initial != Comparand);
+   return Comparand;
+}
+
 #endif /* _MSC_VER < 1600 */
 
 /* MSVC supports decltype keyword, but it's only supported on C++ and doesn't
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] util/u_atomic: Add _InterlockedExchangeAdd8/16 for older MSVC.

2015-02-12 Thread Jose Fonseca

On 12/02/15 17:03, Brian Paul wrote:

On 02/12/2015 09:27 AM, Jose Fonseca wrote:

We need to build certain parts of Mesa (namely gallium, llvmpipe, and
therefore util) with Windows SDK 7.0.7600, which includes MSVC 2008.
---
  src/util/u_atomic.h | 32 ++--
  1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/src/util/u_atomic.h b/src/util/u_atomic.h
index e123e17..f54def3 100644
--- a/src/util/u_atomic.h
+++ b/src/util/u_atomic.h
@@ -88,7 +88,7 @@

  #if _MSC_VER < 1600

-/* Implement _InterlockedCompareExchange8 in terms of
InterlockedCompareExchange16 */
+/* Implement _InterlockedCompareExchange8 in terms of
_InterlockedCompareExchange16 */
  static __inline
  char _InterlockedCompareExchange8(char volatile *Destination8, char
Exchange8, char Comparand8)


static __inline char
_InterlockedCompareExchange8(...)



  {
@@ -103,7 +103,7 @@ char _InterlockedCompareExchange8(char volatile
*Destination8, char Exchange8, c
 * neighboring byte untouched */
short Exchange16 = (Initial16 & ~Mask8) | ((short)Exchange8 <<
Shift8);
short Comparand16 = Initial16;
-  short Initial16 = InterlockedCompareExchange16(Destination16,
Exchange16, Comparand16);
+  short Initial16 = _InterlockedCompareExchange16(Destination16,
Exchange16, Comparand16);
if (Initial16 == Comparand16) {
   /* succeeded */
   return Comparand8;
@@ -114,6 +114,34 @@ char _InterlockedCompareExchange8(char volatile
*Destination8, char Exchange8, c
 return Initial8;
  }

+/* Implement _InterlockedExchangeAdd16 in terms of
_InterlockedCompareExchange16 */
+static __inline
+short _InterlockedExchangeAdd16(short volatile *Addend, short Value)


same thing.


+{
+   short Initial = *Addend;
+   short Comparand;
+   do {
+  short Exchange = Initial + Value;
+  Comparand = Initial;
+  Initial = _InterlockedCompareExchange16(Addend, Exchange,
Comparand);
+   } while(Initial != Comparand);
+   return Comparand;
+}


I had to stare at this for quite a while.  It's kind of a mind bender
(at least to me).  I found it helpful to add a comment:

   /* if *Addend==Comparand then *Addend=Exchange, return original
*Addend */

before the _InterlockedCompareExchange16() call.



+
+/* Implement _InterlockedExchangeAdd8 in terms of
_InterlockedCompareExchange8 */
+static __inline
+char _InterlockedExchangeAdd8(char volatile *Addend, char Value)
+{
+   char Initial = *Addend;
+   char Comparand;
+   do {
+  char Exchange = Initial + Value;
+  Comparand = Initial;
+  Initial = _InterlockedCompareExchange8(Addend, Exchange,
Comparand);
+   } while(Initial != Comparand);
+   return Comparand;
+}
+
  #endif /* _MSC_VER < 1600 */

  /* MSVC supports decltype keyword, but it's only supported on C++
and doesn't



Thanks for the review.  I'll change as you suggested.


Why do the local variables start with upper case letters?  That's not
our usual style.


It matches the prototype documentation on MSDN -- whenever I work with 
Windows APIs I instictively switch to Microsoft's style.


But I agree readability of Caps var is not great. I'll push a follow on 
change switching vars to lower caps.



Anyway,
Reviewed-by: Brian Paul 




Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] mesa: Add gallium include dirs to more parts of the tree.

2015-02-12 Thread Jose Fonseca

Thanks for doing this. I appreciate it.

I have no objection with the series.  I'm happy to see more reuse.  We 
can always move things around later, and it will be much easier when 
then are less entangled/duplicated.


We'll need to update SCons include paths too.  If you have a git repos 
with your series that I can pull from I'll give it a go tomorrow and 
provide a patch for it.


Jose

On 12/02/15 00:48, Eric Anholt wrote:

---
  src/glsl/Makefile.am | 2 ++
  src/mesa/drivers/dri/common/Makefile.am  | 2 ++
  src/mesa/drivers/dri/i915/Makefile.am| 2 ++
  src/mesa/drivers/dri/i965/Makefile.am| 2 ++
  src/mesa/drivers/dri/nouveau/Makefile.am | 2 ++
  src/mesa/drivers/dri/r200/Makefile.am| 2 ++
  src/mesa/drivers/dri/radeon/Makefile.am  | 2 ++
  src/mesa/drivers/dri/swrast/Makefile.am  | 2 ++
  src/util/Makefile.am | 2 ++
  9 files changed, 18 insertions(+)

diff --git a/src/glsl/Makefile.am b/src/glsl/Makefile.am
index e89a9ad..75a5b13 100644
--- a/src/glsl/Makefile.am
+++ b/src/glsl/Makefile.am
@@ -26,6 +26,8 @@ AM_CPPFLAGS = \
-I$(top_srcdir)/src \
-I$(top_srcdir)/src/mapi \
-I$(top_srcdir)/src/mesa/ \
+   -I$(top_srcdir)/src/gallium/include \
+   -I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_srcdir)/src/glsl/glcpp \
-I$(top_srcdir)/src/glsl/nir \
-I$(top_srcdir)/src/gtest/include \
diff --git a/src/mesa/drivers/dri/common/Makefile.am 
b/src/mesa/drivers/dri/common/Makefile.am
index af6f742..da8f97a 100644
--- a/src/mesa/drivers/dri/common/Makefile.am
+++ b/src/mesa/drivers/dri/common/Makefile.am
@@ -30,6 +30,8 @@ AM_CFLAGS = \
-I$(top_srcdir)/src/ \
-I$(top_srcdir)/src/mapi \
-I$(top_srcdir)/src/mesa/ \
+   -I$(top_srcdir)/src/gallium/include \
+   -I$(top_srcdir)/src/gallium/auxiliary \
$(DEFINES) \
$(EXPAT_CFLAGS) \
$(VISIBILITY_CFLAGS)
diff --git a/src/mesa/drivers/dri/i915/Makefile.am 
b/src/mesa/drivers/dri/i915/Makefile.am
index ac49360..822f74c 100644
--- a/src/mesa/drivers/dri/i915/Makefile.am
+++ b/src/mesa/drivers/dri/i915/Makefile.am
@@ -28,6 +28,8 @@ AM_CFLAGS = \
-I$(top_srcdir)/src/ \
-I$(top_srcdir)/src/mapi \
-I$(top_srcdir)/src/mesa/ \
+   -I$(top_srcdir)/src/gallium/include \
+   -I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_srcdir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/mesa/drivers/dri/intel/server \
-I$(top_builddir)/src/mesa/drivers/dri/common \
diff --git a/src/mesa/drivers/dri/i965/Makefile.am 
b/src/mesa/drivers/dri/i965/Makefile.am
index 07eefce..5d33159 100644
--- a/src/mesa/drivers/dri/i965/Makefile.am
+++ b/src/mesa/drivers/dri/i965/Makefile.am
@@ -28,6 +28,8 @@ AM_CFLAGS = \
-I$(top_srcdir)/src/ \
-I$(top_srcdir)/src/mapi \
-I$(top_srcdir)/src/mesa/ \
+   -I$(top_srcdir)/src/gallium/include \
+   -I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_srcdir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/mesa/drivers/dri/intel/server \
-I$(top_srcdir)/src/gtest/include \
diff --git a/src/mesa/drivers/dri/nouveau/Makefile.am 
b/src/mesa/drivers/dri/nouveau/Makefile.am
index f302864..61af95a 100644
--- a/src/mesa/drivers/dri/nouveau/Makefile.am
+++ b/src/mesa/drivers/dri/nouveau/Makefile.am
@@ -33,6 +33,8 @@ AM_CFLAGS = \
-I$(top_srcdir)/src/ \
-I$(top_srcdir)/src/mapi \
-I$(top_srcdir)/src/mesa/ \
+   -I$(top_srcdir)/src/gallium/include \
+   -I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_srcdir)/src/mesa/drivers/dri/common \
$(DEFINES) \
$(VISIBILITY_CFLAGS) \
diff --git a/src/mesa/drivers/dri/r200/Makefile.am 
b/src/mesa/drivers/dri/r200/Makefile.am
index a156728..137d3c8 100644
--- a/src/mesa/drivers/dri/r200/Makefile.am
+++ b/src/mesa/drivers/dri/r200/Makefile.am
@@ -32,6 +32,8 @@ AM_CFLAGS = \
-I$(top_srcdir)/src/ \
-I$(top_srcdir)/src/mapi \
-I$(top_srcdir)/src/mesa/ \
+   -I$(top_srcdir)/src/gallium/include \
+   -I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_srcdir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/mesa/drivers/dri/r200/server \
-I$(top_builddir)/src/mesa/drivers/dri/common \
diff --git a/src/mesa/drivers/dri/radeon/Makefile.am 
b/src/mesa/drivers/dri/radeon/Makefile.am
index 25c4884..b236aa6 100644
--- a/src/mesa/drivers/dri/radeon/Makefile.am
+++ b/src/mesa/drivers/dri/radeon/Makefile.am
@@ -33,6 +33,8 @@ AM_CFLAGS = \
-I$(top_srcdir)/src/ \
-I$(top_srcdir)/src/mapi \
-I$(top_srcdir)/src/mesa/ \
+   -I$(top_srcdir)/src/gallium/include \
+   -I$(top_srcdir)/src/gallium/auxiliary \
-I$(top_srcdir)/src/mesa/drivers/dri/common \
-I$(top_srcdir)/src/mesa/drivers/dri/radeon/server \
-I$(top_builddir)/src/mesa/drivers/dri/common \
diff --git a/src/mesa/drivers/dri/swrast/Makefile.am 
b/src/mesa/drivers/

Re: [Mesa-dev] [PATCH 1/2] gallium: include util/macros.h

2015-02-12 Thread Jose Fonseca

LGTM.

On 12/02/15 17:31, Tobias Klausmann wrote:

The most common macros are defined there, no use to duplicate these
Clean up the already redefinded macros

Signed-off-by: Tobias Klausmann 
---
  src/gallium/include/pipe/p_compiler.h | 57 ++-
  1 file changed, 2 insertions(+), 55 deletions(-)

diff --git a/src/gallium/include/pipe/p_compiler.h 
b/src/gallium/include/pipe/p_compiler.h
index fb018bf..cc4f444 100644
--- a/src/gallium/include/pipe/p_compiler.h
+++ b/src/gallium/include/pipe/p_compiler.h
@@ -33,6 +33,8 @@

  #include "p_config.h"

+#include "util/macros.h"
+
  #include 
  #include 
  #include 
@@ -204,61 +206,6 @@ void _ReadWriteBarrier(void);

  #endif

-
-/* You should use these macros to mark if blocks where the if condition
- * is either likely to be true, or unlikely to be true.
- *
- * This will inform human readers of this fact, and will also inform
- * the compiler, who will in turn inform the CPU.
- *
- * CPUs often start executing code inside the if or the else blocks
- * without knowing whether the condition is true or not, and will have
- * to throw the work away if they find out later they executed the
- * wrong part of the if.
- *
- * If these macros are used, the CPU is more likely to correctly predict
- * the right path, and will avoid speculatively executing the wrong branch,
- * thus not throwing away work, resulting in better performance.
- *
- * In light of this, it is also a good idea to mark as "likely" a path
- * which is not necessarily always more likely, but that will benefit much
- * more from performance improvements since it is already much faster than
- * the other path, or viceversa with "unlikely".
- *
- * Example usage:
- * if(unlikely(do_we_need_a_software_fallback()))
- *do_software_fallback();
- * else
- *render_with_gpu();
- *
- * The macros follow the Linux kernel convention, and more examples can
- * be found there.
- *
- * Note that profile guided optimization can offer better results, but
- * needs an appropriate coverage suite and does not inform human readers.
- */
-#ifndef likely
-#  if defined(__GNUC__)
-#define likely(x)   __builtin_expect(!!(x), 1)
-#define unlikely(x) __builtin_expect(!!(x), 0)
-#  else
-#define likely(x)   (x)
-#define unlikely(x) (x)
-#  endif
-#endif
-
-
-/**
- * Static (compile-time) assertion.
- * Basically, use COND to dimension an array.  If COND is false/zero the
- * array size will be -1 and we'll get a compilation error.
- */
-#define STATIC_ASSERT(COND) \
-   do { \
-  (void) sizeof(char [1 - 2*!(COND)]); \
-   } while (0)
-
-
  #if defined(__cplusplus)
  }
  #endif



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] uti/u_atomic: Don't test p_atomic_add with booleans.

2015-02-13 Thread Jose Fonseca
Add another class of tests.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=89112

I failed to spot this in my previous change, because bool was a typedef
for char on the system I tested.
---
 src/util/u_atomic_test.c | 25 +++--
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/src/util/u_atomic_test.c b/src/util/u_atomic_test.c
index 8bddf8d..ffe4703 100644
--- a/src/util/u_atomic_test.c
+++ b/src/util/u_atomic_test.c
@@ -37,9 +37,9 @@
 #include "u_atomic.h"
 
 
-/* Test operations that are supported for all types, including 8 bits types */
-#define test_atomic_8bits(type, ones) \
-   static void test_atomic_8bits_##type (void) { \
+/* Test only assignment-like operations, which can be supported on all types */
+#define test_atomic_assign(type, ones) \
+   static void test_atomic_assign_##type (void) { \
   type v, r; \
   \
   p_atomic_set(&v, ones); \
@@ -56,6 +56,19 @@
   assert(v == 0 && "p_atomic_cmpxchg"); \
   assert(r == ones && "p_atomic_cmpxchg"); \
   \
+  (void) r; \
+   }
+
+
+/* Test arithmetic operations that are supported on 8bit integer types */
+#define test_atomic_8bits(type, ones) \
+   test_atomic_assign(type, ones) \
+   \
+   static void test_atomic_8bits_##type (void) { \
+  type v, r; \
+  \
+  test_atomic_assign_##type(); \
+  \
   v = 23; \
   p_atomic_add(&v, 42); \
   r = p_atomic_read(&v); \
@@ -65,7 +78,7 @@
}
 
 
-/* Test operations that are not supported for 8 bits types */
+/* Test all operations */
 #define test_atomic(type, ones) \
test_atomic_8bits(type, ones) \
\
@@ -121,7 +134,7 @@ test_atomic(uint64_t, UINT64_C(0x))
 
 test_atomic_8bits(int8_t, INT8_C(-1))
 test_atomic_8bits(uint8_t, UINT8_C(0xff))
-test_atomic_8bits(bool, true)
+test_atomic_assign(bool, true)
 
 int
 main()
@@ -138,7 +151,7 @@ main()
 
test_atomic_8bits_int8_t();
test_atomic_8bits_uint8_t();
-   test_atomic_8bits_bool();
+   test_atomic_assign_bool();
 
return 0;
 }
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] os, llvmpipe: Set rasterizer thread names on Linux.

2015-02-13 Thread Jose Fonseca
To help identify llvmpipe rasterizer threads -- especially when there
can be so many.

We can eventually generalize this to other OSes, but for that we must
restrict the function to be called from the current thread.  See also
http://stackoverflow.com/a/7989973
---
 src/gallium/auxiliary/os/os_thread.h   | 11 +++
 src/gallium/drivers/llvmpipe/lp_rast.c |  8 +++-
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/os/os_thread.h 
b/src/gallium/auxiliary/os/os_thread.h
index ff46a89..d3f13d4 100644
--- a/src/gallium/auxiliary/os/os_thread.h
+++ b/src/gallium/auxiliary/os/os_thread.h
@@ -85,6 +85,17 @@ static INLINE int pipe_thread_destroy( pipe_thread thread )
return thrd_detach( thread );
 }
 
+static INLINE void pipe_thread_setname( const char *name )
+{
+#if defined(HAVE_PTHREAD)
+#  if defined(__GNU_LIBRARY__) && defined(__GLIBC__) && 
defined(__GLIBC_MINOR__) && \
+(__GLIBC__ >= 3 || (__GLIBC__ == 2 && __GLIBC_MINOR__ >= 11))
+   pthread_setname_np(pthread_self(), name);
+#  endif
+#endif
+   (void)name;
+}
+
 
 /* pipe_mutex
  */
diff --git a/src/gallium/drivers/llvmpipe/lp_rast.c 
b/src/gallium/drivers/llvmpipe/lp_rast.c
index e168766..903e7c5 100644
--- a/src/gallium/drivers/llvmpipe/lp_rast.c
+++ b/src/gallium/drivers/llvmpipe/lp_rast.c
@@ -31,6 +31,7 @@
 #include "util/u_rect.h"
 #include "util/u_surface.h"
 #include "util/u_pack_color.h"
+#include "util/u_string.h"
 
 #include "os/os_time.h"
 
@@ -747,11 +748,16 @@ static PIPE_THREAD_ROUTINE( thread_function, init_data )
struct lp_rasterizer_task *task = (struct lp_rasterizer_task *) init_data;
struct lp_rasterizer *rast = task->rast;
boolean debug = false;
-   unsigned fpstate = util_fpstate_get();
+   char thread_name[16];
+   unsigned fpstate;
+
+   util_snprintf(thread_name, sizeof thread_name, "llvmpipe-%u", 
task->thread_index);
+   pipe_thread_setname(thread_name);
 
/* Make sure that denorms are treated like zeros. This is 
 * the behavior required by D3D10. OpenGL doesn't care.
 */
+   fpstate = util_fpstate_get();
util_fpstate_set_denorms_to_zero(fpstate);
 
while (1) {
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] os, llvmpipe: Set rasterizer thread names on Linux.

2015-02-13 Thread Jose Fonseca

On 13/02/15 15:23, Roland Scheidegger wrote:

Just one trivial issue, otherwise

Reviewed-by: Roland Scheidegger 


Am 13.02.2015 um 15:05 schrieb Jose Fonseca:

To help identify llvmpipe rasterizer threads -- especially when there
can be so many.

We can eventually generalize this to other OSes, but for that we must
restrict the function to be called from the current thread.  See also
http://stackoverflow.com/a/7989973
---
  src/gallium/auxiliary/os/os_thread.h   | 11 +++
  src/gallium/drivers/llvmpipe/lp_rast.c |  8 +++-
  2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/os/os_thread.h 
b/src/gallium/auxiliary/os/os_thread.h
index ff46a89..d3f13d4 100644
--- a/src/gallium/auxiliary/os/os_thread.h
+++ b/src/gallium/auxiliary/os/os_thread.h
@@ -85,6 +85,17 @@ static INLINE int pipe_thread_destroy( pipe_thread thread )
 return thrd_detach( thread );
  }

+static INLINE void pipe_thread_setname( const char *name )
+{
+#if defined(HAVE_PTHREAD)
+#  if defined(__GNU_LIBRARY__) && defined(__GLIBC__) && defined(__GLIBC_MINOR__) 
&& \
+(__GLIBC__ >= 3 || (__GLIBC__ == 2 && __GLIBC_MINOR__ >= 11))

Your link is saying glibc needs to be 2.12, not 2.11.


Good catch. Thanks.

Jose




+   pthread_setname_np(pthread_self(), name);
+#  endif
+#endif
+   (void)name;
+}
+

  /* pipe_mutex
   */
diff --git a/src/gallium/drivers/llvmpipe/lp_rast.c 
b/src/gallium/drivers/llvmpipe/lp_rast.c
index e168766..903e7c5 100644
--- a/src/gallium/drivers/llvmpipe/lp_rast.c
+++ b/src/gallium/drivers/llvmpipe/lp_rast.c
@@ -31,6 +31,7 @@
  #include "util/u_rect.h"
  #include "util/u_surface.h"
  #include "util/u_pack_color.h"
+#include "util/u_string.h"

  #include "os/os_time.h"

@@ -747,11 +748,16 @@ static PIPE_THREAD_ROUTINE( thread_function, init_data )
 struct lp_rasterizer_task *task = (struct lp_rasterizer_task *) init_data;
 struct lp_rasterizer *rast = task->rast;
 boolean debug = false;
-   unsigned fpstate = util_fpstate_get();
+   char thread_name[16];
+   unsigned fpstate;
+
+   util_snprintf(thread_name, sizeof thread_name, "llvmpipe-%u", 
task->thread_index);
+   pipe_thread_setname(thread_name);

 /* Make sure that denorms are treated like zeros. This is
  * the behavior required by D3D10. OpenGL doesn't care.
  */
+   fpstate = util_fpstate_get();
 util_fpstate_set_denorms_to_zero(fpstate);

 while (1) {





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] make check failure in u_atomic_test

2015-02-13 Thread Jose Fonseca

On 13/02/15 18:52, Ian Romanick wrote:

Starting this morning I'm seeing 'make check' failures in
u_atomic_test.  It looks like José was the last person to touch that
area.  I haven't investigated any further.

../../bin/test-driver: line 107: 11024 Aborted (core dumped) "$@" > 
$log_file 2>&1
FAIL: u_atomic_test



I've posted the fix for review earlier today. But had no chance to 
commit after Roland's review. I'll commit it shortly.


Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   3   4   5   6   7   8   9   10   >