Re: [Mesa-dev] [RFC PATCH] egl/android: Dequeue buffers inside EGL calls

2017-04-03 Thread Tomasz Figa
Hi Mauro,

On Mon, Apr 3, 2017 at 2:48 AM, Mauro Rossi  wrote:
>
>
> 2017-03-31 13:05 GMT+02:00 Tapani Pälli :
>>
>>
>>
>> On 03/31/2017 10:12 AM, Tapani Pälli wrote:
>>>
>>>
>>>
>>> On 03/31/2017 09:06 AM, Tapani Pälli wrote:



 On 03/31/2017 08:24 AM, Rob Clark wrote:
>
> On Fri, Mar 31, 2017 at 12:22 AM, Tapani Pälli
>  wrote:
>>
>>
>>
>> On 03/30/2017 05:57 PM, Emil Velikov wrote:
>>>
>>>
>>> On 30 March 2017 at 15:30, Tomasz Figa  wrote:


 On Thu, Mar 30, 2017 at 11:17 PM, Emil Velikov
 
 wrote:
>
>
>
> On 30 March 2017 at 11:55, Tomasz Figa  wrote:
>>
>>
>> Android buffer queues can be abandoned, which results in failing
>> to
>> dequeue next buffer. Currently this would fail somewhere deep
>> within
>> the DRI stack calling loader's getBuffers*(), without any error
>> reporting to the client app. However Android framework code
>> relies on
>> proper signaling of this event, so we move buffer dequeue to
>> createWindowSurface() and swapBuffers() call, which can generate
>> proper
>> EGL errors. To keep the performance benefits of delayed buffer
>> handling,
>> if any, fence wait and DRI image creation is kept delayed until
>> getBuffers*() is called by the DRI driver.
>>
> Thank you Tomasz.
>
> I'm fairly confident that this should resolve the crash [in
> swap_buffers] that Mauro was seeing.
> Mauro can you give it a test ?



 Ah, I actually noticed a problem with existing code, supposedly
 fixed
 by [1], but I'm afraid it's still wrong.

 Current swap_buffers calls get_back_bo(), but doesn't call
 update_buffers(), which is the function that should be called before
 to actually dequeue a buffer from Android's buffer queue. Given
 that,
 get_back_bo() would simply fail with !dri2_surf->buffer, because no
 buffer was dequeued.

>>> Right - I was wondering why we don't hit that on EGL/GBM or
>>> EGL/Wayland.
>>> From a quick look - may be because EGL/Android drops the dpy mutex in
>>> droid_window_enqueue_buffer().
>>>
 My patch removes update_buffers() and changes the buffer
 management so
 that there is always a buffer dequeued, starting from surface
 creation, unless there was an error somewhere.

>>> Of the top of your head - is there something stopping us from using
>>> the same method on $other platforms?
>>>
 [1]

 https://cgit.freedesktop.org/mesa/mesa/commit/src/egl/drivers/dri2/platform_android.c?id=4d4558411db166d2d66f8cec9cb581149dbe1597



>
>
> Not that huge of an expert on the Android specifics, so just a
> humble
> request:
> Can we seek the code resuffle (droid_{alloc,free}_local_buffer,
>>>
>>>
>>> Oops silly typo - s/seek/split/.
>>>
> other?) separate from the functionality changes ?



 Sure. Thanks for suggestion.

>>> Please give it a day or two for others to comment.
>>
>>
>>
>> I'm trying to debug why this causes our homescreen (wallpaper) to be
>> black.
>> Otherwise I haven't seen any issues with these changes.
>>
>
> wallpaper seems to be a special sorta hell..  I wonder if there is
> somehow some sort of interaction with what I fixed / worked-around in
> a5e733c6b52e93de3000647d075f5ca2f55fcb71 ??
>
> Maybe at least try commenting out the temp-pbuffer thing to get max
> texture size, and see if that "fixes" things


 Can you give more details, I still live in la la land and don't know
 about 'temp-pbuffer thing'?

>>>
>>> aa I did not recall the problem, you mean the 'dummy pbuffer' in
>>> SurfaceFlinger .. yes I will check if this is related.
>>>
>>
>> If I take away that dummy pbuffer usage (which is useless anyway), couple
>> of errors disappear from the log. They are:
>>
>> SurfaceFlinger: releasePendingBuffer failed: Unknown error -1 (1)
>> SurfaceFlinger: releasePendingBuffer failed: Unknown error -1 (1)
>>
>> but otherwise the desktop still stays black, live wallpapers seem to work
>> so there is something special about this default wallpaper. Will continue
>> digging ..
>>
>> // Tapani
>
>
>
> Hi,
>
> about the black wallpaper the only sign found in logcat is the following:
>
> - beginning of main
> 04-02 15:09:43.148
> ...
> 04-02 15:10:41.710  1414  1414 E Layer   :
> [com.android.systemui.ImageWallpaper] rejecting buffer: bufWidth=1024,
> bufHeight=768, front.active.{w=1, h=1}
> ...
> 04-02 15:10:41.726  1414  1414 E Layer   :
> [com.andr

Re: [Mesa-dev] [PATCH] st/omx/dec: Properly undefine DEBUG macro

2017-04-03 Thread Christian König

Am 03.04.2017 um 06:35 schrieb Shaleen Jain:

---
  src/gallium/state_trackers/omx/vid_dec.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/omx/vid_dec.c 
b/src/gallium/state_trackers/omx/vid_dec.c
index 9a6efb8e28..94664eba04 100644
--- a/src/gallium/state_trackers/omx/vid_dec.c
+++ b/src/gallium/state_trackers/omx/vid_dec.c
@@ -37,7 +37,7 @@
  #include 
  
  /* bellagio defines a DEBUG macro that we don't want */

-#ifndef DEBUG
+#ifdef DEBUG


NAK, the logic locked correct as it was before.

Why do you want to invert it?

Regards,
Christian.


  #include 
  #undef DEBUG
  #else



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: add null pointer check in print_without_declaration()

2017-04-03 Thread Nicolai Hähnle

On 31.03.2017 20:14, Brian Paul wrote:

On 03/31/2017 12:01 AM, Nicolai Hähnle wrote:

On 31.03.2017 05:21, Brian Paul wrote:

To avoid/fix a segmentation fault when running the stand-alone GLSL
compiler utility for cases such as the Mesa demos toyball test:

glsl_compiler --dump-builder --version 120 CH11-toyball.vert
CH11-toyball.frag
---
 src/compiler/glsl/ir_builder_print_visitor.cpp | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ir_builder_print_visitor.cpp
b/src/compiler/glsl/ir_builder_print_visitor.cpp
index 825dbe1..164a237 100644
--- a/src/compiler/glsl/ir_builder_print_visitor.cpp
+++ b/src/compiler/glsl/ir_builder_print_visitor.cpp
@@ -581,7 +581,9 @@
ir_builder_print_visitor::print_without_declaration(const
ir_expression *ir)
  const struct hash_entry *const he =
 _mesa_hash_table_search(index_map, ir->operands[i]);

- print_without_indent("r%04X", (unsigned)(uintptr_t) he->data);
+ if (he) {
+print_without_indent("r%04X", (unsigned)(uintptr_t)
he->data);
+ }


Is the output still usable in this case? I don't quite understand the
use case and how this case can happen.


I didn't dig either and would appreciate any feedback.  I'm merely
trying to avoid a segfault (found while debugging the MinGW optimization
issue).


I'm not really familiar with that code and couldn't figure out from 
looking at it when/why it happens, but at least looking at the 
surrounding code, I'd say that this cannot possibly produce a good 
output when 'he' is NULL. The point is to generate well-formed C++ to be 
compiled later, and skipping the print will just lead to writing out a 
function call with a malformed parameter list.


Maybe the answer is that the --dump-builder simply isn't made for this 
type of use, in which case the question becomes: why do you want to use 
it this way? And if there's a good reason for using it this way, the bug 
needs to be understood properly. Maybe Ian knows more? I believe he 
wrote most of that code.


Nicolai
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC PATCH] egl/android: Dequeue buffers inside EGL calls

2017-04-03 Thread Tapani Pälli



On 03/31/2017 04:04 PM, Rob Clark wrote:

On Fri, Mar 31, 2017 at 2:06 AM, Tapani Pälli  wrote:



On 03/31/2017 08:24 AM, Rob Clark wrote:


On Fri, Mar 31, 2017 at 12:22 AM, Tapani Pälli 
wrote:




On 03/30/2017 05:57 PM, Emil Velikov wrote:



On 30 March 2017 at 15:30, Tomasz Figa  wrote:



On Thu, Mar 30, 2017 at 11:17 PM, Emil Velikov

wrote:




On 30 March 2017 at 11:55, Tomasz Figa  wrote:



Android buffer queues can be abandoned, which results in failing to
dequeue next buffer. Currently this would fail somewhere deep within
the DRI stack calling loader's getBuffers*(), without any error
reporting to the client app. However Android framework code relies on
proper signaling of this event, so we move buffer dequeue to
createWindowSurface() and swapBuffers() call, which can generate
proper
EGL errors. To keep the performance benefits of delayed buffer
handling,
if any, fence wait and DRI image creation is kept delayed until
getBuffers*() is called by the DRI driver.


Thank you Tomasz.

I'm fairly confident that this should resolve the crash [in
swap_buffers] that Mauro was seeing.
Mauro can you give it a test ?




Ah, I actually noticed a problem with existing code, supposedly fixed
by [1], but I'm afraid it's still wrong.

Current swap_buffers calls get_back_bo(), but doesn't call
update_buffers(), which is the function that should be called before
to actually dequeue a buffer from Android's buffer queue. Given that,
get_back_bo() would simply fail with !dri2_surf->buffer, because no
buffer was dequeued.


Right - I was wondering why we don't hit that on EGL/GBM or EGL/Wayland.
From a quick look - may be because EGL/Android drops the dpy mutex in
droid_window_enqueue_buffer().


My patch removes update_buffers() and changes the buffer management so
that there is always a buffer dequeued, starting from surface
creation, unless there was an error somewhere.


Of the top of your head - is there something stopping us from using
the same method on $other platforms?


[1]

https://cgit.freedesktop.org/mesa/mesa/commit/src/egl/drivers/dri2/platform_android.c?id=4d4558411db166d2d66f8cec9cb581149dbe1597




Not that huge of an expert on the Android specifics, so just a humble
request:
Can we seek the code resuffle (droid_{alloc,free}_local_buffer,



Oops silly typo - s/seek/split/.


other?) separate from the functionality changes ?




Sure. Thanks for suggestion.


Please give it a day or two for others to comment.




I'm trying to debug why this causes our homescreen (wallpaper) to be
black.
Otherwise I haven't seen any issues with these changes.



wallpaper seems to be a special sorta hell..  I wonder if there is
somehow some sort of interaction with what I fixed / worked-around in
a5e733c6b52e93de3000647d075f5ca2f55fcb71 ??

Maybe at least try commenting out the temp-pbuffer thing to get max
texture size, and see if that "fixes" things



Can you give more details, I still live in la la land and don't know about
'temp-pbuffer thing'?



Sorry, this is something in the wallpaper app java code.. I forget
exactly where it lives now, but there was some code matching what I
pasted in the commit msg which was creating and then destroying a
temporary pbuffer..



OK yeah found it, it's very similar to what surfaceflinger does also on 
startup before initializing it's rendering engine. It creates temporary 
pbuffer only to query different things before actual init.


// Tapani
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] android: amd/addrlib: trivial fix for gfx9 support

2017-04-03 Thread Nicolai Hähnle

On 01.04.2017 12:48, Mauro Rossi wrote:

Fixes the following build error:

external/mesa/src/amd/addrlib/gfx9/gfx9addrlib.cpp:36:10: fatal error: 
'gfx9_gb_reg.h' file not found
 ^
1 error generated.

Fixes: 7f160ef "amd/addrlib: import gfx9 support"


Reviewed-by: Nicolai Hähnle 


---
 src/amd/Android.addrlib.mk | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/amd/Android.addrlib.mk b/src/amd/Android.addrlib.mk
index d296ce0431c..540de5554bd 100644
--- a/src/amd/Android.addrlib.mk
+++ b/src/amd/Android.addrlib.mk
@@ -37,7 +37,9 @@ LOCAL_C_INCLUDES := \
$(MESA_TOP)/src/amd/common \
$(MESA_TOP)/src/amd/addrlib \
$(MESA_TOP)/src/amd/addrlib/core \
+   $(MESA_TOP)/src/amd/addrlib/inc/chip/gfx9 \
$(MESA_TOP)/src/amd/addrlib/inc/chip/r800 \
+   $(MESA_TOP)/src/amd/addrlib/gfx9/chip \
$(MESA_TOP)/src/amd/addrlib/r800/chip

 include $(MESA_COMMON_MK)




--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 9/9] gallium: decrease the size of pipe_draw_info - 88 -> 80 bytes

2017-04-03 Thread Nicolai Hähnle

Some of these are a bit subtle, but I think they're fine. Series is:

Reviewed-by: Nicolai Hähnle 

On 02.04.2017 20:00, Marek Olšák wrote:

From: Marek Olšák 

---
 src/gallium/auxiliary/indices/u_primconvert.c | 10 --
 src/gallium/include/pipe/p_state.h|  2 +-
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/src/gallium/auxiliary/indices/u_primconvert.c 
b/src/gallium/auxiliary/indices/u_primconvert.c
index 2bdfade..1ffca4b 100644
--- a/src/gallium/auxiliary/indices/u_primconvert.c
+++ b/src/gallium/auxiliary/indices/u_primconvert.c
@@ -121,39 +121,45 @@ util_primconvert_draw_vbo(struct primconvert_context *pc,
util_draw_init_info(&new_info);
new_info.indexed = true;
new_info.min_index = info->min_index;
new_info.max_index = info->max_index;
new_info.index_bias = info->index_bias;
new_info.start_instance = info->start_instance;
new_info.instance_count = info->instance_count;
new_info.primitive_restart = info->primitive_restart;
new_info.restart_index = info->restart_index;
if (info->indexed) {
+  enum pipe_prim_type mode = 0;
+
   u_index_translator(pc->primtypes_mask,
  info->mode, pc->saved_ib.index_size, info->count,
  pc->api_pv, pc->api_pv,
  info->primitive_restart ? PR_ENABLE : PR_DISABLE,
- &new_info.mode, &new_ib.index_size, &new_info.count,
+ &mode, &new_ib.index_size, &new_info.count,
  &trans_func);
+  new_info.mode = mode;
   src = ib->user_buffer;
   if (!src) {
  src = pipe_buffer_map(pc->pipe, ib->buffer,
PIPE_TRANSFER_READ, &src_transfer);
   }
   src = (const uint8_t *)src + ib->offset;
}
else {
+  enum pipe_prim_type mode = 0;
+
   u_index_generator(pc->primtypes_mask,
 info->mode, info->start, info->count,
 pc->api_pv, pc->api_pv,
-&new_info.mode, &new_ib.index_size, &new_info.count,
+&mode, &new_ib.index_size, &new_info.count,
 &gen_func);
+  new_info.mode = mode;
}

u_upload_alloc(pc->pipe->stream_uploader, 0, new_ib.index_size * 
new_info.count, 4,
   &new_ib.offset, &new_ib.buffer, &dst);

if (info->indexed) {
   trans_func(src, info->start, info->count, new_info.count, 
info->restart_index, dst);
}
else {
   gen_func(info->start, new_info.count, dst);
diff --git a/src/gallium/include/pipe/p_state.h 
b/src/gallium/include/pipe/p_state.h
index d68a4d4..eeabf8b 100644
--- a/src/gallium/include/pipe/p_state.h
+++ b/src/gallium/include/pipe/p_state.h
@@ -634,21 +634,21 @@ struct pipe_index_buffer
const void *user_buffer;  /**< pointer to a user buffer if buffer == NULL */
 };


 /**
  * Information to describe a draw_vbo call.
  */
 struct pipe_draw_info
 {
boolean indexed;  /**< use index buffer */
-   enum pipe_prim_type mode;  /**< the mode of the primitive */
+   enum pipe_prim_type mode:8;  /**< the mode of the primitive */
boolean primitive_restart;
ubyte vertices_per_patch; /**< the number of vertices per patch */

unsigned start;  /**< the index of the first vertex */
unsigned count;  /**< number of vertices */

unsigned start_instance; /**< first instance id */
unsigned instance_count; /**< number of instances */

unsigned drawid; /**< id of this draw in a multidraw */




--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium: fix some math formulas to display better

2017-04-03 Thread Nicolai Hähnle

On 02.04.2017 18:02, Ilia Mirkin wrote:

Signed-off-by: Ilia Mirkin 


Reviewed-by: Nicolai Hähnle 


---
 src/gallium/docs/source/tgsi.rst | 32 
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index 05b06ce..ca31924 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -1204,13 +1204,13 @@ Support for these opcodes indicated by 
PIPE_SHADER_CAP_INTEGERS (all of them?)

 .. math::

-  dst.x = src0.x \ src1.x
+  dst.x = \frac{src0.x}{src1.x}

-  dst.y = src0.y \ src1.y
+  dst.y = \frac{src0.y}{src1.y}

-  dst.z = src0.z \ src1.z
+  dst.z = \frac{src0.z}{src1.z}

-  dst.w = src0.w \ src1.w
+  dst.w = \frac{src0.w}{src1.w}


 .. opcode:: UDIV - Unsigned Integer Division
@@ -1219,13 +1219,13 @@ Support for these opcodes indicated by 
PIPE_SHADER_CAP_INTEGERS (all of them?)

 .. math::

-  dst.x = src0.x \ src1.x
+  dst.x = \frac{src0.x}{src1.x}

-  dst.y = src0.y \ src1.y
+  dst.y = \frac{src0.y}{src1.y}

-  dst.z = src0.z \ src1.z
+  dst.z = \frac{src0.z}{src1.z}

-  dst.w = src0.w \ src1.w
+  dst.w = \frac{src0.w}{src1.w}


 .. opcode:: UMOD - Unsigned Integer Remainder
@@ -1234,13 +1234,13 @@ Support for these opcodes indicated by 
PIPE_SHADER_CAP_INTEGERS (all of them?)

 .. math::

-  dst.x = src0.x \ src1.x
+  dst.x = src0.x \bmod src1.x

-  dst.y = src0.y \ src1.y
+  dst.y = src0.y \bmod src1.y

-  dst.z = src0.z \ src1.z
+  dst.z = src0.z \bmod src1.z

-  dst.w = src0.w \ src1.w
+  dst.w = src0.w \bmod src1.w


 .. opcode:: NOT - Bitwise Not
@@ -2259,17 +2259,17 @@ two-component vectors with 64-bits in each component.

 .. math::

-  dst.xy = src0.xy \ src1.xy
+  dst.xy = \frac{src0.xy}{src1.xy}

-  dst.zw = src0.zw \ src1.zw
+  dst.zw = \frac{src0.zw}{src1.zw}

 .. opcode:: U64DIV - 64-bit Unsigned Integer Division

 .. math::

-  dst.xy = src0.xy \ src1.xy
+  dst.xy = \frac{src0.xy}{src1.xy}

-  dst.zw = src0.zw \ src1.zw
+  dst.zw = \frac{src0.zw}{src1.zw}

 .. opcode:: U64MOD - 64-bit Unsigned Integer Remainder





--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: fix broken texture filtering on SIK-CIK since GFX9 changes

2017-04-03 Thread Nicolai Hähnle

On 03.04.2017 02:01, Marek Olšák wrote:

From: Marek Olšák 

Don't clear state[7] on SI-CIK, and only do the meta stuff on VI+.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100531


Makes sense. Please add

Fixes: 5abf60076ce4 ("radeonsi/gfx9: image descriptor changes in mutable 
fields")


Reviewed-by: Nicolai Hähnle 


---
 src/gallium/drivers/radeonsi/si_descriptors.c | 32 ++-
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c 
b/src/gallium/drivers/radeonsi/si_descriptors.c
index 2b91158..8f5a16b 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -397,40 +397,42 @@ void si_set_mutable_tex_desc_fields(struct si_screen 
*sscreen,
if (sscreen->b.chip_class >= GFX9) {
/* Only stencil_offset needs to be added here. */
if (is_stencil)
va += tex->surface.u.gfx9.stencil_offset;
else
va += tex->surface.u.gfx9.surf_offset;
} else {
va += base_level_info->offset;
}

-   if (vi_dcc_enabled(tex, first_level)) {
-   meta_va = (!tex->dcc_separate_buffer ? 
tex->resource.gpu_address : 0) +
- tex->dcc_offset;
-
-   if (sscreen->b.chip_class <= VI)
-   meta_va += base_level_info->dcc_offset;
-   } else if (tex->tc_compatible_htile && !is_stencil) {
-   meta_va = tex->htile_buffer->gpu_address;
-   }
-
state[0] = va >> 8;
state[1] &= C_008F14_BASE_ADDRESS_HI;
state[1] |= S_008F14_BASE_ADDRESS_HI(va >> 40);

-   state[6] &= C_008F28_COMPRESSION_EN;
-   state[7] = 0;
+   if (sscreen->b.chip_class >= VI) {
+   state[6] &= C_008F28_COMPRESSION_EN;
+   state[7] = 0;

-   if (meta_va) {
-   state[6] |= S_008F28_COMPRESSION_EN(1);
-   state[7] = meta_va >> 8;
+   if (vi_dcc_enabled(tex, first_level)) {
+   meta_va = (!tex->dcc_separate_buffer ? 
tex->resource.gpu_address : 0) +
+ tex->dcc_offset;
+
+   if (sscreen->b.chip_class <= VI)
+   meta_va += base_level_info->dcc_offset;
+   } else if (tex->tc_compatible_htile && !is_stencil) {
+   meta_va = tex->htile_buffer->gpu_address;
+   }
+
+   if (meta_va) {
+   state[6] |= S_008F28_COMPRESSION_EN(1);
+   state[7] = meta_va >> 8;
+   }
}

if (sscreen->b.chip_class >= GFX9) {
state[3] &= C_008F1C_SW_MODE;
state[4] &= C_008F20_PITCH_GFX9;

if (is_stencil) {
state[3] |= 
S_008F1C_SW_MODE(tex->surface.u.gfx9.stencil.swizzle_mode);
state[4] |= 
S_008F20_PITCH_GFX9(tex->surface.u.gfx9.stencil.epitch);
} else {




--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] vulkan/wsi/wayland: Pass damage through to the compositor

2017-04-03 Thread Daniel Stone
Hi Jason,

On 1 April 2017 at 06:37, Jason Ekstrand  wrote:
> @@ -594,7 +595,19 @@ wsi_wl_swapchain_queue_present(struct wsi_swapchain 
> *wsi_chain,
>
> assert(image_index < chain->base.image_count);
> wl_surface_attach(chain->surface, chain->images[image_index].buffer, 0, 
> 0);
> -   wl_surface_damage(chain->surface, 0, 0, INT32_MAX, INT32_MAX);
> +
> +   if (chain->surface_version >= 4 && damage &&
> +   damage->pRectangles && damage->rectangleCount > 0) {
> +  for (unsigned i = 0; i < damage->rectangleCount; i++) {
> + const VkRectLayerKHR *rect = &damage->pRectangles[i];
> + assert(rect->layer == 0);
> + wl_surface_damage(chain->surface,
> +   rect->offset.x, rect->offset.y,
> +   rect->extent.width, rect->extent.height);

Very scrupulous version check, but you forgot to actually use
wl_surface_damage_buffer. ;)

Cheers,
Daniel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] radeonsi: clean up 'radeon_bld' references

2017-04-03 Thread Nicolai Hähnle

On 03.04.2017 02:01, Marek Olšák wrote:

From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c | 108 ++-
 1 file changed, 48 insertions(+), 60 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 3876010..2da00f9 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -293,37 +293,35 @@ get_tcs_out_current_patch_data_offset(struct 
si_shader_context *ctx)
LLVMValueRef patch_stride = get_tcs_out_patch_stride(ctx);
LLVMValueRef rel_patch_id = get_rel_patch_id(ctx);

return LLVMBuildAdd(gallivm->builder, patch0_patch_data_offset,
LLVMBuildMul(gallivm->builder, patch_stride,
 rel_patch_id, ""),
"");
 }

 static LLVMValueRef get_instance_index_for_fetch(
-   struct si_shader_context *radeon_bld,
+   struct si_shader_context *ctx,
unsigned param_start_instance, unsigned divisor)
 {
-   struct si_shader_context *ctx =
-   si_shader_context(&radeon_bld->bld_base);
-   struct gallivm_state *gallivm = radeon_bld->bld_base.base.gallivm;
+   struct gallivm_state *gallivm = ctx->bld_base.base.gallivm;


This could just be &ctx->gallivm. Apart from that, the series is

Reviewed-by: Nicolai Hähnle 




-   LLVMValueRef result = LLVMGetParam(radeon_bld->main_fn,
+   LLVMValueRef result = LLVMGetParam(ctx->main_fn,
   ctx->param_instance_id);

/* The division must be done before START_INSTANCE is added. */
if (divisor > 1)
result = LLVMBuildUDiv(gallivm->builder, result,
lp_build_const_int32(gallivm, divisor), "");

return LLVMBuildAdd(gallivm->builder, result,
-   LLVMGetParam(radeon_bld->main_fn, param_start_instance), 
"");
+   LLVMGetParam(ctx->main_fn, param_start_instance), 
"");
 }

 /* Bitcast <4 x float> to <2 x double>, extract the component, and convert
  * to float. */
 static LLVMValueRef extract_double_to_float(struct si_shader_context *ctx,
LLVMValueRef vec4,
unsigned double_index)
 {
LLVMBuilderRef builder = ctx->gallivm.builder;
LLVMTypeRef f64 = LLVMDoubleTypeInContext(ctx->gallivm.context);
@@ -1279,30 +1277,28 @@ static void interp_fs_input(struct si_shader_context 
*ctx,
} else {
result[chan] = ac_build_fs_interp_mov(&ctx->ac,
lp_build_const_int32(gallivm, 2), /* P0 
*/
llvm_chan, attr_number, prim_mask);
}
}
}
 }

 static void declare_input_fs(
-   struct si_shader_context *radeon_bld,
+   struct si_shader_context *ctx,
unsigned input_index,
const struct tgsi_full_declaration *decl,
LLVMValueRef out[4])
 {
-   struct lp_build_context *base = &radeon_bld->bld_base.base;
-   struct si_shader_context *ctx =
-   si_shader_context(&radeon_bld->bld_base);
+   struct lp_build_context *base = &ctx->bld_base.base;
struct si_shader *shader = ctx->shader;
-   LLVMValueRef main_fn = radeon_bld->main_fn;
+   LLVMValueRef main_fn = ctx->main_fn;
LLVMValueRef interp_param = NULL;
int interp_param_idx;

/* Get colors from input VGPRs (set by the prolog). */
if (decl->Semantic.Name == TGSI_SEMANTIC_COLOR) {
unsigned i = decl->Semantic.Index;
unsigned colors_read = shader->selector->info.colors_read;
unsigned mask = colors_read >> (i * 4);
unsigned offset = SI_PARAM_POS_FIXED_PT + 1 +
  (i ? util_bitcount(colors_read & 0xf) : 0);
@@ -1328,171 +1324,165 @@ static void declare_input_fs(
interp_param = NULL; /* load the constant color */

interp_fs_input(ctx, input_index, decl->Semantic.Name,
decl->Semantic.Index, shader->selector->info.num_inputs,
shader->selector->info.colors_read, interp_param,
LLVMGetParam(main_fn, SI_PARAM_PRIM_MASK),
LLVMGetParam(main_fn, SI_PARAM_FRONT_FACE),
&out[0]);
 }

-static LLVMValueRef get_sample_id(struct si_shader_context *radeon_bld)
+static LLVMValueRef get_sample_id(struct si_shader_context *ctx)
 {
-   return unpack_param(si_shader_context(&radeon_bld->bld_base),
-   SI_PARAM_ANCILLARY, 8, 4);
+   return unpack_param(ctx, SI_PARAM_ANCILLARY, 8, 4);
 }


 /**
  * Load a dword from a constant buffer.
  */
 static LLVMValueRef buffer_load_const(struct si_shader_co

Re: [Mesa-dev] [PATCH 1/3] radeonsi: clean up 'radeon_bld' references

2017-04-03 Thread Marek Olšák
On Mon, Apr 3, 2017 at 11:00 AM, Nicolai Hähnle  wrote:
> On 03.04.2017 02:01, Marek Olšák wrote:
>>
>> From: Marek Olšák 
>>
>> ---
>>  src/gallium/drivers/radeonsi/si_shader.c | 108
>> ++-
>>  1 file changed, 48 insertions(+), 60 deletions(-)
>>
>> diff --git a/src/gallium/drivers/radeonsi/si_shader.c
>> b/src/gallium/drivers/radeonsi/si_shader.c
>> index 3876010..2da00f9 100644
>> --- a/src/gallium/drivers/radeonsi/si_shader.c
>> +++ b/src/gallium/drivers/radeonsi/si_shader.c
>> @@ -293,37 +293,35 @@ get_tcs_out_current_patch_data_offset(struct
>> si_shader_context *ctx)
>> LLVMValueRef patch_stride = get_tcs_out_patch_stride(ctx);
>> LLVMValueRef rel_patch_id = get_rel_patch_id(ctx);
>>
>> return LLVMBuildAdd(gallivm->builder, patch0_patch_data_offset,
>> LLVMBuildMul(gallivm->builder, patch_stride,
>>  rel_patch_id, ""),
>> "");
>>  }
>>
>>  static LLVMValueRef get_instance_index_for_fetch(
>> -   struct si_shader_context *radeon_bld,
>> +   struct si_shader_context *ctx,
>> unsigned param_start_instance, unsigned divisor)
>>  {
>> -   struct si_shader_context *ctx =
>> -   si_shader_context(&radeon_bld->bld_base);
>> -   struct gallivm_state *gallivm = radeon_bld->bld_base.base.gallivm;
>> +   struct gallivm_state *gallivm = ctx->bld_base.base.gallivm;
>
>
> This could just be &ctx->gallivm. Apart from that, the series is

That will be a separate patch.

>
> Reviewed-by: Nicolai Hähnle 

Thanks.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: add si_init_descriptor_list() helper

2017-04-03 Thread Samuel Pitoiset



On 04/01/2017 09:11 AM, Nicolai Hähnle wrote:

On 31.03.2017 16:16, Samuel Pitoiset wrote:

This will be used by bindless to initialize the descriptor for
both samplers and images.

Signed-off-by: Samuel Pitoiset 


I'd prefer to see these patches in a larger context, to be honest.

As for this particular change, "init" to me implies also state/metadata
initialization. As a stand-alone function, I think the new function
should be called something else, perhaps si_fill_descriptors -- because
that's what it does -- and the null_descriptor parameter should be
renamed (to just descriptor) and made non-optional, so that the
NULL-check happens in the caller.

This way, the function would make more sense to me as a stand-alone unit
that does one thing.


Okay, this one can wait.
Thanks.



Thanks,
Nicolai



---
 src/gallium/drivers/radeonsi/si_descriptors.c | 26
+-
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c
b/src/gallium/drivers/radeonsi/si_descriptors.c
index d106351c85..84da830c11 100644
--- a/src/gallium/drivers/radeonsi/si_descriptors.c
+++ b/src/gallium/drivers/radeonsi/si_descriptors.c
@@ -96,6 +96,21 @@ static uint32_t null_image_descriptor[8] = {
  * descriptor */
 };

+static void si_init_descriptor_list(uint32_t *desc_list,
+unsigned element_dw_size,
+unsigned num_elements,
+const uint32_t *null_descriptor)
+{
+int i;
+
+/* Initialize the array to NULL descriptors if the element size
is 8. */
+if (null_descriptor) {
+assert(element_dw_size % 8 == 0);
+for (i = 0; i < num_elements * element_dw_size / 8; i++)
+memcpy(desc_list + i * 8, null_descriptor, 8 * 4);
+}
+}
+
 static void si_init_descriptors(struct si_descriptors *desc,
 unsigned shader_userdata_index,
 unsigned element_dw_size,
@@ -103,8 +118,6 @@ static void si_init_descriptors(struct
si_descriptors *desc,
 const uint32_t *null_descriptor,
 unsigned *ce_offset)
 {
-int i;
-
 assert(num_elements <= sizeof(desc->dirty_mask)*8);

 desc->list = CALLOC(num_elements, element_dw_size * 4);
@@ -121,13 +134,8 @@ static void si_init_descriptors(struct
si_descriptors *desc,
 *ce_offset += align(element_dw_size * num_elements * 4, 32);
 }

-/* Initialize the array to NULL descriptors if the element size
is 8. */
-if (null_descriptor) {
-assert(element_dw_size % 8 == 0);
-for (i = 0; i < num_elements * element_dw_size / 8; i++)
-memcpy(desc->list + i * 8, null_descriptor,
-   8 * 4);
-}
+si_init_descriptor_list(desc->list, element_dw_size, num_elements,
+null_descriptor);
 }

 static void si_release_descriptors(struct si_descriptors *desc)





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] radeonsi: use i32_0/1 instead of *int_bld.zero/one in most places

2017-04-03 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c   | 88 ++
 .../drivers/radeonsi/si_shader_tgsi_setup.c| 14 ++--
 2 files changed, 47 insertions(+), 55 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 21efd9a..a5d370b 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -538,38 +538,38 @@ static void declare_input_vs(
break;
}
 }
 
 static LLVMValueRef get_primitive_id(struct lp_build_tgsi_context *bld_base,
 unsigned swizzle)
 {
struct si_shader_context *ctx = si_shader_context(bld_base);
 
if (swizzle > 0)
-   return bld_base->uint_bld.zero;
+   return ctx->i32_0;
 
switch (ctx->type) {
case PIPE_SHADER_VERTEX:
return LLVMGetParam(ctx->main_fn,
ctx->param_vs_prim_id);
case PIPE_SHADER_TESS_CTRL:
return LLVMGetParam(ctx->main_fn,
SI_PARAM_PATCH_ID);
case PIPE_SHADER_TESS_EVAL:
return LLVMGetParam(ctx->main_fn,
ctx->param_tes_patch_id);
case PIPE_SHADER_GEOMETRY:
return LLVMGetParam(ctx->main_fn,
SI_PARAM_PRIMITIVE_ID);
default:
assert(0);
-   return bld_base->uint_bld.zero;
+   return ctx->i32_0;
}
 }
 
 /**
  * Return the value of tgsi_ind_register for indexing.
  * This is the indirect index with the constant offset added to it.
  */
 static LLVMValueRef get_indirect_index(struct si_shader_context *ctx,
   const struct tgsi_ind_register *ind,
   int rel_index)
@@ -1096,28 +1096,28 @@ static LLVMValueRef fetch_input_gs(
vtx_offset_param += SI_PARAM_VTX2_OFFSET - 2;
}
vtx_offset = lp_build_mul_imm(uint,
  LLVMGetParam(ctx->main_fn,
   vtx_offset_param),
  4);
 
param = si_shader_io_get_unique_index(semantic_name, semantic_index);
soffset = LLVMConstInt(ctx->i32, (param * 4 + swizzle) * 256, 0);
 
-   value = ac_build_buffer_load(&ctx->ac, ctx->esgs_ring, 1, uint->zero,
+   value = ac_build_buffer_load(&ctx->ac, ctx->esgs_ring, 1, ctx->i32_0,
 vtx_offset, soffset, 0, 1, 0, true);
if (tgsi_type_is_64bit(type)) {
LLVMValueRef value2;
soffset = LLVMConstInt(ctx->i32, (param * 4 + swizzle + 1) * 
256, 0);
 
value2 = ac_build_buffer_load(&ctx->ac, ctx->esgs_ring, 1,
- uint->zero, vtx_offset, soffset,
+ ctx->i32_0, vtx_offset, soffset,
  0, 1, 0, true);
return si_llvm_emit_fetch_64bit(bld_base, type,
value, value2);
}
return LLVMBuildBitCast(gallivm->builder,
value,
tgsi2llvmtype(bld_base, type), "");
 }
 
 static int lookup_interp_param_index(unsigned interpolate, unsigned location)
@@ -1169,21 +1169,20 @@ static void interp_fs_input(struct si_shader_context 
*ctx,
unsigned semantic_index,
unsigned num_interp_inputs,
unsigned colors_read_mask,
LLVMValueRef interp_param,
LLVMValueRef prim_mask,
LLVMValueRef face,
LLVMValueRef result[4])
 {
struct lp_build_tgsi_context *bld_base = &ctx->bld_base;
struct lp_build_context *base = &bld_base->base;
-   struct lp_build_context *uint = &bld_base->uint_bld;
struct gallivm_state *gallivm = base->gallivm;
LLVMValueRef attr_number;
LLVMValueRef i, j;
 
unsigned chan;
 
/* fs.constant returns the param from the middle vertex, so it's not
 * really useful for flat shading. It's meant to be used for custom
 * interpolation (but the intrinsic can't fetch from the other two
 * vertices).
@@ -1198,41 +1197,41 @@ static void interp_fs_input(struct si_shader_context 
*ctx,
 */
bool interp = interp_param != NULL;
 
attr_number = LLVMConstInt(ctx->i32, input_index, 0);
 
if (interp) {
interp_param = LLVMBuildBitCast(gallivm->builder, interp_param,
LLVMVectorType(ctx->f32, 2), 
"");
 
i = LLVMBuildExtractElement(gallivm->builder,

[Mesa-dev] [PATCH 2/3] radeonsi: use ctx->types instead of bld->types etc.

2017-04-03 Thread Marek Olšák
From: Marek Olšák 

even vec_type is f32.
---
 src/gallium/drivers/radeonsi/si_shader.c   | 28 ++
 .../drivers/radeonsi/si_shader_tgsi_setup.c| 16 ++---
 2 files changed, 21 insertions(+), 23 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index a5d370b..0200172 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -3306,21 +3306,21 @@ static LLVMValueRef image_fetch_coords(
struct gallivm_state *gallivm = bld_base->base.gallivm;
LLVMBuilderRef builder = gallivm->builder;
unsigned target = inst->Memory.Texture;
unsigned num_coords = tgsi_util_get_texture_coord_dim(target);
LLVMValueRef coords[4];
LLVMValueRef tmp;
int chan;
 
for (chan = 0; chan < num_coords; ++chan) {
tmp = lp_build_emit_fetch(bld_base, inst, src, chan);
-   tmp = LLVMBuildBitCast(builder, tmp, 
bld_base->uint_bld.elem_type, "");
+   tmp = LLVMBuildBitCast(builder, tmp, ctx->i32, "");
coords[chan] = tmp;
}
 
/* 1D textures are allocated and used as 2D on GFX9. */
if (ctx->screen->b.chip_class >= GFX9) {
if (target == TGSI_TEXTURE_1D) {
coords[1] = ctx->i32_0;
num_coords++;
} else if (target == TGSI_TEXTURE_1D_ARRAY) {
coords[2] = coords[1];
@@ -3414,31 +3414,31 @@ static void buffer_append_args(
 static void load_fetch_args(
struct lp_build_tgsi_context * bld_base,
struct lp_build_emit_data * emit_data)
 {
struct si_shader_context *ctx = si_shader_context(bld_base);
struct gallivm_state *gallivm = bld_base->base.gallivm;
const struct tgsi_full_instruction * inst = emit_data->inst;
unsigned target = inst->Memory.Texture;
LLVMValueRef rsrc;
 
-   emit_data->dst_type = LLVMVectorType(bld_base->base.elem_type, 4);
+   emit_data->dst_type = ctx->v4f32;
 
if (inst->Src[0].Register.File == TGSI_FILE_BUFFER) {
LLVMBuilderRef builder = gallivm->builder;
LLVMValueRef offset;
LLVMValueRef tmp;
 
rsrc = shader_buffer_fetch_rsrc(ctx, &inst->Src[0]);
 
tmp = lp_build_emit_fetch(bld_base, inst, 1, 0);
-   offset = LLVMBuildBitCast(builder, tmp, 
bld_base->uint_bld.elem_type, "");
+   offset = LLVMBuildBitCast(builder, tmp, ctx->i32, "");
 
buffer_append_args(ctx, emit_data, rsrc, ctx->i32_0,
   offset, false, false);
} else if (inst->Src[0].Register.File == TGSI_FILE_IMAGE) {
LLVMValueRef coords;
 
image_fetch_rsrc(bld_base, &inst->Src[0], false, target, &rsrc);
coords = image_fetch_coords(bld_base, inst, 1);
 
if (target == TGSI_TEXTURE_BUFFER) {
@@ -3522,32 +3522,31 @@ static LLVMValueRef get_memory_ptr(struct 
si_shader_context *ctx,
ptr = LLVMBuildBitCast(builder, ptr, LLVMPointerType(type, addr_space), 
"");
 
return ptr;
 }
 
 static void load_emit_memory(
struct si_shader_context *ctx,
struct lp_build_emit_data *emit_data)
 {
const struct tgsi_full_instruction *inst = emit_data->inst;
-   struct lp_build_context *base = &ctx->bld_base.base;
struct gallivm_state *gallivm = &ctx->gallivm;
LLVMBuilderRef builder = gallivm->builder;
unsigned writemask = inst->Dst[0].Register.WriteMask;
LLVMValueRef channels[4], ptr, derived_ptr, index;
int chan;
 
-   ptr = get_memory_ptr(ctx, inst, base->elem_type, 1);
+   ptr = get_memory_ptr(ctx, inst, ctx->f32, 1);
 
for (chan = 0; chan < 4; ++chan) {
if (!(writemask & (1 << chan))) {
-   channels[chan] = LLVMGetUndef(base->elem_type);
+   channels[chan] = LLVMGetUndef(ctx->f32);
continue;
}
 
index = LLVMConstInt(ctx->i32, chan, 0);
derived_ptr = LLVMBuildGEP(builder, ptr, &index, 1, "");
channels[chan] = LLVMBuildLoad(builder, derived_ptr, "");
}
emit_data->output[emit_data->chan] = lp_build_gather_values(gallivm, 
channels, 4);
 }
 
@@ -3692,21 +3691,21 @@ static void store_fetch_args(
 
memory = tgsi_full_src_register_from_dst(&inst->Dst[0]);
 
if (inst->Dst[0].Register.File == TGSI_FILE_BUFFER) {
LLVMValueRef offset;
LLVMValueRef tmp;
 
rsrc = shader_buffer_fetch_rsrc(ctx, &memory);
 
tmp = lp_build_emit_fetch(bld_base, inst, 0, 0);
-   offset = LLVMBuildBitCast(builder, tmp, 
bld_base->uint_bld.elem_type, "");
+   offset = LLVMBui

[Mesa-dev] [PATCH 3/3] radeonsi: access gallivm through ctx in most places

2017-04-03 Thread Marek Olšák
From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_shader.c   | 116 ++---
 src/gallium/drivers/radeonsi/si_shader_tgsi_alu.c  |   4 +-
 .../drivers/radeonsi/si_shader_tgsi_setup.c|  46 
 3 files changed, 79 insertions(+), 87 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 0200172..29d3dd4 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -296,21 +296,21 @@ get_tcs_out_current_patch_data_offset(struct 
si_shader_context *ctx)
return LLVMBuildAdd(gallivm->builder, patch0_patch_data_offset,
LLVMBuildMul(gallivm->builder, patch_stride,
 rel_patch_id, ""),
"");
 }
 
 static LLVMValueRef get_instance_index_for_fetch(
struct si_shader_context *ctx,
unsigned param_start_instance, unsigned divisor)
 {
-   struct gallivm_state *gallivm = ctx->bld_base.base.gallivm;
+   struct gallivm_state *gallivm = &ctx->gallivm;
 
LLVMValueRef result = LLVMGetParam(ctx->main_fn,
   ctx->param_instance_id);
 
/* The division must be done before START_INSTANCE is added. */
if (divisor > 1)
result = LLVMBuildUDiv(gallivm->builder, result,
LLVMConstInt(ctx->i32, divisor, 0), "");
 
return LLVMBuildAdd(gallivm->builder, result,
@@ -331,22 +331,21 @@ static LLVMValueRef extract_double_to_float(struct 
si_shader_context *ctx,
LLVMValueRef value = LLVMBuildExtractElement(builder, dvec2, index, "");
return LLVMBuildFPTrunc(builder, value, ctx->f32, "");
 }
 
 static void declare_input_vs(
struct si_shader_context *ctx,
unsigned input_index,
const struct tgsi_full_declaration *decl,
LLVMValueRef out[4])
 {
-   struct lp_build_context *base = &ctx->bld_base.base;
-   struct gallivm_state *gallivm = base->gallivm;
+   struct gallivm_state *gallivm = &ctx->gallivm;
 
unsigned chan;
unsigned fix_fetch;
unsigned num_fetches;
unsigned fetch_stride;
 
LLVMValueRef t_list_ptr;
LLVMValueRef t_offset;
LLVMValueRef t_list;
LLVMValueRef vertex_index;
@@ -567,21 +566,21 @@ static LLVMValueRef get_primitive_id(struct 
lp_build_tgsi_context *bld_base,
 }
 
 /**
  * Return the value of tgsi_ind_register for indexing.
  * This is the indirect index with the constant offset added to it.
  */
 static LLVMValueRef get_indirect_index(struct si_shader_context *ctx,
   const struct tgsi_ind_register *ind,
   int rel_index)
 {
-   struct gallivm_state *gallivm = ctx->bld_base.base.gallivm;
+   struct gallivm_state *gallivm = &ctx->gallivm;
LLVMValueRef result;
 
result = ctx->addrs[ind->Index][ind->Swizzle];
result = LLVMBuildLoad(gallivm->builder, result, "");
result = LLVMBuildAdd(gallivm->builder, result,
  LLVMConstInt(ctx->i32, rel_index, 0), "");
return result;
 }
 
 /**
@@ -607,21 +606,21 @@ static LLVMValueRef get_bounded_indirect_index(struct 
si_shader_context *ctx,
 
 /**
  * Calculate a dword address given an input or output register and a stride.
  */
 static LLVMValueRef get_dw_address(struct si_shader_context *ctx,
   const struct tgsi_full_dst_register *dst,
   const struct tgsi_full_src_register *src,
   LLVMValueRef vertex_dw_stride,
   LLVMValueRef base_addr)
 {
-   struct gallivm_state *gallivm = ctx->bld_base.base.gallivm;
+   struct gallivm_state *gallivm = &ctx->gallivm;
struct tgsi_shader_info *info = &ctx->shader->selector->info;
ubyte *name, *index, *array_first;
int first, param;
struct tgsi_full_dst_register reg;
 
/* Set the register description. The address computation is the same
 * for sources and destinations. */
if (src) {
reg.Register.File = src->Register.File;
reg.Register.Index = src->Register.Index;
@@ -706,21 +705,21 @@ static LLVMValueRef get_dw_address(struct 
si_shader_context *ctx,
  * - per patch attribute 0 of patch 1
  *   ...
  *
  * Note that every attribute has 4 components.
  */
 static LLVMValueRef get_tcs_tes_buffer_address(struct si_shader_context *ctx,
   LLVMValueRef rel_patch_id,
LLVMValueRef vertex_index,
LLVMValueRef param_index)
 {
-   struct gallivm_state *gallivm = ctx->bld_base.base.gallivm;
+   struct gallivm_state *gallivm = &ctx->gallivm;
LLVMValueRe

Re: [Mesa-dev] [PATCH] st/omx/dec: Properly undefine DEBUG macro

2017-04-03 Thread Shaleen
Yes, that is correct. I wasn't thinking clearly.

On Mon, Apr 3, 2017 at 1:57 PM Christian König 
wrote:

> Am 03.04.2017 um 06:35 schrieb Shaleen Jain:
> > ---
> >   src/gallium/state_trackers/omx/vid_dec.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/src/gallium/state_trackers/omx/vid_dec.c
> b/src/gallium/state_trackers/omx/vid_dec.c
> > index 9a6efb8e28..94664eba04 100644
> > --- a/src/gallium/state_trackers/omx/vid_dec.c
> > +++ b/src/gallium/state_trackers/omx/vid_dec.c
> > @@ -37,7 +37,7 @@
> >   #include 
> >
> >   /* bellagio defines a DEBUG macro that we don't want */
> > -#ifndef DEBUG
> > +#ifdef DEBUG
>
> NAK, the logic locked correct as it was before.
>
> Why do you want to invert it?
>
> Regards,
> Christian.
>
> >   #include 
> >   #undef DEBUG
> >   #else
>
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] targets/va: export radeon winsys_create functions

2017-04-03 Thread Marek Olšák
From: Marek Olšák 

It silences the following radeonsi LLVM warning due to a previous
commit adding an LLVM workaround:
  "mesa: for the -simplifycfg-sink-common option: may only occur zero or one
   times!"

Cc: 17.0 
---
 src/gallium/targets/omx/omx.sym  | 5 +
 src/gallium/targets/pipe-loader/pipe.sym | 5 +
 src/gallium/targets/va/va.sym| 5 +
 3 files changed, 15 insertions(+)

diff --git a/src/gallium/targets/omx/omx.sym b/src/gallium/targets/omx/omx.sym
index af22aed..e8a2876 100644
--- a/src/gallium/targets/omx/omx.sym
+++ b/src/gallium/targets/omx/omx.sym
@@ -1,6 +1,11 @@
 {
global:
omx_component_library_Setup;
+
+   # Workaround for an LLVM warning with -simplifycfg-sink-common
+   # due to LLVM being initialized multiple times.
+   radeon_drm_winsys_create;
+   amdgpu_winsys_create;
local:
*;
 };
diff --git a/src/gallium/targets/pipe-loader/pipe.sym 
b/src/gallium/targets/pipe-loader/pipe.sym
index b2fa619..605cb83 100644
--- a/src/gallium/targets/pipe-loader/pipe.sym
+++ b/src/gallium/targets/pipe-loader/pipe.sym
@@ -1,7 +1,12 @@
 {
global:
driver_descriptor;
swrast_driver_descriptor;
+
+   # Workaround for an LLVM warning with -simplifycfg-sink-common
+   # due to LLVM being initialized multiple times.
+   radeon_drm_winsys_create;
+   amdgpu_winsys_create;
local:
*;
 };
diff --git a/src/gallium/targets/va/va.sym b/src/gallium/targets/va/va.sym
index c925b2e..917c3d3 100644
--- a/src/gallium/targets/va/va.sym
+++ b/src/gallium/targets/va/va.sym
@@ -1,6 +1,11 @@
 {
global:
__vaDriverInit_*_*;
+
+   # Workaround for an LLVM warning with -simplifycfg-sink-common
+   # due to LLVM being initialized multiple times.
+   radeon_drm_winsys_create;
+   amdgpu_winsys_create;
local:
*;
 };
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] targets/va: export radeon winsys_create functions

2017-04-03 Thread Marek Olšák
I've changed the commit message to "targets: export radeon
winsys_create functions to silence LLVM warning".

Marek

On Mon, Apr 3, 2017 at 12:08 PM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> It silences the following radeonsi LLVM warning due to a previous
> commit adding an LLVM workaround:
>   "mesa: for the -simplifycfg-sink-common option: may only occur zero or one
>times!"
>
> Cc: 17.0 
> ---
>  src/gallium/targets/omx/omx.sym  | 5 +
>  src/gallium/targets/pipe-loader/pipe.sym | 5 +
>  src/gallium/targets/va/va.sym| 5 +
>  3 files changed, 15 insertions(+)
>
> diff --git a/src/gallium/targets/omx/omx.sym b/src/gallium/targets/omx/omx.sym
> index af22aed..e8a2876 100644
> --- a/src/gallium/targets/omx/omx.sym
> +++ b/src/gallium/targets/omx/omx.sym
> @@ -1,6 +1,11 @@
>  {
> global:
> omx_component_library_Setup;
> +
> +   # Workaround for an LLVM warning with -simplifycfg-sink-common
> +   # due to LLVM being initialized multiple times.
> +   radeon_drm_winsys_create;
> +   amdgpu_winsys_create;
> local:
> *;
>  };
> diff --git a/src/gallium/targets/pipe-loader/pipe.sym 
> b/src/gallium/targets/pipe-loader/pipe.sym
> index b2fa619..605cb83 100644
> --- a/src/gallium/targets/pipe-loader/pipe.sym
> +++ b/src/gallium/targets/pipe-loader/pipe.sym
> @@ -1,7 +1,12 @@
>  {
> global:
> driver_descriptor;
> swrast_driver_descriptor;
> +
> +   # Workaround for an LLVM warning with -simplifycfg-sink-common
> +   # due to LLVM being initialized multiple times.
> +   radeon_drm_winsys_create;
> +   amdgpu_winsys_create;
> local:
> *;
>  };
> diff --git a/src/gallium/targets/va/va.sym b/src/gallium/targets/va/va.sym
> index c925b2e..917c3d3 100644
> --- a/src/gallium/targets/va/va.sym
> +++ b/src/gallium/targets/va/va.sym
> @@ -1,6 +1,11 @@
>  {
> global:
> __vaDriverInit_*_*;
> +
> +   # Workaround for an LLVM warning with -simplifycfg-sink-common
> +   # due to LLVM being initialized multiple times.
> +   radeon_drm_winsys_create;
> +   amdgpu_winsys_create;
> local:
> *;
>  };
> --
> 2.7.4
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/4] r600g: check rasterizer primitive states like in radeonsi

2017-04-03 Thread Marek Olšák
For the series:

Reviewed-by: Marek Olšák 

Marek

On Sun, Apr 2, 2017 at 7:33 PM, Constantine Kharlamov
 wrote:
> Specifically, non-line primitives skipped, and defaulting to reset on
> each packet.
>
> The skip of non-line primitives saves ≈110 resetting of
> PA_SC_LINE_STIPPLE register per frame in Kane&Lynch2.
>
> Signed-off-by: Constantine Kharlamov 
> ---
>  src/gallium/drivers/r600/r600_state_common.c | 21 +
>  1 file changed, 13 insertions(+), 8 deletions(-)
>
> diff --git a/src/gallium/drivers/r600/r600_state_common.c 
> b/src/gallium/drivers/r600/r600_state_common.c
> index e4d1660933..4de2a7344b 100644
> --- a/src/gallium/drivers/r600/r600_state_common.c
> +++ b/src/gallium/drivers/r600/r600_state_common.c
> @@ -1670,19 +1670,24 @@ void r600_emit_clip_misc_state(struct r600_context 
> *rctx, struct r600_atom *atom
>  static inline void r600_emit_rasterizer_prim_state(struct r600_context *rctx)
>  {
> struct radeon_winsys_cs *cs = rctx->b.gfx.cs;
> -   unsigned ls_mask = 0;
> enum pipe_prim_type rast_prim = rctx->current_rast_prim;
> -   if (rast_prim == rctx->last_rast_prim)
> +
> +   /* Skip this if not rendering lines. */
> +   if (rast_prim != PIPE_PRIM_LINES &&
> +   rast_prim != PIPE_PRIM_LINE_LOOP &&
> +   rast_prim != PIPE_PRIM_LINE_STRIP &&
> +   rast_prim != PIPE_PRIM_LINES_ADJACENCY &&
> +   rast_prim != PIPE_PRIM_LINE_STRIP_ADJACENCY)
> return;
>
> -   if (rast_prim == PIPE_PRIM_LINES)
> -   ls_mask = 1;
> -   else if (rast_prim == PIPE_PRIM_LINE_STRIP ||
> -rast_prim == PIPE_PRIM_LINE_LOOP)
> -   ls_mask = 2;
> +   if (rast_prim == rctx->last_rast_prim)
> +   return;
>
> +   /* For lines, reset the stipple pattern at each primitive. Otherwise,
> +* reset the stipple pattern at each packet (line strips, line loops).
> +*/
> radeon_set_context_reg(cs, R_028A0C_PA_SC_LINE_STIPPLE,
> -  S_028A0C_AUTO_RESET_CNTL(ls_mask) |
> +  S_028A0C_AUTO_RESET_CNTL(rast_prim == 
> PIPE_PRIM_LINES ? 1 : 2) |
>(rctx->rasterizer ? 
> rctx->rasterizer->pa_sc_line_stipple : 0));
> rctx->last_rast_prim = rast_prim;
>  }
> --
> 2.12.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] targets/va: export radeon winsys_create functions

2017-04-03 Thread Christian König

Am 03.04.2017 um 12:08 schrieb Marek Olšák:

From: Marek Olšák 

It silences the following radeonsi LLVM warning due to a previous
commit adding an LLVM workaround:
   "mesa: for the -simplifycfg-sink-common option: may only occur zero or one
times!"

Cc: 17.0 


Reviewed-by: Christian König 


---
  src/gallium/targets/omx/omx.sym  | 5 +
  src/gallium/targets/pipe-loader/pipe.sym | 5 +
  src/gallium/targets/va/va.sym| 5 +
  3 files changed, 15 insertions(+)

diff --git a/src/gallium/targets/omx/omx.sym b/src/gallium/targets/omx/omx.sym
index af22aed..e8a2876 100644
--- a/src/gallium/targets/omx/omx.sym
+++ b/src/gallium/targets/omx/omx.sym
@@ -1,6 +1,11 @@
  {
global:
omx_component_library_Setup;
+
+   # Workaround for an LLVM warning with -simplifycfg-sink-common
+   # due to LLVM being initialized multiple times.
+   radeon_drm_winsys_create;
+   amdgpu_winsys_create;
local:
*;
  };
diff --git a/src/gallium/targets/pipe-loader/pipe.sym 
b/src/gallium/targets/pipe-loader/pipe.sym
index b2fa619..605cb83 100644
--- a/src/gallium/targets/pipe-loader/pipe.sym
+++ b/src/gallium/targets/pipe-loader/pipe.sym
@@ -1,7 +1,12 @@
  {
global:
driver_descriptor;
swrast_driver_descriptor;
+
+   # Workaround for an LLVM warning with -simplifycfg-sink-common
+   # due to LLVM being initialized multiple times.
+   radeon_drm_winsys_create;
+   amdgpu_winsys_create;
local:
*;
  };
diff --git a/src/gallium/targets/va/va.sym b/src/gallium/targets/va/va.sym
index c925b2e..917c3d3 100644
--- a/src/gallium/targets/va/va.sym
+++ b/src/gallium/targets/va/va.sym
@@ -1,6 +1,11 @@
  {
global:
__vaDriverInit_*_*;
+
+   # Workaround for an LLVM warning with -simplifycfg-sink-common
+   # due to LLVM being initialized multiple times.
+   radeon_drm_winsys_create;
+   amdgpu_winsys_create;
local:
*;
  };



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 100402] [d3d9 bisected] Diablo III fails to start after commit 0630d3600bfb770cf3b23761c45b3add3b277c6b

2017-04-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=100402

Vedran Miletić  changed:

   What|Removed |Added

 CC||ved...@miletic.net

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] docs: mark GL_ARB_vertex_attrib_64bit and OpenGL 4.2 as supported by i965/gen7+

2017-04-03 Thread Juan A. Suarez Romero
v2 (Andreas Boll):
- Mark GL 4.1 as supported by i965/gen7+
- Mark GL_ARB_shader_precision as supported by i965/gen7+
- Update release notes
---
 docs/features.txt | 4 ++--
 docs/relnotes/17.1.0.html | 3 +++
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/docs/features.txt b/docs/features.txt
index 5c22acf..e71b197 100644
--- a/docs/features.txt
+++ b/docs/features.txt
@@ -136,12 +136,12 @@ GL 4.0, GLSL 4.00 --- all DONE: i965/gen7+, nvc0, r600, 
radeonsi
   GL_ARB_transform_feedback3DONE (i965/gen7+, 
llvmpipe, softpipe, swr)
 
 
-GL 4.1, GLSL 4.10 --- all DONE: i965/hsw+, nvc0, r600, radeonsi
+GL 4.1, GLSL 4.10 --- all DONE: i965/gen7+, nvc0, r600, radeonsi
 
   GL_ARB_ES2_compatibility  DONE (i965, nv50, 
llvmpipe, softpipe, swr)
   GL_ARB_get_program_binary DONE (0 binary formats)
   GL_ARB_separate_shader_objectsDONE (all drivers)
-  GL_ARB_shader_precision   DONE (i965/hsw+, all 
drivers that support GLSL 4.10)
+  GL_ARB_shader_precision   DONE (i965/gen7+, all 
drivers that support GLSL 4.10)
   GL_ARB_vertex_attrib_64bitDONE (i965/gen7+, 
llvmpipe, softpipe)
   GL_ARB_viewport_array DONE (i965, nv50, 
llvmpipe, softpipe)
 
diff --git a/docs/relnotes/17.1.0.html b/docs/relnotes/17.1.0.html
index ada1e38..dbf8379 100644
--- a/docs/relnotes/17.1.0.html
+++ b/docs/relnotes/17.1.0.html
@@ -44,9 +44,12 @@ Note: some of the new features are only available with 
certain drivers.
 
 
 
+OpenGL 4.2 on i965/ivb
 GL_ARB_gpu_shader_int64 on i965/gen8+, nvc0, radeonsi, softpipe, 
llvmpipe
+GL_ARB_shader_precision on i965/ivb
 GL_ARB_transform_feedback2 on i965/gen6
 GL_ARB_transform_feedback_overflow_query on i965/gen6+
+GL_ARB_vertex_attrib_64bit on i965/ivb
 Geometry shaders enabled on swr
 
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] anv: add support for allocating more than 1 block of memory

2017-04-03 Thread Juan A. Suarez Romero
On Wed, 2017-03-29 at 12:06 -0700, Jason Ekstrand wrote:
> Looking over the patch, I think I've convinced myself that it's correct.  (I 
> honestly wasn't expecting to come to that conclusion without more iteration.) 
>  That said, this raises some interesting questions.  I added Kristian to the 
> Cc in case he has any input.
> 
>  1. Should we do powers of two or linear.  I'm still a fan of powers of two.

In which specific part do you mean? In free block lists? or in the
allocator_new? 

> 
>  2. Should block pools even have a block size at all? We could just make 
> every block pool allow any power-of-two size from 4 KiB up to. say, 1 MiB and 
> then make the block size part of the state pool or stream that's allocating 
> from it.  At the moment, I like this idea, but I've given it very little 
> thought.
> 
So IIUC, the idea would be the block pool is just a flat chunk of
memory, where we later fetch blocks of memory from, as requested by
applications. Is that correct?



>  3. If we go with the idea in 2. should we still call it block_pool?  I think 
> we can keep the name but it doesn't it as well as it once did.
> 
> Thanks for working on this!  I'm sorry it's taken so long to respond.  Every 
> time I've looked at it, my brain hasn't been in the right state to think 
> about lock-free code. :-/
> 
> On Wed, Mar 15, 2017 at 5:05 AM, Juan A. Suarez Romero  
> wrote:
> > Current Anv allocator assign memory in terms of a fixed block size.
> > 
> > But there can be cases where this block is not enough for a memory
> > request, and thus several blocks must be assigned in a row.
> > 
> > This commit adds support for specifying how many blocks of memory must
> > be assigned.
> > 
> > This fixes a number dEQP-VK.pipeline.render_to_image.* tests that crash.
> > 
> > v2: lock-free free-list is not handled correctly (Jason)
> > ---
> >  src/intel/vulkan/anv_allocator.c   | 81 
> > +++---
> >  src/intel/vulkan/anv_batch_chain.c |  4 +-
> >  src/intel/vulkan/anv_private.h     |  7 +++-
> >  3 files changed, 66 insertions(+), 26 deletions(-)
> > 
> > diff --git a/src/intel/vulkan/anv_allocator.c 
> > b/src/intel/vulkan/anv_allocator.c
> > index 45c663b..3924551 100644
> > --- a/src/intel/vulkan/anv_allocator.c
> > +++ b/src/intel/vulkan/anv_allocator.c
> > @@ -257,7 +257,8 @@ anv_block_pool_init(struct anv_block_pool *pool,
> >     pool->device = device;
> >     anv_bo_init(&pool->bo, 0, 0);
> >     pool->block_size = block_size;
> > -   pool->free_list = ANV_FREE_LIST_EMPTY;
> > +   for (uint32_t i = 0; i < ANV_MAX_BLOCKS; i++)
> > +      pool->free_list[i] = ANV_FREE_LIST_EMPTY;
> >     pool->back_free_list = ANV_FREE_LIST_EMPTY;
> > 
> >     pool->fd = memfd_create("block pool", MFD_CLOEXEC);
> > @@ -500,30 +501,35 @@ fail:
> > 
> >  static uint32_t
> >  anv_block_pool_alloc_new(struct anv_block_pool *pool,
> > -                         struct anv_block_state *pool_state)
> > +                         struct anv_block_state *pool_state,
> > +                         uint32_t n_blocks)
> 
> Maybe have this take a size rather than n_blocks?  It's only ever called by 
> stuff in the block pool so the caller can do the multiplication.  It would 
> certainly make some of the math below easier.
>  
> >  {
> >     struct anv_block_state state, old, new;
> > 
> >     while (1) {
> > -      state.u64 = __sync_fetch_and_add(&pool_state->u64, pool->block_size);
> > -      if (state.next < state.end) {
> > +      state.u64 = __sync_fetch_and_add(&pool_state->u64, n_blocks * 
> > pool->block_size);
> > +      if (state.next > state.end) {
> > +         futex_wait(&pool_state->end, state.end);
> > +         continue;
> > +      } else if ((state.next + (n_blocks - 1) * pool->block_size) < 
> > state.end) {
> 
> First off, please keep the if's in the same order unless we have a reason to 
> re-arrange them.  It would make this way easier to review. :-)
> 
> Second, I think this would be much easier to read as:
> 
> if (state.next + size <= state.end) {
>    /* Success */
> } else if (state.next <= state.end) {
>    /* Our block is the one that crosses the line */
> } else {
>    /* Wait like everyone else */
> }
>  
> >           assert(pool->map);
> >           return state.next;
> > -      } else if (state.next == state.end) {
> > -         /* We allocated the first block outside the pool, we have to grow 
> > it.
> > -          * pool_state->next acts a mutex: threads who try to allocate now 
> > will
> > -          * get block indexes above the current limit and hit futex_wait
> > -          * below. */
> > -         new.next = state.next + pool->block_size;
> > +      } else {
> > +         /* We allocated the firsts blocks outside the pool, we have to 
> > grow
> > +          * it. pool_state->next acts a mutex: threads who try to allocate
> > +          * now will get block indexes above the current limit and hit
> > +          * futex_wait below.
> > +          */
> > +         new.n

Re: [Mesa-dev] [PATCH 9/9] gallium: decrease the size of pipe_draw_info - 88 -> 80 bytes

2017-04-03 Thread Brian Paul
I need to test this series on Windows and with MinGW first.  I'm worried
about enums with bitfields.

-Brian


On Mon, Apr 3, 2017 at 2:46 AM, Nicolai Hähnle  wrote:

> Some of these are a bit subtle, but I think they're fine. Series is:
>
> Reviewed-by: Nicolai Hähnle 
>
>
> On 02.04.2017 20:00, Marek Olšák wrote:
>
>> From: Marek Olšák 
>>
>> ---
>>  src/gallium/auxiliary/indices/u_primconvert.c | 10 --
>>  src/gallium/include/pipe/p_state.h|  2 +-
>>  2 files changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/indices/u_primconvert.c
>> b/src/gallium/auxiliary/indices/u_primconvert.c
>> index 2bdfade..1ffca4b 100644
>> --- a/src/gallium/auxiliary/indices/u_primconvert.c
>> +++ b/src/gallium/auxiliary/indices/u_primconvert.c
>> @@ -121,39 +121,45 @@ util_primconvert_draw_vbo(struct
>> primconvert_context *pc,
>> util_draw_init_info(&new_info);
>> new_info.indexed = true;
>> new_info.min_index = info->min_index;
>> new_info.max_index = info->max_index;
>> new_info.index_bias = info->index_bias;
>> new_info.start_instance = info->start_instance;
>> new_info.instance_count = info->instance_count;
>> new_info.primitive_restart = info->primitive_restart;
>> new_info.restart_index = info->restart_index;
>> if (info->indexed) {
>> +  enum pipe_prim_type mode = 0;
>> +
>>u_index_translator(pc->primtypes_mask,
>>   info->mode, pc->saved_ib.index_size,
>> info->count,
>>   pc->api_pv, pc->api_pv,
>>   info->primitive_restart ? PR_ENABLE :
>> PR_DISABLE,
>> - &new_info.mode, &new_ib.index_size,
>> &new_info.count,
>> + &mode, &new_ib.index_size, &new_info.count,
>>   &trans_func);
>> +  new_info.mode = mode;
>>src = ib->user_buffer;
>>if (!src) {
>>   src = pipe_buffer_map(pc->pipe, ib->buffer,
>> PIPE_TRANSFER_READ, &src_transfer);
>>}
>>src = (const uint8_t *)src + ib->offset;
>> }
>> else {
>> +  enum pipe_prim_type mode = 0;
>> +
>>u_index_generator(pc->primtypes_mask,
>>  info->mode, info->start, info->count,
>>  pc->api_pv, pc->api_pv,
>> -&new_info.mode, &new_ib.index_size,
>> &new_info.count,
>> +&mode, &new_ib.index_size, &new_info.count,
>>  &gen_func);
>> +  new_info.mode = mode;
>> }
>>
>> u_upload_alloc(pc->pipe->stream_uploader, 0, new_ib.index_size *
>> new_info.count, 4,
>>&new_ib.offset, &new_ib.buffer, &dst);
>>
>> if (info->indexed) {
>>trans_func(src, info->start, info->count, new_info.count,
>> info->restart_index, dst);
>> }
>> else {
>>gen_func(info->start, new_info.count, dst);
>> diff --git a/src/gallium/include/pipe/p_state.h
>> b/src/gallium/include/pipe/p_state.h
>> index d68a4d4..eeabf8b 100644
>> --- a/src/gallium/include/pipe/p_state.h
>> +++ b/src/gallium/include/pipe/p_state.h
>> @@ -634,21 +634,21 @@ struct pipe_index_buffer
>> const void *user_buffer;  /**< pointer to a user buffer if buffer ==
>> NULL */
>>  };
>>
>>
>>  /**
>>   * Information to describe a draw_vbo call.
>>   */
>>  struct pipe_draw_info
>>  {
>> boolean indexed;  /**< use index buffer */
>> -   enum pipe_prim_type mode;  /**< the mode of the primitive */
>> +   enum pipe_prim_type mode:8;  /**< the mode of the primitive */
>> boolean primitive_restart;
>> ubyte vertices_per_patch; /**< the number of vertices per patch */
>>
>> unsigned start;  /**< the index of the first vertex */
>> unsigned count;  /**< number of vertices */
>>
>> unsigned start_instance; /**< first instance id */
>> unsigned instance_count; /**< number of instances */
>>
>> unsigned drawid; /**< id of this draw in a multidraw */
>>
>>
>
> --
> Lerne, wie die Welt wirklich ist,
> Aber vergiss niemals, wie sie sein sollte.
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] anv: add support for allocating more than 1 block of memory

2017-04-03 Thread Jason Ekstrand
On Mon, Apr 3, 2017 at 5:02 AM, Juan A. Suarez Romero 
wrote:

> On Wed, 2017-03-29 at 12:06 -0700, Jason Ekstrand wrote:
> > Looking over the patch, I think I've convinced myself that it's
> correct.  (I honestly wasn't expecting to come to that conclusion without
> more iteration.)  That said, this raises some interesting questions.  I
> added Kristian to the Cc in case he has any input.
> >
> >  1. Should we do powers of two or linear.  I'm still a fan of powers of
> two.
>
> In which specific part do you mean? In free block lists? or in the
> allocator_new?
>

In the state pool, we just round all allocation sizes up to the nearest
power-of-two, and then the index into the array of free lists isn't "size -
base", it's "ilog2(size) - base".


> >
> >  2. Should block pools even have a block size at all? We could just make
> every block pool allow any power-of-two size from 4 KiB up to. say, 1 MiB
> and then make the block size part of the state pool or stream that's
> allocating from it.  At the moment, I like this idea, but I've given it
> very little thought.
> >
> So IIUC, the idea would be the block pool is just a flat chunk of
> memory, where we later fetch blocks of memory from, as requested by
> applications. Is that correct?
>

Sorry, but this patch gave me some sudden revelations and things are still
in the process of reforming in my brain.  In the past, we assumed a
two-layered allocation strategy where we had a block pool which was the
base and then the state pool and state stream allocators sat on top of it.
Originally, the state pool allocator was just for a single size as well.

Now that the block pool is going to be capable of allocating multiple
sizes, the whole mental model of the separation falls apart.  The new
future that I see is one where the block pool and state pool aren't
separate.  Instead, we have a single thing which I'll call state_pool (we
have to pick one of the names) which lets you allocate a state of any size
from 32B to 2MB.  The state stream will then allocate off of a state_pool
instead of a block_pool since they're now the same.  For smaller states, we
still want to allocate reasonably large chunks (probably 4K) so that we
ensure that things are nicely aligned.  I think I've got a pretty good idea
of how it should work at this point and can write more if you'd like.

Before we dive in and do a pile of refactoring, I think this patch is
pretty much good-to-go assuming we fix the power-of-two thing and it fixes
a bug so let's focus there.  Are you  interested in doing the refactoring?
If not, that's ok, I'm happy to do it and it wouldn't take me long now that
I've gotten a chance to think about it.  If you are interested, go for it!


> >  3. If we go with the idea in 2. should we still call it block_pool?  I
> think we can keep the name but it doesn't it as well as it once did.
> >
> > Thanks for working on this!  I'm sorry it's taken so long to respond.
> Every time I've looked at it, my brain hasn't been in the right state to
> think about lock-free code. :-/
> >
> > On Wed, Mar 15, 2017 at 5:05 AM, Juan A. Suarez Romero <
> jasua...@igalia.com> wrote:
> > > Current Anv allocator assign memory in terms of a fixed block size.
> > >
> > > But there can be cases where this block is not enough for a memory
> > > request, and thus several blocks must be assigned in a row.
> > >
> > > This commit adds support for specifying how many blocks of memory must
> > > be assigned.
> > >
> > > This fixes a number dEQP-VK.pipeline.render_to_image.* tests that
> crash.
> > >
> > > v2: lock-free free-list is not handled correctly (Jason)
> > > ---
> > >  src/intel/vulkan/anv_allocator.c   | 81
> +++---
> > >  src/intel/vulkan/anv_batch_chain.c |  4 +-
> > >  src/intel/vulkan/anv_private.h |  7 +++-
> > >  3 files changed, 66 insertions(+), 26 deletions(-)
> > >
> > > diff --git a/src/intel/vulkan/anv_allocator.c b/src/intel/vulkan/anv_
> allocator.c
> > > index 45c663b..3924551 100644
> > > --- a/src/intel/vulkan/anv_allocator.c
> > > +++ b/src/intel/vulkan/anv_allocator.c
> > > @@ -257,7 +257,8 @@ anv_block_pool_init(struct anv_block_pool *pool,
> > > pool->device = device;
> > > anv_bo_init(&pool->bo, 0, 0);
> > > pool->block_size = block_size;
> > > -   pool->free_list = ANV_FREE_LIST_EMPTY;
> > > +   for (uint32_t i = 0; i < ANV_MAX_BLOCKS; i++)
> > > +  pool->free_list[i] = ANV_FREE_LIST_EMPTY;
> > > pool->back_free_list = ANV_FREE_LIST_EMPTY;
> > >
> > > pool->fd = memfd_create("block pool", MFD_CLOEXEC);
> > > @@ -500,30 +501,35 @@ fail:
> > >
> > >  static uint32_t
> > >  anv_block_pool_alloc_new(struct anv_block_pool *pool,
> > > - struct anv_block_state *pool_state)
> > > + struct anv_block_state *pool_state,
> > > + uint32_t n_blocks)
> >
> > Maybe have this take a size rather than n_blocks?  It's only ever called
> by stuff in the block pool 

Re: [Mesa-dev] [PATCH 4/5] vulkan/wsi/wayland: Pass damage through to the compositor

2017-04-03 Thread Jason Ekstrand
On Mon, Apr 3, 2017 at 1:54 AM, Daniel Stone  wrote:

> Hi Jason,
>
> On 1 April 2017 at 06:37, Jason Ekstrand  wrote:
> > @@ -594,7 +595,19 @@ wsi_wl_swapchain_queue_present(struct
> wsi_swapchain *wsi_chain,
> >
> > assert(image_index < chain->base.image_count);
> > wl_surface_attach(chain->surface, chain->images[image_index].buffer,
> 0, 0);
> > -   wl_surface_damage(chain->surface, 0, 0, INT32_MAX, INT32_MAX);
> > +
> > +   if (chain->surface_version >= 4 && damage &&
> > +   damage->pRectangles && damage->rectangleCount > 0) {
> > +  for (unsigned i = 0; i < damage->rectangleCount; i++) {
> > + const VkRectLayerKHR *rect = &damage->pRectangles[i];
> > + assert(rect->layer == 0);
> > + wl_surface_damage(chain->surface,
> > +   rect->offset.x, rect->offset.y,
> > +   rect->extent.width, rect->extent.height);
>
> Very scrupulous version check, but you forgot to actually use
> wl_surface_damage_buffer. ;)
>

Gah!!!  I'll get that fixed.  Assuming that change, now that you've looked
at it all, would you mind reviewing at least this patch?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv/pipeline: do not disable depth writes if depth testing is disabled

2017-04-03 Thread Jason Ekstrand
Sorry.  I thought I had.

Reviewed-by: Jason Ekstrand 

On Sun, Apr 2, 2017 at 11:02 PM, Iago Toral  wrote:

> Can anyone review this one?
>
> On Wed, 2017-03-29 at 08:58 +0200, Iago Toral Quiroga wrote:
> > Writing and testing are two different things and they can be set
> > separately
> > by the application. If an application wants to record depth data
> > without
> > caring for the depth test, it can enable depth test and set the depth
> > compare function to VK_COMPARE_OP_ALWAYS or it can simply disable
> > depth testing altogether. Some CTS tests do the latter.
> >
> > Fixes all multisample tests with depth-only formats in:
> > dEQP-VK.renderpass.multisample.*
> > ---
> >  src/intel/vulkan/genX_pipeline.c | 4 
> >  1 file changed, 4 deletions(-)
> >
> > diff --git a/src/intel/vulkan/genX_pipeline.c
> > b/src/intel/vulkan/genX_pipeline.c
> > index 85a9e4f..dc393cb 100644
> > --- a/src/intel/vulkan/genX_pipeline.c
> > +++ b/src/intel/vulkan/genX_pipeline.c
> > @@ -728,10 +728,6 @@
> > sanitize_ds_state(VkPipelineDepthStencilStateCreateInfo *state,
> >  {
> > *stencilWriteEnable = state->stencilTestEnable;
> >
> > -   /* If the depth test is disabled, we won't be writing anything.
> > */
> > -   if (!state->depthTestEnable)
> > -  state->depthWriteEnable = false;
> > -
> > /* The Vulkan spec requires that if either depth or stencil is
> > not present,
> >  * the pipeline is to act as if the test silently passes.
> >  */
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/5] vulkan/wsi/wayland: Pass damage through to the compositor

2017-04-03 Thread Daniel Stone
Hi,

On 3 April 2017 at 15:45, Jason Ekstrand  wrote:
> On Mon, Apr 3, 2017 at 1:54 AM, Daniel Stone  wrote:
>> Very scrupulous version check, but you forgot to actually use
>> wl_surface_damage_buffer. ;)
>
> Gah!!!  I'll get that fixed.  Assuming that change, now that you've looked
> at it all, would you mind reviewing at least this patch?

Yeah, with a trivial change to use surface_buffer, the series is:
Reviewed-by: Daniel Stone 

Cheers,
Daniel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/9] gallium: decrease the size of pipe_box - 24 -> 16 bytes

2017-04-03 Thread Brian Paul

On 04/02/2017 12:00 PM, Marek Olšák wrote:

From: Marek Olšák 

Also:

pipe_transfer: 48 -> 40 bytes.
pipe_blit_info = 176 -> 160 bytes.
---
  src/gallium/include/pipe/p_state.h | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/gallium/include/pipe/p_state.h 
b/src/gallium/include/pipe/p_state.h
index 392bb8b..6a147ef 100644
--- a/src/gallium/include/pipe/p_state.h
+++ b/src/gallium/include/pipe/p_state.h
@@ -472,25 +472,25 @@ struct pipe_image_view
 } u;
  };


  /**
   * Subregion of 1D/2D/3D image resource.
   */
  struct pipe_box
  {
 int x;
-   int y;
-   int z;
+   int16_t y;
+   int16_t z;
 int width;
-   int height;
-   int depth;
+   int16_t height;
+   int16_t depth;


I think a comment explaining why x/width are int but y/z/height/depth 
are int16_t (texture buffer objects, right?) would be good.  Otherwise, 
someone's going to wonder about that and maybe submit a patch to change it.


Same thing for the pipe_resource change.

I tested the series on Windows/MSVC and MinGW and didn't see any problems.

With the above comments, Reviewed-by: Brian Paul 

BTW, some years ago I looked at using bitfields in gl_texture_image and 
gl_texture_object and it looked like a substantial size reduction was 
possible.  Maybe you or someone else would like to look into that.


-Brian



  };


  /**
   * A memory object/resource such as a vertex buffer or texture.
   */
  struct pipe_resource
  {
 struct pipe_reference reference;
 struct pipe_screen *screen; /**< screen that this texture belongs to */



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/9] gallium: decrease the size of pipe_box - 24 -> 16 bytes

2017-04-03 Thread Alex Deucher
On Sun, Apr 2, 2017 at 2:00 PM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> Also:
>
> pipe_transfer: 48 -> 40 bytes.
> pipe_blit_info = 176 -> 160 bytes.
> ---
>  src/gallium/include/pipe/p_state.h | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/src/gallium/include/pipe/p_state.h 
> b/src/gallium/include/pipe/p_state.h
> index 392bb8b..6a147ef 100644
> --- a/src/gallium/include/pipe/p_state.h
> +++ b/src/gallium/include/pipe/p_state.h
> @@ -472,25 +472,25 @@ struct pipe_image_view
> } u;
>  };
>
>
>  /**
>   * Subregion of 1D/2D/3D image resource.
>   */
>  struct pipe_box
>  {
> int x;
> -   int y;
> -   int z;
> +   int16_t y;
> +   int16_t z;
> int width;
> -   int height;
> -   int depth;
> +   int16_t height;
> +   int16_t depth;
>  };

Not specific to this patch per se, but will this cause any issues in
the future as max surface sizes supported by hw increase?  I feel like
we'll need to change this back at some point.

Alex

>
>
>  /**
>   * A memory object/resource such as a vertex buffer or texture.
>   */
>  struct pipe_resource
>  {
> struct pipe_reference reference;
> struct pipe_screen *screen; /**< screen that this texture belongs to */
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv/pipeline: do not disable depth writes if depth testing is disabled

2017-04-03 Thread Nanley Chery
On Mon, Apr 03, 2017 at 08:02:54AM +0200, Iago Toral wrote:
> Can anyone review this one?
> 
> On Wed, 2017-03-29 at 08:58 +0200, Iago Toral Quiroga wrote:
> > Writing and testing are two different things and they can be set
> > separately
> > by the application. If an application wants to record depth data
> > without
> > caring for the depth test, it can enable depth test and set the depth
> > compare function to VK_COMPARE_OP_ALWAYS or it can simply disable
> > depth testing altogether. Some CTS tests do the latter.
> > 

Doesn't disabling the depth test prevent any updates to the depth
buffer?

Section 25.10 Depth Test from the Vulkan spec says:

   When disabled, the depth comparison and subsequent possible updates
   to the value of the depth component of the depth/stencil attachment
   are bypassed and the fragment is passed to the next operation.

-Nanley

> > Fixes all multisample tests with depth-only formats in:
> > dEQP-VK.renderpass.multisample.*
> > ---
> >  src/intel/vulkan/genX_pipeline.c | 4 
> >  1 file changed, 4 deletions(-)
> > 
> > diff --git a/src/intel/vulkan/genX_pipeline.c
> > b/src/intel/vulkan/genX_pipeline.c
> > index 85a9e4f..dc393cb 100644
> > --- a/src/intel/vulkan/genX_pipeline.c
> > +++ b/src/intel/vulkan/genX_pipeline.c
> > @@ -728,10 +728,6 @@
> > sanitize_ds_state(VkPipelineDepthStencilStateCreateInfo *state,
> >  {
> > *stencilWriteEnable = state->stencilTestEnable;
> >  
> > -   /* If the depth test is disabled, we won't be writing anything.
> > */
> > -   if (!state->depthTestEnable)
> > -  state->depthWriteEnable = false;
> > -
> > /* The Vulkan spec requires that if either depth or stencil is
> > not present,
> >  * the pipeline is to act as if the test silently passes.
> >  */
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] nv50/ir: run some passes multiple times

2017-04-03 Thread Karol Herbst
From: Karol Herbst 

With the shader cache, compilation time matters less.

As a side effect we can write more optimizations to produce better optimized
code.

total instructions in shared programs : 3931743 -> 3917512 (-0.36%)
total gprs used in shared programs: 481460 -> 481680 (0.05%)
total local used in shared programs   : 27481 -> 26761 (-2.62%)
total bytes used in shared programs   : 36032672 -> 35902648 (-0.36%)

localgpr   inst  bytes
helped  48 13338433843
  hurt   1 295  75  75

Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp| 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 0de84fe9fc..505de08573 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -3729,12 +3729,17 @@ Program::optimizeSSA(int level)
RUN_PASS(1, CopyPropagation, run);
RUN_PASS(1, MergeSplits, run);
RUN_PASS(2, GlobalCSE, run);
-   RUN_PASS(1, LocalCSE, run);
-   RUN_PASS(2, AlgebraicOpt, run);
-   RUN_PASS(2, ModifierFolding, run); // before load propagation -> less checks
-   RUN_PASS(1, ConstantFolding, foldAll);
-   RUN_PASS(2, LateAlgebraicOpt, run);
-   RUN_PASS(1, Split64BitOpPreRA, run);
+   for (int i = 0; i < 2; ++i) {
+  RUN_PASS(1, LocalCSE, run);
+  RUN_PASS(2, AlgebraicOpt, run);
+  RUN_PASS(2, ModifierFolding, run); // before load propagation -> less 
checks
+  RUN_PASS(1, ConstantFolding, foldAll);
+  RUN_PASS(2, LateAlgebraicOpt, run);
+  // only once
+  if (i == 0)
+ RUN_PASS(1, Split64BitOpPreRA, run);
+  RUN_PASS(1, DeadCodeElim, buryAll);
+   }
RUN_PASS(1, LoadPropagation, run);
RUN_PASS(1, IndirectPropagation, run);
RUN_PASS(2, MemoryOpt, run);
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] nv50/ir: fix AlgebraicOpt for slcts with mods

2017-04-03 Thread Karol Herbst
From: Karol Herbst 

Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 4c92a1efb5..bd60a84998 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -1797,10 +1797,10 @@ AlgebraicOpt::handleSLCT(Instruction *slct)
   if (slct->getSrc(2)->asImm()->compare(slct->asCmp()->setCond, 0.0f))
  slct->setSrc(0, slct->getSrc(1));
} else
-   if (slct->getSrc(0) != slct->getSrc(1)) {
+   if (slct->getSrc(0) != slct->getSrc(1) || slct->src(0).mod != 
slct->src(1).mod)
   return;
-   }
-   slct->op = OP_MOV;
+   slct->op = slct->src(0).mod.getOp();
+   slct->src(0).mod = slct->src(0).mod ^ Modifier(slct->op);
slct->setSrc(1, NULL);
slct->setSrc(2, NULL);
 }
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/3] nv50/ir: Preapre for running Opts inside a loop

2017-04-03 Thread Karol Herbst
Slowly we are getting to the point, that we miss enough optimization
opportunities as the result of our own passes.

For this we need to fix AlgebraicOpt to be able to handle mods on sources
without creating new issues.

The last patch enables looping opts.

Karol Herbst (3):
  nv50/ir: fix AlgebraicOpt for slcts with mods
  nv50/ir: handle logops with NOT in AlgebraicOpt
  nv50/ir: run some passes multiple times

 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   | 29 +++---
 1 file changed, 20 insertions(+), 9 deletions(-)

-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] nv50/ir: handle logops with NOT in AlgebraicOpt

2017-04-03 Thread Karol Herbst
Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index bd60a84998..0de84fe9fc 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -1856,6 +1856,12 @@ AlgebraicOpt::handleLOGOP(Instruction *logop)
 
   set0 = cloneForward(func, set0);
   set1 = cloneShallow(func, set1);
+
+  if (logop->src(0).mod == Modifier(NV50_IR_MOD_NOT))
+ set0->asCmp()->setCond = inverseCondCode(set0->asCmp()->setCond);
+  if (logop->src(1).mod == Modifier(NV50_IR_MOD_NOT))
+ set1->asCmp()->setCond = inverseCondCode(set1->asCmp()->setCond);
+
   logop->bb->insertAfter(logop, set1);
   logop->bb->insertAfter(logop, set0);
 
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 0/3] nv50/ir: Preapre for running Opts inside a loop

2017-04-03 Thread Karol Herbst
Slowly we are getting to the point, that we miss enough optimization
opportunities as the result of our own passes.

For this we need to fix AlgebraicOpt to be able to handle mods on sources
without creating new issues.

The last patch enables looping opts.

v2: update commit author

Karol Herbst (3):
  nv50/ir: fix AlgebraicOpt for slcts with mods
  nv50/ir: handle logops with NOT in AlgebraicOpt
  nv50/ir: run some passes multiple times

 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp   | 29 +++---
 1 file changed, 20 insertions(+), 9 deletions(-)

-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 3/3] nv50/ir: run some passes multiple times

2017-04-03 Thread Karol Herbst
With the shader cache, compilation time matters less.

As a side effect we can write more optimizations to produce better optimized
code.

total instructions in shared programs : 3931743 -> 3917512 (-0.36%)
total gprs used in shared programs: 481460 -> 481680 (0.05%)
total local used in shared programs   : 27481 -> 26761 (-2.62%)
total bytes used in shared programs   : 36032672 -> 35902648 (-0.36%)

localgpr   inst  bytes
helped  48 13338433843
  hurt   1 295  75  75

Signed-off-by: Karol Herbst 
---
 .../drivers/nouveau/codegen/nv50_ir_peephole.cpp| 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 0de84fe9fc..505de08573 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -3729,12 +3729,17 @@ Program::optimizeSSA(int level)
RUN_PASS(1, CopyPropagation, run);
RUN_PASS(1, MergeSplits, run);
RUN_PASS(2, GlobalCSE, run);
-   RUN_PASS(1, LocalCSE, run);
-   RUN_PASS(2, AlgebraicOpt, run);
-   RUN_PASS(2, ModifierFolding, run); // before load propagation -> less checks
-   RUN_PASS(1, ConstantFolding, foldAll);
-   RUN_PASS(2, LateAlgebraicOpt, run);
-   RUN_PASS(1, Split64BitOpPreRA, run);
+   for (int i = 0; i < 2; ++i) {
+  RUN_PASS(1, LocalCSE, run);
+  RUN_PASS(2, AlgebraicOpt, run);
+  RUN_PASS(2, ModifierFolding, run); // before load propagation -> less 
checks
+  RUN_PASS(1, ConstantFolding, foldAll);
+  RUN_PASS(2, LateAlgebraicOpt, run);
+  // only once
+  if (i == 0)
+ RUN_PASS(1, Split64BitOpPreRA, run);
+  RUN_PASS(1, DeadCodeElim, buryAll);
+   }
RUN_PASS(1, LoadPropagation, run);
RUN_PASS(1, IndirectPropagation, run);
RUN_PASS(2, MemoryOpt, run);
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 2/3] nv50/ir: handle logops with NOT in AlgebraicOpt

2017-04-03 Thread Karol Herbst
Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index bd60a84998..0de84fe9fc 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -1856,6 +1856,12 @@ AlgebraicOpt::handleLOGOP(Instruction *logop)
 
   set0 = cloneForward(func, set0);
   set1 = cloneShallow(func, set1);
+
+  if (logop->src(0).mod == Modifier(NV50_IR_MOD_NOT))
+ set0->asCmp()->setCond = inverseCondCode(set0->asCmp()->setCond);
+  if (logop->src(1).mod == Modifier(NV50_IR_MOD_NOT))
+ set1->asCmp()->setCond = inverseCondCode(set1->asCmp()->setCond);
+
   logop->bb->insertAfter(logop, set1);
   logop->bb->insertAfter(logop, set0);
 
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 1/3] nv50/ir: fix AlgebraicOpt for slcts with mods

2017-04-03 Thread Karol Herbst
Signed-off-by: Karol Herbst 
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
index 4c92a1efb5..bd60a84998 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@@ -1797,10 +1797,10 @@ AlgebraicOpt::handleSLCT(Instruction *slct)
   if (slct->getSrc(2)->asImm()->compare(slct->asCmp()->setCond, 0.0f))
  slct->setSrc(0, slct->getSrc(1));
} else
-   if (slct->getSrc(0) != slct->getSrc(1)) {
+   if (slct->getSrc(0) != slct->getSrc(1) || slct->src(0).mod != 
slct->src(1).mod)
   return;
-   }
-   slct->op = OP_MOV;
+   slct->op = slct->src(0).mod.getOp();
+   slct->src(0).mod = slct->src(0).mod ^ Modifier(slct->op);
slct->setSrc(1, NULL);
slct->setSrc(2, NULL);
 }
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv/nir: Delete the apply_dynamic_offsets prototype

2017-04-03 Thread Nanley Chery
On Wed, Mar 22, 2017 at 03:13:56PM -0700, Jason Ekstrand wrote:
> That pass hasn't existed since dd4db84640bbb694f180dd50850c3388f67228be
> but the prototype stuck around for no reason.
> ---
>  src/intel/vulkan/anv_nir.h | 3 ---
>  1 file changed, 3 deletions(-)
> 

This patch is
Reviewed-by: Nanley Chery 

> diff --git a/src/intel/vulkan/anv_nir.h b/src/intel/vulkan/anv_nir.h
> index a929dd9..3f97701 100644
> --- a/src/intel/vulkan/anv_nir.h
> +++ b/src/intel/vulkan/anv_nir.h
> @@ -35,9 +35,6 @@ void anv_nir_lower_input_attachments(nir_shader *shader);
>  
>  void anv_nir_lower_push_constants(nir_shader *shader);
>  
> -void anv_nir_apply_dynamic_offsets(struct anv_pipeline *pipeline,
> -   nir_shader *shader,
> -   struct brw_stage_prog_data *prog_data);
>  void anv_nir_apply_pipeline_layout(struct anv_pipeline *pipeline,
> nir_shader *shader,
> struct brw_stage_prog_data *prog_data,
> -- 
> 2.5.0.400.gff86faf
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] util: #include "c99_compat.h" to fix Windows build

2017-04-03 Thread Brian Paul
Otherwise, we were getting the definition for 'inline' by chance from
some other preceeding #include.
---
 src/util/list.h  | 1 +
 src/util/mesa-sha1.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/src/util/list.h b/src/util/list.h
index 07eb9f3..6edb750 100644
--- a/src/util/list.h
+++ b/src/util/list.h
@@ -41,6 +41,7 @@
 #include 
 #include 
 #include 
+#include "c99_compat.h"
 
 
 struct list_head
diff --git a/src/util/mesa-sha1.h b/src/util/mesa-sha1.h
index d3f7aff..bde50ba 100644
--- a/src/util/mesa-sha1.h
+++ b/src/util/mesa-sha1.h
@@ -24,6 +24,7 @@
 #define MESA_SHA1_H
 
 #include 
+#include "c99_compat.h"
 #include "sha1/sha1.h"
 
 #ifdef __cplusplus
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] util: s/SHA1_H/MESA_SHA1_H/

2017-04-03 Thread Brian Paul
To follow the convention of other header include guards.
---
 src/util/mesa-sha1.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/util/mesa-sha1.h b/src/util/mesa-sha1.h
index a81aba9..d3f7aff 100644
--- a/src/util/mesa-sha1.h
+++ b/src/util/mesa-sha1.h
@@ -20,8 +20,8 @@
  * DEALINGS IN THE SOFTWARE.
  */
 
-#ifndef SHA1_H
-#define SHA1_H
+#ifndef MESA_SHA1_H
+#define MESA_SHA1_H
 
 #include 
 #include "sha1/sha1.h"
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] util: fix MSVC warning in u_align_u32()

2017-04-03 Thread Brian Paul
To silence
C:\Users\Brian\projects\mesa\src\util/u_vector.h(41) : warning C4146: unary
minus operator applied to unsigned type, result still unsigned
---
 src/util/u_vector.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/util/u_vector.h b/src/util/u_vector.h
index f97a8b4..c7fcb37 100644
--- a/src/util/u_vector.h
+++ b/src/util/u_vector.h
@@ -38,7 +38,7 @@
 static inline uint32_t
 u_align_u32(uint32_t v, uint32_t a)
 {
-   assert(a != 0 && a == (a & -a));
+   assert(a != 0 && a == (a & -((int32_t) a)));
return (v + a - 1) & ~(a - 1);
 }
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv/pipeline: do not disable depth writes if depth testing is disabled

2017-04-03 Thread Jason Ekstrand
On Mon, Apr 3, 2017 at 8:55 AM, Nanley Chery  wrote:

> On Mon, Apr 03, 2017 at 08:02:54AM +0200, Iago Toral wrote:
> > Can anyone review this one?
> >
> > On Wed, 2017-03-29 at 08:58 +0200, Iago Toral Quiroga wrote:
> > > Writing and testing are two different things and they can be set
> > > separately
> > > by the application. If an application wants to record depth data
> > > without
> > > caring for the depth test, it can enable depth test and set the depth
> > > compare function to VK_COMPARE_OP_ALWAYS or it can simply disable
> > > depth testing altogether. Some CTS tests do the latter.
> > >
>
> Doesn't disabling the depth test prevent any updates to the depth
> buffer?
>
> Section 25.10 Depth Test from the Vulkan spec says:
>
>When disabled, the depth comparison and subsequent possible updates
>to the value of the depth component of the depth/stencil attachment
>are bypassed and the fragment is passed to the next operation.
>

Thanks!  Maybe I wasn't crazy when I wrote that line of code.  Consider my
review retracted until we're sure this isn't a test bug.


> -Nanley
>
> > > Fixes all multisample tests with depth-only formats in:
> > > dEQP-VK.renderpass.multisample.*
> > > ---
> > >  src/intel/vulkan/genX_pipeline.c | 4 
> > >  1 file changed, 4 deletions(-)
> > >
> > > diff --git a/src/intel/vulkan/genX_pipeline.c
> > > b/src/intel/vulkan/genX_pipeline.c
> > > index 85a9e4f..dc393cb 100644
> > > --- a/src/intel/vulkan/genX_pipeline.c
> > > +++ b/src/intel/vulkan/genX_pipeline.c
> > > @@ -728,10 +728,6 @@
> > > sanitize_ds_state(VkPipelineDepthStencilStateCreateInfo *state,
> > >  {
> > > *stencilWriteEnable = state->stencilTestEnable;
> > >
> > > -   /* If the depth test is disabled, we won't be writing anything.
> > > */
> > > -   if (!state->depthTestEnable)
> > > -  state->depthWriteEnable = false;
> > > -
> > > /* The Vulkan spec requires that if either depth or stencil is
> > > not present,
> > >  * the pipeline is to act as if the test silently passes.
> > >  */
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] util: #include "c99_compat.h" to fix Windows build

2017-04-03 Thread Jose Fonseca

On 03/04/17 17:09, Brian Paul wrote:

Otherwise, we were getting the definition for 'inline' by chance from
some other preceeding #include.
---
 src/util/list.h  | 1 +
 src/util/mesa-sha1.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/src/util/list.h b/src/util/list.h
index 07eb9f3..6edb750 100644
--- a/src/util/list.h
+++ b/src/util/list.h
@@ -41,6 +41,7 @@
 #include 
 #include 
 #include 
+#include "c99_compat.h"


 struct list_head
diff --git a/src/util/mesa-sha1.h b/src/util/mesa-sha1.h
index d3f7aff..bde50ba 100644
--- a/src/util/mesa-sha1.h
+++ b/src/util/mesa-sha1.h
@@ -24,6 +24,7 @@
 #define MESA_SHA1_H

 #include 
+#include "c99_compat.h"
 #include "sha1/sha1.h"

 #ifdef __cplusplus



Looks good.

Series is

Reviewed-by: Jose Fonseca 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 9/9] gallium: decrease the size of pipe_draw_info - 88 -> 80 bytes

2017-04-03 Thread Marek Olšák
On Apr 3, 2017 4:04 PM, "Brian Paul"  wrote:

I need to test this series on Windows and with MinGW first.  I'm worried
about enums with bitfields.


Hopefully it'll work. Enums without bitfields always occupy 4 bytes. If
bitfields don't work with those, we'll have to stop using enums in these
cases.

Marek



-Brian


On Mon, Apr 3, 2017 at 2:46 AM, Nicolai Hähnle  wrote:

> Some of these are a bit subtle, but I think they're fine. Series is:
>
> Reviewed-by: Nicolai Hähnle 
>
>
> On 02.04.2017 20:00, Marek Olšák wrote:
>
>> From: Marek Olšák 
>>
>> ---
>>  src/gallium/auxiliary/indices/u_primconvert.c | 10 --
>>  src/gallium/include/pipe/p_state.h|  2 +-
>>  2 files changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/src/gallium/auxiliary/indices/u_primconvert.c
>> b/src/gallium/auxiliary/indices/u_primconvert.c
>> index 2bdfade..1ffca4b 100644
>> --- a/src/gallium/auxiliary/indices/u_primconvert.c
>> +++ b/src/gallium/auxiliary/indices/u_primconvert.c
>> @@ -121,39 +121,45 @@ util_primconvert_draw_vbo(struct
>> primconvert_context *pc,
>> util_draw_init_info(&new_info);
>> new_info.indexed = true;
>> new_info.min_index = info->min_index;
>> new_info.max_index = info->max_index;
>> new_info.index_bias = info->index_bias;
>> new_info.start_instance = info->start_instance;
>> new_info.instance_count = info->instance_count;
>> new_info.primitive_restart = info->primitive_restart;
>> new_info.restart_index = info->restart_index;
>> if (info->indexed) {
>> +  enum pipe_prim_type mode = 0;
>> +
>>u_index_translator(pc->primtypes_mask,
>>   info->mode, pc->saved_ib.index_size,
>> info->count,
>>   pc->api_pv, pc->api_pv,
>>   info->primitive_restart ? PR_ENABLE :
>> PR_DISABLE,
>> - &new_info.mode, &new_ib.index_size,
>> &new_info.count,
>> + &mode, &new_ib.index_size, &new_info.count,
>>   &trans_func);
>> +  new_info.mode = mode;
>>src = ib->user_buffer;
>>if (!src) {
>>   src = pipe_buffer_map(pc->pipe, ib->buffer,
>> PIPE_TRANSFER_READ, &src_transfer);
>>}
>>src = (const uint8_t *)src + ib->offset;
>> }
>> else {
>> +  enum pipe_prim_type mode = 0;
>> +
>>u_index_generator(pc->primtypes_mask,
>>  info->mode, info->start, info->count,
>>  pc->api_pv, pc->api_pv,
>> -&new_info.mode, &new_ib.index_size,
>> &new_info.count,
>> +&mode, &new_ib.index_size, &new_info.count,
>>  &gen_func);
>> +  new_info.mode = mode;
>> }
>>
>> u_upload_alloc(pc->pipe->stream_uploader, 0, new_ib.index_size *
>> new_info.count, 4,
>>&new_ib.offset, &new_ib.buffer, &dst);
>>
>> if (info->indexed) {
>>trans_func(src, info->start, info->count, new_info.count,
>> info->restart_index, dst);
>> }
>> else {
>>gen_func(info->start, new_info.count, dst);
>> diff --git a/src/gallium/include/pipe/p_state.h
>> b/src/gallium/include/pipe/p_state.h
>> index d68a4d4..eeabf8b 100644
>> --- a/src/gallium/include/pipe/p_state.h
>> +++ b/src/gallium/include/pipe/p_state.h
>> @@ -634,21 +634,21 @@ struct pipe_index_buffer
>> const void *user_buffer;  /**< pointer to a user buffer if buffer ==
>> NULL */
>>  };
>>
>>
>>  /**
>>   * Information to describe a draw_vbo call.
>>   */
>>  struct pipe_draw_info
>>  {
>> boolean indexed;  /**< use index buffer */
>> -   enum pipe_prim_type mode;  /**< the mode of the primitive */
>> +   enum pipe_prim_type mode:8;  /**< the mode of the primitive */
>> boolean primitive_restart;
>> ubyte vertices_per_patch; /**< the number of vertices per patch */
>>
>> unsigned start;  /**< the index of the first vertex */
>> unsigned count;  /**< number of vertices */
>>
>> unsigned start_instance; /**< first instance id */
>> unsigned instance_count; /**< number of instances */
>>
>> unsigned drawid; /**< id of this draw in a multidraw */
>>
>>
>
> --
> Lerne, wie die Welt wirklich ist,
> Aber vergiss niemals, wie sie sein sollte.
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 100259] [EGL] [GBM] undefined reference to `gbm_bo_create_with_modifiers'

2017-04-03 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=100259

--- Comment #14 from Emil Velikov  ---
Ouch that is some nasty bug in Slackware packaging.

Note you want /usr/local/ and /usr/ in the same order across
--with-pkg-config-dir and --with-system-libdir.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] anv: add support for allocating more than 1 block of memory

2017-04-03 Thread Kristian Høgsberg
On Wed, Mar 29, 2017 at 12:06 PM, Jason Ekstrand  wrote:
> Looking over the patch, I think I've convinced myself that it's correct.  (I
> honestly wasn't expecting to come to that conclusion without more
> iteration.)  That said, this raises some interesting questions.  I added
> Kristian to the Cc in case he has any input.

I haven't looked closely, and I agree it's increasingly tricky code to
review. I'd be careful about making this a fully generic any-block
size allocator. The premise, when we first designed this, was that for
something like a fixed-size, power-of-two pool, we could write a fast,
lock-less and fragmentation free allocator without getting in over our
heads. However, if this evolves (devolves?) into "malloc, but for
bos", it may be time to take a step back and look if something like
jemalloc with bo backing is a better choice.

Kristian

>
>  1. Should we do powers of two or linear.  I'm still a fan of powers of two.
>
>  2. Should block pools even have a block size at all? We could just make
> every block pool allow any power-of-two size from 4 KiB up to. say, 1 MiB
> and then make the block size part of the state pool or stream that's
> allocating from it.  At the moment, I like this idea, but I've given it very
> little thought.
>
>  3. If we go with the idea in 2. should we still call it block_pool?  I
> think we can keep the name but it doesn't it as well as it once did.
>
> Thanks for working on this!  I'm sorry it's taken so long to respond.  Every
> time I've looked at it, my brain hasn't been in the right state to think
> about lock-free code. :-/
>
> On Wed, Mar 15, 2017 at 5:05 AM, Juan A. Suarez Romero 
> wrote:
>>
>> Current Anv allocator assign memory in terms of a fixed block size.
>>
>> But there can be cases where this block is not enough for a memory
>> request, and thus several blocks must be assigned in a row.
>>
>> This commit adds support for specifying how many blocks of memory must
>> be assigned.
>>
>> This fixes a number dEQP-VK.pipeline.render_to_image.* tests that crash.
>>
>> v2: lock-free free-list is not handled correctly (Jason)
>> ---
>>  src/intel/vulkan/anv_allocator.c   | 81
>> +++---
>>  src/intel/vulkan/anv_batch_chain.c |  4 +-
>>  src/intel/vulkan/anv_private.h |  7 +++-
>>  3 files changed, 66 insertions(+), 26 deletions(-)
>>
>> diff --git a/src/intel/vulkan/anv_allocator.c
>> b/src/intel/vulkan/anv_allocator.c
>> index 45c663b..3924551 100644
>> --- a/src/intel/vulkan/anv_allocator.c
>> +++ b/src/intel/vulkan/anv_allocator.c
>> @@ -257,7 +257,8 @@ anv_block_pool_init(struct anv_block_pool *pool,
>> pool->device = device;
>> anv_bo_init(&pool->bo, 0, 0);
>> pool->block_size = block_size;
>> -   pool->free_list = ANV_FREE_LIST_EMPTY;
>> +   for (uint32_t i = 0; i < ANV_MAX_BLOCKS; i++)
>> +  pool->free_list[i] = ANV_FREE_LIST_EMPTY;
>> pool->back_free_list = ANV_FREE_LIST_EMPTY;
>>
>> pool->fd = memfd_create("block pool", MFD_CLOEXEC);
>> @@ -500,30 +501,35 @@ fail:
>>
>>  static uint32_t
>>  anv_block_pool_alloc_new(struct anv_block_pool *pool,
>> - struct anv_block_state *pool_state)
>> + struct anv_block_state *pool_state,
>> + uint32_t n_blocks)
>
>
> Maybe have this take a size rather than n_blocks?  It's only ever called by
> stuff in the block pool so the caller can do the multiplication.  It would
> certainly make some of the math below easier.
>
>>
>>  {
>> struct anv_block_state state, old, new;
>>
>> while (1) {
>> -  state.u64 = __sync_fetch_and_add(&pool_state->u64,
>> pool->block_size);
>> -  if (state.next < state.end) {
>> +  state.u64 = __sync_fetch_and_add(&pool_state->u64, n_blocks *
>> pool->block_size);
>> +  if (state.next > state.end) {
>> + futex_wait(&pool_state->end, state.end);
>> + continue;
>> +  } else if ((state.next + (n_blocks - 1) * pool->block_size) <
>> state.end) {
>
>
> First off, please keep the if's in the same order unless we have a reason to
> re-arrange them.  It would make this way easier to review. :-)
>
> Second, I think this would be much easier to read as:
>
> if (state.next + size <= state.end) {
>/* Success */
> } else if (state.next <= state.end) {
>/* Our block is the one that crosses the line */
> } else {
>/* Wait like everyone else */
> }
>
>>
>>   assert(pool->map);
>>   return state.next;
>> -  } else if (state.next == state.end) {
>> - /* We allocated the first block outside the pool, we have to
>> grow it.
>> -  * pool_state->next acts a mutex: threads who try to allocate
>> now will
>> -  * get block indexes above the current limit and hit futex_wait
>> -  * below. */
>> - new.next = state.next + pool->block_size;
>> +  } else {
>> + /* We allocated the firsts blocks outside the pool, we have to
>> grow
>> +  * it. pool_state-

Re: [Mesa-dev] [PATCH 6/9] gallium: decrease the size of pipe_box - 24 -> 16 bytes

2017-04-03 Thread Marek Olšák
On Apr 3, 2017 5:11 PM, "Alex Deucher"  wrote:

On Sun, Apr 2, 2017 at 2:00 PM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> Also:
>
> pipe_transfer: 48 -> 40 bytes.
> pipe_blit_info = 176 -> 160 bytes.
> ---
>  src/gallium/include/pipe/p_state.h | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/src/gallium/include/pipe/p_state.h
b/src/gallium/include/pipe/p_state.h
> index 392bb8b..6a147ef 100644
> --- a/src/gallium/include/pipe/p_state.h
> +++ b/src/gallium/include/pipe/p_state.h
> @@ -472,25 +472,25 @@ struct pipe_image_view
> } u;
>  };
>
>
>  /**
>   * Subregion of 1D/2D/3D image resource.
>   */
>  struct pipe_box
>  {
> int x;
> -   int y;
> -   int z;
> +   int16_t y;
> +   int16_t z;
> int width;
> -   int height;
> -   int depth;
> +   int16_t height;
> +   int16_t depth;
>  };

Not specific to this patch per se, but will this cause any issues in
the future as max surface sizes supported by hw increase?  I feel like
we'll need to change this back at some point.


We should be fine as long as OpenGL and/or games don't require bigger sizes.

Marek


Alex

>
>
>  /**
>   * A memory object/resource such as a vertex buffer or texture.
>   */
>  struct pipe_resource
>  {
> struct pipe_reference reference;
> struct pipe_screen *screen; /**< screen that this texture belongs to
*/
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] bin/get-fixes-pick-list.sh: fix typo

2017-04-03 Thread Juan A. Suarez Romero
Replace "nore" by "more".
---
 bin/get-fixes-pick-list.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/bin/get-fixes-pick-list.sh b/bin/get-fixes-pick-list.sh
index 59bcae4..75242a2 100755
--- a/bin/get-fixes-pick-list.sh
+++ b/bin/get-fixes-pick-list.sh
@@ -27,7 +27,7 @@ do
# For each one try to extract the tag
fixes_count=`git show $sha | grep -i "fixes:" | wc -l`
if [ "x$fixes_count" != x1 ] ; then
-   echo WARNING: Commit $sha has nore than one Fixes tag
+   echo WARNING: Commit $sha has more than one Fixes tag
fi
fixes=`git show $sha | grep -i "fixes:" | head -n 1`
# The following sed/cut combination is borrowed from GregKH
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] anv: add support for allocating more than 1 block of memory

2017-04-03 Thread Jason Ekstrand
On Mon, Apr 3, 2017 at 9:40 AM, Kristian Høgsberg 
wrote:

> On Wed, Mar 29, 2017 at 12:06 PM, Jason Ekstrand 
> wrote:
> > Looking over the patch, I think I've convinced myself that it's
> correct.  (I
> > honestly wasn't expecting to come to that conclusion without more
> > iteration.)  That said, this raises some interesting questions.  I added
> > Kristian to the Cc in case he has any input.
>
> I haven't looked closely, and I agree it's increasingly tricky code to
> review. I'd be careful about making this a fully generic any-block
> size allocator. The premise, when we first designed this, was that for
> something like a fixed-size, power-of-two pool, we could write a fast,
> lock-less and fragmentation free allocator without getting in over our
> heads.


Yeah, it's getting tricky but I don't think it's outside the realm of
humans yet. :-)


> However, if this evolves (devolves?) into "malloc, but for
> bos", it may be time to take a step back and look if something like
> jemalloc with bo backing is a better choice.
>

Yeah, it may be time to start considering that.  Unfortunately, I don't
think we can use jemalloc for this in a safe way.  jemalloc does provide a
way to manage pools but it does so, not with a pool pointer, but with an
arena number in a global namespace.  If an app were to use jemalloc (I
wouldn't be surprised in the gaming world if jemalloc is fairly
common-place) and uses arenas, we could end up with silent collisions.

At some point (probably later this year), I hope to look into pulling in a
foreign memory allocator and use that for BO placement and start
softpinning the universe (no relocations!).  That may be a good time to
revisit allocation strategies a bit.

Kristian
>
> >
> >  1. Should we do powers of two or linear.  I'm still a fan of powers of
> two.
> >
> >  2. Should block pools even have a block size at all? We could just make
> > every block pool allow any power-of-two size from 4 KiB up to. say, 1 MiB
> > and then make the block size part of the state pool or stream that's
> > allocating from it.  At the moment, I like this idea, but I've given it
> very
> > little thought.
> >
> >  3. If we go with the idea in 2. should we still call it block_pool?  I
> > think we can keep the name but it doesn't it as well as it once did.
> >
> > Thanks for working on this!  I'm sorry it's taken so long to respond.
> Every
> > time I've looked at it, my brain hasn't been in the right state to think
> > about lock-free code. :-/
> >
> > On Wed, Mar 15, 2017 at 5:05 AM, Juan A. Suarez Romero <
> jasua...@igalia.com>
> > wrote:
> >>
> >> Current Anv allocator assign memory in terms of a fixed block size.
> >>
> >> But there can be cases where this block is not enough for a memory
> >> request, and thus several blocks must be assigned in a row.
> >>
> >> This commit adds support for specifying how many blocks of memory must
> >> be assigned.
> >>
> >> This fixes a number dEQP-VK.pipeline.render_to_image.* tests that
> crash.
> >>
> >> v2: lock-free free-list is not handled correctly (Jason)
> >> ---
> >>  src/intel/vulkan/anv_allocator.c   | 81
> >> +++---
> >>  src/intel/vulkan/anv_batch_chain.c |  4 +-
> >>  src/intel/vulkan/anv_private.h |  7 +++-
> >>  3 files changed, 66 insertions(+), 26 deletions(-)
> >>
> >> diff --git a/src/intel/vulkan/anv_allocator.c
> >> b/src/intel/vulkan/anv_allocator.c
> >> index 45c663b..3924551 100644
> >> --- a/src/intel/vulkan/anv_allocator.c
> >> +++ b/src/intel/vulkan/anv_allocator.c
> >> @@ -257,7 +257,8 @@ anv_block_pool_init(struct anv_block_pool *pool,
> >> pool->device = device;
> >> anv_bo_init(&pool->bo, 0, 0);
> >> pool->block_size = block_size;
> >> -   pool->free_list = ANV_FREE_LIST_EMPTY;
> >> +   for (uint32_t i = 0; i < ANV_MAX_BLOCKS; i++)
> >> +  pool->free_list[i] = ANV_FREE_LIST_EMPTY;
> >> pool->back_free_list = ANV_FREE_LIST_EMPTY;
> >>
> >> pool->fd = memfd_create("block pool", MFD_CLOEXEC);
> >> @@ -500,30 +501,35 @@ fail:
> >>
> >>  static uint32_t
> >>  anv_block_pool_alloc_new(struct anv_block_pool *pool,
> >> - struct anv_block_state *pool_state)
> >> + struct anv_block_state *pool_state,
> >> + uint32_t n_blocks)
> >
> >
> > Maybe have this take a size rather than n_blocks?  It's only ever called
> by
> > stuff in the block pool so the caller can do the multiplication.  It
> would
> > certainly make some of the math below easier.
> >
> >>
> >>  {
> >> struct anv_block_state state, old, new;
> >>
> >> while (1) {
> >> -  state.u64 = __sync_fetch_and_add(&pool_state->u64,
> >> pool->block_size);
> >> -  if (state.next < state.end) {
> >> +  state.u64 = __sync_fetch_and_add(&pool_state->u64, n_blocks *
> >> pool->block_size);
> >> +  if (state.next > state.end) {
> >> + futex_wait(&pool_state->end, state.end);
> >> + continue;
> >> +  } else if ((stat

Re: [Mesa-dev] [PATCH v2] anv: add support for allocating more than 1 block of memory

2017-04-03 Thread Jason Ekstrand
On Mon, Apr 3, 2017 at 7:44 AM, Jason Ekstrand  wrote:

> On Mon, Apr 3, 2017 at 5:02 AM, Juan A. Suarez Romero  > wrote:
>
>> On Wed, 2017-03-29 at 12:06 -0700, Jason Ekstrand wrote:
>> > Looking over the patch, I think I've convinced myself that it's
>> correct.  (I honestly wasn't expecting to come to that conclusion without
>> more iteration.)  That said, this raises some interesting questions.  I
>> added Kristian to the Cc in case he has any input.
>> >
>> >  1. Should we do powers of two or linear.  I'm still a fan of powers of
>> two.
>>
>> In which specific part do you mean? In free block lists? or in the
>> allocator_new?
>>
>
> In the state pool, we just round all allocation sizes up to the nearest
> power-of-two, and then the index into the array of free lists isn't "size -
> base", it's "ilog2(size) - base".
>
>
>> >
>> >  2. Should block pools even have a block size at all? We could just
>> make every block pool allow any power-of-two size from 4 KiB up to. say, 1
>> MiB and then make the block size part of the state pool or stream that's
>> allocating from it.  At the moment, I like this idea, but I've given it
>> very little thought.
>> >
>> So IIUC, the idea would be the block pool is just a flat chunk of
>> memory, where we later fetch blocks of memory from, as requested by
>> applications. Is that correct?
>>
>
Thinking about this again, I think your statement may have been more
correct than I thought.  If we make the state_stream chain off of a
state_pool rather than the block_pool, we could make the block pool
structure a simple bi-directional growing BO and trust in the state pool
for 100% of the re-use.  That would probably be a way simpler structure.
For that matter, the state pool could just own the block_pool and
setup/teardown would be easier.


> Sorry, but this patch gave me some sudden revelations and things are still
> in the process of reforming in my brain.  In the past, we assumed a
> two-layered allocation strategy where we had a block pool which was the
> base and then the state pool and state stream allocators sat on top of it.
> Originally, the state pool allocator was just for a single size as well.
>
> Now that the block pool is going to be capable of allocating multiple
> sizes, the whole mental model of the separation falls apart.  The new
> future that I see is one where the block pool and state pool aren't
> separate.  Instead, we have a single thing which I'll call state_pool (we
> have to pick one of the names) which lets you allocate a state of any size
> from 32B to 2MB.  The state stream will then allocate off of a state_pool
> instead of a block_pool since they're now the same.  For smaller states, we
> still want to allocate reasonably large chunks (probably 4K) so that we
> ensure that things are nicely aligned.  I think I've got a pretty good idea
> of how it should work at this point and can write more if you'd like.
>
> Before we dive in and do a pile of refactoring, I think this patch is
> pretty much good-to-go assuming we fix the power-of-two thing and it fixes
> a bug so let's focus there.  Are you  interested in doing the refactoring?
> If not, that's ok, I'm happy to do it and it wouldn't take me long now that
> I've gotten a chance to think about it.  If you are interested, go for it!
>
>
>> >  3. If we go with the idea in 2. should we still call it block_pool?  I
>> think we can keep the name but it doesn't it as well as it once did.
>> >
>> > Thanks for working on this!  I'm sorry it's taken so long to respond.
>> Every time I've looked at it, my brain hasn't been in the right state to
>> think about lock-free code. :-/
>> >
>> > On Wed, Mar 15, 2017 at 5:05 AM, Juan A. Suarez Romero <
>> jasua...@igalia.com> wrote:
>> > > Current Anv allocator assign memory in terms of a fixed block size.
>> > >
>> > > But there can be cases where this block is not enough for a memory
>> > > request, and thus several blocks must be assigned in a row.
>> > >
>> > > This commit adds support for specifying how many blocks of memory must
>> > > be assigned.
>> > >
>> > > This fixes a number dEQP-VK.pipeline.render_to_image.* tests that
>> crash.
>> > >
>> > > v2: lock-free free-list is not handled correctly (Jason)
>> > > ---
>> > >  src/intel/vulkan/anv_allocator.c   | 81
>> +++---
>> > >  src/intel/vulkan/anv_batch_chain.c |  4 +-
>> > >  src/intel/vulkan/anv_private.h |  7 +++-
>> > >  3 files changed, 66 insertions(+), 26 deletions(-)
>> > >
>> > > diff --git a/src/intel/vulkan/anv_allocator.c
>> b/src/intel/vulkan/anv_allocator.c
>> > > index 45c663b..3924551 100644
>> > > --- a/src/intel/vulkan/anv_allocator.c
>> > > +++ b/src/intel/vulkan/anv_allocator.c
>> > > @@ -257,7 +257,8 @@ anv_block_pool_init(struct anv_block_pool *pool,
>> > > pool->device = device;
>> > > anv_bo_init(&pool->bo, 0, 0);
>> > > pool->block_size = block_size;
>> > > -   pool->free_list = ANV_FREE_LIST_EMPTY;
>> > > +   for (uint32_t i =

Re: [Mesa-dev] [PATCH 2/2] gallium/util: libunwind support

2017-04-03 Thread Thomas Hellstrom
Hi, Rob,

On 03/24/2017 10:21 PM, Rob Clark wrote:
> It's kinda sad that (a) we don't have debug_backtrace support on !X86
> and that (b) we re-invent our own crude backtrace support in the first
> place.  If available, use libunwind instead.  The backtrace format is
> based on what xserver and weston use, since it is nice not to have to
> figure out a different format.
>
> Signed-off-by: Rob Clark 

Did you consider glibc "backtrace()", I think it's also available on ARM...

Also is the output format the same as before, or at least compatible with
gallium/tools/addr2line.sh?

Thanks,
Thomas

> ---
>  configure.ac   | 24 
>  src/gallium/Automake.inc   |  1 +
>  src/gallium/auxiliary/util/u_debug_stack.c | 91 
> ++
>  src/gallium/auxiliary/util/u_debug_stack.h | 15 -
>  4 files changed, 129 insertions(+), 2 deletions(-)
>
> diff --git a/configure.ac b/configure.ac
> index a99684b..5046acb 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -1025,6 +1025,30 @@ AC_SUBST([LLVM_LIBS])
>  AC_SUBST([LLVM_LDFLAGS])
>  AC_SUBST([LLVM_INCLUDEDIR])
>  
> +dnl
> +dnl libunwind
> +dnl
> +AC_ARG_ENABLE([libunwind],
> +[AS_HELP_STRING([--enable-libunwind],
> +[Use libunwind for backtracing (default: auto)])],
> +[LIBUNWIND="$enableval"],
> +[LIBUNWIND="auto"])
> +
> +PKG_CHECK_MODULES(LIBUNWIND, libunwind, [HAVE_LIBUNWIND=yes], 
> [HAVE_LIBUNWIND=no])
> +if test "x$LIBUNWIND" = "xauto"; then
> +LIBUNWIND="$HAVE_LIBUNWIND"
> +fi
> +
> +if test "x$LIBUNWIND" = "xyes"; then
> +if test "x$HAVE_LIBUNWIND" != "xyes"; then
> +AC_MSG_ERROR([libunwind requested but not installed.])
> +fi
> +AC_DEFINE(HAVE_LIBUNWIND, 1, [Have libunwind support])
> +fi
> +
> +AM_CONDITIONAL(HAVE_LIBUNWIND, [test "x$LIBUNWIND" = xyes])
> +
> +
>  dnl Options for APIs
>  AC_ARG_ENABLE([opengl],
>  [AS_HELP_STRING([--disable-opengl],
> diff --git a/src/gallium/Automake.inc b/src/gallium/Automake.inc
> index a01fa54..48b5a44 100644
> --- a/src/gallium/Automake.inc
> +++ b/src/gallium/Automake.inc
> @@ -46,6 +46,7 @@ GALLIUM_TARGET_CFLAGS = \
>  
>  GALLIUM_COMMON_LIB_DEPS = \
>   -lm \
> + $(LIBUNWIND_LIBS) \
>   $(LIBSENSORS_LIBS) \
>   $(CLOCK_LIB) \
>   $(PTHREAD_LIBS) \
> diff --git a/src/gallium/auxiliary/util/u_debug_stack.c 
> b/src/gallium/auxiliary/util/u_debug_stack.c
> index f941234..cf05f13 100644
> --- a/src/gallium/auxiliary/util/u_debug_stack.c
> +++ b/src/gallium/auxiliary/util/u_debug_stack.c
> @@ -36,6 +36,95 @@
>  #include "u_debug_symbol.h"
>  #include "u_debug_stack.h"
>  
> +#if defined(HAVE_LIBUNWIND)
> +
> +#ifndef _GNU_SOURCE
> +#define _GNU_SOURCE
> +#endif
> +#include 
> +
> +void
> +debug_backtrace_capture(struct debug_stack_frame *backtrace,
> +unsigned start_frame,
> +unsigned nr_frames)
> +{
> +   unw_cursor_t cursor;
> +   unw_context_t context;
> +   unw_proc_info_t pip;
> +   unsigned i = 0;
> +   int ret;
> +
> +   pip.unwind_info = NULL;
> +
> +   unw_getcontext(&context);
> +   unw_init_local(&cursor, &context);
> +
> +   while ((start_frame > 0) && (unw_step(&cursor) > 0))
> +  start_frame--;
> +
> +   while (unw_step(&cursor) > 0) {
> +  char procname[256];
> +  const char *filename;
> +  unw_word_t off;
> +  Dl_info dlinfo;
> +
> +  unw_get_proc_info(&cursor, &pip);
> +
> +  ret = unw_get_proc_name(&cursor, procname, 256, &off);
> +  if (ret && ret != -UNW_ENOMEM) {
> + procname[0] = '?';
> + procname[1] = 0;
> +  }
> +
> +   if (dladdr((void *)(uintptr_t)(pip.start_ip + off), &dlinfo) && 
> dlinfo.dli_fname &&
> +   *dlinfo.dli_fname)
> +   filename = dlinfo.dli_fname;
> +   else
> +   filename = "?";
> +
> +  snprintf(backtrace[i].buf, sizeof(backtrace[i].buf),
> +"%u: %s (%s%s+0x%x) [%p]", i, filename, procname,
> +ret == -UNW_ENOMEM ? "..." : "", (int)off,
> +(void *)(uintptr_t)(pip.start_ip + off));
> +
> +  i++;
> +   }
> +
> +   while (i < nr_frames) {
> +  backtrace[i].buf[0] = '\0';
> +  i++;
> +   }
> +}
> +
> +void
> +debug_backtrace_dump(const struct debug_stack_frame *backtrace,
> + unsigned nr_frames)
> +{
> +   unsigned i;
> +
> +   for (i = 0; i < nr_frames; ++i) {
> +  if (backtrace[i].buf[0] == '\0')
> + break;
> +  debug_printf("\t%s\n", backtrace[i].buf);
> +   }
> +}
> +
> +void
> +debug_backtrace_print(FILE *f,
> +  const struct debug_stack_frame *backtrace,
> +  unsigned nr_frames)
> +{
> +   unsigned i;
> +
> +   for (i = 0; i < nr_frames; ++i) {
> +  if (backtrace[i].buf[0] == '\0')
> + break;
> +  fprintf(f, "\t%s\n", backtrace[i].buf);
> +   }
> +}
> +
> +#else /* ! HAVE_LIBUNWIND */
> +
>  #if defined(PIPE_OS_WINDOWS)
>  #include 
>  #endif
> @@ -1

Re: [Mesa-dev] [PATCH 2/2] gallium/util: libunwind support

2017-04-03 Thread Rob Clark
On Mon, Apr 3, 2017 at 12:56 PM, Thomas Hellstrom  wrote:
> Hi, Rob,
>
> On 03/24/2017 10:21 PM, Rob Clark wrote:
>> It's kinda sad that (a) we don't have debug_backtrace support on !X86
>> and that (b) we re-invent our own crude backtrace support in the first
>> place.  If available, use libunwind instead.  The backtrace format is
>> based on what xserver and weston use, since it is nice not to have to
>> figure out a different format.
>>
>> Signed-off-by: Rob Clark 
>
> Did you consider glibc "backtrace()", I think it's also available on ARM...

I had not.. although xserver and weston are already using libunwind.
I'm not sure about portability of libunwind to other libc
implementations (but I guess it is at least not worse than using a
glibc specific API).

I suppose we could always add a fallback to backtrace().

> Also is the output format the same as before, or at least compatible with
> gallium/tools/addr2line.sh?

quite possibly not.. I chose to align to the format that xserver and
weston was already using.  Otoh, not sure if you would need to use
addr2line.sh since it already decodes things to human readable
fxn/file names.

BR,
-R

> Thanks,
> Thomas
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] gallium/util: libunwind support

2017-04-03 Thread Kristian Høgsberg
On Mon, Apr 3, 2017 at 10:13 AM, Rob Clark  wrote:
> On Mon, Apr 3, 2017 at 12:56 PM, Thomas Hellstrom  
> wrote:
>> Hi, Rob,
>>
>> On 03/24/2017 10:21 PM, Rob Clark wrote:
>>> It's kinda sad that (a) we don't have debug_backtrace support on !X86
>>> and that (b) we re-invent our own crude backtrace support in the first
>>> place.  If available, use libunwind instead.  The backtrace format is
>>> based on what xserver and weston use, since it is nice not to have to
>>> figure out a different format.
>>>
>>> Signed-off-by: Rob Clark 
>>
>> Did you consider glibc "backtrace()", I think it's also available on ARM...
>
> I had not.. although xserver and weston are already using libunwind.
> I'm not sure about portability of libunwind to other libc
> implementations (but I guess it is at least not worse than using a
> glibc specific API).

We did use glibc backtrace initially in weston, but switched to
libbacktrace to get better support for symbols in dlopened modules.
Here's the commit message:

compositor: Use libunwind if available for better backtraces

libunwind has a dwarf parser and automatically queries the dlinfo
for location of dlopened modules.  The resulting backtrace is much
better and includes stack frames in dynamically loaded modules.

krh: Originally submitted for Xorg, adapted for weston:

  http://lists.x.org/archives/xorg-devel/2013-February/035493.html

Note this require libunwind at least 1.1 to get the pkg-config files.

Signed-off-by: Marcin Slusarz 

See 
https://cgit.freedesktop.org/wayland/weston/commit/?id=554a0da74a3f6fc945503a3eb1bfcc9038441b39

Kristian

> I suppose we could always add a fallback to backtrace().
>
>> Also is the output format the same as before, or at least compatible with
>> gallium/tools/addr2line.sh?
>
> quite possibly not.. I chose to align to the format that xserver and
> weston was already using.  Otoh, not sure if you would need to use
> addr2line.sh since it already decodes things to human readable
> fxn/file names.
>
> BR,
> -R
>
>> Thanks,
>> Thomas
>>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] gallium/util: libunwind support

2017-04-03 Thread Thomas Hellstrom
On 04/03/2017 07:13 PM, Rob Clark wrote:
> On Mon, Apr 3, 2017 at 12:56 PM, Thomas Hellstrom  
> wrote:
>> Hi, Rob,
>>
>> On 03/24/2017 10:21 PM, Rob Clark wrote:
>>> It's kinda sad that (a) we don't have debug_backtrace support on !X86
>>> and that (b) we re-invent our own crude backtrace support in the first
>>> place.  If available, use libunwind instead.  The backtrace format is
>>> based on what xserver and weston use, since it is nice not to have to
>>> figure out a different format.
>>>
>>> Signed-off-by: Rob Clark 
>> Did you consider glibc "backtrace()", I think it's also available on ARM...
> I had not.. although xserver and weston are already using libunwind.
> I'm not sure about portability of libunwind to other libc
> implementations (but I guess it is at least not worse than using a
> glibc specific API).
>
> I suppose we could always add a fallback to backtrace().
>
>> Also is the output format the same as before, or at least compatible with
>> gallium/tools/addr2line.sh?
> quite possibly not.. I chose to align to the format that xserver and
> weston was already using.  Otoh, not sure if you would need to use
> addr2line.sh since it already decodes things to human readable
> fxn/file names.
>
> BR,
> -R

backtrace() (or the homebrew i386 implementation) + addr2line.sh gives
you both function name, source file and line number, which IMO is pretty
useful. I'll give the new format a shot. Perhaps if needed we could have
a config option to choose the method. I guess this is mostly used in
DEBUG builds anyway, so a developer can choose the desired method...

/Thomas



>
>> Thanks,
>> Thomas
>>


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: Fix blob memory leak

2017-04-03 Thread Bartosz Tomczyk
---
 src/compiler/glsl/blob.h | 11 +++
 src/compiler/glsl/shader_cache.cpp   |  2 +-
 src/compiler/glsl/tests/blob_test.c  |  8 
 src/mesa/state_tracker/st_shader_cache.c |  2 +-
 4 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/src/compiler/glsl/blob.h b/src/compiler/glsl/blob.h
index 9de12e6eb8..940c81e13b 100644
--- a/src/compiler/glsl/blob.h
+++ b/src/compiler/glsl/blob.h
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef __cplusplus
 extern "C" {
@@ -80,6 +81,16 @@ struct blob *
 blob_create(void);
 
 /**
+ * Destroy a blob and free its memory.
+ */
+static inline void
+blob_destroy(struct blob *blob)
+{
+   free(blob->data);
+   free(blob);
+}
+
+/**
  * Add some unstructured, fixed-size data to a blob.
  *
  * \return True unless allocation failed.
diff --git a/src/compiler/glsl/shader_cache.cpp 
b/src/compiler/glsl/shader_cache.cpp
index ea1bc01f02..f5e6a22bb9 100644
--- a/src/compiler/glsl/shader_cache.cpp
+++ b/src/compiler/glsl/shader_cache.cpp
@@ -1273,7 +1273,7 @@ shader_cache_write_program_metadata(struct gl_context 
*ctx,
 
disk_cache_put(cache, prog->data->sha1, metadata->data, metadata->size);
 
-   free(metadata);
+   blob_destroy(metadata);
 
if (ctx->_Shader->Flags & GLSL_CACHE_INFO) {
   _mesa_sha1_format(sha1_buf, prog->data->sha1);
diff --git a/src/compiler/glsl/tests/blob_test.c 
b/src/compiler/glsl/tests/blob_test.c
index 01b5ef0b2f..df0e91af35 100644
--- a/src/compiler/glsl/tests/blob_test.c
+++ b/src/compiler/glsl/tests/blob_test.c
@@ -184,7 +184,7 @@ test_write_and_read_functions (void)
 "read_consumes_all_bytes");
expect_equal(false, reader.overrun, "read_does_not_overrun");
 
-   free(blob);
+   blob_destroy(blob);
 }
 
 /* Test that data values are written and read with proper alignment. */
@@ -242,7 +242,7 @@ test_alignment(void)
"aligned read of intptr_t");
}
 
-   free(blob);
+   blob_destroy(blob);
 }
 
 /* Test that we detect overrun. */
@@ -264,7 +264,7 @@ test_overrun(void)
expect_equal(0, blob_read_uint32(&reader), "read at overrun");
expect_equal(true, reader.overrun, "overrun flag set");
 
-   free(blob);
+   blob_destroy(blob);
 }
 
 /* Test that we can read and write some large objects, (exercising the code in
@@ -308,7 +308,7 @@ test_big_objects(void)
expect_equal(false, reader.overrun,
 "overrun flag not set reading large objects");
 
-   free(blob);
+   blob_destroy(blob);
ralloc_free(ctx);
 }
 
diff --git a/src/mesa/state_tracker/st_shader_cache.c 
b/src/mesa/state_tracker/st_shader_cache.c
index e8c7289ec6..1a11f1135d 100644
--- a/src/mesa/state_tracker/st_shader_cache.c
+++ b/src/mesa/state_tracker/st_shader_cache.c
@@ -135,7 +135,7 @@ st_store_tgsi_in_disk_cache(struct st_context *st, struct 
gl_program *prog,
   _mesa_shader_stage_to_string(prog->info.stage), sha1_buf);
}
 
-   free(blob);
+   blob_destroy(blob);
 }
 
 static void
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: Increase descriptor limits.

2017-04-03 Thread Bas Nieuwenhuizen
We supported more generally. Decreased the dynamic buffers though, as
we only support 16 for uniform+storage.

Signed-off-by: Bas Nieuwenhuizen 
---
 src/amd/vulkan/radv_device.c | 28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 5c48be1d11a..d3aac90468c 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -545,21 +545,21 @@ void radv_GetPhysicalDeviceProperties(
.bufferImageGranularity   = 64, /* A cache line 
*/
.sparseAddressSpaceSize   = 0xu, /* 
buffer max size */
.maxBoundDescriptorSets   = MAX_SETS,
-   .maxPerStageDescriptorSamplers= 64,
-   .maxPerStageDescriptorUniformBuffers  = 64,
-   .maxPerStageDescriptorStorageBuffers  = 64,
-   .maxPerStageDescriptorSampledImages   = 64,
-   .maxPerStageDescriptorStorageImages   = 64,
-   .maxPerStageDescriptorInputAttachments= 64,
-   .maxPerStageResources = 128,
+   .maxPerStageDescriptorSamplers= (1u << 31) / 16,
+   .maxPerStageDescriptorUniformBuffers  = (1u << 31) / 16,
+   .maxPerStageDescriptorStorageBuffers  = (1u << 31) / 16,
+   .maxPerStageDescriptorSampledImages   = (1u << 31) / 96,
+   .maxPerStageDescriptorStorageImages   = (1u << 31) / 64,
+   .maxPerStageDescriptorInputAttachments= (1u << 31) / 64,
+   .maxPerStageResources = (1u << 31) / 32,
.maxDescriptorSetSamplers = 256,
-   .maxDescriptorSetUniformBuffers   = 256,
-   .maxDescriptorSetUniformBuffersDynamic= 256,
-   .maxDescriptorSetStorageBuffers   = 256,
-   .maxDescriptorSetStorageBuffersDynamic= 256,
-   .maxDescriptorSetSampledImages= 256,
-   .maxDescriptorSetStorageImages= 256,
-   .maxDescriptorSetInputAttachments = 256,
+   .maxDescriptorSetUniformBuffers   = (1u << 31) / 16,
+   .maxDescriptorSetUniformBuffersDynamic= 8,
+   .maxDescriptorSetStorageBuffers   = (1u << 31) / 16,
+   .maxDescriptorSetStorageBuffersDynamic= 8,
+   .maxDescriptorSetSampledImages= (1u << 31) / 96,
+   .maxDescriptorSetStorageImages= (1u << 31) / 64,
+   .maxDescriptorSetInputAttachments = (1u << 31) / 64,
.maxVertexInputAttributes = 32,
.maxVertexInputBindings   = 32,
.maxVertexInputAttributeOffset= 2047,
-- 
2.12.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] util/u_atomic: provide 64bit atomics where they're missing

2017-04-03 Thread Matt Turner
On Thu, Mar 30, 2017 at 3:47 PM, Matt Turner  wrote:
> On Thu, Mar 30, 2017 at 3:26 PM, Grazvydas Ignotas 
wrote:
>> There are still some distributions trying to support unfortunate people
>> with old or exotic CPUs that don't have 64bit atomic operations. When
>> compiling for such a machine, gcc conveniently inserts a library call to
>> a helper, but it's implementation is missing and we get a linker error.
>> This allows us to provide our own implementation, which is marked weak
>> to prefer a better implementation, should one exist.
>>
>> v2: changed copyright, some style adjustments
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93089
>> Signed-off-by: Grazvydas Ignotas 
>> Reviewed-by: Matt Turner 
>
> Thanks, I'll commit this.
>
>> ---
>>  no commit access, but request sent:
>>  https://bugs.freedesktop.org/show_bug.cgi?id=100467
>
> Thanks. I commented on the bug and said I approve :)

I modified the patch slightly to print the results of the test with
AC_MSG_CHECKING/AC_MSG_RESULT and pushed it as commit a6a38a038bd62e.

I'll do some build tests on various architectures. I think I like the idea
of this going to the stable 17.0 branch.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv/winsys: only workout color/depth levels if we have color/depth

2017-04-03 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen 

On Mon, Apr 3, 2017 at 7:17 AM, Dave Airlie  wrote:
> From: Dave Airlie 
>
> This fixes an old bug that seems to get triggered by
> dEQP-VK.memory.requirements.image.sparse_tiling_optimal
>
> We return early when allocating S8_UINT due to there being
> no color or depth, and end up with image size of 0.
>
> Signed-off-by: Dave Airlie 
> ---
>  src/amd/vulkan/winsys/amdgpu/radv_amdgpu_surface.c | 48 
> --
>  1 file changed, 26 insertions(+), 22 deletions(-)
>
> diff --git a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_surface.c 
> b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_surface.c
> index 0433952..53ca6ff 100644
> --- a/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_surface.c
> +++ b/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_surface.c
> @@ -481,28 +481,32 @@ static int radv_amdgpu_winsys_surface_init(struct 
> radeon_winsys *_ws,
> surf->htile_size = surf->htile_slice_size = 0;
> surf->htile_alignment = 1;
>
> -   /* Calculate texture layout information. */
> -   for (level = 0; level <= surf->last_level; level++) {
> -   r = radv_compute_level(ws->addrlib, surf, false, level, type, 
> compressed,
> -  &AddrSurfInfoIn, &AddrSurfInfoOut, 
> &AddrDccIn, &AddrDccOut);
> -   if (r)
> -   return r;
> -
> -   if (level == 0) {
> -   surf->bo_alignment = AddrSurfInfoOut.baseAlign;
> -   surf->pipe_config = 
> AddrSurfInfoOut.pTileInfo->pipeConfig - 1;
> -   radv_set_micro_tile_mode(surf, &ws->info);
> -
> -   /* For 2D modes only. */
> -   if (AddrSurfInfoOut.tileMode >= 
> ADDR_TM_2D_TILED_THIN1) {
> -   surf->bankw = 
> AddrSurfInfoOut.pTileInfo->bankWidth;
> -   surf->bankh = 
> AddrSurfInfoOut.pTileInfo->bankHeight;
> -   surf->mtilea = 
> AddrSurfInfoOut.pTileInfo->macroAspectRatio;
> -   surf->tile_split = 
> AddrSurfInfoOut.pTileInfo->tileSplitBytes;
> -   surf->num_banks = 
> AddrSurfInfoOut.pTileInfo->banks;
> -   surf->macro_tile_index = 
> AddrSurfInfoOut.macroModeIndex;
> -   } else {
> -   surf->macro_tile_index = 0;
> +   if (AddrSurfInfoIn.flags.color || AddrSurfInfoIn.flags.depth) {
> +   /* Calculate texture layout information. */
> +   for (level = 0; level <= surf->last_level; level++) {
> +   r = radv_compute_level(ws->addrlib, surf, false, 
> level, type, compressed,
> +  &AddrSurfInfoIn, 
> &AddrSurfInfoOut, &AddrDccIn, &AddrDccOut);
> +   if (r) {
> +   assert(0);
> +   return r;
> +   }
> +
> +   if (level == 0) {
> +   surf->bo_alignment = 
> AddrSurfInfoOut.baseAlign;
> +   surf->pipe_config = 
> AddrSurfInfoOut.pTileInfo->pipeConfig - 1;
> +   radv_set_micro_tile_mode(surf, &ws->info);
> +
> +   /* For 2D modes only. */
> +   if (AddrSurfInfoOut.tileMode >= 
> ADDR_TM_2D_TILED_THIN1) {
> +   surf->bankw = 
> AddrSurfInfoOut.pTileInfo->bankWidth;
> +   surf->bankh = 
> AddrSurfInfoOut.pTileInfo->bankHeight;
> +   surf->mtilea = 
> AddrSurfInfoOut.pTileInfo->macroAspectRatio;
> +   surf->tile_split = 
> AddrSurfInfoOut.pTileInfo->tileSplitBytes;
> +   surf->num_banks = 
> AddrSurfInfoOut.pTileInfo->banks;
> +   surf->macro_tile_index = 
> AddrSurfInfoOut.macroModeIndex;
> +   } else {
> +   surf->macro_tile_index = 0;
> +   }
> }
> }
> }
> --
> 2.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] util: fix MSVC warning in u_align_u32()

2017-04-03 Thread Emil Velikov
On 3 April 2017 at 17:09, Brian Paul  wrote:
> To silence
> C:\Users\Brian\projects\mesa\src\util/u_vector.h(41) : warning C4146: unary
> minus operator applied to unsigned type, result still unsigned

For the series:
Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] bin/get-fixes-pick-list.sh: fix typo

2017-04-03 Thread Emil Velikov
Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa/glthread: Avoid unnecessary batch reallocation

2017-04-03 Thread Bartosz Tomczyk
---
 src/mesa/main/glthread.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c
index 3f07c420d4..aa14292e59 100644
--- a/src/mesa/main/glthread.c
+++ b/src/mesa/main/glthread.c
@@ -53,7 +53,8 @@ glthread_allocate_batch(struct gl_context *ctx)
 }
 
 static void
-glthread_unmarshal_batch(struct gl_context *ctx, struct glthread_batch *batch)
+glthread_unmarshal_batch(struct gl_context *ctx, struct glthread_batch *batch,
+ const bool release_batch)
 {
size_t pos = 0;
 
@@ -64,7 +65,10 @@ glthread_unmarshal_batch(struct gl_context *ctx, struct 
glthread_batch *batch)
 
assert(pos == batch->used);
 
-   free(batch);
+   if (release_batch)
+  free(batch);
+   else
+  memset(batch, 0, offsetof(struct glthread_batch, buffer));
 }
 
 static void *
@@ -103,7 +107,7 @@ glthread_worker(void *data)
   glthread->busy = true;
   pthread_mutex_unlock(&glthread->mutex);
 
-  glthread_unmarshal_batch(ctx, batch);
+  glthread_unmarshal_batch(ctx, batch, true);
 
   pthread_mutex_lock(&glthread->mutex);
   glthread->busy = false;
@@ -214,7 +218,7 @@ _mesa_glthread_flush_batch_locked(struct gl_context *ctx)
 * need to restore it when it returns.
 */
if (false) {
-  glthread_unmarshal_batch(ctx, batch);
+  glthread_unmarshal_batch(ctx, batch, true);
   _glapi_set_dispatch(ctx->CurrentClientDispatch);
   return;
}
@@ -269,9 +273,8 @@ _mesa_glthread_finish(struct gl_context *ctx)
if (!(glthread->batch_queue || glthread->busy)) {
   if (glthread->batch && glthread->batch->used) {
  struct _glapi_table *dispatch = _glapi_get_dispatch();
- glthread_unmarshal_batch(ctx, glthread->batch);
+ glthread_unmarshal_batch(ctx, glthread->batch, false);
  _glapi_set_dispatch(dispatch);
- glthread_allocate_batch(ctx);
   }
} else {
   _mesa_glthread_flush_batch_locked(ctx);
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] intel: gen_decoder: store pointer to current decoded field in iterator

2017-04-03 Thread Matt Turner
Both are

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/glthread: Avoid unnecessary batch reallocation

2017-04-03 Thread Nicolai Hähnle

On 03.04.2017 20:38, Bartosz Tomczyk wrote:

---
 src/mesa/main/glthread.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c
index 3f07c420d4..aa14292e59 100644
--- a/src/mesa/main/glthread.c
+++ b/src/mesa/main/glthread.c
@@ -53,7 +53,8 @@ glthread_allocate_batch(struct gl_context *ctx)
 }

 static void
-glthread_unmarshal_batch(struct gl_context *ctx, struct glthread_batch *batch)
+glthread_unmarshal_batch(struct gl_context *ctx, struct glthread_batch *batch,
+ const bool release_batch)
 {
size_t pos = 0;

@@ -64,7 +65,10 @@ glthread_unmarshal_batch(struct gl_context *ctx, struct 
glthread_batch *batch)

assert(pos == batch->used);

-   free(batch);
+   if (release_batch)
+  free(batch);
+   else
+  memset(batch, 0, offsetof(struct glthread_batch, buffer));


Hmm. Why do we memset the batch? Seems like a trivial optimization to 
just not do that...


Apart from that, the patch looks good to me.

Cheers,
Nicolai



 }

 static void *
@@ -103,7 +107,7 @@ glthread_worker(void *data)
   glthread->busy = true;
   pthread_mutex_unlock(&glthread->mutex);

-  glthread_unmarshal_batch(ctx, batch);
+  glthread_unmarshal_batch(ctx, batch, true);

   pthread_mutex_lock(&glthread->mutex);
   glthread->busy = false;
@@ -214,7 +218,7 @@ _mesa_glthread_flush_batch_locked(struct gl_context *ctx)
 * need to restore it when it returns.
 */
if (false) {
-  glthread_unmarshal_batch(ctx, batch);
+  glthread_unmarshal_batch(ctx, batch, true);
   _glapi_set_dispatch(ctx->CurrentClientDispatch);
   return;
}
@@ -269,9 +273,8 @@ _mesa_glthread_finish(struct gl_context *ctx)
if (!(glthread->batch_queue || glthread->busy)) {
   if (glthread->batch && glthread->batch->used) {
  struct _glapi_table *dispatch = _glapi_get_dispatch();
- glthread_unmarshal_batch(ctx, glthread->batch);
+ glthread_unmarshal_batch(ctx, glthread->batch, false);
  _glapi_set_dispatch(dispatch);
- glthread_allocate_batch(ctx);
   }
} else {
   _mesa_glthread_flush_batch_locked(ctx);




--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] gallium/util: libunwind support

2017-04-03 Thread Emil Velikov
Hi Rob,

On 24 March 2017 at 21:21, Rob Clark  wrote:
> It's kinda sad that (a) we don't have debug_backtrace support on !X86
> and that (b) we re-invent our own crude backtrace support in the first
> place.  If available, use libunwind instead.  The backtrace format is
> based on what xserver and weston use, since it is nice not to have to
> figure out a different format.
>
> Signed-off-by: Rob Clark 
> ---
>  configure.ac   | 24 
>  src/gallium/Automake.inc   |  1 +
>  src/gallium/auxiliary/util/u_debug_stack.c | 91 
> ++
>  src/gallium/auxiliary/util/u_debug_stack.h | 15 -
>  4 files changed, 129 insertions(+), 2 deletions(-)
>
> diff --git a/configure.ac b/configure.ac
> index a99684b..5046acb 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -1025,6 +1025,30 @@ AC_SUBST([LLVM_LIBS])
>  AC_SUBST([LLVM_LDFLAGS])
>  AC_SUBST([LLVM_INCLUDEDIR])
>
> +dnl
> +dnl libunwind
> +dnl
> +AC_ARG_ENABLE([libunwind],
> +[AS_HELP_STRING([--enable-libunwind],
> +[Use libunwind for backtracing (default: auto)])],
> +[LIBUNWIND="$enableval"],
> +[LIBUNWIND="auto"])
> +
> +PKG_CHECK_MODULES(LIBUNWIND, libunwind, [HAVE_LIBUNWIND=yes], 
> [HAVE_LIBUNWIND=no])
> +if test "x$LIBUNWIND" = "xauto"; then
> +LIBUNWIND="$HAVE_LIBUNWIND"
> +fi
> +
> +if test "x$LIBUNWIND" = "xyes"; then
> +if test "x$HAVE_LIBUNWIND" != "xyes"; then
> +AC_MSG_ERROR([libunwind requested but not installed.])
> +fi
> +AC_DEFINE(HAVE_LIBUNWIND, 1, [Have libunwind support])
> +fi
> +
> +AM_CONDITIONAL(HAVE_LIBUNWIND, [test "x$LIBUNWIND" = xyes])
> +
> +
>  dnl Options for APIs
>  AC_ARG_ENABLE([opengl],
>  [AS_HELP_STRING([--disable-opengl],
> diff --git a/src/gallium/Automake.inc b/src/gallium/Automake.inc
> index a01fa54..48b5a44 100644
> --- a/src/gallium/Automake.inc
> +++ b/src/gallium/Automake.inc
> @@ -46,6 +46,7 @@ GALLIUM_TARGET_CFLAGS = \
>
>  GALLIUM_COMMON_LIB_DEPS = \
> -lm \
> +   $(LIBUNWIND_LIBS) \

We're using the LIBUNWIND_LIBS but LIBUNWIND_CFLAGS is missing. Please
it to src/gallium/auxiliary/Makefile.am's AM_CFLAGS


> diff --git a/src/gallium/auxiliary/util/u_debug_stack.h 
> b/src/gallium/auxiliary/util/u_debug_stack.h
> index 04eba08..0effcbe 100644
> --- a/src/gallium/auxiliary/util/u_debug_stack.h
> +++ b/src/gallium/auxiliary/util/u_debug_stack.h
> @@ -30,6 +30,11 @@
>
>  #include 
>
> +#ifdef HAVE_LIBUNWIND
> +#define UNW_LOCAL_ONLY
> +#include 
> +#endif
> +
This hunk is not needed in the header. Please move it to the C file.

With the above
Reviewed-by: Emil Velikov 

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] travis: Support LLVM 3.8+ on Trusty-based Travis-CI via apt-get not apt addon

2017-04-03 Thread Emil Velikov
On 2 April 2017 at 21:48, Rhys Kidd  wrote:
> Per comments by Travis-CI, the apt addon is only really needed for the
> container-based Precise builds, as they don't yet support Trusty on that 
> platform.
>
> Mesa currently uses Trusty fully-virtualized environment (due to sudo: 
> required).
>
> See further:
> https://docs.travis-ci.com/user/trusty-ci-environment/#Fully-virtualized-via-sudo%3A-required
> https://github.com/travis-ci/apt-source-whitelist/pull/205#issuecomment-216054237
>
> Signed-off-by: Rhys Kidd 
You're a start Rhys, thank you !

Reviewed-by: Emil Velikov  and pushed to master.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] radeonsi: use i32_0/1 instead of *int_bld.zero/one in most places

2017-04-03 Thread Dave Airlie
On 3 April 2017 at 19:52, Marek Olšák  wrote:
> From: Marek Olšák 

radv uses i32zero and i32one, it might be nice to be consistent, but I
don't mind which way.

Dave.
>
> ---
>  src/gallium/drivers/radeonsi/si_shader.c   | 88 
> ++
>  .../drivers/radeonsi/si_shader_tgsi_setup.c| 14 ++--
>  2 files changed, 47 insertions(+), 55 deletions(-)
>
> diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
> b/src/gallium/drivers/radeonsi/si_shader.c
> index 21efd9a..a5d370b 100644
> --- a/src/gallium/drivers/radeonsi/si_shader.c
> +++ b/src/gallium/drivers/radeonsi/si_shader.c
> @@ -538,38 +538,38 @@ static void declare_input_vs(
> break;
> }
>  }
>
>  static LLVMValueRef get_primitive_id(struct lp_build_tgsi_context *bld_base,
>  unsigned swizzle)
>  {
> struct si_shader_context *ctx = si_shader_context(bld_base);
>
> if (swizzle > 0)
> -   return bld_base->uint_bld.zero;
> +   return ctx->i32_0;
>
> switch (ctx->type) {
> case PIPE_SHADER_VERTEX:
> return LLVMGetParam(ctx->main_fn,
> ctx->param_vs_prim_id);
> case PIPE_SHADER_TESS_CTRL:
> return LLVMGetParam(ctx->main_fn,
> SI_PARAM_PATCH_ID);
> case PIPE_SHADER_TESS_EVAL:
> return LLVMGetParam(ctx->main_fn,
> ctx->param_tes_patch_id);
> case PIPE_SHADER_GEOMETRY:
> return LLVMGetParam(ctx->main_fn,
> SI_PARAM_PRIMITIVE_ID);
> default:
> assert(0);
> -   return bld_base->uint_bld.zero;
> +   return ctx->i32_0;
> }
>  }
>
>  /**
>   * Return the value of tgsi_ind_register for indexing.
>   * This is the indirect index with the constant offset added to it.
>   */
>  static LLVMValueRef get_indirect_index(struct si_shader_context *ctx,
>const struct tgsi_ind_register *ind,
>int rel_index)
> @@ -1096,28 +1096,28 @@ static LLVMValueRef fetch_input_gs(
> vtx_offset_param += SI_PARAM_VTX2_OFFSET - 2;
> }
> vtx_offset = lp_build_mul_imm(uint,
>   LLVMGetParam(ctx->main_fn,
>vtx_offset_param),
>   4);
>
> param = si_shader_io_get_unique_index(semantic_name, semantic_index);
> soffset = LLVMConstInt(ctx->i32, (param * 4 + swizzle) * 256, 0);
>
> -   value = ac_build_buffer_load(&ctx->ac, ctx->esgs_ring, 1, uint->zero,
> +   value = ac_build_buffer_load(&ctx->ac, ctx->esgs_ring, 1, ctx->i32_0,
>  vtx_offset, soffset, 0, 1, 0, true);
> if (tgsi_type_is_64bit(type)) {
> LLVMValueRef value2;
> soffset = LLVMConstInt(ctx->i32, (param * 4 + swizzle + 1) * 
> 256, 0);
>
> value2 = ac_build_buffer_load(&ctx->ac, ctx->esgs_ring, 1,
> - uint->zero, vtx_offset, soffset,
> + ctx->i32_0, vtx_offset, soffset,
>   0, 1, 0, true);
> return si_llvm_emit_fetch_64bit(bld_base, type,
> value, value2);
> }
> return LLVMBuildBitCast(gallivm->builder,
> value,
> tgsi2llvmtype(bld_base, type), "");
>  }
>
>  static int lookup_interp_param_index(unsigned interpolate, unsigned location)
> @@ -1169,21 +1169,20 @@ static void interp_fs_input(struct si_shader_context 
> *ctx,
> unsigned semantic_index,
> unsigned num_interp_inputs,
> unsigned colors_read_mask,
> LLVMValueRef interp_param,
> LLVMValueRef prim_mask,
> LLVMValueRef face,
> LLVMValueRef result[4])
>  {
> struct lp_build_tgsi_context *bld_base = &ctx->bld_base;
> struct lp_build_context *base = &bld_base->base;
> -   struct lp_build_context *uint = &bld_base->uint_bld;
> struct gallivm_state *gallivm = base->gallivm;
> LLVMValueRef attr_number;
> LLVMValueRef i, j;
>
> unsigned chan;
>
> /* fs.constant returns the param from the middle vertex, so it's not
>  * really useful for flat shading. It's meant to be used for custom
>  * interpolation (but the intrinsic can't fetch from the other two
>  * vertices).
> @@ -1198,41 +1197,41 @@ static void interp_fs_input(struct si_shader_context 
> *ctx,
>  */
>   

Re: [Mesa-dev] [PATCH] mesa/glthread: Avoid unnecessary batch reallocation

2017-04-03 Thread Bartosz Tomczyk
Actually, I can just set only batch->used to 0, but it seems to error
prone. When someone adds some fields to batch struct, it will be easy to
miss that it should be initialized in glthread_unmarshal_batch.

Anyway I can change it if you want to.

On Mon, Apr 3, 2017 at 8:42 PM, Nicolai Hähnle  wrote:

> On 03.04.2017 20:38, Bartosz Tomczyk wrote:
>
>> ---
>>  src/mesa/main/glthread.c | 15 +--
>>  1 file changed, 9 insertions(+), 6 deletions(-)
>>
>> diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c
>> index 3f07c420d4..aa14292e59 100644
>> --- a/src/mesa/main/glthread.c
>> +++ b/src/mesa/main/glthread.c
>> @@ -53,7 +53,8 @@ glthread_allocate_batch(struct gl_context *ctx)
>>  }
>>
>>  static void
>> -glthread_unmarshal_batch(struct gl_context *ctx, struct glthread_batch
>> *batch)
>> +glthread_unmarshal_batch(struct gl_context *ctx, struct glthread_batch
>> *batch,
>> + const bool release_batch)
>>  {
>> size_t pos = 0;
>>
>> @@ -64,7 +65,10 @@ glthread_unmarshal_batch(struct gl_context *ctx,
>> struct glthread_batch *batch)
>>
>> assert(pos == batch->used);
>>
>> -   free(batch);
>> +   if (release_batch)
>> +  free(batch);
>> +   else
>> +  memset(batch, 0, offsetof(struct glthread_batch, buffer));
>>
>
> Hmm. Why do we memset the batch? Seems like a trivial optimization to just
> not do that...
>
> Apart from that, the patch looks good to me.
>
> Cheers,
> Nicolai
>
>
>
>  }
>>
>>  static void *
>> @@ -103,7 +107,7 @@ glthread_worker(void *data)
>>glthread->busy = true;
>>pthread_mutex_unlock(&glthread->mutex);
>>
>> -  glthread_unmarshal_batch(ctx, batch);
>> +  glthread_unmarshal_batch(ctx, batch, true);
>>
>>pthread_mutex_lock(&glthread->mutex);
>>glthread->busy = false;
>> @@ -214,7 +218,7 @@ _mesa_glthread_flush_batch_locked(struct gl_context
>> *ctx)
>>  * need to restore it when it returns.
>>  */
>> if (false) {
>> -  glthread_unmarshal_batch(ctx, batch);
>> +  glthread_unmarshal_batch(ctx, batch, true);
>>_glapi_set_dispatch(ctx->CurrentClientDispatch);
>>return;
>> }
>> @@ -269,9 +273,8 @@ _mesa_glthread_finish(struct gl_context *ctx)
>> if (!(glthread->batch_queue || glthread->busy)) {
>>if (glthread->batch && glthread->batch->used) {
>>   struct _glapi_table *dispatch = _glapi_get_dispatch();
>> - glthread_unmarshal_batch(ctx, glthread->batch);
>> + glthread_unmarshal_batch(ctx, glthread->batch, false);
>>   _glapi_set_dispatch(dispatch);
>> - glthread_allocate_batch(ctx);
>>}
>> } else {
>>_mesa_glthread_flush_batch_locked(ctx);
>>
>>
>
> --
> Lerne, wie die Welt wirklich ist,
> Aber vergiss niemals, wie sie sein sollte.
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: move to using common buffer load format.

2017-04-03 Thread Dave Airlie
From: Dave Airlie 

Get rid of usage of SI.vs.load.input.

Signed-off-by: Dave Airlie 
---
 src/amd/common/ac_nir_to_llvm.c | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index 520e4cf..da38331 100644
--- a/src/amd/common/ac_nir_to_llvm.c
+++ b/src/amd/common/ac_nir_to_llvm.c
@@ -4564,7 +4564,6 @@ handle_vs_input_decl(struct nir_to_llvm_context *ctx,
LLVMValueRef t_list_ptr = ctx->vertex_buffers;
LLVMValueRef t_offset;
LLVMValueRef t_list;
-   LLVMValueRef args[3];
LLVMValueRef input;
LLVMValueRef buffer_index;
int index = variable->data.location - VERT_ATTRIB_GENERIC0;
@@ -4586,13 +4585,11 @@ handle_vs_input_decl(struct nir_to_llvm_context *ctx,
t_offset = LLVMConstInt(ctx->i32, index + i, false);
 
t_list = ac_build_indexed_load_const(&ctx->ac, t_list_ptr, 
t_offset);
-   args[0] = t_list;
-   args[1] = LLVMConstInt(ctx->i32, 0, false);
-   args[2] = buffer_index;
-   input = ac_build_intrinsic(&ctx->ac,
-   "llvm.SI.vs.load.input", ctx->v4f32, args, 3,
-   AC_FUNC_ATTR_READNONE | AC_FUNC_ATTR_NOUNWIND |
-   AC_FUNC_ATTR_LEGACY);
+
+   input = ac_build_buffer_load_format(&ctx->ac, t_list,
+   buffer_index,
+   LLVMConstInt(ctx->i32, 0, 
false),
+   true);
 
for (unsigned chan = 0; chan < 4; chan++) {
LLVMValueRef llvm_chan = LLVMConstInt(ctx->i32, chan, 
false);
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/glthread: Avoid unnecessary batch reallocation

2017-04-03 Thread Nicolai Hähnle

On 03.04.2017 20:53, Bartosz Tomczyk wrote:

Actually, I can just set only batch->used to 0, but it seems to error
prone. When someone adds some fields to batch struct, it will be easy to
miss that it should be initialized in glthread_unmarshal_batch.


Better to have it fail early and loudly with garbage data, rather than 
silently if 0 happens to be an accepted value. I think it should be 
changed (unless there's a good reason to rely on it on purpose, but I 
don't see one...).


Cheers,
Nicolai




Anyway I can change it if you want to.

On Mon, Apr 3, 2017 at 8:42 PM, Nicolai Hähnle mailto:nhaeh...@gmail.com>> wrote:

On 03.04.2017 20:38, Bartosz Tomczyk wrote:

---
 src/mesa/main/glthread.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c
index 3f07c420d4..aa14292e59 100644
--- a/src/mesa/main/glthread.c
+++ b/src/mesa/main/glthread.c
@@ -53,7 +53,8 @@ glthread_allocate_batch(struct gl_context *ctx)
 }

 static void
-glthread_unmarshal_batch(struct gl_context *ctx, struct
glthread_batch *batch)
+glthread_unmarshal_batch(struct gl_context *ctx, struct
glthread_batch *batch,
+ const bool release_batch)
 {
size_t pos = 0;

@@ -64,7 +65,10 @@ glthread_unmarshal_batch(struct gl_context
*ctx, struct glthread_batch *batch)

assert(pos == batch->used);

-   free(batch);
+   if (release_batch)
+  free(batch);
+   else
+  memset(batch, 0, offsetof(struct glthread_batch, buffer));


Hmm. Why do we memset the batch? Seems like a trivial optimization
to just not do that...

Apart from that, the patch looks good to me.

Cheers,
Nicolai



 }

 static void *
@@ -103,7 +107,7 @@ glthread_worker(void *data)
   glthread->busy = true;
   pthread_mutex_unlock(&glthread->mutex);

-  glthread_unmarshal_batch(ctx, batch);
+  glthread_unmarshal_batch(ctx, batch, true);

   pthread_mutex_lock(&glthread->mutex);
   glthread->busy = false;
@@ -214,7 +218,7 @@ _mesa_glthread_flush_batch_locked(struct
gl_context *ctx)
 * need to restore it when it returns.
 */
if (false) {
-  glthread_unmarshal_batch(ctx, batch);
+  glthread_unmarshal_batch(ctx, batch, true);
   _glapi_set_dispatch(ctx->CurrentClientDispatch);
   return;
}
@@ -269,9 +273,8 @@ _mesa_glthread_finish(struct gl_context *ctx)
if (!(glthread->batch_queue || glthread->busy)) {
   if (glthread->batch && glthread->batch->used) {
  struct _glapi_table *dispatch = _glapi_get_dispatch();
- glthread_unmarshal_batch(ctx, glthread->batch);
+ glthread_unmarshal_batch(ctx, glthread->batch, false);
  _glapi_set_dispatch(dispatch);
- glthread_allocate_batch(ctx);
   }
} else {
   _mesa_glthread_flush_batch_locked(ctx);



--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.





--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] [RFC] mesa/glthread: misaligned address access

2017-04-03 Thread Bartosz Tomczyk
Address sanitizer reports lot of misaligned access:
SUMMARY: AddressSanitizer: undefined-behavior main/marshal.c:276:31 in
main/marshal.c:276:31: runtime error: load of misaligned address 0x631000104866 
for type
'const GLuint' (aka 'const unsigned int'), which requires 4 byte alignment
0x631000104866: note: pointer points here
 92 88 00 00 00 00  00 00 4a 03 0c 00 93 88  00 00 00 00 00 00 02 01  0c 00 40 
8d 00 00 00 00  00 00
 ^
SUMMARY: AddressSanitizer: undefined-behavior main/marshal_generated.c:28725:12 
in
main/marshal_generated.c:28726:12: runtime error: member access within 
misaligned address 0x6310003fc874 for type
'struct marshal_cmd_VertexAttribPointer', which requires 8 byte alignment
0x6310003fc874: note: pointer points here
  01 00 00 00 7a 02 20 00  00 00 00 00 be be be be  be be be be be be be be  be 
be be be be be be be
  ^
SUMMARY: AddressSanitizer: undefined-behavior main/marshal_generated.c:28726:12 
in
main/marshal_generated.c:28726:12: runtime error: store to misaligned address 
0x6310003fc87c for type
'GLint' (aka 'int'), which requires 8 byte alignment
0x6310003fc87c: note: pointer points here
  00 00 00 00 be be be be  be be be be be be be be  be be be be be be be be  be 
be be be be be be be

This probably isn't issue for x86 as misaligned access shoul be as fast
as aligned ones. But this could be issue for different architectures
like ARM/MIPS, any exert here ?

Any idea how to get correct aligment on different platforms ?
---
 src/mesa/main/marshal.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/marshal.h b/src/mesa/main/marshal.h
index 2f1509b2d5..4842d27eeb 100644
--- a/src/mesa/main/marshal.h
+++ b/src/mesa/main/marshal.h
@@ -32,6 +32,7 @@
 
 #include "main/glthread.h"
 #include "main/context.h"
+#include "main/macros.h"
 
 struct marshal_cmd_base
 {
@@ -55,15 +56,16 @@ _mesa_glthread_allocate_command(struct gl_context *ctx,
 {
struct glthread_state *glthread = ctx->GLThread;
struct marshal_cmd_base *cmd_base;
+   const size_t aligned_size = ALIGN(size, 8);
 
if (unlikely(glthread->batch->used + size > MARSHAL_MAX_CMD_SIZE))
   _mesa_glthread_flush_batch(ctx);
 
cmd_base = (struct marshal_cmd_base *)
   &glthread->batch->buffer[glthread->batch->used];
-   glthread->batch->used += size;
+   glthread->batch->used += aligned_size;
cmd_base->cmd_id = cmd_id;
-   cmd_base->cmd_size = size;
+   cmd_base->cmd_size = aligned_size;
return cmd_base;
 }
 
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: move to using common buffer load format.

2017-04-03 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen 

On Mon, Apr 3, 2017 at 8:57 PM, Dave Airlie  wrote:
> From: Dave Airlie 
>
> Get rid of usage of SI.vs.load.input.
>
> Signed-off-by: Dave Airlie 
> ---
>  src/amd/common/ac_nir_to_llvm.c | 13 +
>  1 file changed, 5 insertions(+), 8 deletions(-)
>
> diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
> index 520e4cf..da38331 100644
> --- a/src/amd/common/ac_nir_to_llvm.c
> +++ b/src/amd/common/ac_nir_to_llvm.c
> @@ -4564,7 +4564,6 @@ handle_vs_input_decl(struct nir_to_llvm_context *ctx,
> LLVMValueRef t_list_ptr = ctx->vertex_buffers;
> LLVMValueRef t_offset;
> LLVMValueRef t_list;
> -   LLVMValueRef args[3];
> LLVMValueRef input;
> LLVMValueRef buffer_index;
> int index = variable->data.location - VERT_ATTRIB_GENERIC0;
> @@ -4586,13 +4585,11 @@ handle_vs_input_decl(struct nir_to_llvm_context *ctx,
> t_offset = LLVMConstInt(ctx->i32, index + i, false);
>
> t_list = ac_build_indexed_load_const(&ctx->ac, t_list_ptr, 
> t_offset);
> -   args[0] = t_list;
> -   args[1] = LLVMConstInt(ctx->i32, 0, false);
> -   args[2] = buffer_index;
> -   input = ac_build_intrinsic(&ctx->ac,
> -   "llvm.SI.vs.load.input", ctx->v4f32, args, 3,
> -   AC_FUNC_ATTR_READNONE | AC_FUNC_ATTR_NOUNWIND |
> -   AC_FUNC_ATTR_LEGACY);
> +
> +   input = ac_build_buffer_load_format(&ctx->ac, t_list,
> +   buffer_index,
> +   LLVMConstInt(ctx->i32, 0, 
> false),
> +   true);
>
> for (unsigned chan = 0; chan < 4; chan++) {
> LLVMValueRef llvm_chan = LLVMConstInt(ctx->i32, chan, 
> false);
> --
> 2.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa/glthread: Avoid unnecessary batch reallocation

2017-04-03 Thread Bartosz Tomczyk
---
 src/mesa/main/glthread.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/src/mesa/main/glthread.c b/src/mesa/main/glthread.c
index 3f07c420d4..c4d3f4a434 100644
--- a/src/mesa/main/glthread.c
+++ b/src/mesa/main/glthread.c
@@ -53,7 +53,8 @@ glthread_allocate_batch(struct gl_context *ctx)
 }
 
 static void
-glthread_unmarshal_batch(struct gl_context *ctx, struct glthread_batch *batch)
+glthread_unmarshal_batch(struct gl_context *ctx, struct glthread_batch *batch,
+ const bool release_batch)
 {
size_t pos = 0;
 
@@ -64,7 +65,10 @@ glthread_unmarshal_batch(struct gl_context *ctx, struct 
glthread_batch *batch)
 
assert(pos == batch->used);
 
-   free(batch);
+   if (release_batch)
+  free(batch);
+   else
+  batch->used = 0;
 }
 
 static void *
@@ -103,7 +107,7 @@ glthread_worker(void *data)
   glthread->busy = true;
   pthread_mutex_unlock(&glthread->mutex);
 
-  glthread_unmarshal_batch(ctx, batch);
+  glthread_unmarshal_batch(ctx, batch, true);
 
   pthread_mutex_lock(&glthread->mutex);
   glthread->busy = false;
@@ -214,7 +218,7 @@ _mesa_glthread_flush_batch_locked(struct gl_context *ctx)
 * need to restore it when it returns.
 */
if (false) {
-  glthread_unmarshal_batch(ctx, batch);
+  glthread_unmarshal_batch(ctx, batch, true);
   _glapi_set_dispatch(ctx->CurrentClientDispatch);
   return;
}
@@ -269,9 +273,8 @@ _mesa_glthread_finish(struct gl_context *ctx)
if (!(glthread->batch_queue || glthread->busy)) {
   if (glthread->batch && glthread->batch->used) {
  struct _glapi_table *dispatch = _glapi_get_dispatch();
- glthread_unmarshal_batch(ctx, glthread->batch);
+ glthread_unmarshal_batch(ctx, glthread->batch, false);
  _glapi_set_dispatch(dispatch);
- glthread_allocate_batch(ctx);
   }
} else {
   _mesa_glthread_flush_batch_locked(ctx);
-- 
2.12.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] radv/ac: round cube array coordinate before fixup.

2017-04-03 Thread Bas Nieuwenhuizen
Series is:

Reviewed-by: Bas Nieuwenhuizen 

On Mon, Apr 3, 2017 at 5:46 AM, Dave Airlie  wrote:
> From: Dave Airlie 
>
> This fixes:
> dEQP-VK.glsl.texture_functions.texture.samplercubearray*
>
> Signed-off-by: Dave Airlie 
> ---
>  src/amd/common/ac_nir_to_llvm.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
> index 6985371..adba539 100644
> --- a/src/amd/common/ac_nir_to_llvm.c
> +++ b/src/amd/common/ac_nir_to_llvm.c
> @@ -4237,6 +4237,8 @@ static void visit_tex(struct nir_to_llvm_context *ctx, 
> nir_tex_instr *instr)
> }
>
> if (instr->sampler_dim == GLSL_SAMPLER_DIM_CUBE && coord) {
> +   if (instr->is_array && instr->op != nir_texop_lod)
> +   coords[3] = apply_round_slice(ctx, coords[3]);
> for (chan = 0; chan < instr->coord_components; chan++)
> coords[chan] = to_float(ctx, coords[chan]);
> if (instr->coord_components == 3)
> @@ -4264,7 +4266,9 @@ static void visit_tex(struct nir_to_llvm_context *ctx, 
> nir_tex_instr *instr)
> }
> if (instr->coord_components > 2) {
> /* This seems like a bit of a hack - but it passes 
> Vulkan CTS with it */
> -   if (instr->sampler_dim != GLSL_SAMPLER_DIM_3D && 
> instr->op != nir_texop_txf) {
> +   if (instr->sampler_dim != GLSL_SAMPLER_DIM_3D &&
> +   instr->sampler_dim != GLSL_SAMPLER_DIM_CUBE &&
> +   instr->op != nir_texop_txf) {
> coords[2] = apply_round_slice(ctx, coords[2]);
> }
> address[count++] = coords[2];
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 07/18] anv/allocator: Add a BO cache

2017-04-03 Thread Chad Versace
On Fri 31 Mar 2017, Chad Versace wrote:
> On Wed 15 Mar 2017, Jason Ekstrand wrote:
> > This cache allows us to easily ensure that we have a unique anv_bo for
> > each gem handle.  We'll need this in order to support multiple-import of
> > memory objects and semaphores.
> > 
> > v2 (Jason Ekstrand):
> >  - Reject BO imports if the size doesn't match the prime fd size as
> >reported by lseek().
> > 
> > v3 (Jason Ekstrand):
> >  - Fix reference counting around cache_release (Chris Willson)
> >  - Move the mutex_unlock() later in cache_release
> > ---
> >  src/intel/vulkan/anv_allocator.c | 261 
> > +++
> >  src/intel/vulkan/anv_private.h   |  26 
> >  2 files changed, 287 insertions(+)
> 
> 
> > +static uint32_t
> > +hash_uint32_t(const void *key)
> > +{
> > +   return (uint32_t)(uintptr_t)key;
> > +}
> 
> This hash function does not appear hashy.
> 
> If I correctly understand the details of Mesa's struct hash_table,
> choosing the identify function for the hash function causes unwanted
> clustering when inserting consecutive gem handles.  Since the kernel does
> allocate gem handles consecutively, the problem is real.
> 
> For proof, consider the following:
> 
>- Suppose a long-running process (maybe the compositor) has thrashed on the
>  hash table long enough that its bucket count
>  is ht->size = hash_sizes[7].size = 283. Suppose a spike of
>  compositor activity raises the hash table's density to about 0.5.
>  And suppose the hash table buckets are filled with the consecutive gem
>  handles
>  
>  {0, 0, 0, 0, 4, 5, 6, 7, 8, 9, ..., 127, 128, 0, 0, 0, ..., 0 }
>  
>  The exact density is (128 - 4 + 1) / 283 = 0.4417.
> 
>- Next, some other in-process activity (maybe OpenGL) generated
>  a lot of gem handles after Vulkan's most recently imported
>  gem handle, 128.

This point in the example---the reason why the gem handles in the
anv_bo_cache skip from 128 to 287---is bogus in Vulkan. The problem *is*
real for multiple in-process OpenGL contexts derived from the same
EGLDisplay, using EGL_EXT_image_dma_buf_import, because each context
shares the same intel_screen, and therefore the same drm device fd. But
in Vulkan, each VkDevice opens its own drm device id. So, bogus example.

BUT, that leads to a new question...

Since each VkDevice has a unique drm device fd, and since the kernel
allocates gem handles consecutively on the fd, and since struct
hash_table only grows and never shrinks, and since patch 8/18 inserts
every VkDeviceMemory into the cache... I believe no collisions are
possible in anv_bo_cache.

If there are no collisions, then the hash table is only adding overhead,
and we should use a direct-addressing lookup table. The bo cache should
look like this:

struct anv_bo_cache {
   /* The array indices are gem handles. Null entries are legal. */
   struct anv_bo **bos;

   /* Length of the array. Because the array can have holes, this
* is *not* the number of gem handles in the array. 
*/
   size_t len; 

   pthread_mutex_t mutex;
};

struct anv_bo *
anv_bo_cache_lookup(struct anv_bo_cache *cache, uint32_t gem_handle)
{
   struct anv_bo *bo = NULL;

   pthread_mutex_lock(&cache->mutex);

   if (gem_handle < cache->len)
  bo = cache->entries[gem_handle] == NULL)

   pthread_mutex_unlock(&cache->mutex);

   return bo;

}

BUT, that leads to yet another question...

Why is patch 8/18 inserting every VkDeviceMemory into the cache? If
I understand things correctly, we only *need* to insert a VkDeviceMemory
into the cache if, in vkAllocateMemory, either (1)
VkExportMemoryAllocateInfoKHX::handleTypes != 0 or (2)
VkMemoryAllocateInfo's pNext chain contains an import structure.

If we insert into the cache only those VkDeviceMemories that are
imported or that will be exported, then the bo cache remains small, and
we *should* use a hash table.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radv: Enable VK_KHR_incremental_present.

2017-04-03 Thread Bas Nieuwenhuizen
Just enabling the driver-independent implementation that Jason did.

Signed-off-by: Bas Nieuwenhuizen 
---
 src/amd/vulkan/radv_device.c   |  4 
 src/amd/vulkan/radv_entrypoints_gen.py |  1 +
 src/amd/vulkan/radv_wsi.c  | 11 ++-
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 5c48be1d11a..6456eb17f56 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -92,6 +92,10 @@ static const VkExtensionProperties instance_extensions[] = {
 
 static const VkExtensionProperties common_device_extensions[] = {
{
+   .extensionName = VK_KHR_INCREMENTAL_PRESENT_EXTENSION_NAME,
+   .specVersion = 1,
+   },
+   {
.extensionName = VK_KHR_MAINTENANCE1_EXTENSION_NAME,
.specVersion = 1,
},
diff --git a/src/amd/vulkan/radv_entrypoints_gen.py 
b/src/amd/vulkan/radv_entrypoints_gen.py
index b7b2bcf97e4..bee29dc1202 100644
--- a/src/amd/vulkan/radv_entrypoints_gen.py
+++ b/src/amd/vulkan/radv_entrypoints_gen.py
@@ -31,6 +31,7 @@ supported_extensions = [
'VK_AMD_draw_indirect_count',
'VK_NV_dedicated_allocation',
'VK_KHR_get_physical_device_properties2',
+   'VK_KHR_incremental_present',
'VK_KHR_maintenance1',
'VK_KHR_sampler_mirror_clamp_to_edge',
'VK_KHR_shader_draw_parameters',
diff --git a/src/amd/vulkan/radv_wsi.c b/src/amd/vulkan/radv_wsi.c
index 8b66095b263..b8999f4eb02 100644
--- a/src/amd/vulkan/radv_wsi.c
+++ b/src/amd/vulkan/radv_wsi.c
@@ -26,6 +26,7 @@
 #include "radv_private.h"
 #include "radv_meta.h"
 #include "wsi_common.h"
+#include "util/vk_util.h"
 
 static const struct wsi_callbacks wsi_cbs = {
.get_phys_device_format_properties = radv_GetPhysicalDeviceFormatProperties,
@@ -452,9 +453,14 @@ VkResult radv_QueuePresentKHR(
RADV_FROM_HANDLE(radv_queue, queue, _queue);
VkResult result = VK_SUCCESS;
 
+   const VkPresentRegionsKHR *regions =
+vk_find_struct_const(pPresentInfo->pNext, PRESENT_REGIONS_KHR);
+
for (uint32_t i = 0; i < pPresentInfo->swapchainCount; i++) {
RADV_FROM_HANDLE(wsi_swapchain, swapchain, 
pPresentInfo->pSwapchains[i]);
struct radeon_winsys_cs *cs;
+   const VkPresentRegionKHR *region = NULL;
+
assert(radv_device_from_handle(swapchain->device) == 
queue->device);
if (swapchain->fences[0] == VK_NULL_HANDLE) {
result = 
radv_CreateFence(radv_device_to_handle(queue->device),
@@ -484,9 +490,12 @@ VkResult radv_QueuePresentKHR(
 pPresentInfo->waitSemaphoreCount, 
NULL, 0, false, base_fence);
fence->submitted = true;
 
+   if (regions && regions->pRegions)
+   region = ®ions->pRegions[i];
+
result = swapchain->queue_present(swapchain,
  
pPresentInfo->pImageIndices[i],
- NULL);
+ region);
/* TODO: What if one of them returns OUT_OF_DATE? */
if (result != VK_SUCCESS)
return result;
-- 
2.12.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] radv: overhaul fragment shader sample positions.

2017-04-03 Thread Bas Nieuwenhuizen
Series is

Reviewed-by: Bas Nieuwenhuizen 

Not really a fan of yet another flag in the command buffer, but not
sure what would be the best solution.

On Mon, Apr 3, 2017 at 5:44 AM, Dave Airlie  wrote:
> From: Dave Airlie 
>
> The current code was broken, and I decided to redesign it instead.
>
> This puts the sample positions for all samples into the queue
> constant descriptor buffer after all the spill/ring descriptors.
>
> It then uses a single offset register to point how far into the
> samples the samples for num_samples are. This saves one user sgpr
> and means we only generate the sample position data in the rare
> single case where we need it currently.
>
> This doesn't fix the failing CTS tests without the followup
> fix.
>
> Signed-off-by: Dave Airlie 
> ---
>  src/amd/common/ac_nir_to_llvm.c  | 31 +++-
>  src/amd/common/ac_nir_to_llvm.h  |  4 ++-
>  src/amd/vulkan/radv_cmd_buffer.c | 61 
> 
>  src/amd/vulkan/radv_device.c | 40 ++
>  src/amd/vulkan/radv_private.h|  2 ++
>  5 files changed, 87 insertions(+), 51 deletions(-)
>
> diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
> index 520e4cf..aabcabe 100644
> --- a/src/amd/common/ac_nir_to_llvm.c
> +++ b/src/amd/common/ac_nir_to_llvm.c
> @@ -109,7 +109,7 @@ struct nir_to_llvm_context {
> LLVMValueRef hs_ring_tess_factor;
>
> LLVMValueRef prim_mask;
> -   LLVMValueRef sample_positions;
> +   LLVMValueRef sample_pos_offset;
> LLVMValueRef persp_sample, persp_center, persp_centroid;
> LLVMValueRef linear_sample, linear_center, linear_centroid;
> LLVMValueRef front_face;
> @@ -573,6 +573,7 @@ static void create_function(struct nir_to_llvm_context 
> *ctx)
> ctx->stage == MESA_SHADER_VERTEX ||
> ctx->stage == MESA_SHADER_TESS_CTRL ||
> ctx->stage == MESA_SHADER_TESS_EVAL ||
> +   ctx->stage == MESA_SHADER_FRAGMENT ||
> ctx->is_gs_copy_shader)
> need_ring_offsets = true;
>
> @@ -584,7 +585,7 @@ static void create_function(struct nir_to_llvm_context 
> *ctx)
> need_push_constants = false;
>
> if (need_ring_offsets && !ctx->options->supports_spill) {
> -   arg_types[arg_idx++] = const_array(ctx->v16i8, 8); /* address 
> of rings */
> +   arg_types[arg_idx++] = const_array(ctx->v16i8, 16); /* 
> address of rings */
> }
>
> /* 1 for each descriptor set */
> @@ -679,7 +680,7 @@ static void create_function(struct nir_to_llvm_context 
> *ctx)
> arg_types[arg_idx++] = ctx->i32; // GS instance id
> break;
> case MESA_SHADER_FRAGMENT:
> -   arg_types[arg_idx++] = const_array(ctx->f32, 32); /* sample 
> positions */
> +   arg_types[arg_idx++] = ctx->i32; /* sample position offset */
> user_sgpr_count = arg_idx;
> arg_types[arg_idx++] = ctx->i32; /* prim mask */
> sgpr_count = arg_idx;
> @@ -735,7 +736,7 @@ static void create_function(struct nir_to_llvm_context 
> *ctx)
>
> LLVMPointerType(ctx->i8, CONST_ADDR_SPACE),
>NULL, 0, 
> AC_FUNC_ATTR_READNONE);
> ctx->ring_offsets = LLVMBuildBitCast(ctx->builder, 
> ctx->ring_offsets,
> -
> const_array(ctx->v16i8, 8), "");
> +
> const_array(ctx->v16i8, 16), "");
> } else
> ctx->ring_offsets = LLVMGetParam(ctx->main_function, 
> arg_idx++);
> }
> @@ -844,9 +845,9 @@ static void create_function(struct nir_to_llvm_context 
> *ctx)
> ctx->gs_invocation_id = LLVMGetParam(ctx->main_function, 
> arg_idx++);
> break;
> case MESA_SHADER_FRAGMENT:
> -   set_userdata_location_shader(ctx, AC_UD_PS_SAMPLE_POS, 
> user_sgpr_idx, 2);
> -   user_sgpr_idx += 2;
> -   ctx->sample_positions = LLVMGetParam(ctx->main_function, 
> arg_idx++);
> +   set_userdata_location_shader(ctx, AC_UD_PS_SAMPLE_POS_OFFSET, 
> user_sgpr_idx, 1);
> +   user_sgpr_idx += 1;
> +   ctx->sample_pos_offset = LLVMGetParam(ctx->main_function, 
> arg_idx++);
> ctx->prim_mask = LLVMGetParam(ctx->main_function, arg_idx++);
> ctx->persp_sample = LLVMGetParam(ctx->main_function, 
> arg_idx++);
> ctx->persp_center = LLVMGetParam(ctx->main_function, 
> arg_idx++);
> @@ -3518,15 +3519,17 @@ static LLVMValueRef lookup_interp_param(struct 
> nir_to_llvm_context *ctx,
>  static LLVMValueRef load_sample_position(struct nir_to_llvm_context *ctx,
>  LLVMValueRef s

Re: [Mesa-dev] [PATCH] i965: oa-counters: keep on reading reports until delimiting timestamp

2017-04-03 Thread Robert Bragg
On Mar 30, 2017 16:16, "Lionel Landwerlin" 
wrote:

While exercising reading report with moderate load, we might have to
wait for all the reports to land in the OA buffer, otherwise we might
miss some reports. That means we need to keep on reading the OA stream
until the last report we read has a timestamp older that the timestamp
recorded by the MI_REPORT_PERF_COUNT at the end of the performance
query.


The expectation is that since we have received the second
MI_REPORT_PERF_COUNT report that implies any periodic/automatic reports
that the could possibly relate to a query must have been written to the
OABUFFER. Since we read() as much as is available from i915 perf when we
come to finish processing a query then we shouldn't miss anything.

We shouldn't synchronously wait in a busy loop until we've found a report
that has a timestamp >= our end MI_RPC because that could be a really
significant amount of time (e.g. 50+ milliseconds on HSW). NB. the periodic
samples are just to account for counter overflow. On Gen8+ the loop would
likely block until the next context switch happens.

Was this proposal based on a code inspection or did you maybe hit a problem
correctly measuring something?

Maybe if based on inspection we can find somewhere to put a comment to
clarify the assumptions above?

Br,
- Robert


Signed-off-by: Lionel Landwerlin 
Cc: Robert Bragg 
---
 src/mesa/drivers/dri/i965/brw_performance_query.c | 51
++-
 1 file changed, 32 insertions(+), 19 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_performance_query.c
b/src/mesa/drivers/dri/i965/brw_performance_query.c
index 4a94e4b3cc..076c59e633 100644
--- a/src/mesa/drivers/dri/i965/brw_performance_query.c
+++ b/src/mesa/drivers/dri/i965/brw_performance_query.c
@@ -647,38 +647,50 @@ discard_all_queries(struct brw_context *brw)
 }

 static bool
-read_oa_samples(struct brw_context *brw)
+read_oa_samples_until(struct brw_context *brw,
+  uint32_t start_timestamp,
+  uint32_t end_timestamp)
 {
-   while (1) {
+   uint32_t last_timestamp = start_timestamp;
+
+   while (last_timestamp < end_timestamp) {
   struct brw_oa_sample_buf *buf = get_free_sample_buf(brw);
+  uint32_t offset;
   int len;

   while ((len = read(brw->perfquery.oa_stream_fd, buf->buf,
- sizeof(buf->buf))) < 0 && errno == EINTR)
+ sizeof(buf->buf))) < 0 &&
+ (errno == EINTR || errno == EAGAIN))
  ;

   if (len <= 0) {
  exec_list_push_tail(&brw->perfquery.free_sample_buffers,
&buf->link);

- if (len < 0) {
-if (errno == EAGAIN)
-   return true;
-else {
-   DBG("Error reading i915 perf samples: %m\n");
-   return false;
-}
- } else {
+ if (len < 0)
+DBG("Error reading i915 perf samples: %m\n");
+ else
 DBG("Spurious EOF reading i915 perf samples\n");
-return false;
- }
+
+ return false;
   }

   buf->len = len;
   exec_list_push_tail(&brw->perfquery.sample_buffers, &buf->link);
+
+  /* Go through the reports and update the last timestamp. */
+  while (offset < buf->len) {
+ const struct drm_i915_perf_record_header *header =
+(const struct drm_i915_perf_record_header *) &buf->buf[offset];
+ uint32_t *report = (uint32_t *) (header + 1);
+
+ if (header->type == DRM_I915_PERF_RECORD_SAMPLE)
+last_timestamp = report[1];
+
+ offset += header->size;
+  }
}

-   unreachable("not reached");
-   return false;
+   return true;
 }

 /**
@@ -709,10 +721,6 @@ accumulate_oa_reports(struct brw_context *brw,

assert(o->Ready);

-   /* Collect the latest periodic OA reports from i915 perf */
-   if (!read_oa_samples(brw))
-  goto error;
-
drm_intel_bo_map(obj->oa.bo, false);
query_buffer = obj->oa.bo->virtual;

@@ -728,6 +736,11 @@ accumulate_oa_reports(struct brw_context *brw,
   goto error;
}

+   /* Collect the latest periodic OA reports from i915 perf */
+   if (!read_oa_samples_until(brw, start[1], end[1]))
+  goto error;
+
+
/* See if we have any periodic reports to accumulate too... */

/* N.B. The oa.samples_head was set when the query began and
--
2.11.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] gallium/util: libunwind support

2017-04-03 Thread Thomas Hellstrom
On 04/03/2017 07:33 PM, Thomas Hellstrom wrote:
> On 04/03/2017 07:13 PM, Rob Clark wrote:
>> On Mon, Apr 3, 2017 at 12:56 PM, Thomas Hellstrom  
>> wrote:
>>> Hi, Rob,
>>>
>>> On 03/24/2017 10:21 PM, Rob Clark wrote:
 It's kinda sad that (a) we don't have debug_backtrace support on !X86
 and that (b) we re-invent our own crude backtrace support in the first
 place.  If available, use libunwind instead.  The backtrace format is
 based on what xserver and weston use, since it is nice not to have to
 figure out a different format.

 Signed-off-by: Rob Clark 
>>> Did you consider glibc "backtrace()", I think it's also available on ARM...
>> I had not.. although xserver and weston are already using libunwind.
>> I'm not sure about portability of libunwind to other libc
>> implementations (but I guess it is at least not worse than using a
>> glibc specific API).
>>
>> I suppose we could always add a fallback to backtrace().
>>
Hmm. This commit (bisected) appears to break svga/vmwgfx in DEBUG mode:

*** Error in `glxgears': malloc(): memory corruption: 0x025c09c0 ***
Aborted (core dumped)

The svga linux winsys makes extensive use of the backtrace functionality
using u_debug_flush.c

/Thomas

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v3] intel: tools: add aubinator_error_decode tool

2017-04-03 Thread Lionel Landwerlin
This is pretty much the same tool as what i-g-t has, only with a more
fancy decoding of the instructions/registers. It also doesn't support
anything before gen4.

v2 (from Matt): Drop authors
Remove undefined automake variable

v3: Fix incorrect offsets for dword > 1 (Jordan)

Signed-off-by: Lionel Landwerlin 
Acked-by: Matt Turner 
---
 src/intel/Makefile.tools.am  |  19 +-
 src/intel/common/gen_decoder.c   |  11 +
 src/intel/common/gen_decoder.h   |   1 +
 src/intel/tools/.gitignore   |   1 +
 src/intel/tools/aubinator_error_decode.c | 733 +++
 5 files changed, 764 insertions(+), 1 deletion(-)
 create mode 100644 src/intel/tools/aubinator_error_decode.c

diff --git a/src/intel/Makefile.tools.am b/src/intel/Makefile.tools.am
index dc073e8daa..576beea4f2 100644
--- a/src/intel/Makefile.tools.am
+++ b/src/intel/Makefile.tools.am
@@ -19,7 +19,9 @@
 # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
 # IN THE SOFTWARE.

-noinst_PROGRAMS += tools/aubinator
+noinst_PROGRAMS += \
+   tools/aubinator \
+   tools/aubinator_error_decode

 tools_aubinator_SOURCES = \
tools/aubinator.c \
@@ -42,3 +44,18 @@ tools_aubinator_LDADD = \
$(EXPAT_LIBS) \
$(ZLIB_LIBS) \
-lm
+
+
+tools_aubinator_error_decode_SOURCES = \
+   tools/aubinator_error_decode.c
+
+tools_aubinator_error_decode_LDADD = \
+   common/libintel_common.la \
+   $(top_builddir)/src/util/libmesautil.la \
+   $(EXPAT_LIBS) \
+   $(ZLIB_LIBS)
+
+tools_aubinator_error_decode_CFLAGS = \
+   $(AM_CFLAGS) \
+   $(EXPAT_CFLAGS) \
+   $(ZLIB_CFLAGS)
diff --git a/src/intel/common/gen_decoder.c b/src/intel/common/gen_decoder.c
index 6cc6896b05..1ae78c88e3 100644
--- a/src/intel/common/gen_decoder.c
+++ b/src/intel/common/gen_decoder.c
@@ -108,6 +108,17 @@ gen_spec_find_register(struct gen_spec *spec, uint32_t 
offset)
return NULL;
 }

+struct gen_group *
+gen_spec_find_register_by_name(struct gen_spec *spec, const char *name)
+{
+   for (int i = 0; i < spec->nregisters; i++) {
+  if (strcmp(spec->registers[i]->name, name) == 0)
+ return spec->registers[i];
+   }
+
+   return NULL;
+}
+
 struct gen_enum *
 gen_spec_find_enum(struct gen_spec *spec, const char *name)
 {
diff --git a/src/intel/common/gen_decoder.h b/src/intel/common/gen_decoder.h
index be37e8a542..870bd7f784 100644
--- a/src/intel/common/gen_decoder.h
+++ b/src/intel/common/gen_decoder.h
@@ -45,6 +45,7 @@ struct gen_spec *gen_spec_load_from_path(const struct 
gen_device_info *devinfo,
 uint32_t gen_spec_get_gen(struct gen_spec *spec);
 struct gen_group *gen_spec_find_instruction(struct gen_spec *spec, const 
uint32_t *p);
 struct gen_group *gen_spec_find_register(struct gen_spec *spec, uint32_t 
offset);
+struct gen_group *gen_spec_find_register_by_name(struct gen_spec *spec, const 
char *name);
 int gen_group_get_length(struct gen_group *group, const uint32_t *p);
 const char *gen_group_get_name(struct gen_group *group);
 uint32_t gen_group_get_opcode(struct gen_group *group);
diff --git a/src/intel/tools/.gitignore b/src/intel/tools/.gitignore
index 0c80a6fed2..27437f9eef 100644
--- a/src/intel/tools/.gitignore
+++ b/src/intel/tools/.gitignore
@@ -1 +1,2 @@
 /aubinator
+/aubinator_error_decode
diff --git a/src/intel/tools/aubinator_error_decode.c 
b/src/intel/tools/aubinator_error_decode.c
new file mode 100644
index 00..b55cc33761
--- /dev/null
+++ b/src/intel/tools/aubinator_error_decode.c
@@ -0,0 +1,733 @@
+/*
+ * Copyright © 2007-2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "common/gen_decoder.h"
+#include "util/macros.h"
+
+#

Re: [Mesa-dev] [PATCH 2/2] gallium/util: libunwind support

2017-04-03 Thread Rob Clark
On Mon, Apr 3, 2017 at 4:06 PM, Thomas Hellstrom  wrote:
> On 04/03/2017 07:33 PM, Thomas Hellstrom wrote:
>> On 04/03/2017 07:13 PM, Rob Clark wrote:
>>> On Mon, Apr 3, 2017 at 12:56 PM, Thomas Hellstrom  
>>> wrote:
 Hi, Rob,

 On 03/24/2017 10:21 PM, Rob Clark wrote:
> It's kinda sad that (a) we don't have debug_backtrace support on !X86
> and that (b) we re-invent our own crude backtrace support in the first
> place.  If available, use libunwind instead.  The backtrace format is
> based on what xserver and weston use, since it is nice not to have to
> figure out a different format.
>
> Signed-off-by: Rob Clark 
 Did you consider glibc "backtrace()", I think it's also available on ARM...
>>> I had not.. although xserver and weston are already using libunwind.
>>> I'm not sure about portability of libunwind to other libc
>>> implementations (but I guess it is at least not worse than using a
>>> glibc specific API).
>>>
>>> I suppose we could always add a fallback to backtrace().
>>>
> Hmm. This commit (bisected) appears to break svga/vmwgfx in DEBUG mode:
>
> *** Error in `glxgears': malloc(): memory corruption: 0x025c09c0 ***
> Aborted (core dumped)
>
> The svga linux winsys makes extensive use of the backtrace functionality
> using u_debug_flush.c

hmm, do you think I should be able to reproduce by sprinkling some
calls to debug_flush_alert()?

I was using it mostly w/ the refcnt logging before..

BR,
-R
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: Enable VK_KHR_incremental_present.

2017-04-03 Thread Jason Ekstrand

Reviewed-by: Jason Ekstrand 


On April 3, 2017 12:36:16 PM Bas Nieuwenhuizen  wrote:


Just enabling the driver-independent implementation that Jason did.

Signed-off-by: Bas Nieuwenhuizen 
---
 src/amd/vulkan/radv_device.c   |  4 
 src/amd/vulkan/radv_entrypoints_gen.py |  1 +
 src/amd/vulkan/radv_wsi.c  | 11 ++-
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 5c48be1d11a..6456eb17f56 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vulkan/radv_device.c
@@ -92,6 +92,10 @@ static const VkExtensionProperties instance_extensions[] = {

 static const VkExtensionProperties common_device_extensions[] = {
{
+   .extensionName = VK_KHR_INCREMENTAL_PRESENT_EXTENSION_NAME,
+   .specVersion = 1,
+   },
+   {
.extensionName = VK_KHR_MAINTENANCE1_EXTENSION_NAME,
.specVersion = 1,
},
diff --git a/src/amd/vulkan/radv_entrypoints_gen.py 
b/src/amd/vulkan/radv_entrypoints_gen.py

index b7b2bcf97e4..bee29dc1202 100644
--- a/src/amd/vulkan/radv_entrypoints_gen.py
+++ b/src/amd/vulkan/radv_entrypoints_gen.py
@@ -31,6 +31,7 @@ supported_extensions = [
'VK_AMD_draw_indirect_count',
'VK_NV_dedicated_allocation',
'VK_KHR_get_physical_device_properties2',
+   'VK_KHR_incremental_present',
'VK_KHR_maintenance1',
'VK_KHR_sampler_mirror_clamp_to_edge',
'VK_KHR_shader_draw_parameters',
diff --git a/src/amd/vulkan/radv_wsi.c b/src/amd/vulkan/radv_wsi.c
index 8b66095b263..b8999f4eb02 100644
--- a/src/amd/vulkan/radv_wsi.c
+++ b/src/amd/vulkan/radv_wsi.c
@@ -26,6 +26,7 @@
 #include "radv_private.h"
 #include "radv_meta.h"
 #include "wsi_common.h"
+#include "util/vk_util.h"

 static const struct wsi_callbacks wsi_cbs = {
.get_phys_device_format_properties = radv_GetPhysicalDeviceFormatProperties,
@@ -452,9 +453,14 @@ VkResult radv_QueuePresentKHR(
RADV_FROM_HANDLE(radv_queue, queue, _queue);
VkResult result = VK_SUCCESS;

+   const VkPresentRegionsKHR *regions =
+vk_find_struct_const(pPresentInfo->pNext, PRESENT_REGIONS_KHR);
+
for (uint32_t i = 0; i < pPresentInfo->swapchainCount; i++) {
RADV_FROM_HANDLE(wsi_swapchain, swapchain, 
pPresentInfo->pSwapchains[i]);
struct radeon_winsys_cs *cs;
+   const VkPresentRegionKHR *region = NULL;
+
assert(radv_device_from_handle(swapchain->device) == 
queue->device);
if (swapchain->fences[0] == VK_NULL_HANDLE) {
result = 
radv_CreateFence(radv_device_to_handle(queue->device),
@@ -484,9 +490,12 @@ VkResult radv_QueuePresentKHR(
 pPresentInfo->waitSemaphoreCount, 
NULL, 0, false, base_fence);
fence->submitted = true;

+   if (regions && regions->pRegions)
+   region = ®ions->pRegions[i];
+
result = swapchain->queue_present(swapchain,
  
pPresentInfo->pImageIndices[i],
- NULL);
+ region);
/* TODO: What if one of them returns OUT_OF_DATE? */
if (result != VK_SUCCESS)
return result;
--
2.12.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] util/u_atomic: provide 64bit atomics where they're missing

2017-04-03 Thread Mark Janes
This commit appears to intermittently provoke gpu hangs on 32-bit Intel
systems when running

piglit.shaders.glsl-max-varyings >max_varying_components

Since the behavior is intermittent, I may have identified the wrong
patch.  Please let me know if this patch seems unlikely to affect the
test.

Matt Turner  writes:

> On Thu, Mar 30, 2017 at 3:47 PM, Matt Turner  wrote:
>> On Thu, Mar 30, 2017 at 3:26 PM, Grazvydas Ignotas 
> wrote:
>>> There are still some distributions trying to support unfortunate people
>>> with old or exotic CPUs that don't have 64bit atomic operations. When
>>> compiling for such a machine, gcc conveniently inserts a library call to
>>> a helper, but it's implementation is missing and we get a linker error.
>>> This allows us to provide our own implementation, which is marked weak
>>> to prefer a better implementation, should one exist.
>>>
>>> v2: changed copyright, some style adjustments
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93089
>>> Signed-off-by: Grazvydas Ignotas 
>>> Reviewed-by: Matt Turner 
>>
>> Thanks, I'll commit this.
>>
>>> ---
>>>  no commit access, but request sent:
>>>  https://bugs.freedesktop.org/show_bug.cgi?id=100467
>>
>> Thanks. I commented on the bug and said I approve :)
>
> I modified the patch slightly to print the results of the test with
> AC_MSG_CHECKING/AC_MSG_RESULT and pushed it as commit a6a38a038bd62e.
>
> I'll do some build tests on various architectures. I think I like the idea
> of this going to the stable 17.0 branch.
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] gallium/util: libunwind support

2017-04-03 Thread Rob Clark
On Mon, Apr 3, 2017 at 4:06 PM, Thomas Hellstrom  wrote:
> On 04/03/2017 07:33 PM, Thomas Hellstrom wrote:
>> On 04/03/2017 07:13 PM, Rob Clark wrote:
>>> On Mon, Apr 3, 2017 at 12:56 PM, Thomas Hellstrom  
>>> wrote:
 Hi, Rob,

 On 03/24/2017 10:21 PM, Rob Clark wrote:
> It's kinda sad that (a) we don't have debug_backtrace support on !X86
> and that (b) we re-invent our own crude backtrace support in the first
> place.  If available, use libunwind instead.  The backtrace format is
> based on what xserver and weston use, since it is nice not to have to
> figure out a different format.
>
> Signed-off-by: Rob Clark 
 Did you consider glibc "backtrace()", I think it's also available on ARM...
>>> I had not.. although xserver and weston are already using libunwind.
>>> I'm not sure about portability of libunwind to other libc
>>> implementations (but I guess it is at least not worse than using a
>>> glibc specific API).
>>>
>>> I suppose we could always add a fallback to backtrace().
>>>
> Hmm. This commit (bisected) appears to break svga/vmwgfx in DEBUG mode:
>
> *** Error in `glxgears': malloc(): memory corruption: 0x025c09c0 ***
> Aborted (core dumped)
>
> The svga linux winsys makes extensive use of the backtrace functionality
> using u_debug_flush.c

ok, I can reproduce.. hopefully patch following shortly..

BR,
-R
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 07/18] anv/allocator: Add a BO cache

2017-04-03 Thread Jason Ekstrand
On Mon, Apr 3, 2017 at 12:31 PM, Chad Versace 
wrote:

> On Fri 31 Mar 2017, Chad Versace wrote:
> > On Wed 15 Mar 2017, Jason Ekstrand wrote:
> > > This cache allows us to easily ensure that we have a unique anv_bo for
> > > each gem handle.  We'll need this in order to support multiple-import
> of
> > > memory objects and semaphores.
> > >
> > > v2 (Jason Ekstrand):
> > >  - Reject BO imports if the size doesn't match the prime fd size as
> > >reported by lseek().
> > >
> > > v3 (Jason Ekstrand):
> > >  - Fix reference counting around cache_release (Chris Willson)
> > >  - Move the mutex_unlock() later in cache_release
> > > ---
> > >  src/intel/vulkan/anv_allocator.c | 261 ++
> +
> > >  src/intel/vulkan/anv_private.h   |  26 
> > >  2 files changed, 287 insertions(+)
> >
> >
> > > +static uint32_t
> > > +hash_uint32_t(const void *key)
> > > +{
> > > +   return (uint32_t)(uintptr_t)key;
> > > +}
> >
> > This hash function does not appear hashy.
> >
> > If I correctly understand the details of Mesa's struct hash_table,
> > choosing the identify function for the hash function causes unwanted
> > clustering when inserting consecutive gem handles.  Since the kernel does
> > allocate gem handles consecutively, the problem is real.
> >
> > For proof, consider the following:
> >
> >- Suppose a long-running process (maybe the compositor) has thrashed
> on the
> >  hash table long enough that its bucket count
> >  is ht->size = hash_sizes[7].size = 283. Suppose a spike of
> >  compositor activity raises the hash table's density to about 0.5.
> >  And suppose the hash table buckets are filled with the consecutive
> gem
> >  handles
> >
> >  {0, 0, 0, 0, 4, 5, 6, 7, 8, 9, ..., 127, 128, 0, 0, 0, ..., 0 }
> >
> >  The exact density is (128 - 4 + 1) / 283 = 0.4417.
> >
> >- Next, some other in-process activity (maybe OpenGL) generated
> >  a lot of gem handles after Vulkan's most recently imported
> >  gem handle, 128.
>
> This point in the example---the reason why the gem handles in the
> anv_bo_cache skip from 128 to 287---is bogus in Vulkan. The problem *is*
> real for multiple in-process OpenGL contexts derived from the same
> EGLDisplay, using EGL_EXT_image_dma_buf_import, because each context
> shares the same intel_screen, and therefore the same drm device fd. But
> in Vulkan, each VkDevice opens its own drm device id. So, bogus example.
>
> BUT, that leads to a new question...
>
> Since each VkDevice has a unique drm device fd, and since the kernel
> allocates gem handles consecutively on the fd, and since struct
> hash_table only grows and never shrinks, and since patch 8/18 inserts
> every VkDeviceMemory into the cache... I believe no collisions are
> possible in anv_bo_cache.
>

Does this fall under the category of unbreakable kernel ABI or is it just a
side-effect of the implementation?  If not, then I'm reluctant to trust it.


> If there are no collisions, then the hash table is only adding overhead,
>

Sure, but a no-collision hash table is pretty cheap...

and we should use a direct-addressing lookup table. The bo cache should
> look like this:
>
> struct anv_bo_cache {
>/* The array indices are gem handles. Null entries are legal. */
>struct anv_bo **bos;
>
>/* Length of the array. Because the array can have holes, this
> * is *not* the number of gem handles in the array.
> */
>size_t len;
>
>pthread_mutex_t mutex;
> };
>
> struct anv_bo *
> anv_bo_cache_lookup(struct anv_bo_cache *cache, uint32_t gem_handle)
> {
>struct anv_bo *bo = NULL;
>
>pthread_mutex_lock(&cache->mutex);
>
>if (gem_handle < cache->len)
>   bo = cache->entries[gem_handle] == NULL)
>
>pthread_mutex_unlock(&cache->mutex);
>
>return bo;
>
> }
>
> BUT, that leads to yet another question...
>
> Why is patch 8/18 inserting every VkDeviceMemory into the cache? If
> I understand things correctly, we only *need* to insert a VkDeviceMemory
> into the cache if, in vkAllocateMemory, either (1)
> VkExportMemoryAllocateInfoKHX::handleTypes != 0 or (2)
> VkMemoryAllocateInfo's pNext chain contains an import structure.
>

Because I'm lazy.  In order to start using the bo cache,
anv_device_memory::bo needs to be a pointer (well, it's makes the BO cache
API simpler and more efficient if it's a pointer).  This would mean that we
would have to allocate an additional chunk of memory or go through some
other hoops in order to make it work.  At the end of the day, just stuffing
everything in the cache was simpler and kept us to a single path.


> If we insert into the cache only those VkDeviceMemories that are
> imported or that will be exported, then the bo cache remains small, and
> we *should* use a hash table.
>

Maybe.  But the client isn't supposed to be allocating hundreds of
VkDeviceMemory objects.  It's supposed to allocate a few 

Re: [Mesa-dev] [PATCH 2/2] gallium/util: libunwind support

2017-04-03 Thread Rob Clark
On Mon, Apr 3, 2017 at 4:57 PM, Rob Clark  wrote:
> On Mon, Apr 3, 2017 at 4:06 PM, Thomas Hellstrom  
> wrote:
>> On 04/03/2017 07:33 PM, Thomas Hellstrom wrote:
>>> On 04/03/2017 07:13 PM, Rob Clark wrote:
 On Mon, Apr 3, 2017 at 12:56 PM, Thomas Hellstrom  
 wrote:
> Hi, Rob,
>
> On 03/24/2017 10:21 PM, Rob Clark wrote:
>> It's kinda sad that (a) we don't have debug_backtrace support on !X86
>> and that (b) we re-invent our own crude backtrace support in the first
>> place.  If available, use libunwind instead.  The backtrace format is
>> based on what xserver and weston use, since it is nice not to have to
>> figure out a different format.
>>
>> Signed-off-by: Rob Clark 
> Did you consider glibc "backtrace()", I think it's also available on 
> ARM...
 I had not.. although xserver and weston are already using libunwind.
 I'm not sure about portability of libunwind to other libc
 implementations (but I guess it is at least not worse than using a
 glibc specific API).

 I suppose we could always add a fallback to backtrace().

>> Hmm. This commit (bisected) appears to break svga/vmwgfx in DEBUG mode:
>>
>> *** Error in `glxgears': malloc(): memory corruption: 0x025c09c0 ***
>> Aborted (core dumped)
>>
>> The svga linux winsys makes extensive use of the backtrace functionality
>> using u_debug_flush.c
>
> ok, I can reproduce.. hopefully patch following shortly..
>

So, I found one other minor issue, but the big problem is
debug_stack_frame size difference, because u_debug_flush.c doesn't
#include config.h..  adding:

#ifdef HAVE_CONFIG_H
#include 
#endif

in u_debug_stack.h solves that..  although I think usually folks
prefer to have that in .c files.  (Not sure if I remember the reason.)

So we should either add that in u_debug_stack.h or track down all the
.c files that #include it and add there.

(Or is there a reason we don't just do -include config.h globally in
the build system?)

BR,
-R
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] util/u_atomic: provide 64bit atomics where they're missing

2017-04-03 Thread Matt Turner
On Mon, Apr 3, 2017 at 1:37 PM, Mark Janes  wrote:
>
> This commit appears to intermittently provoke gpu hangs on 32-bit Intel
> systems when running
>
> piglit.shaders.glsl-max-varyings >max_varying_components
>
> Since the behavior is intermittent, I may have identified the wrong
> patch.  Please let me know if this patch seems unlikely to affect the
> test.

I think it's not this patch, but I haven't figured out which it is yet.

Applying this patch doesn't change i965_dri.so (md5sum is the same).
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 6/9] gallium: decrease the size of pipe_box - 24 -> 16 bytes

2017-04-03 Thread Roland Scheidegger
Am 03.04.2017 um 17:11 schrieb Alex Deucher:
> On Sun, Apr 2, 2017 at 2:00 PM, Marek Olšák  wrote:
>> From: Marek Olšák 
>>
>> Also:
>>
>> pipe_transfer: 48 -> 40 bytes.
>> pipe_blit_info = 176 -> 160 bytes.
>> ---
>>  src/gallium/include/pipe/p_state.h | 8 
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/gallium/include/pipe/p_state.h 
>> b/src/gallium/include/pipe/p_state.h
>> index 392bb8b..6a147ef 100644
>> --- a/src/gallium/include/pipe/p_state.h
>> +++ b/src/gallium/include/pipe/p_state.h
>> @@ -472,25 +472,25 @@ struct pipe_image_view
>> } u;
>>  };
>>
>>
>>  /**
>>   * Subregion of 1D/2D/3D image resource.
>>   */
>>  struct pipe_box
>>  {
>> int x;
>> -   int y;
>> -   int z;
>> +   int16_t y;
>> +   int16_t z;
>> int width;
>> -   int height;
>> -   int depth;
>> +   int16_t height;
>> +   int16_t depth;
>>  };
> 
> Not specific to this patch per se, but will this cause any issues in
> the future as max surface sizes supported by hw increase?  I feel like
> we'll need to change this back at some point.
I don't think this can increase easily. With 8 subpixel bits for
rasterization (which is the standard nowadays) the absolute maximum you
could theoretically support would be 64k when using float math, and there's
probably reasons why you don't want to go quite to the aboluste limit.
That said, with int16 the largest pot size possible would be 16k, which
indeed is what everybody already supports, not leaving even a factor 2
increase... So I'm not sure it's really a good idea.


> Alex
> 
>>
>>
>>  /**
>>   * A memory object/resource such as a vertex buffer or texture.
>>   */
>>  struct pipe_resource
>>  {
>> struct pipe_reference reference;
>> struct pipe_screen *screen; /**< screen that this texture belongs to */
>> --
>> 2.7.4
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 07/18] anv/allocator: Add a BO cache

2017-04-03 Thread Chad Versace
On Mon 03 Apr 2017, Jason Ekstrand wrote:
> On Mon, Apr 3, 2017 at 12:31 PM, Chad Versace 
> wrote:
> 
> > On Fri 31 Mar 2017, Chad Versace wrote:
> > > On Wed 15 Mar 2017, Jason Ekstrand wrote:
> > > > This cache allows us to easily ensure that we have a unique anv_bo for
> > > > each gem handle.  We'll need this in order to support multiple-import
> > of
> > > > memory objects and semaphores.
> > > >
> > > > v2 (Jason Ekstrand):
> > > >  - Reject BO imports if the size doesn't match the prime fd size as
> > > >reported by lseek().
> > > >
> > > > v3 (Jason Ekstrand):
> > > >  - Fix reference counting around cache_release (Chris Willson)
> > > >  - Move the mutex_unlock() later in cache_release
> > > > ---
> > > >  src/intel/vulkan/anv_allocator.c | 261 ++
> > +
> > > >  src/intel/vulkan/anv_private.h   |  26 
> > > >  2 files changed, 287 insertions(+)
> > >
> > >
> > > > +static uint32_t
> > > > +hash_uint32_t(const void *key)
> > > > +{
> > > > +   return (uint32_t)(uintptr_t)key;
> > > > +}
> > >
> > > This hash function does not appear hashy.
> > >
> > > If I correctly understand the details of Mesa's struct hash_table,
> > > choosing the identify function for the hash function causes unwanted
> > > clustering when inserting consecutive gem handles.  Since the kernel does
> > > allocate gem handles consecutively, the problem is real.
> > >
> > > For proof, consider the following:
> > >
> > >- Suppose a long-running process (maybe the compositor) has thrashed
> > on the
> > >  hash table long enough that its bucket count
> > >  is ht->size = hash_sizes[7].size = 283. Suppose a spike of
> > >  compositor activity raises the hash table's density to about 0.5.
> > >  And suppose the hash table buckets are filled with the consecutive
> > gem
> > >  handles
> > >
> > >  {0, 0, 0, 0, 4, 5, 6, 7, 8, 9, ..., 127, 128, 0, 0, 0, ..., 0 }
> > >
> > >  The exact density is (128 - 4 + 1) / 283 = 0.4417.
> > >
> > >- Next, some other in-process activity (maybe OpenGL) generated
> > >  a lot of gem handles after Vulkan's most recently imported
> > >  gem handle, 128.
> >
> > This point in the example---the reason why the gem handles in the
> > anv_bo_cache skip from 128 to 287---is bogus in Vulkan. The problem *is*
> > real for multiple in-process OpenGL contexts derived from the same
> > EGLDisplay, using EGL_EXT_image_dma_buf_import, because each context
> > shares the same intel_screen, and therefore the same drm device fd. But
> > in Vulkan, each VkDevice opens its own drm device id. So, bogus example.
> >
> > BUT, that leads to a new question...
> >
> > Since each VkDevice has a unique drm device fd, and since the kernel
> > allocates gem handles consecutively on the fd, and since struct
> > hash_table only grows and never shrinks, and since patch 8/18 inserts
> > every VkDeviceMemory into the cache... I believe no collisions are
> > possible in anv_bo_cache.
> >
> 
> Does this fall under the category of unbreakable kernel ABI or is it just a
> side-effect of the implementation?  If not, then I'm reluctant to trust it.

I'm not certain. But krh, sitting beside me, says "It's ABI at this point.
The kernel uses a 'idr' structure which guarantees that behavior".

> > If there are no collisions, then the hash table is only adding overhead,
> 
> Sure, but a no-collision hash table is pretty cheap...
> 
> > and we should use a direct-addressing lookup table. The bo cache should
> > look like this:
> >
> > struct anv_bo_cache {
> >/* The array indices are gem handles. Null entries are legal. */
> >struct anv_bo **bos;
> >
> >/* Length of the array. Because the array can have holes, this
> > * is *not* the number of gem handles in the array.
> > */
> >size_t len;
> >
> >pthread_mutex_t mutex;
> > };
> >
> > struct anv_bo *
> > anv_bo_cache_lookup(struct anv_bo_cache *cache, uint32_t gem_handle)
> > {
> >struct anv_bo *bo = NULL;
> >
> >pthread_mutex_lock(&cache->mutex);
> >
> >if (gem_handle < cache->len)
> >   bo = cache->entries[gem_handle];
> >
> >pthread_mutex_unlock(&cache->mutex);
> >
> >return bo;
> >
> > }
> >
> > BUT, that leads to yet another question...
> >
> > Why is patch 8/18 inserting every VkDeviceMemory into the cache? If
> > I understand things correctly, we only *need* to insert a VkDeviceMemory
> > into the cache if, in vkAllocateMemory, either (1)
> > VkExportMemoryAllocateInfoKHX::handleTypes != 0 or (2)
> > VkMemoryAllocateInfo's pNext chain contains an import structure.
> >
> 
> Because I'm lazy.  In order to start using the bo cache,
> anv_device_memory::bo needs to be a pointer (well, it's makes the BO cache
> API simpler and more efficient if it's a pointer).  This would mean that we
> would have to allocate an additional chunk of memory or go through some
> other hoops in order t

Re: [Mesa-dev] [RFC PATCH] egl/android: Dequeue buffers inside EGL calls

2017-04-03 Thread Mauro Rossi
Hi Tomasz,

2017-04-03 10:00 GMT+02:00 Tomasz Figa :

> Hi Mauro,
>
> On Mon, Apr 3, 2017 at 2:48 AM, Mauro Rossi  wrote:
> >
> >
> > 2017-03-31 13:05 GMT+02:00 Tapani Pälli :
> >>
> >>
> >>
> >> On 03/31/2017 10:12 AM, Tapani Pälli wrote:
> >>>
> >>>
> >>>
> >>> On 03/31/2017 09:06 AM, Tapani Pälli wrote:
> 
> 
> 
>  On 03/31/2017 08:24 AM, Rob Clark wrote:
> >
> > On Fri, Mar 31, 2017 at 12:22 AM, Tapani Pälli
> >  wrote:
> >>
> >>
> >>
> >> On 03/30/2017 05:57 PM, Emil Velikov wrote:
> >>>
> >>>
> >>> On 30 March 2017 at 15:30, Tomasz Figa  wrote:
> 
> 
>  On Thu, Mar 30, 2017 at 11:17 PM, Emil Velikov
>  
>  wrote:
> >
> >
> >
> > On 30 March 2017 at 11:55, Tomasz Figa 
> wrote:
> >>
> >>
> >> Android buffer queues can be abandoned, which results in failing
> >> to
> >> dequeue next buffer. Currently this would fail somewhere deep
> >> within
> >> the DRI stack calling loader's getBuffers*(), without any error
> >> reporting to the client app. However Android framework code
> >> relies on
> >> proper signaling of this event, so we move buffer dequeue to
> >> createWindowSurface() and swapBuffers() call, which can generate
> >> proper
> >> EGL errors. To keep the performance benefits of delayed buffer
> >> handling,
> >> if any, fence wait and DRI image creation is kept delayed until
> >> getBuffers*() is called by the DRI driver.
> >>
> > Thank you Tomasz.
> >
> > I'm fairly confident that this should resolve the crash [in
> > swap_buffers] that Mauro was seeing.
> > Mauro can you give it a test ?
> 
> 
> 
>  Ah, I actually noticed a problem with existing code, supposedly
>  fixed
>  by [1], but I'm afraid it's still wrong.
> 
>  Current swap_buffers calls get_back_bo(), but doesn't call
>  update_buffers(), which is the function that should be called
> before
>  to actually dequeue a buffer from Android's buffer queue. Given
>  that,
>  get_back_bo() would simply fail with !dri2_surf->buffer, because
> no
>  buffer was dequeued.
> 
> >>> Right - I was wondering why we don't hit that on EGL/GBM or
> >>> EGL/Wayland.
> >>> From a quick look - may be because EGL/Android drops the dpy mutex
> in
> >>> droid_window_enqueue_buffer().
> >>>
>  My patch removes update_buffers() and changes the buffer
>  management so
>  that there is always a buffer dequeued, starting from surface
>  creation, unless there was an error somewhere.
> 
> >>> Of the top of your head - is there something stopping us from using
> >>> the same method on $other platforms?
> >>>
>  [1]
> 
>  https://cgit.freedesktop.org/mesa/mesa/commit/src/egl/
> drivers/dri2/platform_android.c?id=4d4558411db166d2d66f8cec9cb581
> 149dbe1597
> 
> 
> 
> >
> >
> > Not that huge of an expert on the Android specifics, so just a
> > humble
> > request:
> > Can we seek the code resuffle (droid_{alloc,free}_local_buffer,
> >>>
> >>>
> >>> Oops silly typo - s/seek/split/.
> >>>
> > other?) separate from the functionality changes ?
> 
> 
> 
>  Sure. Thanks for suggestion.
> 
> >>> Please give it a day or two for others to comment.
> >>
> >>
> >>
> >> I'm trying to debug why this causes our homescreen (wallpaper) to be
> >> black.
> >> Otherwise I haven't seen any issues with these changes.
> >>
> >
> > wallpaper seems to be a special sorta hell..  I wonder if there is
> > somehow some sort of interaction with what I fixed / worked-around in
> > a5e733c6b52e93de3000647d075f5ca2f55fcb71 ??
> >
> > Maybe at least try commenting out the temp-pbuffer thing to get max
> > texture size, and see if that "fixes" things
> 
> 
>  Can you give more details, I still live in la la land and don't know
>  about 'temp-pbuffer thing'?
> 
> >>>
> >>> aa I did not recall the problem, you mean the 'dummy pbuffer' in
> >>> SurfaceFlinger .. yes I will check if this is related.
> >>>
> >>
> >> If I take away that dummy pbuffer usage (which is useless anyway),
> couple
> >> of errors disappear from the log. They are:
> >>
> >> SurfaceFlinger: releasePendingBuffer failed: Unknown error -1 (1)
> >> SurfaceFlinger: releasePendingBuffer failed: Unknown error -1 (1)
> >>
> >> but otherwise the desktop still stays black, live wallpapers seem to
> work
> >> so there is something special about this default wallpaper. Will
> continue
> >> digging ..
> >>
> >> // Tapani
> >
> >
>

[Mesa-dev] [PATCH 3/3] intel/aubinator: Stop searching after a custom handler is found

2017-04-03 Thread Jordan Justen
Signed-off-by: Jordan Justen 
---
 src/intel/tools/aubinator.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index a64bce7a536..0d33c392384 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c
@@ -724,8 +724,10 @@ parse_commands(struct gen_spec *spec, uint32_t *cmds, int 
size, int engine)
  decode_group(inst, p, 0);
 
  for (i = 0; i < ARRAY_LENGTH(custom_handlers); i++) {
-if (gen_group_get_opcode(inst) == custom_handlers[i].opcode)
+if (gen_group_get_opcode(inst) == custom_handlers[i].opcode) {
custom_handlers[i].handle(spec, p);
+   break;
+}
  }
   }
 
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] intel/gen_decoder: Fix length for Media State/Object commands

2017-04-03 Thread Jordan Justen
From BDW PRM, Volume 6: Command Stream Programming, 'Render Command
Header Format'.

Signed-off-by: Jordan Justen 
---
 src/intel/common/gen_decoder.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/src/intel/common/gen_decoder.c b/src/intel/common/gen_decoder.c
index 1ae78c88e3f..1244f4c4480 100644
--- a/src/intel/common/gen_decoder.c
+++ b/src/intel/common/gen_decoder.c
@@ -697,8 +697,16 @@ gen_group_get_length(struct gen_group *group, const 
uint32_t *p)
  return field(h, 0, 7) + 2;
   case 1:
  return 1;
-  case 2:
- return 2;
+  case 2: {
+ uint32_t opcode = field(h, 24, 26);
+ assert(opcode < 3 /* 3 and above currently reserved */);
+ if (opcode == 0)
+return field(h, 0, 7) + 2;
+ else if (opcode < 3)
+return field(h, 0, 15) + 2;
+ else
+return 1; /* FIXME: if more opcodes are added */
+  }
   case 3:
  return field(h, 0, 7) + 2;
   }
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Fix blob memory leak

2017-04-03 Thread Timothy Arceri

Pushed, Thanks!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] intel/gen_decoder: return -1 for unknown command formats

2017-04-03 Thread Jordan Justen
Decoding with aubinator encountered a command of 0x. With the
previous code, it caused aubinator to jump 255 + 2 dwords to start
decoding again.

Instead we can attempt to detect the known instruction formats. If the
format is not recognized, then we can advance just 1 dword.

Signed-off-by: Jordan Justen 
---
 src/intel/common/gen_decoder.c| 21 +++--
 src/intel/tools/aubinator.c   |  4 ++--
 src/mesa/drivers/dri/i965/intel_batchbuffer.c |  2 ++
 3 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/src/intel/common/gen_decoder.c b/src/intel/common/gen_decoder.c
index 1244f4c4480..d5ba3e6265e 100644
--- a/src/intel/common/gen_decoder.c
+++ b/src/intel/common/gen_decoder.c
@@ -692,28 +692,37 @@ gen_group_get_length(struct gen_group *group, const 
uint32_t *p)
 
case 3: /* Render */ {
   uint32_t subtype = field(h, 27, 28);
+  uint32_t opcode = field(h, 24, 26);
   switch (subtype) {
   case 0:
- return field(h, 0, 7) + 2;
+ if (opcode < 2)
+return field(h, 0, 7) + 2;
+ else
+return -1;
   case 1:
- return 1;
+ if (opcode < 2)
+return 1;
+ else
+return -1;
   case 2: {
- uint32_t opcode = field(h, 24, 26);
- assert(opcode < 3 /* 3 and above currently reserved */);
  if (opcode == 0)
 return field(h, 0, 7) + 2;
  else if (opcode < 3)
 return field(h, 0, 15) + 2;
  else
-return 1; /* FIXME: if more opcodes are added */
+return -1;
   }
   case 3:
- return field(h, 0, 7) + 2;
+ if (opcode < 4)
+return field(h, 0, 7) + 2;
+ else
+return -1;
   }
}
}
 
unreachable("bad opcode");
+   return -1;
 }
 
 void
diff --git a/src/intel/tools/aubinator.c b/src/intel/tools/aubinator.c
index cae578babac..a64bce7a536 100644
--- a/src/intel/tools/aubinator.c
+++ b/src/intel/tools/aubinator.c
@@ -687,12 +687,12 @@ parse_commands(struct gen_spec *spec, uint32_t *cmds, int 
size, int engine)
 
for (p = cmds; p < end; p += length) {
   inst = gen_spec_find_instruction(spec, p);
+  length = gen_group_get_length(inst, p);
   if (inst == NULL) {
  fprintf(outfile, "unknown instruction %08x\n", p[0]);
- length = (p[0] & 0xff) + 2;
+ length = MIN2(1, length);
  continue;
   }
-  length = gen_group_get_length(inst, p);
 
   const char *color, *reset_color = NORMAL;
   uint64_t offset;
diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index 54bab9efb02..d16b4622456 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -318,6 +318,8 @@ do_batch_dump(struct brw_context *brw)
   }
 
   length = gen_group_get_length(inst, p);
+  assert(length > 0);
+  length = MAX2(length, 1);
}
 
if (ret == 0) {
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] intel/vec4: Add some fall through comments

2017-04-03 Thread Jason Ekstrand
---
 src/intel/compiler/brw_vec4_nir.cpp | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/intel/compiler/brw_vec4_nir.cpp 
b/src/intel/compiler/brw_vec4_nir.cpp
index 2384265..613c695 100644
--- a/src/intel/compiler/brw_vec4_nir.cpp
+++ b/src/intel/compiler/brw_vec4_nir.cpp
@@ -1310,6 +1310,7 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr)
 
case nir_op_iadd:
   assert(nir_dest_bit_size(instr->dest.dest) < 64);
+  /* fall through */
case nir_op_fadd:
   inst = emit(ADD(dst, op[0], op[1]));
   inst->saturate = instr->dest.saturate;
@@ -1539,6 +1540,7 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr)
 
case nir_op_imin:
case nir_op_umin:
+  /* fall through */
   assert(nir_dest_bit_size(instr->dest.dest) < 64);
case nir_op_fmin:
   inst = emit_minmax(BRW_CONDITIONAL_L, dst, op[0], op[1]);
@@ -1548,6 +1550,7 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr)
case nir_op_imax:
case nir_op_umax:
   assert(nir_dest_bit_size(instr->dest.dest) < 64);
+  /* fall through */
case nir_op_fmax:
   inst = emit_minmax(BRW_CONDITIONAL_GE, dst, op[0], op[1]);
   inst->saturate = instr->dest.saturate;
@@ -2054,6 +2057,7 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr)
case nir_op_iabs:
case nir_op_ineg:
   assert(nir_dest_bit_size(instr->dest.dest) < 64);
+  /* fall through */
case nir_op_fabs:
case nir_op_fneg:
case nir_op_fsat:
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 07/18] anv/allocator: Add a BO cache

2017-04-03 Thread Jason Ekstrand
On Mon, Apr 3, 2017 at 3:45 PM, Chad Versace 
wrote:

> On Mon 03 Apr 2017, Jason Ekstrand wrote:
> > On Mon, Apr 3, 2017 at 12:31 PM, Chad Versace 
> > wrote:
> >
> > > On Fri 31 Mar 2017, Chad Versace wrote:
> > > > On Wed 15 Mar 2017, Jason Ekstrand wrote:
> > > > > This cache allows us to easily ensure that we have a unique anv_bo
> for
> > > > > each gem handle.  We'll need this in order to support
> multiple-import
> > > of
> > > > > memory objects and semaphores.
> > > > >
> > > > > v2 (Jason Ekstrand):
> > > > >  - Reject BO imports if the size doesn't match the prime fd size as
> > > > >reported by lseek().
> > > > >
> > > > > v3 (Jason Ekstrand):
> > > > >  - Fix reference counting around cache_release (Chris Willson)
> > > > >  - Move the mutex_unlock() later in cache_release
> > > > > ---
> > > > >  src/intel/vulkan/anv_allocator.c | 261
> ++
> > > +
> > > > >  src/intel/vulkan/anv_private.h   |  26 
> > > > >  2 files changed, 287 insertions(+)
> > > >
> > > >
> > > > > +static uint32_t
> > > > > +hash_uint32_t(const void *key)
> > > > > +{
> > > > > +   return (uint32_t)(uintptr_t)key;
> > > > > +}
> > > >
> > > > This hash function does not appear hashy.
> > > >
> > > > If I correctly understand the details of Mesa's struct hash_table,
> > > > choosing the identify function for the hash function causes unwanted
> > > > clustering when inserting consecutive gem handles.  Since the kernel
> does
> > > > allocate gem handles consecutively, the problem is real.
> > > >
> > > > For proof, consider the following:
> > > >
> > > >- Suppose a long-running process (maybe the compositor) has
> thrashed
> > > on the
> > > >  hash table long enough that its bucket count
> > > >  is ht->size = hash_sizes[7].size = 283. Suppose a spike of
> > > >  compositor activity raises the hash table's density to about
> 0.5.
> > > >  And suppose the hash table buckets are filled with the
> consecutive
> > > gem
> > > >  handles
> > > >
> > > >  {0, 0, 0, 0, 4, 5, 6, 7, 8, 9, ..., 127, 128, 0, 0, 0, ..., 0 }
> > > >
> > > >  The exact density is (128 - 4 + 1) / 283 = 0.4417.
> > > >
> > > >- Next, some other in-process activity (maybe OpenGL) generated
> > > >  a lot of gem handles after Vulkan's most recently imported
> > > >  gem handle, 128.
> > >
> > > This point in the example---the reason why the gem handles in the
> > > anv_bo_cache skip from 128 to 287---is bogus in Vulkan. The problem
> *is*
> > > real for multiple in-process OpenGL contexts derived from the same
> > > EGLDisplay, using EGL_EXT_image_dma_buf_import, because each context
> > > shares the same intel_screen, and therefore the same drm device fd. But
> > > in Vulkan, each VkDevice opens its own drm device id. So, bogus
> example.
> > >
> > > BUT, that leads to a new question...
> > >
> > > Since each VkDevice has a unique drm device fd, and since the kernel
> > > allocates gem handles consecutively on the fd, and since struct
> > > hash_table only grows and never shrinks, and since patch 8/18 inserts
> > > every VkDeviceMemory into the cache... I believe no collisions are
> > > possible in anv_bo_cache.
> > >
> >
> > Does this fall under the category of unbreakable kernel ABI or is it
> just a
> > side-effect of the implementation?  If not, then I'm reluctant to trust
> it.
>
> I'm not certain. But krh, sitting beside me, says "It's ABI at this point.
> The kernel uses a 'idr' structure which guarantees that behavior".
>

If we think we can rely on it, I'm happy to make it into a flat array but
it'll basically be a simplified hash table at that point.


> > > If there are no collisions, then the hash table is only adding
> overhead,
> >
> > Sure, but a no-collision hash table is pretty cheap...
> >
> > > and we should use a direct-addressing lookup table. The bo cache should
> > > look like this:
> > >
> > > struct anv_bo_cache {
> > >/* The array indices are gem handles. Null entries are legal. */
> > >struct anv_bo **bos;
> > >
> > >/* Length of the array. Because the array can have holes, this
> > > * is *not* the number of gem handles in the array.
> > > */
> > >size_t len;
> > >
> > >pthread_mutex_t mutex;
> > > };
> > >
> > > struct anv_bo *
> > > anv_bo_cache_lookup(struct anv_bo_cache *cache, uint32_t
> gem_handle)
> > > {
> > >struct anv_bo *bo = NULL;
> > >
> > >pthread_mutex_lock(&cache->mutex);
> > >
> > >if (gem_handle < cache->len)
> > >   bo = cache->entries[gem_handle];
> > >
> > >pthread_mutex_unlock(&cache->mutex);
> > >
> > >return bo;
> > >
> > > }
> > >
> > > BUT, that leads to yet another question...
> > >
> > > Why is patch 8/18 inserting every VkDeviceMemory into the cache? If
> > > I understand things correctly, we only *need* to insert a
> VkDeviceMemory
> > > into the cache if, in vkAllocateMemory, either (1

Re: [Mesa-dev] [PATCH v3 07/18] anv/allocator: Add a BO cache

2017-04-03 Thread Chad Versace
On Wed 15 Mar 2017, Jason Ekstrand wrote:
> This cache allows us to easily ensure that we have a unique anv_bo for
> each gem handle.  We'll need this in order to support multiple-import of
> memory objects and semaphores.
> 
> v2 (Jason Ekstrand):
>  - Reject BO imports if the size doesn't match the prime fd size as
>reported by lseek().
> 
> v3 (Jason Ekstrand):
>  - Fix reference counting around cache_release (Chris Willson)
>  - Move the mutex_unlock() later in cache_release
> ---
>  src/intel/vulkan/anv_allocator.c | 261 
> +++
>  src/intel/vulkan/anv_private.h   |  26 
>  2 files changed, 287 insertions(+)



> +VkResult
> +anv_bo_cache_import(struct anv_device *device,
> +struct anv_bo_cache *cache,
> +int fd, uint64_t size, struct anv_bo **bo_out,
> +VkAllocationCallbacks *alloc)
> +{
> +   pthread_mutex_lock(&cache->mutex);
> +
> +   /* The kernel is going to give us whole pages anyway */
> +   size = align_u64(size, 4096);
> +
> +   uint32_t gem_handle = anv_gem_fd_to_handle(device, fd);
> +   if (!gem_handle) {
> +  pthread_mutex_unlock(&cache->mutex);
> +  return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX);
> +   }
> +
> +   struct anv_cached_bo *bo = anv_bo_cache_lookup_locked(cache, gem_handle);
> +   if (bo) {
> +  assert(bo->bo.size == size);

For security's sake, we must always check the size, even if the bo is in
the cache. An assertion isn't enough. A malicious client may tell the
truth the first time when sending the fd to the trusted app. The
malicious client may then dup the fd and send it to the trusted app with
a false size.

> +  __sync_fetch_and_add(&bo->refcount, 1);
> +   } else {
> +  /* For security purposes, we reject BO imports where the size does not
> +   * match exactly.  This prevents a malicious client from passing a
> +   * buffer to a trusted client, lying about the size, and telling the
> +   * trusted client to try and texture from an image that goes
> +   * out-of-bounds.  This sort of thing could lead to GPU hangs or worse
> +   * in the trusted client.  The trusted client can protect itself 
> against
> +   * this sort of attack but only if it can trust the buffer size.
> +   */

> +  off_t import_size = lseek(fd, 0, SEEK_END);
> +  if (import_size == (off_t)-1 || import_size != size) {
> + anv_gem_close(device, gem_handle);
> + pthread_mutex_unlock(&cache->mutex);
> + return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX);
> +  }
> +
> +  struct anv_cached_bo *bo =
> + vk_alloc(alloc, size, 8, VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
> +  if (!bo) {
> + anv_gem_close(device, gem_handle);
> + pthread_mutex_unlock(&cache->mutex);
> + return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
> +  }
> +
> +  bo->refcount = 1;
> +
> +  anv_bo_init(&bo->bo, gem_handle, size);
> +
> +  _mesa_hash_table_insert(cache->bo_map, (void *)(uintptr_t)gem_handle, 
> bo);
> +   }
> +
> +   pthread_mutex_unlock(&cache->mutex);
> +
> +   /* From the Vulkan spec:
> +*
> +*"Importing memory from a file descriptor transfers ownership of
> +*the file descriptor from the application to the Vulkan
> +*implementation. The application must not perform any operations on
> +*the file descriptor after a successful import."
> +*
> +* If the import fails, we leave the file descriptor open.
> +*/
> +   close(fd);
> +
> +   *bo_out = &bo->bo;
> +
> +   return VK_SUCCESS;
> +}
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 07/18] anv/allocator: Add a BO cache

2017-04-03 Thread Jason Ekstrand
On Mon, Apr 3, 2017 at 4:26 PM, Chad Versace 
wrote:

> On Wed 15 Mar 2017, Jason Ekstrand wrote:
> > This cache allows us to easily ensure that we have a unique anv_bo for
> > each gem handle.  We'll need this in order to support multiple-import of
> > memory objects and semaphores.
> >
> > v2 (Jason Ekstrand):
> >  - Reject BO imports if the size doesn't match the prime fd size as
> >reported by lseek().
> >
> > v3 (Jason Ekstrand):
> >  - Fix reference counting around cache_release (Chris Willson)
> >  - Move the mutex_unlock() later in cache_release
> > ---
> >  src/intel/vulkan/anv_allocator.c | 261 ++
> +
> >  src/intel/vulkan/anv_private.h   |  26 
> >  2 files changed, 287 insertions(+)
>
>
>
> > +VkResult
> > +anv_bo_cache_import(struct anv_device *device,
> > +struct anv_bo_cache *cache,
> > +int fd, uint64_t size, struct anv_bo **bo_out,
> > +VkAllocationCallbacks *alloc)
> > +{
> > +   pthread_mutex_lock(&cache->mutex);
> > +
> > +   /* The kernel is going to give us whole pages anyway */
> > +   size = align_u64(size, 4096);
> > +
> > +   uint32_t gem_handle = anv_gem_fd_to_handle(device, fd);
> > +   if (!gem_handle) {
> > +  pthread_mutex_unlock(&cache->mutex);
> > +  return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX);
> > +   }
> > +
> > +   struct anv_cached_bo *bo = anv_bo_cache_lookup_locked(cache,
> gem_handle);
> > +   if (bo) {
> > +  assert(bo->bo.size == size);
>
> For security's sake, we must always check the size, even if the bo is in
> the cache. An assertion isn't enough. A malicious client may tell the
> truth the first time when sending the fd to the trusted app. The
> malicious client may then dup the fd and send it to the trusted app with
> a false size.
>

Yup.  Fixed locally.

--Jason


> > +  __sync_fetch_and_add(&bo->refcount, 1);
> > +   } else {
> > +  /* For security purposes, we reject BO imports where the size
> does not
> > +   * match exactly.  This prevents a malicious client from passing a
> > +   * buffer to a trusted client, lying about the size, and telling
> the
> > +   * trusted client to try and texture from an image that goes
> > +   * out-of-bounds.  This sort of thing could lead to GPU hangs or
> worse
> > +   * in the trusted client.  The trusted client can protect itself
> against
> > +   * this sort of attack but only if it can trust the buffer size.
> > +   */
>
> > +  off_t import_size = lseek(fd, 0, SEEK_END);
> > +  if (import_size == (off_t)-1 || import_size != size) {
> > + anv_gem_close(device, gem_handle);
> > + pthread_mutex_unlock(&cache->mutex);
> > + return vk_error(VK_ERROR_INVALID_EXTERNAL_HANDLE_KHX);
> > +  }
> > +
> > +  struct anv_cached_bo *bo =
> > + vk_alloc(alloc, size, 8, VK_SYSTEM_ALLOCATION_SCOPE_OBJECT);
> > +  if (!bo) {
> > + anv_gem_close(device, gem_handle);
> > + pthread_mutex_unlock(&cache->mutex);
> > + return vk_error(VK_ERROR_OUT_OF_HOST_MEMORY);
> > +  }
> > +
> > +  bo->refcount = 1;
> > +
> > +  anv_bo_init(&bo->bo, gem_handle, size);
> > +
> > +  _mesa_hash_table_insert(cache->bo_map, (void
> *)(uintptr_t)gem_handle, bo);
> > +   }
> > +
> > +   pthread_mutex_unlock(&cache->mutex);
> > +
> > +   /* From the Vulkan spec:
> > +*
> > +*"Importing memory from a file descriptor transfers ownership of
> > +*the file descriptor from the application to the Vulkan
> > +*implementation. The application must not perform any
> operations on
> > +*the file descriptor after a successful import."
> > +*
> > +* If the import fails, we leave the file descriptor open.
> > +*/
> > +   close(fd);
> > +
> > +   *bo_out = &bo->bo;
> > +
> > +   return VK_SUCCESS;
> > +}
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] [RFC] mesa/glthread: misaligned address access

2017-04-03 Thread Timothy Arceri


On 04/04/17 05:12, Bartosz Tomczyk wrote:

Address sanitizer reports lot of misaligned access:
SUMMARY: AddressSanitizer: undefined-behavior main/marshal.c:276:31 in
main/marshal.c:276:31: runtime error: load of misaligned address 0x631000104866 
for type
'const GLuint' (aka 'const unsigned int'), which requires 4 byte alignment
0x631000104866: note: pointer points here
 92 88 00 00 00 00  00 00 4a 03 0c 00 93 88  00 00 00 00 00 00 02 01  0c 00 40 
8d 00 00 00 00  00 00
 ^
SUMMARY: AddressSanitizer: undefined-behavior main/marshal_generated.c:28725:12 
in
main/marshal_generated.c:28726:12: runtime error: member access within 
misaligned address 0x6310003fc874 for type
'struct marshal_cmd_VertexAttribPointer', which requires 8 byte alignment
0x6310003fc874: note: pointer points here
  01 00 00 00 7a 02 20 00  00 00 00 00 be be be be  be be be be be be be be  be 
be be be be be be be
  ^
SUMMARY: AddressSanitizer: undefined-behavior main/marshal_generated.c:28726:12 
in
main/marshal_generated.c:28726:12: runtime error: store to misaligned address 
0x6310003fc87c for type
'GLint' (aka 'int'), which requires 8 byte alignment
0x6310003fc87c: note: pointer points here
  00 00 00 00 be be be be  be be be be be be be be  be be be be be be be be  be 
be be be be be be be

This probably isn't issue for x86 as misaligned access shoul be as fast
as aligned ones. But this could be issue for different architectures
like ARM/MIPS, any exert here ?

Any idea how to get correct aligment on different platforms ?


Thanks. I've had the same patch locally for a while, I think it makes 
sense to push this as is for now (which I've done). We can adjust for 
other platforms later if needed.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


  1   2   >