Re: [Mesa-dev] [PATCH v2 1/2] vl: add a lanczos interpolation filter v2

2016-07-25 Thread Andy Furniss

Nayan Deshmukh wrote:

Thanks for testing :)

On Monday, July 25, 2016, Andy Furniss  wrote:


Nayan Deshmukh wrote:


Hi Christian,

I have sent the new patches, they should fix all the artifacts. :)



I have briefly tried these over time and v3 1/2 + v2 2/2 still show
artifacts for me.



What are these artifacts? can you please tell me about these artifacts and
if possible also send me the videos where this are happening.


Most videos will show it, though it varies with level and scaling amount.

The Pendulum vid I uploaded in the bicubic thread will show it.

https://drive.google.com/file/d/0BxP5-S1t9VEEaHZEM203RFpyNEE/view?usp=sharing

The artifacts vary between similar to what Christian posted, sometimes
half the image missing. fullscreen looks different to unscaled with
pendulum.

Here's a screen recording going through the levels window then fullscreen
starting at bicubic to give a "good" as reference.

https://drive.google.com/file/d/0BxP5-S1t9VEEYXBFZ3dNeVJoRnc/view?usp=sharing





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/15] ddebug: use a debug context for GPU hang debugging only

2016-07-25 Thread Nicolai Hähnle

On 23.07.2016 02:14, Marek Olšák wrote:

From: Marek Olšák 

---
 src/gallium/drivers/ddebug/dd_screen.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/ddebug/dd_screen.c 
b/src/gallium/drivers/ddebug/dd_screen.c
index 46869ab..d4a50ac 100644
--- a/src/gallium/drivers/ddebug/dd_screen.c
+++ b/src/gallium/drivers/ddebug/dd_screen.c
@@ -116,7 +116,8 @@ dd_screen_context_create(struct pipe_screen *_screen, void 
*priv,
struct dd_screen *dscreen = dd_screen(_screen);
struct pipe_screen *screen = dscreen->screen;

-   flags |= PIPE_CONTEXT_DEBUG;
+   if (dscreen->mode == DD_DETECT_HANGS)
+  flags |= PIPE_CONTEXT_DEBUG;


I don't like this change. Dumping command buffers with the 
GALLIUM_DDEBUG=always option has helped me find state tracking bugs.


Nicolai



return dd_context_create(dscreen,
 screen->context_create(screen, priv, flags));


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/9] st/va: add conversion for yv12 to nv12in putimage v2

2016-07-25 Thread Christian König

Am 23.07.2016 um 01:51 schrieb Andy Furniss:

Christian König wrote:

From: Boyuan Zhang 

For putimage call, if image format is yv12 (or IYUV with U V field swap)


This comment confuses me
AIUI + checking on fourcc.org yv12 is YVU and IYUV/I420 is YUV and 
nv12  is UVUVUV... so compared to the normal way of writing yuv/yCbCr

I wouldn't call IYUV as being the one "with U V field swap"


I have to confes I didn't understood the comment either, but I also 
didn't spend to much time on it.


To make nails with head I just gone ahead and committed the patchset :)

Please open bug reports for any remaining issues.

Cheers,
Christian.




and
surface format is nv12, then we need to convert yv12 to nv12 and then 
copy
the converted data from image to surface. We can't use the existing 
logic

where surface is destroyed and re-created with yv12 format.

v2 (chk): fix some compiler warnings and commit message

Signed-off-by: Boyuan Zhang 
Signed-off-by: Christian König 
---
  src/gallium/state_trackers/va/image.c | 34 
+++---

  1 file changed, 27 insertions(+), 7 deletions(-)

diff --git a/src/gallium/state_trackers/va/image.c 
b/src/gallium/state_trackers/va/image.c

index 1b956e3..0364556 100644
--- a/src/gallium/state_trackers/va/image.c
+++ b/src/gallium/state_trackers/va/image.c
@@ -471,7 +471,9 @@ vlVaPutImage(VADriverContextP ctx, VASurfaceID 
surface, VAImageID image,

return VA_STATUS_ERROR_OPERATION_FAILED;
 }

-   if (format != surf->buffer->buffer_format) {
+   if ((format != surf->buffer->buffer_format) &&
+ ((format != PIPE_FORMAT_YV12) || 
(surf->buffer->buffer_format != PIPE_FORMAT_NV12)) &&
+ ((format != PIPE_FORMAT_IYUV) || 
(surf->buffer->buffer_format != PIPE_FORMAT_NV12))) {

struct pipe_video_buffer *tmp_buf;
struct pipe_video_buffer templat = surf->templat;

@@ -513,12 +515,30 @@ vlVaPutImage(VADriverContextP ctx, VASurfaceID 
surface, VAImageID image,

unsigned width, height;
if (!views[i]) continue;
vlVaVideoSurfaceSize(surf, i, &width, &height);
-  for (j = 0; j < views[i]->texture->array_size; ++j) {
- struct pipe_box dst_box = {0, 0, j, width, height, 1};
- drv->pipe->transfer_inline_write(drv->pipe, 
views[i]->texture, 0,

-PIPE_TRANSFER_WRITE, &dst_box,
-data[i] + pitches[i] * j,
-pitches[i] * views[i]->texture->array_size, 0);
+  if (((format == PIPE_FORMAT_YV12) || (format == 
PIPE_FORMAT_IYUV)) &&

+(surf->buffer->buffer_format == PIPE_FORMAT_NV12)) {
+ struct pipe_transfer *transfer = NULL;
+ uint8_t *map = NULL;
+ struct pipe_box dst_box_1 = {0, 0, 0, width, height, 1};
+ map = drv->pipe->transfer_map(drv->pipe,
+   views[i]->texture,
+   0,
+ PIPE_TRANSFER_DISCARD_RANGE,
+   &dst_box_1, &transfer);
+ if (map == NULL)
+return VA_STATUS_ERROR_OPERATION_FAILED;
+
+ u_copy_yv12_img_to_nv12_surf ((ubyte * const*)data, map, 
width, height,

+   pitches[i], transfer->stride, i);
+ pipe_transfer_unmap(drv->pipe, transfer);
+  } else {
+ for (j = 0; j < views[i]->texture->array_size; ++j) {
+struct pipe_box dst_box = {0, 0, j, width, height, 1};
+ drv->pipe->transfer_inline_write(drv->pipe, views[i]->texture, 0,
+ PIPE_TRANSFER_WRITE, &dst_box,
+ data[i] + pitches[i] * j,
+ pitches[i] * 
views[i]->texture->array_size, 0);

+ }
}
 }
 pipe_mutex_unlock(drv->mutex);





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/15] Gallium ddebug module: Pipelined GPU hang detection

2016-07-25 Thread Nicolai Hähnle

Apart for patch 5, the series is

Reviewed-by: Nicolai Hähnle 

On 23.07.2016 02:14, Marek Olšák wrote:

Hi,

This is for GPU hangs that are hard to reproduce and require interactive
playing for minutes or even hours.

The performance should be at least 50% of the performance without ddebug.
The added CPU overhead is mainly due to recording all states after every
draw call. The added GPU overhead is PS/CS partial flushes and clear_buffer
for writing a user fence. There are no cache flushes between draw calls.

The command is:
  GALLIUM_DDEBUG="pipelined 2000" [executable]

The generated hang report contains everything except the parsed IB and
the buffer list.

My strategy for rare random GPU hangs is to get several hang reports and
see if they have anything in common.

Please review.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] main: memcpy larger chunks in _mesa_propagate_uniforms_to_driver_storage

2016-07-25 Thread Nicolai Hähnle

Pushed, thanks!

On 22.07.2016 13:10, Nils Wallménius wrote:

When possible, do the memcpy on larger blocks. This reduces cycles
spent in _mesa_propagate_uniforms_to_driver_storage from
1.51 % to 0.62% according to perf during the Unigine Heaven benchmark.
It did not affect the framerate of the benchmark. The system used for
testing was an i5 6600K with a Radeon R9 380.

Piglit hangs randomly on this system both with and without the patch
so i could not make a comparison.

v2: fixed whitespace

Signed-off-by: Nils Wallménius 
Reviewed-by: Nicolai Hähnle 
---
 src/mesa/main/uniform_query.cpp | 29 +++--
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp
index ab22a0e..b9b9ff2 100644
--- a/src/mesa/main/uniform_query.cpp
+++ b/src/mesa/main/uniform_query.cpp
@@ -578,14 +578,31 @@ _mesa_propagate_uniforms_to_driver_storage(struct 
gl_uniform_storage *uni,
 unsigned j;
 unsigned v;

-for (j = 0; j < count; j++) {
-   for (v = 0; v < vectors; v++) {
-  memcpy(dst, src, src_vector_byte_stride);
-  src += src_vector_byte_stride;
-  dst += store->vector_stride;
+if (src_vector_byte_stride == store->vector_stride) {
+   if (extra_stride) {
+  for (j = 0; j < count; j++) {
+ memcpy(dst, src, src_vector_byte_stride * vectors);
+ src += src_vector_byte_stride * vectors;
+ dst += store->vector_stride * vectors;
+
+ dst += extra_stride;
+  }
+   } else {
+  /* Unigine Heaven benchmark gets here */
+  memcpy(dst, src, src_vector_byte_stride * vectors * count);
+  src += src_vector_byte_stride * vectors * count;
+  dst += store->vector_stride * vectors * count;
}
+} else {
+   for (j = 0; j < count; j++) {
+  for (v = 0; v < vectors; v++) {
+ memcpy(dst, src, src_vector_byte_stride);
+ src += src_vector_byte_stride;
+ dst += store->vector_stride;
+  }

-   dst += extra_stride;
+  dst += extra_stride;
+   }
 }
 break;
   }


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 93551] Divinity: Original Sin Enhanced Edition(Native) crash on start

2016-07-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=93551

--- Comment #33 from Mikhail Korolev  ---
Created attachment 125311
  --> https://bugs.freedesktop.org/attachment.cgi?id=125311&action=edit
divos-hack.patch

(In reply to Thomas J. Moore from comment #32)
> Created attachment 125302 [details]
> Simple LD_PRELOAD shim to apply necessary patches for divos
> 
> Game works great for me with the above patches (thanks to those who figured
> this out!).  However, since they are not likely to be incorporated into
> Mesa, and patching my system Mesa just for one poorly written game is a bad
> idea, I think one of two alternate solutions needs to be provided.  My
> preference would be to patch the game binaries.  I don't really want to mess
> with that right now, though (especially since I can't easily locate where it
> does the vendor check).  The other would be to provide the patches in the
> form of an LD_PRELOAD shim.  I have attached the source code for one that
> seems to work for me.

Your code calls dlsym at every call of glGetString/glXGetProcAddressARB instead
of call it only at startup. Fix in attachments.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 93551] Divinity: Original Sin Enhanced Edition(Native) crash on start

2016-07-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=93551

--- Comment #34 from Mikhail Korolev  ---
(In reply to Thomas J. Moore from comment #32)
> Created attachment 125302 [details]
> Simple LD_PRELOAD shim to apply necessary patches for divos
> 
> Game works great for me with the above patches (thanks to those who figured
> this out!).  However, since they are not likely to be incorporated into
> Mesa, and patching my system Mesa just for one poorly written game is a bad
> idea, I think one of two alternate solutions needs to be provided.  My
> preference would be to patch the game binaries.  I don't really want to mess
> with that right now, though (especially since I can't easily locate where it
> does the vendor check).  The other would be to provide the patches in the
> form of an LD_PRELOAD shim.  I have attached the source code for one that
> seems to work for me.

As for vendor check location:

 [ /media/Storage/Games/GOG/Linux/DivinityOriginalSinEnhancedEdition/game ] $
GAME_DIR="/media/Storage/Games/GOG/Linux/DivinityOriginalSinEnhancedEdition"
MESA_GL_VERSION_OVERRIDE=4.2  MESA_GLSL_VERSION_OVERRIDE=420 
LD_PRELOAD="${GAME_DIR}/workaround/divos-hack-f.so" 
LD_LIBRARY_PATH="${GAME_DIR}/game" gdb -q ${GAME_DIR}/game/EoCApp
Reading symbols from
/media/Storage/Games/GOG/Linux/DivinityOriginalSinEnhancedEdition/game/EoCApp...(no
debugging symbols found)...done.
(gdb) b divos-hack-f.c:38
No symbol table is loaded.  Use the "file" command.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (divos-hack-f.c:38) pending.
(gdb) ru
Starting program:
/media/Storage/Games/GOG/Linux/DivinityOriginalSinEnhancedEdition/game/EoCApp 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7fffe4827700 (LWP 1452)]

Thread 1 "EoCApp" hit Breakpoint 1, glGetString (name=7936) at
divos-hack-f.c:38
38  return (const GLubyte *)vendor;
(gdb) bt
#0  glGetString (name=7936) at divos-hack-f.c:38
#1  0x74624dce in api::OpenGLRenderer::OpenGLRenderer(api::IAPI*,
void*) ()
   from
/media/Storage/Games/GOG/Linux/DivinityOriginalSinEnhancedEdition/game/libOGLBinding.so
#2  0x74623f79 in api::OpenGLAPI::CreateRenderer() () from
/media/Storage/Games/GOG/Linux/DivinityOriginalSinEnhancedEdition/game/libOGLBinding.so
#3  0x74623b23 in api::OpenGLAPI::Init() () from
/media/Storage/Games/GOG/Linux/DivinityOriginalSinEnhancedEdition/game/libOGLBinding.so
#4  0x744309aa in BaseApp::InitAPI() () from
/media/Storage/Games/GOG/Linux/DivinityOriginalSinEnhancedEdition/game/libGameEngine.so
#5  0x7442f578 in BaseApp::Start(ls::InitStruct*) () from
/media/Storage/Games/GOG/Linux/DivinityOriginalSinEnhancedEdition/game/libGameEngine.so
#6  0x006d5160 in main ()
(gdb)

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Rename the DEBUG macro to MESA_DEBUG

2016-07-25 Thread Brian Paul

On 07/22/2016 11:40 AM, Jose Fonseca wrote:

(2nd try. 1st email is being held due to size.)

On 21/07/16 18:48, Vedran Miletić wrote:

LLVM and Mesa both define the DEBUG macro in incompatible ways. As a
general practice, we should avoid using such generic names when it is
possible to do so.

This patch renames all occurrences of the DEBUG macro to MESA_DEBUG,
and removes workarounds previously used to enable building Mesa with
LLVM (pop_macro() and push_macro() function calls).

Please let me know if I missed any.

Signed-off-by: Vedran Miletić 
---
  configure.ac   |  2 +-
  src/compiler/glsl/ir_validate.cpp  |  4 +-
  src/compiler/nir/nir.h |  6 +-
  src/compiler/nir/nir_metadata.c|  4 +-
  src/compiler/nir/nir_validate.c|  5 +-
  src/egl/drivers/haiku/egl_haiku.cpp|  6 +-
  src/egl/main/eglconfig.c   |  6 +-
  src/gallium/auxiliary/draw/draw_cliptest_tmp.h |  4 +-
  src/gallium/auxiliary/gallivm/lp_bld_debug.h   | 12 ++--
  src/gallium/auxiliary/gallivm/lp_bld_init.c| 16 +++---
  src/gallium/auxiliary/gallivm/lp_bld_misc.cpp  | 23 ++--
  src/gallium/auxiliary/gallivm/lp_bld_struct.c  | 16 +++---
  src/gallium/auxiliary/os/os_memory.h   |  6 +-
  src/gallium/auxiliary/os/os_misc.c |  4 +-
  src/gallium/auxiliary/os/os_misc.h |  6 +-
  .../auxiliary/pipebuffer/pb_buffer_fenced.c| 10 ++--
  src/gallium/auxiliary/pipebuffer/pb_bufmgr_debug.c |  6 +-
  src/gallium/auxiliary/tgsi/tgsi_exec.c | 16 +++---
  src/gallium/auxiliary/tgsi/tgsi_ureg.c |  8 +--
  src/gallium/auxiliary/util/u_cache.c   | 16 +++---
  src/gallium/auxiliary/util/u_cpu_detect.c  |  8 +--
  src/gallium/auxiliary/util/u_debug.c   | 18 +++---
  src/gallium/auxiliary/util/u_debug.h   | 66
+++---
  src/gallium/auxiliary/util/u_debug_flush.c |  4 +-
  src/gallium/auxiliary/util/u_debug_flush.h |  6 +-
  src/gallium/auxiliary/util/u_debug_image.c |  4 +-
  src/gallium/auxiliary/util/u_debug_image.h |  8 +--
  src/gallium/drivers/freedreno/ir3/ir3.c| 16 +++---
  src/gallium/drivers/freedreno/ir3/ir3.h| 18 +++---
  src/gallium/drivers/freedreno/ir3/ir3_print.c  |  4 +-
  src/gallium/drivers/freedreno/ir3/ir3_ra.c |  4 +-
  src/gallium/drivers/i915/i915_debug.c  |  6 +-
  src/gallium/drivers/i915/i915_debug.h  |  6 +-
  src/gallium/drivers/ilo/core/ilo_debug.h   | 17 +++---
  src/gallium/drivers/llvmpipe/lp_debug.h|  6 +-
  src/gallium/drivers/llvmpipe/lp_perf.h |  6 +-
  src/gallium/drivers/llvmpipe/lp_rast.c |  4 +-
  src/gallium/drivers/llvmpipe/lp_rast.h |  4 +-
  src/gallium/drivers/llvmpipe/lp_rast_priv.h|  6 +-
  src/gallium/drivers/llvmpipe/lp_scene.c|  4 +-
  src/gallium/drivers/llvmpipe/lp_screen.c   |  8 +--
  src/gallium/drivers/llvmpipe/lp_setup_line.c   |  4 +-
  src/gallium/drivers/llvmpipe/lp_setup_point.c  |  4 +-
  src/gallium/drivers/llvmpipe/lp_state_sampler.c|  4 +-
  src/gallium/drivers/llvmpipe/lp_test_main.c|  4 +-
  src/gallium/drivers/llvmpipe/lp_texture.c  | 24 
  src/gallium/drivers/llvmpipe/lp_texture.h  |  4 +-
  .../drivers/nouveau/codegen/nv50_ir_driver.h   |  6 +-
  .../drivers/nouveau/codegen/nv50_ir_inlines.h  |  4 +-
  src/gallium/drivers/nouveau/nouveau_screen.h   |  4 +-
  src/gallium/drivers/nouveau/nouveau_statebuf.h | 10 ++--
  src/gallium/drivers/nouveau/nv50/nv50_program.c|  6 +-
  src/gallium/drivers/nouveau/nvc0/nvc0_program.c| 14 ++---
  src/gallium/drivers/nouveau/nvc0/nve4_compute.c| 12 ++--
  src/gallium/drivers/r300/r300_cb.h |  6 +-
  src/gallium/drivers/r300/r300_context.c|  6 +-
  src/gallium/drivers/r300/r300_cs.h |  6 +-
  src/gallium/drivers/softpipe/sp_tex_sample.c   |  4 +-
  src/gallium/drivers/svga/svga_debug.h  |  8 +--
  src/gallium/drivers/svga/svga_draw.c   |  6 +-
  src/gallium/drivers/svga/svga_format.c |  6 +-
  src/gallium/drivers/svga/svga_pipe_draw.c  |  4 +-
  .../drivers/svga/svga_resource_buffer_upload.c |  4 +-
  src/gallium/drivers/svga/svga_screen.c | 18 +++---
  src/gallium/drivers/svga/svga_screen.h |  6 +-
  src/gallium/drivers/svga/svga_state.c  |  6 +-
  src/gallium/drivers/svga/svga_state_constants.c|  4 +-
  src/gallium/drivers/svga/svga_state_fs.c   | 10 ++--
  .../drivers/swr/rasterizer/jitter/JitManager.cpp   |  5 --
  .../drivers/swr/rasterizer/jitter/JitManager.h |  6 --
  src/gallium/drivers/swr/swr_shader.cpp |  4 --
  src/gallium/drivers/s

Re: [Mesa-dev] [PATCH] Rename the DEBUG macro to MESA_DEBUG

2016-07-25 Thread Christian König

Am 22.07.2016 um 17:21 schrieb Emil Velikov:

On 22 July 2016 at 09:42, Christian König  wrote:

Am 22.07.2016 um 03:37 schrieb Rob Clark:

On Thu, Jul 21, 2016 at 9:35 PM, Rob Clark  wrote:

On Thu, Jul 21, 2016 at 1:48 PM, Vedran Miletić 
wrote:

LLVM and Mesa both define the DEBUG macro in incompatible ways. As a
general practice, we should avoid using such generic names when it is
possible to do so.

This patch renames all occurrences of the DEBUG macro to MESA_DEBUG,
and removes workarounds previously used to enable building Mesa with
LLVM (pop_macro() and push_macro() function calls).

Please let me know if I missed any.

I guess at least some in-flight patches (at least my
pipe_mutex_assert_locked() patch, but I guess DEBUG is common enough
that it might effect others).. not sure if there is a better way to
deal with that without things falling through the cracks..  maybe
introduce MESA_DEBUG which is the same as DEBUG first, and then a
later patch to remove DEBUG.  Or at least including sed/etc rule to
re-do the mass-change on a later baseline in the commit msg?

I don't mind rebasing my patch, just more worried about things falling
through the cracks with other in-progress stuff, since it seems like
the end result would be a silent fail to enable intended debug code..

btw, possibly tilting at windmills here, but afaik we don't export
DEBUG outside the mesa codebase.. so actually it should be llvm that
s/DEBUG/LLVM_DEBUG/


I already had the same issue with other libraries/headers as well which
define DEBUG as something.


Out of curiosity: can you give some examples ?


Bellagio for example. Took me a moment to realize where the build 
failure in their headers where coming from.


Christian.




I clearly agree that those libraries shouldn't do that with such a common
name, but renaming the Mesa DEBUG define to something more library specific
would still be a good idea to avoid such problems in the future.

So general approach is Acked-by: Christian König 


Note that doing this will likely break things for the VMWare people
since (IIRC) on Windows/MSVC DEBUG is commonly used/set by the
compiler.

Jose can you confirm/dismiss if this will cause issues ?

-Emil



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/9] st/mesa: completely rewrite state atoms

2016-07-25 Thread Brian Paul

On 07/18/2016 07:11 AM, Marek Olšák wrote:

From: Marek Olšák 

The goal is to do this in st_validate_state:
while (dirty)
   atoms[u_bit_scan(&dirty)]->update(st);

That implies that atoms can't specify which flags they consume.
There is exactly one ST_NEW_* flag for each atom. (58 flags in total)

There are macros that combine multiple flags into one for easier use.

All _NEW_* flags are translated into ST_NEW_* flags in st_invalidate_state.
st/mesa doesn't keep the _NEW_* flags after that.

torcs is 2% faster between the previous patch and the end of this series.
---
  src/mesa/state_tracker/st_atom.c   | 153 +-
  src/mesa/state_tracker/st_atom.h   | 210 +
  src/mesa/state_tracker/st_atom_array.c |   4 -
  src/mesa/state_tracker/st_atom_atomicbuf.c |  24 ---
  src/mesa/state_tracker/st_atom_blend.c |   4 -
  src/mesa/state_tracker/st_atom_clip.c  |   4 -
  src/mesa/state_tracker/st_atom_constbuf.c  |  48 --
  src/mesa/state_tracker/st_atom_depth.c |   4 -
  src/mesa/state_tracker/st_atom_framebuffer.c   |   4 -
  src/mesa/state_tracker/st_atom_image.c |  24 ---
  src/mesa/state_tracker/st_atom_list.h  |  75 +
  src/mesa/state_tracker/st_atom_msaa.c  |   8 -
  src/mesa/state_tracker/st_atom_pixeltransfer.c |   4 -
  src/mesa/state_tracker/st_atom_rasterizer.c|  16 --
  src/mesa/state_tracker/st_atom_sampler.c   |   4 -
  src/mesa/state_tracker/st_atom_scissor.c   |   8 -
  src/mesa/state_tracker/st_atom_shader.c|  24 ---
  src/mesa/state_tracker/st_atom_stipple.c   |   5 -
  src/mesa/state_tracker/st_atom_storagebuf.c|  24 ---
  src/mesa/state_tracker/st_atom_tess.c  |   4 -
  src/mesa/state_tracker/st_atom_texture.c   |  24 ---
  src/mesa/state_tracker/st_atom_viewport.c  |   4 -
  src/mesa/state_tracker/st_cb_bitmap.c  |  10 +-
  src/mesa/state_tracker/st_cb_bufferobjects.c   |  10 +-
  src/mesa/state_tracker/st_cb_compute.c |   2 +-
  src/mesa/state_tracker/st_cb_feedback.c|   2 +-
  src/mesa/state_tracker/st_cb_program.c |  38 ++---
  src/mesa/state_tracker/st_cb_texture.c |   2 +-
  src/mesa/state_tracker/st_context.c| 100 ++--
  src/mesa/state_tracker/st_context.h|  42 +
  src/mesa/state_tracker/st_draw.c   |   4 +-
  src/mesa/state_tracker/st_manager.c|   4 +-
  32 files changed, 377 insertions(+), 516 deletions(-)
  create mode 100644 src/mesa/state_tracker/st_atom_list.h

diff --git a/src/mesa/state_tracker/st_atom.c b/src/mesa/state_tracker/st_atom.c
index 9d5cc0f..5843d2a 100644
--- a/src/mesa/state_tracker/st_atom.c
+++ b/src/mesa/state_tracker/st_atom.c
@@ -37,87 +37,18 @@
  #include "st_manager.h"


-/**
- * This is used to initialize st->render_atoms[].
- */
-static const struct st_tracked_state *render_atoms[] =
-{
-   &st_update_depth_stencil_alpha,
-   &st_update_clip,
-
-   &st_update_fp,
-   &st_update_gp,
-   &st_update_tep,
-   &st_update_tcp,
-   &st_update_vp,
-
-   &st_update_rasterizer,
-   &st_update_polygon_stipple,
-   &st_update_viewport,
-   &st_update_scissor,
-   &st_update_window_rectangles,
-   &st_update_blend,
-   &st_update_vertex_texture,
-   &st_update_fragment_texture,
-   &st_update_geometry_texture,
-   &st_update_tessctrl_texture,
-   &st_update_tesseval_texture,
-   &st_update_sampler, /* depends on update_*_texture for swizzle */
-   &st_bind_vs_images,
-   &st_bind_tcs_images,
-   &st_bind_tes_images,
-   &st_bind_gs_images,
-   &st_bind_fs_images,
-   &st_update_framebuffer, /* depends on update_*_texture and bind_*_images */
-   &st_update_msaa,
-   &st_update_sample_shading,
-   &st_update_vs_constants,
-   &st_update_tcs_constants,
-   &st_update_tes_constants,
-   &st_update_gs_constants,
-   &st_update_fs_constants,
-   &st_bind_vs_ubos,
-   &st_bind_tcs_ubos,
-   &st_bind_tes_ubos,
-   &st_bind_fs_ubos,
-   &st_bind_gs_ubos,
-   &st_bind_vs_atomics,
-   &st_bind_tcs_atomics,
-   &st_bind_tes_atomics,
-   &st_bind_fs_atomics,
-   &st_bind_gs_atomics,
-   &st_bind_vs_ssbos,
-   &st_bind_tcs_ssbos,
-   &st_bind_tes_ssbos,
-   &st_bind_fs_ssbos,
-   &st_bind_gs_ssbos,
-   &st_update_pixel_transfer,
-   &st_update_tess,
-
-   /* this must be done after the vertex program update */
-   &st_update_array
-};
-
-
-/**
- * This is used to initialize st->compute_atoms[].
- */
-static const struct st_tracked_state *compute_atoms[] =
+/* The list state update functions. */
+static const struct st_tracked_state *atoms[] =
  {
-   &st_update_cp,
-   &st_update_compute_texture,
-   &st_update_sampler, /* depends on update_compute_texture for swizzle */
-   &st_update_cs_constants,
-   &st_bind_cs_ubos,
-   &st_bind_cs_atomics,
-   &st_bind_cs_ssbos,
-   &st_bind_cs_images,
+#define ST_STATE(FLAG, st_update) &st_update,
+#include "st_atom_list.h"
+#undef ST_STATE
  };


  void st_

Re: [Mesa-dev] [PATCH 3/9] st/mesa: completely rewrite state atoms

2016-07-25 Thread Rob Clark
On Mon, Jul 25, 2016 at 11:16 AM, Brian Paul  wrote:
> On 07/18/2016 07:11 AM, Marek Olšák wrote:
>>
>> From: Marek Olšák 
>>
>> The goal is to do this in st_validate_state:
>> while (dirty)
>>atoms[u_bit_scan(&dirty)]->update(st);
>>
>> That implies that atoms can't specify which flags they consume.
>> There is exactly one ST_NEW_* flag for each atom. (58 flags in total)
>>
>> There are macros that combine multiple flags into one for easier use.
>>
>> All _NEW_* flags are translated into ST_NEW_* flags in
>> st_invalidate_state.
>> st/mesa doesn't keep the _NEW_* flags after that.
>>
>> torcs is 2% faster between the previous patch and the end of this series.
>> ---
>>   src/mesa/state_tracker/st_atom.c   | 153 +-
>>   src/mesa/state_tracker/st_atom.h   | 210
>> +
>>   src/mesa/state_tracker/st_atom_array.c |   4 -
>>   src/mesa/state_tracker/st_atom_atomicbuf.c |  24 ---
>>   src/mesa/state_tracker/st_atom_blend.c |   4 -
>>   src/mesa/state_tracker/st_atom_clip.c  |   4 -
>>   src/mesa/state_tracker/st_atom_constbuf.c  |  48 --
>>   src/mesa/state_tracker/st_atom_depth.c |   4 -
>>   src/mesa/state_tracker/st_atom_framebuffer.c   |   4 -
>>   src/mesa/state_tracker/st_atom_image.c |  24 ---
>>   src/mesa/state_tracker/st_atom_list.h  |  75 +
>>   src/mesa/state_tracker/st_atom_msaa.c  |   8 -
>>   src/mesa/state_tracker/st_atom_pixeltransfer.c |   4 -
>>   src/mesa/state_tracker/st_atom_rasterizer.c|  16 --
>>   src/mesa/state_tracker/st_atom_sampler.c   |   4 -
>>   src/mesa/state_tracker/st_atom_scissor.c   |   8 -
>>   src/mesa/state_tracker/st_atom_shader.c|  24 ---
>>   src/mesa/state_tracker/st_atom_stipple.c   |   5 -
>>   src/mesa/state_tracker/st_atom_storagebuf.c|  24 ---
>>   src/mesa/state_tracker/st_atom_tess.c  |   4 -
>>   src/mesa/state_tracker/st_atom_texture.c   |  24 ---
>>   src/mesa/state_tracker/st_atom_viewport.c  |   4 -
>>   src/mesa/state_tracker/st_cb_bitmap.c  |  10 +-
>>   src/mesa/state_tracker/st_cb_bufferobjects.c   |  10 +-
>>   src/mesa/state_tracker/st_cb_compute.c |   2 +-
>>   src/mesa/state_tracker/st_cb_feedback.c|   2 +-
>>   src/mesa/state_tracker/st_cb_program.c |  38 ++---
>>   src/mesa/state_tracker/st_cb_texture.c |   2 +-
>>   src/mesa/state_tracker/st_context.c| 100 ++--
>>   src/mesa/state_tracker/st_context.h|  42 +
>>   src/mesa/state_tracker/st_draw.c   |   4 +-
>>   src/mesa/state_tracker/st_manager.c|   4 +-
>>   32 files changed, 377 insertions(+), 516 deletions(-)
>>   create mode 100644 src/mesa/state_tracker/st_atom_list.h
>>
>> diff --git a/src/mesa/state_tracker/st_atom.c
>> b/src/mesa/state_tracker/st_atom.c
>> index 9d5cc0f..5843d2a 100644
>> --- a/src/mesa/state_tracker/st_atom.c
>> +++ b/src/mesa/state_tracker/st_atom.c
>> @@ -37,87 +37,18 @@
>>   #include "st_manager.h"
>>
>>
>> -/**
>> - * This is used to initialize st->render_atoms[].
>> - */
>> -static const struct st_tracked_state *render_atoms[] =
>> -{
>> -   &st_update_depth_stencil_alpha,
>> -   &st_update_clip,
>> -
>> -   &st_update_fp,
>> -   &st_update_gp,
>> -   &st_update_tep,
>> -   &st_update_tcp,
>> -   &st_update_vp,
>> -
>> -   &st_update_rasterizer,
>> -   &st_update_polygon_stipple,
>> -   &st_update_viewport,
>> -   &st_update_scissor,
>> -   &st_update_window_rectangles,
>> -   &st_update_blend,
>> -   &st_update_vertex_texture,
>> -   &st_update_fragment_texture,
>> -   &st_update_geometry_texture,
>> -   &st_update_tessctrl_texture,
>> -   &st_update_tesseval_texture,
>> -   &st_update_sampler, /* depends on update_*_texture for swizzle */
>> -   &st_bind_vs_images,
>> -   &st_bind_tcs_images,
>> -   &st_bind_tes_images,
>> -   &st_bind_gs_images,
>> -   &st_bind_fs_images,
>> -   &st_update_framebuffer, /* depends on update_*_texture and
>> bind_*_images */
>> -   &st_update_msaa,
>> -   &st_update_sample_shading,
>> -   &st_update_vs_constants,
>> -   &st_update_tcs_constants,
>> -   &st_update_tes_constants,
>> -   &st_update_gs_constants,
>> -   &st_update_fs_constants,
>> -   &st_bind_vs_ubos,
>> -   &st_bind_tcs_ubos,
>> -   &st_bind_tes_ubos,
>> -   &st_bind_fs_ubos,
>> -   &st_bind_gs_ubos,
>> -   &st_bind_vs_atomics,
>> -   &st_bind_tcs_atomics,
>> -   &st_bind_tes_atomics,
>> -   &st_bind_fs_atomics,
>> -   &st_bind_gs_atomics,
>> -   &st_bind_vs_ssbos,
>> -   &st_bind_tcs_ssbos,
>> -   &st_bind_tes_ssbos,
>> -   &st_bind_fs_ssbos,
>> -   &st_bind_gs_ssbos,
>> -   &st_update_pixel_transfer,
>> -   &st_update_tess,
>> -
>> -   /* this must be done after the vertex program update */
>> -   &st_update_array
>> -};
>> -
>> -
>> -/**
>> - * This is used to initialize st->compute_atoms[].
>> - */
>> -static const struct st_tracked_state *compute_atoms[] =
>> +/* The list 

[Mesa-dev] [Bug 93551] Divinity: Original Sin Enhanced Edition(Native) crash on start

2016-07-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=93551

--- Comment #35 from Thomas J. Moore  ---
(In reply to Mikhail Korolev from comment #33)
 > Your code calls dlsym at every call of glGetString/glXGetProcAddressARB
> instead of call it only at startup.

If so, your version of gcc is broken.  Just in case my gcc is broken as well, I
checked with gdb and the lines of code executing dlsym are executed exactly
once, while the wrappers themselves are called more than once.  The only error
I can see on reviewing my code is that I misspelled Enhanced, which is not
worth correcting.  That's not to say my code is perfect, though.

> Fix in attachments.

If you say so.  Thanks for trying, at least.  Seems more complex to me with
little benefit (the benefit being no longer executing the NULL check every
call, which probably takes a few nanoseconds).  In fact, using _init like that
makes me uncomfortable, since the time of symbol resolution is less obvious. 
If it works, it works, though.

(In reply to Mikhail Korolev from comment #33)
> As for vendor check location:

Thanks.  I guess what I really meant to say is that I no longer have the
patience and dedication needed to properly reverse engineer and provide patches
for the code.  Last time I did that was over 20 years ago
(http://aminet.net/package/disk/misc/cdfix).

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] st_glsl_to_tgsi: only skip over slots of an input array that are present

2016-07-25 Thread Nicolai Hähnle
From: Nicolai Hähnle 

When an application declares varying arrays but does not actually do any
indirect indexing, some array indices may end up unused in the consuming
shader, so the number of input slots that correspond to the array ends
up less than the array_size.

Cc: mesa-sta...@lists.freedesktop.org
---
See also the shader_runner Piglit test that I sent out a moment ago.

 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp 
b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
index 7564119..38e2c4a 100644
--- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
+++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
@@ -6058,7 +6058,11 @@ st_translate_program(
   inputSemanticName[i], inputSemanticIndex[i],
   interpMode[i], 0, interpLocation[i],
   array_id, array_size);
-i += array_size - 1;
+
+GLuint base_attr = inputSlotToAttr[i];
+while (i + 1 < numInputs &&
+   inputSlotToAttr[i + 1] < base_attr + array_size)
+   ++i;
  }
  else {
 t->inputs[i] = ureg_DECL_fs_input_cyl_centroid(ureg,
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/2] vl: add a lanczos interpolation filter v2

2016-07-25 Thread Nayan Deshmukh
On Mon, Jul 25, 2016 at 4:30 PM, Andy Furniss  wrote:

> Nayan Deshmukh wrote:
>
>> Thanks for testing :)
>>
>> On Monday, July 25, 2016, Andy Furniss  wrote:
>>
>> Nayan Deshmukh wrote:
>>>
>>> Hi Christian,

 I have sent the new patches, they should fix all the artifacts. :)


>>> I have briefly tried these over time and v3 1/2 + v2 2/2 still show
>>> artifacts for me.
>>>
>>
>>
>> What are these artifacts? can you please tell me about these artifacts and
>> if possible also send me the videos where this are happening.
>>
>
> Most videos will show it, though it varies with level and scaling amount.
>
> The Pendulum vid I uploaded in the bicubic thread will show it.
>
>
> https://drive.google.com/file/d/0BxP5-S1t9VEEaHZEM203RFpyNEE/view?usp=sharing
>
> The artifacts vary between similar to what Christian posted, sometimes
> half the image missing. fullscreen looks different to unscaled with
> pendulum.
>
> Here's a screen recording going through the levels window then fullscreen
> starting at bicubic to give a "good" as reference.
>
>
> https://drive.google.com/file/d/0BxP5-S1t9VEEYXBFZ3dNeVJoRnc/view?usp=sharing
>
> Thanks for this recording as the video is playing fine in my system, I
guess it is due to
the hardware  that I am using. I will see what I can do.

Regards,
Nayan
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 90264] [Regression, bisected] Tooltip corruption in Chrome

2016-07-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=90264

--- Comment #66 from Diego Viola  ---
I have this issue as well, I'm on Arch Linux (x86-64).

mesa 12.0.1-2
xorg-server 1.18.4-1
linux 4.6.4-1

Using modesetting here, ThinkPad T450 (broadwell i5-5300U).

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 90264] [Regression, bisected] Tooltip corruption in Chrome

2016-07-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=90264

--- Comment #67 from Diego Viola  ---
(In reply to Diego Viola from comment #66)
> I have this issue as well, I'm on Arch Linux (x86-64).
> 
> mesa 12.0.1-2
> xorg-server 1.18.4-1
> linux 4.6.4-1
> 
> Using modesetting here, ThinkPad T450 (broadwell i5-5300U).

00:02.0 VGA compatible controller: Intel Corporation HD Graphics 5500 (rev 09)

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4 11/11] vc4: use common screen ref counting

2016-07-25 Thread Eric Anholt
Rob Herring  writes:

> Use the common pipe_screen ref counting and fd hashing functions for
> vc4. This is necessary to only create a single pipe_screen for a
> process and avoid multiple imports of same prime fd among other things
> (probably).

Reviewed-by: Eric Anholt 


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] mesa clover from git fails to compile

2016-07-25 Thread Jan Vesely
On Fri, 2016-07-22 at 23:09 +0200, Pali Rohár wrote:
> Hello,
> 
> after fixing problem with mako version mesa from git still fails to
> compile. Now problematic part is clover state tracker. Error message
> is:

you are using old compiler. check:
https://bugs.freedesktop.org/show_bug.cgi?id=97019

Jan


> 
> libtool: compile:  g++-4.8 -DPACKAGE_NAME=\"Mesa\"
> -DPACKAGE_TARNAME=\"mesa\" -
> DPACKAGE_VERSION=\"12.1.0-devel\" "-DPACKAGE_STRING=\"Mesa 12.1.0-
> devel\"" "-
> DPACKAGE_BUGREPORT=\"https://bugs.freedesktop.org/enter_bug.cgi?produ
> ct=Mesa\"" -DPACKAGE_URL=\"\" -
> DPACKAGE=\"mesa\" -DVERSION=\"12.1.0-devel\" -DSTDC_HEADERS=1
> -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 
> -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1
> -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -
> DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1
> -DLT_OBJDIR=\".libs/\" -DYYTEXT_POINTER=1 -
> DHAVE___BUILTIN_BSWAP32=1 -DHAVE___BUILTIN_BSWAP64=1
> -DHAVE___BUILTIN_CLZ=1 -DHAVE___BUILTIN_CLZLL=1 -
> DHAVE___BUILTIN_CTZ=1 -DHAVE___BUILTIN_EXPECT=1
> -DHAVE___BUILTIN_FFS=1 -DHAVE___BUILTIN_FFSLL=1 -
> DHAVE___BUILTIN_POPCOUNT=1 -DHAVE___BUILTIN_POPCOUNTLL=1
> -DHAVE___BUILTIN_UNREACHABLE=1 -
> DHAVE_FUNC_ATTRIBUTE_CONST=1 -DHAVE_FUNC_ATTRIBUTE_FLATTEN=1
> -DHAVE_FUNC_ATTRIBUTE_FORMAT=1 -
> DHAVE_FUNC_ATTRIBUTE_MALLOC=1 -DHAVE_FUNC_ATTRIBUTE_PACKED=1
> -DHAVE_FUNC_ATTRIBUTE_PURE=1 -
> DHAVE_FUNC_ATTRIBUTE_UNUSED=1 -DHAVE_FUNC_ATTRIBUTE_VISIBILITY=1 -
> DHAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT=1
> -DHAVE_FUNC_ATTRIBUTE_WEAK=1 -DHAVE_DLADDR=1 -
> DHAVE_PTHREAD=1 -DHAVE_LIBEXPAT=1 -I.
> -I../../../../../../src/gallium/state_trackers/clover -
> I../../../../../../include -I../../../../../../src
> -I../../../../../../src/gallium/include -
> I../../../../../../src/gallium/drivers
> -I../../../../../../src/gallium/auxiliary -
> I../../../../../../src/gallium/winsys -I../../../../src -
> I../../../../../../src/gallium/state_trackers/clover
> -DHAVE_CLOVER_ICD -D_FORTIFY_SOURCE=2 -std=c++11 
> -fvisibility=hidden -I/usr/lib/llvm-3.7/include
> -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -
> D__STDC_LIMIT_MACROS -std=c++11 -D__STDC_LIMIT_MACROS
> -D__STDC_CONSTANT_MACROS -D_GNU_SOURCE -
> DUSE_SSE41 -DNDEBUG -DTEXTURE_FLOAT_ENABLED -DUSE_X86_64_ASM
> -DHAVE_XLOCALE_H -DHAVE_SYS_SYSCTL_H -
> DHAVE_STRTOF -DHAVE_MKOSTEMP -DHAVE_DLOPEN -DHAVE_POSIX_MEMALIGN
> -DHAVE_LIBDRM -DGLX_USE_DRM -
> DHAVE_LIBUDEV -DGLX_INDIRECT_RENDERING -DGLX_DIRECT_RENDERING
> -DGLX_USE_TLS -DHAVE_ALIAS -
> DHAVE_MINCORE -DHAVE_ST_VDPAU -DHAVE_LLVM=0x0307
> -DMESA_LLVM_VERSION_PATCH=0 -
> DLIBCLC_INCLUDEDIR=\"/usr/include/\"
> -DLIBCLC_LIBEXECDIR=\"/usr/lib/clc/\" -
> DCLANG_RESOURCE_DIR=\"/usr/lib/llvm-3.7/lib/clang/3.7.0\" -g -O2
> -fstack-protector --param=ssp-buffer-
> size=4 -Wformat -Wformat-security -Werror=format-security -Wall -Wall
> -fno-strict-aliasing -fno-math-
> errno -fno-trapping-math -MT llvm/libclllvm_la-invocation.lo -MD -MP
> -MF llvm/.deps/libclllvm_la-
> invocation.Tpo -c
> ../../../../../../src/gallium/state_trackers/clover/llvm/invocation.c
> pp  -fPIC -DPIC 
> -o llvm/.libs/libclllvm_la-invocation.o
> ../../../../../../src/gallium/state_trackers/clover/llvm/codegen/nati
> ve.cpp: In function 
> 'std::vector {anonymous}::emit_code(llvm::Module&, const
> clover::llvm::target&, 
> llvm::TargetMachine::CodeGenFileType, std::string&)':
> ../../../../../../src/gallium/state_trackers/clover/llvm/codegen/nati
> ve.cpp:129:52: error: invalid 
> initialization of non-const reference of type
> 'clover::llvm::compat::raw_ostream_to_emit_file {aka 
> llvm::raw_svector_ostream&}' from an rvalue of type ' initializer list>'
>   compat::raw_ostream_to_emit_file fos { os };
> ^
> make[5]: *** [llvm/codegen/libclllvm_la-native.lo] Error 1
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
-- 
Jan Vesely 

signature.asc
Description: This is a digitally signed message part
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 93551] Divinity: Original Sin Enhanced Edition(Native) crash on start

2016-07-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=93551

--- Comment #36 from Mikhail Korolev  ---
(In reply to Thomas J. Moore from comment #35)
> (In reply to Mikhail Korolev from comment #33)
>  > Your code calls dlsym at every call of glGetString/glXGetProcAddressARB
> > instead of call it only at startup.
> 
> If so, your version of gcc is broken.  Just in case my gcc is broken as
> well, I checked with gdb and the lines of code executing dlsym are executed
> exactly once, while the wrappers themselves are called more than once.  The
> only error I can see on reviewing my code is that I misspelled Enhanced,
> which is not worth correcting.  That's not to say my code is perfect, though.

My fault. I missed `static` part. Sorry for distraction.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/9] st/mesa: completely rewrite state atoms

2016-07-25 Thread Marek Olšák
On Mon, Jul 25, 2016 at 5:42 PM, Rob Clark  wrote:
> On Mon, Jul 25, 2016 at 11:16 AM, Brian Paul  wrote:
>> On 07/18/2016 07:11 AM, Marek Olšák wrote:
>>>
>>> From: Marek Olšák 
>>>
>>> The goal is to do this in st_validate_state:
>>> while (dirty)
>>>atoms[u_bit_scan(&dirty)]->update(st);
>>>
>>> That implies that atoms can't specify which flags they consume.
>>> There is exactly one ST_NEW_* flag for each atom. (58 flags in total)
>>>
>>> There are macros that combine multiple flags into one for easier use.
>>>
>>> All _NEW_* flags are translated into ST_NEW_* flags in
>>> st_invalidate_state.
>>> st/mesa doesn't keep the _NEW_* flags after that.
>>>
>>> torcs is 2% faster between the previous patch and the end of this series.
>>> ---
>>>   src/mesa/state_tracker/st_atom.c   | 153 +-
>>>   src/mesa/state_tracker/st_atom.h   | 210
>>> +
>>>   src/mesa/state_tracker/st_atom_array.c |   4 -
>>>   src/mesa/state_tracker/st_atom_atomicbuf.c |  24 ---
>>>   src/mesa/state_tracker/st_atom_blend.c |   4 -
>>>   src/mesa/state_tracker/st_atom_clip.c  |   4 -
>>>   src/mesa/state_tracker/st_atom_constbuf.c  |  48 --
>>>   src/mesa/state_tracker/st_atom_depth.c |   4 -
>>>   src/mesa/state_tracker/st_atom_framebuffer.c   |   4 -
>>>   src/mesa/state_tracker/st_atom_image.c |  24 ---
>>>   src/mesa/state_tracker/st_atom_list.h  |  75 +
>>>   src/mesa/state_tracker/st_atom_msaa.c  |   8 -
>>>   src/mesa/state_tracker/st_atom_pixeltransfer.c |   4 -
>>>   src/mesa/state_tracker/st_atom_rasterizer.c|  16 --
>>>   src/mesa/state_tracker/st_atom_sampler.c   |   4 -
>>>   src/mesa/state_tracker/st_atom_scissor.c   |   8 -
>>>   src/mesa/state_tracker/st_atom_shader.c|  24 ---
>>>   src/mesa/state_tracker/st_atom_stipple.c   |   5 -
>>>   src/mesa/state_tracker/st_atom_storagebuf.c|  24 ---
>>>   src/mesa/state_tracker/st_atom_tess.c  |   4 -
>>>   src/mesa/state_tracker/st_atom_texture.c   |  24 ---
>>>   src/mesa/state_tracker/st_atom_viewport.c  |   4 -
>>>   src/mesa/state_tracker/st_cb_bitmap.c  |  10 +-
>>>   src/mesa/state_tracker/st_cb_bufferobjects.c   |  10 +-
>>>   src/mesa/state_tracker/st_cb_compute.c |   2 +-
>>>   src/mesa/state_tracker/st_cb_feedback.c|   2 +-
>>>   src/mesa/state_tracker/st_cb_program.c |  38 ++---
>>>   src/mesa/state_tracker/st_cb_texture.c |   2 +-
>>>   src/mesa/state_tracker/st_context.c| 100 ++--
>>>   src/mesa/state_tracker/st_context.h|  42 +
>>>   src/mesa/state_tracker/st_draw.c   |   4 +-
>>>   src/mesa/state_tracker/st_manager.c|   4 +-
>>>   32 files changed, 377 insertions(+), 516 deletions(-)
>>>   create mode 100644 src/mesa/state_tracker/st_atom_list.h
>>>
>>> diff --git a/src/mesa/state_tracker/st_atom.c
>>> b/src/mesa/state_tracker/st_atom.c
>>> index 9d5cc0f..5843d2a 100644
>>> --- a/src/mesa/state_tracker/st_atom.c
>>> +++ b/src/mesa/state_tracker/st_atom.c
>>> @@ -37,87 +37,18 @@
>>>   #include "st_manager.h"
>>>
>>>
>>> -/**
>>> - * This is used to initialize st->render_atoms[].
>>> - */
>>> -static const struct st_tracked_state *render_atoms[] =
>>> -{
>>> -   &st_update_depth_stencil_alpha,
>>> -   &st_update_clip,
>>> -
>>> -   &st_update_fp,
>>> -   &st_update_gp,
>>> -   &st_update_tep,
>>> -   &st_update_tcp,
>>> -   &st_update_vp,
>>> -
>>> -   &st_update_rasterizer,
>>> -   &st_update_polygon_stipple,
>>> -   &st_update_viewport,
>>> -   &st_update_scissor,
>>> -   &st_update_window_rectangles,
>>> -   &st_update_blend,
>>> -   &st_update_vertex_texture,
>>> -   &st_update_fragment_texture,
>>> -   &st_update_geometry_texture,
>>> -   &st_update_tessctrl_texture,
>>> -   &st_update_tesseval_texture,
>>> -   &st_update_sampler, /* depends on update_*_texture for swizzle */
>>> -   &st_bind_vs_images,
>>> -   &st_bind_tcs_images,
>>> -   &st_bind_tes_images,
>>> -   &st_bind_gs_images,
>>> -   &st_bind_fs_images,
>>> -   &st_update_framebuffer, /* depends on update_*_texture and
>>> bind_*_images */
>>> -   &st_update_msaa,
>>> -   &st_update_sample_shading,
>>> -   &st_update_vs_constants,
>>> -   &st_update_tcs_constants,
>>> -   &st_update_tes_constants,
>>> -   &st_update_gs_constants,
>>> -   &st_update_fs_constants,
>>> -   &st_bind_vs_ubos,
>>> -   &st_bind_tcs_ubos,
>>> -   &st_bind_tes_ubos,
>>> -   &st_bind_fs_ubos,
>>> -   &st_bind_gs_ubos,
>>> -   &st_bind_vs_atomics,
>>> -   &st_bind_tcs_atomics,
>>> -   &st_bind_tes_atomics,
>>> -   &st_bind_fs_atomics,
>>> -   &st_bind_gs_atomics,
>>> -   &st_bind_vs_ssbos,
>>> -   &st_bind_tcs_ssbos,
>>> -   &st_bind_tes_ssbos,
>>> -   &st_bind_fs_ssbos,
>>> -   &st_bind_gs_ssbos,
>>> -   &st_update_pixel_transfer,
>>> -   &st_update_tess,
>>> -
>>> -   /* this must be done after the vertex program update */
>>> -   &st_

Re: [Mesa-dev] [PATCH 1/3] i965/blorp/gen8: Stop multiplying depth by 6 for cubes

2016-07-25 Thread Anuj Phogat
On Fri, Jul 22, 2016 at 10:39 PM, Jason Ekstrand  wrote:
> intel_mipmap_tree::logical_depth0 is now in 2-D slices so there is no need
> for us to multiply by 6 when we go to fill out a blorp surface state.
>
> Signed-off-by: Jason Ekstrand 
> ---
>  src/mesa/drivers/dri/i965/gen8_blorp.c | 5 +
>  1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/gen8_blorp.c 
> b/src/mesa/drivers/dri/i965/gen8_blorp.c
> index 870b67f..ab9b747 100644
> --- a/src/mesa/drivers/dri/i965/gen8_blorp.c
> +++ b/src/mesa/drivers/dri/i965/gen8_blorp.c
> @@ -526,9 +526,6 @@ gen8_blorp_emit_surface_states(struct brw_context *brw,
>mt->msaa_layout == INTEL_MSAA_LAYOUT_CMS) ?
>   MAX2(mt->num_samples, 1) : 1;
>
> -  const bool is_cube = mt->target == GL_TEXTURE_CUBE_MAP_ARRAY ||
> -   mt->target == GL_TEXTURE_CUBE_MAP;
> -  const unsigned depth = (is_cube ? 6 : 1) * mt->logical_depth0;
>const unsigned layer = mt->target != GL_TEXTURE_3D ?
>  surface->layer / layer_divider : 0;
>
> @@ -537,7 +534,7 @@ gen8_blorp_emit_surface_states(struct brw_context *brw,
>   .base_level = surface->level,
>   .levels = mt->last_level - surface->level + 1,
>   .base_array_layer = layer,
> - .array_len = depth - layer,
> + .array_len = mt->logical_depth0 - layer,
>   .channel_select = {
>  swizzle_to_scs(GET_SWZ(surface->swizzle, 0)),
>  swizzle_to_scs(GET_SWZ(surface->swizzle, 1)),
> --
> 2.5.0.400.gff86faf
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

LGTM. Series is:
Reviewed-by: Anuj Phogat 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-stable] [PATCH 1/9] anv/clear: Handle ClearImage on 3-D images

2016-07-25 Thread Anuj Phogat
On Thu, Jul 21, 2016 at 9:21 PM, Jason Ekstrand  wrote:
> Signed-off-by: Jason Ekstrand 
> Cc: "12.0" 
> Cc: Nanley Chery 
> ---
>  src/intel/vulkan/anv_meta_clear.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_meta_clear.c 
> b/src/intel/vulkan/anv_meta_clear.c
> index 18dfae8..fe750c8 100644
> --- a/src/intel/vulkan/anv_meta_clear.c
> +++ b/src/intel/vulkan/anv_meta_clear.c
> @@ -761,9 +761,11 @@ anv_cmd_clear_image(struct anv_cmd_buffer *cmd_buffer,
>
> for (uint32_t r = 0; r < range_count; r++) {
>const VkImageSubresourceRange *range = &ranges[r];
> -
>for (uint32_t l = 0; l < anv_get_levelCount(image, range); ++l) {
> - for (uint32_t s = 0; s < anv_get_layerCount(image, range); ++s) {
> + const uint32_t layer_count = image->type == VK_IMAGE_TYPE_3D ?
> +  anv_minify(image->extent.depth, l) :
> +  anv_get_layerCount(image, range);
> + for (uint32_t s = 0; s < layer_count; ++s) {
>  struct anv_image_view iview;
>  anv_image_view_init(&iview, cmd_buffer->device,
> &(VkImageViewCreateInfo) {
> --
> 2.5.0.400.gff86faf
>
> ___
> mesa-stable mailing list
> mesa-sta...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-stable

This is a resend of series which has already landed. So no review is
required.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 96950] Another regression from bc4e0c486: vbo: Use a bitmask to track the active arrays in vbo_exec*.

2016-07-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=96950

--- Comment #7 from Rob Clark  ---
(In reply to Mathias Fröhlich from comment #6)
> Created attachment 125255 [details] [review]
> Fix for an other assert in immediate mode rendering with the linux turbine
> 
> Rob,
> 
> The patch fixes the problem you observed with 0ad in a different way. Can
> you check if this approach also fixes your problem?

it does not seem to regress 0ad..

fwiw, if you are curious, this should reproduce my original problem (if you
revert my earlier fix and don't apply your patch)

https://people.freedesktop.org/~robclark/0ad-cycladic-archipelago.trace.xz

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4 09/11] vmwgfx: use common screen ref counting

2016-07-25 Thread Brian Paul

On 07/22/2016 10:22 AM, Rob Herring wrote:

Use the common pipe_screen ref counting and fd hashing functions. The
mutex can be dropped as the pipe loader protects the create_screen()
calls.

Signed-off-by: Rob Herring 
---
  src/gallium/auxiliary/target-helpers/drm_helper.h |  2 +-
  src/gallium/drivers/svga/svga_public.h|  2 +-
  src/gallium/drivers/svga/svga_screen.c|  5 ++-
  src/gallium/targets/pipe-loader/pipe_vmwgfx.c |  2 +-
  src/gallium/winsys/svga/drm/vmw_screen.c  | 54 +--
  src/gallium/winsys/svga/drm/vmw_screen.h  |  6 ---
  6 files changed, 17 insertions(+), 54 deletions(-)

diff --git a/src/gallium/auxiliary/target-helpers/drm_helper.h 
b/src/gallium/auxiliary/target-helpers/drm_helper.h
index 90820d3..a042162 100644
--- a/src/gallium/auxiliary/target-helpers/drm_helper.h
+++ b/src/gallium/auxiliary/target-helpers/drm_helper.h
@@ -181,7 +181,7 @@ pipe_vmwgfx_create_screen(int fd)
 if (!sws)
return NULL;

-   screen = svga_screen_create(sws);
+   screen = svga_screen_create(sws, fd);
 return screen ? debug_screen_wrap(screen) : NULL;
  }

diff --git a/src/gallium/drivers/svga/svga_public.h 
b/src/gallium/drivers/svga/svga_public.h
index ded2e24..5a95660 100644
--- a/src/gallium/drivers/svga/svga_public.h
+++ b/src/gallium/drivers/svga/svga_public.h
@@ -37,6 +37,6 @@ struct pipe_screen;
  struct svga_winsys_screen;

  struct pipe_screen *
-svga_screen_create(struct svga_winsys_screen *sws);
+svga_screen_create(struct svga_winsys_screen *sws, int fd);

  #endif /* SVGA_PUBLIC_H_ */
diff --git a/src/gallium/drivers/svga/svga_screen.c 
b/src/gallium/drivers/svga/svga_screen.c
index 5b4ac74..b353b92 100644
--- a/src/gallium/drivers/svga/svga_screen.c
+++ b/src/gallium/drivers/svga/svga_screen.c
@@ -26,6 +26,7 @@
  #include "util/u_format.h"
  #include "util/u_memory.h"
  #include "util/u_inlines.h"
+#include "util/u_screen.h"
  #include "util/u_string.h"
  #include "util/u_math.h"

@@ -906,7 +907,7 @@ svga_destroy_screen( struct pipe_screen *screen )
   * Create a new svga_screen object
   */
  struct pipe_screen *
-svga_screen_create(struct svga_winsys_screen *sws)
+svga_screen_create(struct svga_winsys_screen *sws, int fd)
  {
 struct svga_screen *svgascreen;
 struct pipe_screen *screen;
@@ -1081,6 +1082,8 @@ svga_screen_create(struct svga_winsys_screen *sws)

 svga_screen_cache_init(svgascreen);

+   pipe_screen_reference_init(screen, dup(fd));


dup() is not a Windows function.  I'm not sure where the prototype for 
dup() is getting pulled in on Linux either.  Maybe this file needs:


#ifdef PIPE_OS_WINDOWS
static int dup(int fd)
{
   return fd;
}
#else
#include 
#endif

Surprisingly, #include  in u_screen.c seems to compile with 
MSVC.


-Brian




+
 return screen;
  error2:
 FREE(svgascreen);
diff --git a/src/gallium/targets/pipe-loader/pipe_vmwgfx.c 
b/src/gallium/targets/pipe-loader/pipe_vmwgfx.c
index 71015df..d246022 100644
--- a/src/gallium/targets/pipe-loader/pipe_vmwgfx.c
+++ b/src/gallium/targets/pipe-loader/pipe_vmwgfx.c
@@ -14,7 +14,7 @@ create_screen(int fd)
 if (!sws)
return NULL;

-   screen = svga_screen_create(sws);
+   screen = svga_screen_create(sws, fd);
 if (!screen)
return NULL;

diff --git a/src/gallium/winsys/svga/drm/vmw_screen.c 
b/src/gallium/winsys/svga/drm/vmw_screen.c
index 7fcb6d2..e0fa763 100644
--- a/src/gallium/winsys/svga/drm/vmw_screen.c
+++ b/src/gallium/winsys/svga/drm/vmw_screen.c
@@ -29,25 +29,11 @@
  #include "vmw_context.h"

  #include "util/u_memory.h"
+#include "util/u_screen.h"
  #include "pipe/p_compiler.h"
-#include "util/u_hash_table.h"
  #include 
-#include 
  #include 

-static struct util_hash_table *dev_hash = NULL;
-
-static int vmw_dev_compare(void *key1, void *key2)
-{
-   return (major(*(dev_t *)key1) == major(*(dev_t *)key2) &&
-   minor(*(dev_t *)key1) == minor(*(dev_t *)key2)) ? 0 : 1;
-}
-
-static unsigned vmw_dev_hash(void *key)
-{
-   return (major(*(dev_t *) key) << 16) | minor(*(dev_t *) key);
-}
-
  /* Called from vmw_drm_create_screen(), creates and initializes the
   * vmw_winsys_screen structure, which is the main entity in this
   * module.
@@ -60,29 +46,15 @@ struct vmw_winsys_screen *
  vmw_winsys_create( int fd )
  {
 struct vmw_winsys_screen *vws;
-   struct stat stat_buf;
-
-   if (dev_hash == NULL) {
-  dev_hash = util_hash_table_create(vmw_dev_hash, vmw_dev_compare);
-  if (dev_hash == NULL)
- return NULL;
-   }
+   struct pipe_screen *pscreen = pipe_screen_reference(fd);

-   if (fstat(fd, &stat_buf))
-  return NULL;
-
-   vws = util_hash_table_get(dev_hash, &stat_buf.st_rdev);
-   if (vws) {
-  vws->open_count++;
-  return vws;
-   }
+   if (pscreen)
+  return vmw_winsys_screen(svga_winsys_screen(pscreen));

 vws = CALLOC_STRUCT(vmw_winsys_screen);
 if (!vws)
goto out_no_vws;

-   vws->device = stat_buf.st_rdev;
-   vws->open_count = 

[Mesa-dev] [Bug 96950] Another regression from bc4e0c486: vbo: Use a bitmask to track the active arrays in vbo_exec*.

2016-07-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=96950

--- Comment #8 from Brian Paul  ---
(In reply to Mathias Fröhlich from comment #6)
> Created attachment 125255 [details] [review]
> Fix for an other assert in immediate mode rendering with the linux turbine
> 
> Brian,
> 
> I tried to reproduce the described problem with the linux demo and stepped
> ontpo an other assert in vbo_exec_vtx_wrap(). This assert may be related to
> the one you mentioned. With this patch applied the linux turbine demo runs
> fine on i965.
> Can you check if the windows demo gets fixed by this patch too?

Yeah, your patch seems to fix things.  Thanks!

Tested-by: Brian Paul 

As for the patch itself, I think we want to s/boolean/bool/ and maybe #include
 just to be safe.

Otherwise, Reviewed-by: Brian Paul 

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC] i965: Delete brw_do_channel_expressions().

2016-07-25 Thread Matt Turner
On Haswell:

total instructions in shared programs: 6193528 -> 6197476 (0.06%)
instructions in affected programs: 756099 -> 760047 (0.52%)
helped: 1057
HURT: 2983

total cycles in shared programs: 71120700 -> 71169938 (0.07%)
cycles in affected programs: 5378 -> 53493716 (0.09%)
helped: 12737
HURT: 20309

total loops in shared programs: 2493 -> 2493 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total spills in shared programs: 2017 -> 1959 (-2.88%)
spills in affected programs: 1280 -> 1222 (-4.53%)
helped: 7
HURT: 2

total fills in shared programs: 1652 -> 1590 (-3.75%)
fills in affected programs: 1061 -> 999 (-5.84%)
helped: 7
HURT: 2

LOST:   46
GAINED: 26

On Skylake:

total instructions in shared programs: 11688933 -> 11697000 (0.07%)
instructions in affected programs: 1516104 -> 1524171 (0.53%)
helped: 1800
HURT: 5498

total cycles in shared programs: 133418304 -> 133453022 (0.03%)
cycles in affected programs: 105480208 -> 105514926 (0.03%)
helped: 20491
HURT: 36534

total loops in shared programs: 3218 -> 3218 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total spills in shared programs: 3754 -> 3847 (2.48%)
spills in affected programs: 705 -> 798 (13.19%)
helped: 2
HURT: 61

total fills in shared programs: 5353 -> 5444 (1.70%)
fills in affected programs: 659 -> 750 (13.81%)
helped: 2
HURT: 61

LOST:   27
GAINED: 23

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94477
---
I'm not advocating for applying this. I think the results show
pretty clearly that we're not ready to get rid of channel expressions.

But how do we ready ourselves to finally delete it? Hopefully Tim's
work will help?


 src/mesa/drivers/dri/i965/Makefile.sources |   1 -
 .../dri/i965/brw_fs_channel_expressions.cpp| 447 -
 src/mesa/drivers/dri/i965/brw_link.cpp |   3 -
 3 files changed, 451 deletions(-)
 delete mode 100644 src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
b/src/mesa/drivers/dri/i965/Makefile.sources
index df6b5dd..b5fe171 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -124,7 +124,6 @@ i965_FILES = \
brw_ff_gs.c \
brw_ff_gs_emit.c \
brw_ff_gs.h \
-   brw_fs_channel_expressions.cpp \
brw_fs_vector_splitting.cpp \
brw_formatquery.c \
brw_gs.c \
diff --git a/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp
deleted file mode 100644
index 5eac8d4..000
--- a/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp
+++ /dev/null
@@ -1,447 +0,0 @@
-/*
- * Copyright © 2010 Intel Corporation
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice (including the next
- * paragraph) shall be included in all copies or substantial portions of the
- * Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
- * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
- * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
- * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
- * DEALINGS IN THE SOFTWARE.
- */
-
-/**
- * \file brw_wm_channel_expressions.cpp
- *
- * Breaks vector operations down into operations on each component.
- *
- * The 965 fragment shader receives 8 or 16 pixels at a time, so each
- * channel of a vector is laid out as 1 or 2 8-float registers.  Each
- * ALU operation operates on one of those channel registers.  As a
- * result, there is no value to the 965 fragment shader in tracking
- * "vector" expressions in the sense of GLSL fragment shaders, when
- * doing a channel at a time may help in constant folding, algebraic
- * simplification, and reducing the liveness of channel registers.
- *
- * The exception to the desire to break everything down to floats is
- * texturing.  The texture sampler returns a writemasked masked
- * 4/8-register sequence containing the texture values.  We don't want
- * to dispatch to the sampler separately for each channel we need, so
- * we do retain the vector types in that case.
- */
-
-#include "compiler/glsl/ir.h"
-#include "compiler/glsl/ir_expression_flattening.h"
-#include "compiler/glsl_types.h"
-
-class ir_channel_expressions_visitor :

[Mesa-dev] [Bug 97019] [clover] build failure in llvm/codegen/native.cpp:129:52

2016-07-25 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=97019

--- Comment #8 from Francisco Jerez  ---
(In reply to Dieter Nützel from comment #6)
> (In reply to Francisco Jerez from comment #5)
> > Seems like a GCC bug...  You may be able to work around the issue by using
> > the old-fashioned constructor call syntax with parentheses instead of braces
> > to initialize the "fos" variable.
> 
> This one worked (see attachment).
>  
> OpenGL renderer string: Gallium 0.4 on AMD TURKS (DRM 2.43.0 /
> 4.6.4-6.g684e9e1-default, LLVM 4.0.0)
> OpenGL core profile version string: 3.3 (Core Profile) Mesa 12.1.0-devel
> (git-e7b2ce5)
> 
> opencl-example / run_tests.sh
> 
> Passed
> 71 passes, 0 fails

Seems reasonable, would you mind sending the patch for review to the mailing
list? (feel free to add me to the CC list)

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nvc0: use nve4_p2mf_push_linear() to reduce code duplication

2016-07-25 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nvc0/nvc0_tex.c | 45 ++---
 1 file changed, 9 insertions(+), 36 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
index 8abf1b5..25a5a8e 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
@@ -551,7 +551,6 @@ nvc0_validate_tic(struct nvc0_context *nvc0, int s)
 static bool
 nve4_validate_tic(struct nvc0_context *nvc0, unsigned s)
 {
-   struct nouveau_bo *txc = nvc0->screen->txc;
struct nouveau_pushbuf *push = nvc0->base.pushbuf;
unsigned i;
bool need_flush = false;
@@ -571,17 +570,9 @@ nve4_validate_tic(struct nvc0_context *nvc0, unsigned s)
   if (tic->id < 0) {
  tic->id = nvc0_screen_tic_alloc(nvc0->screen, tic);
 
- PUSH_SPACE(push, 16);
- BEGIN_NVC0(push, NVE4_P2MF(UPLOAD_DST_ADDRESS_HIGH), 2);
- PUSH_DATAh(push, txc->offset + (tic->id * 32));
- PUSH_DATA (push, txc->offset + (tic->id * 32));
- BEGIN_NVC0(push, NVE4_P2MF(UPLOAD_LINE_LENGTH_IN), 2);
- PUSH_DATA (push, 32);
- PUSH_DATA (push, 1);
- BEGIN_1IC0(push, NVE4_P2MF(UPLOAD_EXEC), 9);
- PUSH_DATA (push, 0x1001);
- PUSH_DATAp(push, &tic->tic[0], 8);
-
+ nve4_p2mf_push_linear(&nvc0->base, nvc0->screen->txc, tic->id * 32,
+   NV_VRAM_DOMAIN(&nvc0->screen->base), 32,
+   tic->tic);
  need_flush = true;
   } else
   if (res->status & NOUVEAU_BUFFER_STATUS_GPU_WRITING) {
@@ -685,8 +676,6 @@ nvc0_validate_tsc(struct nvc0_context *nvc0, int s)
 bool
 nve4_validate_tsc(struct nvc0_context *nvc0, int s)
 {
-   struct nouveau_bo *txc = nvc0->screen->txc;
-   struct nouveau_pushbuf *push = nvc0->base.pushbuf;
unsigned i;
bool need_flush = false;
 
@@ -700,17 +689,10 @@ nve4_validate_tsc(struct nvc0_context *nvc0, int s)
   if (tsc->id < 0) {
  tsc->id = nvc0_screen_tsc_alloc(nvc0->screen, tsc);
 
- PUSH_SPACE(push, 16);
- BEGIN_NVC0(push, NVE4_P2MF(UPLOAD_DST_ADDRESS_HIGH), 2);
- PUSH_DATAh(push, txc->offset + 65536 + (tsc->id * 32));
- PUSH_DATA (push, txc->offset + 65536 + (tsc->id * 32));
- BEGIN_NVC0(push, NVE4_P2MF(UPLOAD_LINE_LENGTH_IN), 2);
- PUSH_DATA (push, 32);
- PUSH_DATA (push, 1);
- BEGIN_1IC0(push, NVE4_P2MF(UPLOAD_EXEC), 9);
- PUSH_DATA (push, 0x1001);
- PUSH_DATAp(push, &tsc->tsc[0], 8);
-
+ nve4_p2mf_push_linear(&nvc0->base, nvc0->screen->txc,
+   65536 + tsc->id * 32,
+   NV_VRAM_DOMAIN(&nvc0->screen->base),
+   32, tsc->tsc);
  need_flush = true;
   }
   nvc0->screen->tsc.lock[tsc->id / 32] |= 1 << (tsc->id % 32);
@@ -1142,7 +1124,6 @@ gm107_validate_surfaces(struct nvc0_context *nvc0,
struct nv04_resource *res = nv04_resource(view->resource);
struct nouveau_pushbuf *push = nvc0->base.pushbuf;
struct nvc0_screen *screen = nvc0->screen;
-   struct nouveau_bo *txc = nvc0->screen->txc;
struct nv50_tic_entry *tic;
 
tic = nv50_tic_entry(nvc0->images_tic[stage][slot]);
@@ -1154,16 +1135,8 @@ gm107_validate_surfaces(struct nvc0_context *nvc0,
   tic->id = nvc0_screen_tic_alloc(nvc0->screen, tic);
 
   /* upload the texture view */
-  PUSH_SPACE(push, 16);
-  BEGIN_NVC0(push, NVE4_P2MF(UPLOAD_DST_ADDRESS_HIGH), 2);
-  PUSH_DATAh(push, txc->offset + (tic->id * 32));
-  PUSH_DATA (push, txc->offset + (tic->id * 32));
-  BEGIN_NVC0(push, NVE4_P2MF(UPLOAD_LINE_LENGTH_IN), 2);
-  PUSH_DATA (push, 32);
-  PUSH_DATA (push, 1);
-  BEGIN_1IC0(push, NVE4_P2MF(UPLOAD_EXEC), 9);
-  PUSH_DATA (push, 0x1001);
-  PUSH_DATAp(push, &tic->tic[0], 8);
+  nve4_p2mf_push_linear(&nvc0->base, nvc0->screen->txc, tic->id * 32,
+NV_VRAM_DOMAIN(&nvc0->screen->base), 32, tic->tic);
 
   BEGIN_NVC0(push, NVC0_3D(TIC_FLUSH), 1);
   PUSH_DATA (push, 0);
-- 
2.8.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nvc0: use nve4_p2mf_push_linear() to reduce code duplication

2016-07-25 Thread Ilia Mirkin
Sounds reasonable. Pretty sure those NV_VRAM_DOMAIN thingies should
just be txc->domain. With that fixed,

Reviewed-by: Ilia Mirkin 

On Mon, Jul 25, 2016 at 6:17 PM, Samuel Pitoiset
 wrote:
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/nouveau/nvc0/nvc0_tex.c | 45 
> ++---
>  1 file changed, 9 insertions(+), 36 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
> index 8abf1b5..25a5a8e 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
> @@ -551,7 +551,6 @@ nvc0_validate_tic(struct nvc0_context *nvc0, int s)
>  static bool
>  nve4_validate_tic(struct nvc0_context *nvc0, unsigned s)
>  {
> -   struct nouveau_bo *txc = nvc0->screen->txc;
> struct nouveau_pushbuf *push = nvc0->base.pushbuf;
> unsigned i;
> bool need_flush = false;
> @@ -571,17 +570,9 @@ nve4_validate_tic(struct nvc0_context *nvc0, unsigned s)
>if (tic->id < 0) {
>   tic->id = nvc0_screen_tic_alloc(nvc0->screen, tic);
>
> - PUSH_SPACE(push, 16);
> - BEGIN_NVC0(push, NVE4_P2MF(UPLOAD_DST_ADDRESS_HIGH), 2);
> - PUSH_DATAh(push, txc->offset + (tic->id * 32));
> - PUSH_DATA (push, txc->offset + (tic->id * 32));
> - BEGIN_NVC0(push, NVE4_P2MF(UPLOAD_LINE_LENGTH_IN), 2);
> - PUSH_DATA (push, 32);
> - PUSH_DATA (push, 1);
> - BEGIN_1IC0(push, NVE4_P2MF(UPLOAD_EXEC), 9);
> - PUSH_DATA (push, 0x1001);
> - PUSH_DATAp(push, &tic->tic[0], 8);
> -
> + nve4_p2mf_push_linear(&nvc0->base, nvc0->screen->txc, tic->id * 32,
> +   NV_VRAM_DOMAIN(&nvc0->screen->base), 32,
> +   tic->tic);
>   need_flush = true;
>} else
>if (res->status & NOUVEAU_BUFFER_STATUS_GPU_WRITING) {
> @@ -685,8 +676,6 @@ nvc0_validate_tsc(struct nvc0_context *nvc0, int s)
>  bool
>  nve4_validate_tsc(struct nvc0_context *nvc0, int s)
>  {
> -   struct nouveau_bo *txc = nvc0->screen->txc;
> -   struct nouveau_pushbuf *push = nvc0->base.pushbuf;
> unsigned i;
> bool need_flush = false;
>
> @@ -700,17 +689,10 @@ nve4_validate_tsc(struct nvc0_context *nvc0, int s)
>if (tsc->id < 0) {
>   tsc->id = nvc0_screen_tsc_alloc(nvc0->screen, tsc);
>
> - PUSH_SPACE(push, 16);
> - BEGIN_NVC0(push, NVE4_P2MF(UPLOAD_DST_ADDRESS_HIGH), 2);
> - PUSH_DATAh(push, txc->offset + 65536 + (tsc->id * 32));
> - PUSH_DATA (push, txc->offset + 65536 + (tsc->id * 32));
> - BEGIN_NVC0(push, NVE4_P2MF(UPLOAD_LINE_LENGTH_IN), 2);
> - PUSH_DATA (push, 32);
> - PUSH_DATA (push, 1);
> - BEGIN_1IC0(push, NVE4_P2MF(UPLOAD_EXEC), 9);
> - PUSH_DATA (push, 0x1001);
> - PUSH_DATAp(push, &tsc->tsc[0], 8);
> -
> + nve4_p2mf_push_linear(&nvc0->base, nvc0->screen->txc,
> +   65536 + tsc->id * 32,
> +   NV_VRAM_DOMAIN(&nvc0->screen->base),
> +   32, tsc->tsc);
>   need_flush = true;
>}
>nvc0->screen->tsc.lock[tsc->id / 32] |= 1 << (tsc->id % 32);
> @@ -1142,7 +1124,6 @@ gm107_validate_surfaces(struct nvc0_context *nvc0,
> struct nv04_resource *res = nv04_resource(view->resource);
> struct nouveau_pushbuf *push = nvc0->base.pushbuf;
> struct nvc0_screen *screen = nvc0->screen;
> -   struct nouveau_bo *txc = nvc0->screen->txc;
> struct nv50_tic_entry *tic;
>
> tic = nv50_tic_entry(nvc0->images_tic[stage][slot]);
> @@ -1154,16 +1135,8 @@ gm107_validate_surfaces(struct nvc0_context *nvc0,
>tic->id = nvc0_screen_tic_alloc(nvc0->screen, tic);
>
>/* upload the texture view */
> -  PUSH_SPACE(push, 16);
> -  BEGIN_NVC0(push, NVE4_P2MF(UPLOAD_DST_ADDRESS_HIGH), 2);
> -  PUSH_DATAh(push, txc->offset + (tic->id * 32));
> -  PUSH_DATA (push, txc->offset + (tic->id * 32));
> -  BEGIN_NVC0(push, NVE4_P2MF(UPLOAD_LINE_LENGTH_IN), 2);
> -  PUSH_DATA (push, 32);
> -  PUSH_DATA (push, 1);
> -  BEGIN_1IC0(push, NVE4_P2MF(UPLOAD_EXEC), 9);
> -  PUSH_DATA (push, 0x1001);
> -  PUSH_DATAp(push, &tic->tic[0], 8);
> +  nve4_p2mf_push_linear(&nvc0->base, nvc0->screen->txc, tic->id * 32,
> +NV_VRAM_DOMAIN(&nvc0->screen->base), 32, 
> tic->tic);
>
>BEGIN_NVC0(push, NVC0_3D(TIC_FLUSH), 1);
>PUSH_DATA (push, 0);
> --
> 2.8.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nvc0: use nve4_p2mf_push_linear() to reduce code duplication

2016-07-25 Thread Samuel Pitoiset



On 07/26/2016 12:20 AM, Ilia Mirkin wrote:

Sounds reasonable. Pretty sure those NV_VRAM_DOMAIN thingies should
just be txc->domain. With that fixed,


IIRC, this NV_VRAM_DOMAIN thing is for gk20a, maybe this is going to fix 
some tests as a side effect?




Reviewed-by: Ilia Mirkin 

On Mon, Jul 25, 2016 at 6:17 PM, Samuel Pitoiset
 wrote:

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nvc0/nvc0_tex.c | 45 ++---
 1 file changed, 9 insertions(+), 36 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
index 8abf1b5..25a5a8e 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
@@ -551,7 +551,6 @@ nvc0_validate_tic(struct nvc0_context *nvc0, int s)
 static bool
 nve4_validate_tic(struct nvc0_context *nvc0, unsigned s)
 {
-   struct nouveau_bo *txc = nvc0->screen->txc;
struct nouveau_pushbuf *push = nvc0->base.pushbuf;
unsigned i;
bool need_flush = false;
@@ -571,17 +570,9 @@ nve4_validate_tic(struct nvc0_context *nvc0, unsigned s)
   if (tic->id < 0) {
  tic->id = nvc0_screen_tic_alloc(nvc0->screen, tic);

- PUSH_SPACE(push, 16);
- BEGIN_NVC0(push, NVE4_P2MF(UPLOAD_DST_ADDRESS_HIGH), 2);
- PUSH_DATAh(push, txc->offset + (tic->id * 32));
- PUSH_DATA (push, txc->offset + (tic->id * 32));
- BEGIN_NVC0(push, NVE4_P2MF(UPLOAD_LINE_LENGTH_IN), 2);
- PUSH_DATA (push, 32);
- PUSH_DATA (push, 1);
- BEGIN_1IC0(push, NVE4_P2MF(UPLOAD_EXEC), 9);
- PUSH_DATA (push, 0x1001);
- PUSH_DATAp(push, &tic->tic[0], 8);
-
+ nve4_p2mf_push_linear(&nvc0->base, nvc0->screen->txc, tic->id * 32,
+   NV_VRAM_DOMAIN(&nvc0->screen->base), 32,
+   tic->tic);
  need_flush = true;
   } else
   if (res->status & NOUVEAU_BUFFER_STATUS_GPU_WRITING) {
@@ -685,8 +676,6 @@ nvc0_validate_tsc(struct nvc0_context *nvc0, int s)
 bool
 nve4_validate_tsc(struct nvc0_context *nvc0, int s)
 {
-   struct nouveau_bo *txc = nvc0->screen->txc;
-   struct nouveau_pushbuf *push = nvc0->base.pushbuf;
unsigned i;
bool need_flush = false;

@@ -700,17 +689,10 @@ nve4_validate_tsc(struct nvc0_context *nvc0, int s)
   if (tsc->id < 0) {
  tsc->id = nvc0_screen_tsc_alloc(nvc0->screen, tsc);

- PUSH_SPACE(push, 16);
- BEGIN_NVC0(push, NVE4_P2MF(UPLOAD_DST_ADDRESS_HIGH), 2);
- PUSH_DATAh(push, txc->offset + 65536 + (tsc->id * 32));
- PUSH_DATA (push, txc->offset + 65536 + (tsc->id * 32));
- BEGIN_NVC0(push, NVE4_P2MF(UPLOAD_LINE_LENGTH_IN), 2);
- PUSH_DATA (push, 32);
- PUSH_DATA (push, 1);
- BEGIN_1IC0(push, NVE4_P2MF(UPLOAD_EXEC), 9);
- PUSH_DATA (push, 0x1001);
- PUSH_DATAp(push, &tsc->tsc[0], 8);
-
+ nve4_p2mf_push_linear(&nvc0->base, nvc0->screen->txc,
+   65536 + tsc->id * 32,
+   NV_VRAM_DOMAIN(&nvc0->screen->base),
+   32, tsc->tsc);
  need_flush = true;
   }
   nvc0->screen->tsc.lock[tsc->id / 32] |= 1 << (tsc->id % 32);
@@ -1142,7 +1124,6 @@ gm107_validate_surfaces(struct nvc0_context *nvc0,
struct nv04_resource *res = nv04_resource(view->resource);
struct nouveau_pushbuf *push = nvc0->base.pushbuf;
struct nvc0_screen *screen = nvc0->screen;
-   struct nouveau_bo *txc = nvc0->screen->txc;
struct nv50_tic_entry *tic;

tic = nv50_tic_entry(nvc0->images_tic[stage][slot]);
@@ -1154,16 +1135,8 @@ gm107_validate_surfaces(struct nvc0_context *nvc0,
   tic->id = nvc0_screen_tic_alloc(nvc0->screen, tic);

   /* upload the texture view */
-  PUSH_SPACE(push, 16);
-  BEGIN_NVC0(push, NVE4_P2MF(UPLOAD_DST_ADDRESS_HIGH), 2);
-  PUSH_DATAh(push, txc->offset + (tic->id * 32));
-  PUSH_DATA (push, txc->offset + (tic->id * 32));
-  BEGIN_NVC0(push, NVE4_P2MF(UPLOAD_LINE_LENGTH_IN), 2);
-  PUSH_DATA (push, 32);
-  PUSH_DATA (push, 1);
-  BEGIN_1IC0(push, NVE4_P2MF(UPLOAD_EXEC), 9);
-  PUSH_DATA (push, 0x1001);
-  PUSH_DATAp(push, &tic->tic[0], 8);
+  nve4_p2mf_push_linear(&nvc0->base, nvc0->screen->txc, tic->id * 32,
+NV_VRAM_DOMAIN(&nvc0->screen->base), 32, tic->tic);

   BEGIN_NVC0(push, NVC0_3D(TIC_FLUSH), 1);
   PUSH_DATA (push, 0);
--
2.8.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nvc0: use nve4_p2mf_push_linear() to reduce code duplication

2016-07-25 Thread Ilia Mirkin
On Mon, Jul 25, 2016 at 6:21 PM, Samuel Pitoiset
 wrote:
>
>
> On 07/26/2016 12:20 AM, Ilia Mirkin wrote:
>>
>> Sounds reasonable. Pretty sure those NV_VRAM_DOMAIN thingies should
>> just be txc->domain. With that fixed,
>
>
> IIRC, this NV_VRAM_DOMAIN thing is for gk20a, maybe this is going to fix
> some tests as a side effect?

Yes, and it's used at object creation time. Fairly sure it's unnecessary here.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nvc0: use nve4_p2mf_push_linear() to reduce code duplication

2016-07-25 Thread Samuel Pitoiset



On 07/26/2016 12:24 AM, Ilia Mirkin wrote:

On Mon, Jul 25, 2016 at 6:21 PM, Samuel Pitoiset
 wrote:



On 07/26/2016 12:20 AM, Ilia Mirkin wrote:


Sounds reasonable. Pretty sure those NV_VRAM_DOMAIN thingies should
just be txc->domain. With that fixed,



IIRC, this NV_VRAM_DOMAIN thing is for gk20a, maybe this is going to fix
some tests as a side effect?


Yes, and it's used at object creation time. Fairly sure it's unnecessary here.


Right.




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nvc0: use nvc0_m2mf_push_linear() to reduce code duplication

2016-07-25 Thread Samuel Pitoiset
Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nvc0/nvc0_tex.c | 16 +++-
 1 file changed, 3 insertions(+), 13 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
index 25a5a8e..40a9c93 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
@@ -474,7 +474,6 @@ nvc0_validate_tic(struct nvc0_context *nvc0, int s)
 {
uint32_t commands[32];
struct nouveau_pushbuf *push = nvc0->base.pushbuf;
-   struct nouveau_bo *txc = nvc0->screen->txc;
unsigned i;
unsigned n = 0;
bool need_flush = false;
@@ -495,18 +494,9 @@ nvc0_validate_tic(struct nvc0_context *nvc0, int s)
   if (tic->id < 0) {
  tic->id = nvc0_screen_tic_alloc(nvc0->screen, tic);
 
- PUSH_SPACE(push, 17);
- BEGIN_NVC0(push, NVC0_M2MF(OFFSET_OUT_HIGH), 2);
- PUSH_DATAh(push, txc->offset + (tic->id * 32));
- PUSH_DATA (push, txc->offset + (tic->id * 32));
- BEGIN_NVC0(push, NVC0_M2MF(LINE_LENGTH_IN), 2);
- PUSH_DATA (push, 32);
- PUSH_DATA (push, 1);
- BEGIN_NVC0(push, NVC0_M2MF(EXEC), 1);
- PUSH_DATA (push, 0x100111);
- BEGIN_NIC0(push, NVC0_M2MF(DATA), 8);
- PUSH_DATAp(push, &tic->tic[0], 8);
-
+ nvc0_m2mf_push_linear(&nvc0->base, nvc0->screen->txc, tic->id * 32,
+   NV_VRAM_DOMAIN(&nvc0->screen->base), 32,
+   tic->tic);
  need_flush = true;
   } else
   if (res->status & NOUVEAU_BUFFER_STATUS_GPU_WRITING) {
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nvc0: use nvc0_m2mf_push_linear() to reduce code duplication

2016-07-25 Thread Ilia Mirkin
Reviewed-by: Ilia Mirkin 

On Mon, Jul 25, 2016 at 6:48 PM, Samuel Pitoiset
 wrote:
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/nouveau/nvc0/nvc0_tex.c | 16 +++-
>  1 file changed, 3 insertions(+), 13 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
> index 25a5a8e..40a9c93 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_tex.c
> @@ -474,7 +474,6 @@ nvc0_validate_tic(struct nvc0_context *nvc0, int s)
>  {
> uint32_t commands[32];
> struct nouveau_pushbuf *push = nvc0->base.pushbuf;
> -   struct nouveau_bo *txc = nvc0->screen->txc;
> unsigned i;
> unsigned n = 0;
> bool need_flush = false;
> @@ -495,18 +494,9 @@ nvc0_validate_tic(struct nvc0_context *nvc0, int s)
>if (tic->id < 0) {
>   tic->id = nvc0_screen_tic_alloc(nvc0->screen, tic);
>
> - PUSH_SPACE(push, 17);
> - BEGIN_NVC0(push, NVC0_M2MF(OFFSET_OUT_HIGH), 2);
> - PUSH_DATAh(push, txc->offset + (tic->id * 32));
> - PUSH_DATA (push, txc->offset + (tic->id * 32));
> - BEGIN_NVC0(push, NVC0_M2MF(LINE_LENGTH_IN), 2);
> - PUSH_DATA (push, 32);
> - PUSH_DATA (push, 1);
> - BEGIN_NVC0(push, NVC0_M2MF(EXEC), 1);
> - PUSH_DATA (push, 0x100111);
> - BEGIN_NIC0(push, NVC0_M2MF(DATA), 8);
> - PUSH_DATAp(push, &tic->tic[0], 8);
> -
> + nvc0_m2mf_push_linear(&nvc0->base, nvc0->screen->txc, tic->id * 32,
> +   NV_VRAM_DOMAIN(&nvc0->screen->base), 32,
> +   tic->tic);
>   need_flush = true;
>} else
>if (res->status & NOUVEAU_BUFFER_STATUS_GPU_WRITING) {
> --
> 2.9.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 01/21] i965/fs: Get rid of fs_visitor::do_dual_src.

2016-07-25 Thread Anuj Phogat
On Fri, Jul 22, 2016 at 8:58 PM, Francisco Jerez  wrote:
> This boolean flag was being used for two different things:
>
>  - To set the brw_wm_prog_data::dual_src_blend flag.  Instead we can
>just set it based on whether the dual_src_output register is valid,
>which will be the case if the shader writes the secondary blending
>color.
>
>  - To decide whether to call emit_single_fb_write() once, or in a loop
>that would iterate only once, which seems pretty useless.
> ---
>  src/mesa/drivers/dri/i965/brw_fs.h   |  1 -
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  2 --
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 37 
> +++-
>  3 files changed, 14 insertions(+), 26 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
> b/src/mesa/drivers/dri/i965/brw_fs.h
> index fc1e1c4..46b15b4 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.h
> +++ b/src/mesa/drivers/dri/i965/brw_fs.h
> @@ -318,7 +318,6 @@ public:
> fs_reg sample_mask;
> fs_reg outputs[VARYING_SLOT_MAX];
> fs_reg dual_src_output;
> -   bool do_dual_src;
> int first_non_payload_grf;
> /** Either BRW_MAX_GRF or GEN7_MRF_HACK_START */
> unsigned max_grf;
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 50d73eb..2872b2d 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -103,12 +103,10 @@ fs_visitor::nir_setup_outputs()
>   if (key->force_dual_color_blend &&
>   var->data.location == FRAG_RESULT_DATA1) {
>  this->dual_src_output = reg;
> -this->do_dual_src = true;
>   } else if (var->data.index > 0) {
>  assert(var->data.location == FRAG_RESULT_DATA0);
>  assert(var->data.index == 1);
>  this->dual_src_output = reg;
> -this->do_dual_src = true;
>   } else if (var->data.location == FRAG_RESULT_COLOR) {
>  /* Writing gl_FragColor outputs to all color regions. */
>  for (unsigned int i = 0; i < MAX2(key->nr_color_regions, 1); 
> i++) {
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index 6d84374..808d8af 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> @@ -437,33 +437,25 @@ fs_visitor::emit_fb_writes()
> "in SIMD16+ mode.\n");
> }
>
> -   if (do_dual_src) {
> -  const fs_builder abld = bld.annotate("FB dual-source write");
> +   for (int target = 0; target < key->nr_color_regions; target++) {
> +  /* Skip over outputs that weren't written. */
> +  if (this->outputs[target].file == BAD_FILE)
> + continue;
>
> -  inst = emit_single_fb_write(abld, this->outputs[0],
> -  this->dual_src_output, reg_undef, 4);
> -  inst->target = 0;
> -
> -  prog_data->dual_src_blend = true;
> -   } else {
> -  for (int target = 0; target < key->nr_color_regions; target++) {
> - /* Skip over outputs that weren't written. */
> - if (this->outputs[target].file == BAD_FILE)
> -continue;
> +  const fs_builder abld = bld.annotate(
> + ralloc_asprintf(this->mem_ctx, "FB write target %d", target));
>
> - const fs_builder abld = bld.annotate(
> -ralloc_asprintf(this->mem_ctx, "FB write target %d", target));
> +  fs_reg src0_alpha;
> +  if (devinfo->gen >= 6 && key->replicate_alpha && target != 0)
> + src0_alpha = offset(outputs[0], bld, 3);
>
> - fs_reg src0_alpha;
> - if (devinfo->gen >= 6 && key->replicate_alpha && target != 0)
> -src0_alpha = offset(outputs[0], bld, 3);
> -
> - inst = emit_single_fb_write(abld, this->outputs[target], reg_undef,
> - src0_alpha, 4);
> - inst->target = target;
> -  }
> +  inst = emit_single_fb_write(abld, this->outputs[target],
> +  this->dual_src_output, src0_alpha, 4);
> +  inst->target = target;
> }
>
> +   prog_data->dual_src_blend = (this->dual_src_output.file != BAD_FILE);
> +
It'll be nice to add this assert here:
assert(!prog_data->dual_src_blend ||  key->nr_color_regions == 1);

> if (inst == NULL) {
>/* Even if there's no color buffers enabled, we still need to send
> * alpha out the pipeline to our null renderbuffer to support
> @@ -914,7 +906,6 @@ fs_visitor::init()
> this->promoted_constants = 0,
>
> this->spilled_any_registers = false;
> -   this->do_dual_src = false;
>  }
>
>  fs_visitor::~fs_visitor()
> --
> 2.9.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Patch is:
Reviewed-by: Anuj Phogat 
_

[Mesa-dev] [PATCH 2/2] radeonsi/compute: Use the HSA abi for non-TGSI compute shaders

2016-07-25 Thread Tom Stellard
This patche switches non-TGSI compute shaders over to using the HSA
ABI described here:

https://github.com/RadeonOpenCompute/ROCm-Docs/blob/master/AMDGPU-ABI.md

The HSA ABI provides a much cleaner interface for compute shaders and allows
us to share more code in the compiler with the HSA stack.

The main changes in this patch are:
  - We now pass the scratch buffer resource into the shader via user sgprs
rather than using relocations.
  - Grid/Block sizes are now passed to the shader via the dispatch packet
rather than at the beginning of the kernel arguments.

Typically for HSA, the CP firmware will create the dispatch packet and set
up the user sgprs automatically.  However, in Mesa we let the driver do
this work.  The main reason for this is that I haven't researched how to
get the CP to do all these things, and I'm not sure if it is supported
for all GPUs.
---
 src/gallium/drivers/radeon/r600_pipe_common.c|   6 +-
 src/gallium/drivers/radeonsi/amd_kernel_code_t.h | 534 +++
 src/gallium/drivers/radeonsi/si_compute.c| 234 +-
 3 files changed, 756 insertions(+), 18 deletions(-)
 create mode 100644 src/gallium/drivers/radeonsi/amd_kernel_code_t.h

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index cd4908f..9ecf666 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -784,7 +784,11 @@ static int r600_get_compute_param(struct pipe_screen 
*screen,
if (rscreen->family <= CHIP_ARUBA) {
triple = "r600--";
} else {
-   triple = "amdgcn--";
+   if (HAVE_LLVM < 0x0400) {
+   triple = "amdgcn--";
+   } else {
+   triple = "amdgcn--mesa3d";
+   }
}
switch(rscreen->family) {
/* Clang < 3.6 is missing Hainan in its list of
diff --git a/src/gallium/drivers/radeonsi/amd_kernel_code_t.h 
b/src/gallium/drivers/radeonsi/amd_kernel_code_t.h
new file mode 100644
index 000..d0d7809
--- /dev/null
+++ b/src/gallium/drivers/radeonsi/amd_kernel_code_t.h
@@ -0,0 +1,534 @@
+/*
+ * Copyright 2015,2016 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * on the rights to use, copy, modify, merge, publish, distribute, sub
+ * license, and/or sell copies of the Software, and to permit persons to whom
+ * the Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHOR(S) AND/OR THEIR SUPPLIERS BE LIABLE FOR ANY CLAIM,
+ * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
+ * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
+ * USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef AMDKERNELCODET_H
+#define AMDKERNELCODET_H
+
+//---//
+// AMD Kernel Code, and its dependencies //
+//---//
+
+// Sets val bits for specified mask in specified dst packed instance.
+#define AMD_HSA_BITS_SET(dst, mask, val)   
\
+  dst &= (~(1 << mask ## _SHIFT) & ~mask); 
\
+  dst |= (((val) << mask ## _SHIFT) & mask)
+
+// Gets bits for specified mask from specified src packed instance.
+#define AMD_HSA_BITS_GET(src, mask)
\
+  ((src & mask) >> mask ## _SHIFT) 
\
+
+/* Every amd_*_code_t has the following properties, which are composed of
+ * a number of bit fields. Every bit field has a mask (AMD_CODE_PROPERTY_*),
+ * bit width (AMD_CODE_PROPERTY_*_WIDTH, and bit shift amount
+ * (AMD_CODE_PROPERTY_*_SHIFT) for convenient access. Unused bits must be 0.
+ *
+ * (Note that bit fields cannot be used as their layout is
+ * implementation defined in the C standard and so cannot be used to
+ * specify an ABI)
+ */
+enum amd_code_property_mask_t {
+
+  /* Enable the setup of the SGPR user data registers
+   * (AMD_CODE_PROPERTY_ENABLE_SGPR_*), see documentation of amd_kernel_code_t
+   * for initial register state.
+   *
+   * The tot

[Mesa-dev] [PATCH shader-db 2/2] run: Mark shaders with only one stage as separable.

2016-07-25 Thread Kenneth Graunke
There are a couple cases where a single shader might happen:

- compute shaders
  (only one stage, no inputs and outputs; separable shouldn't matter)
- vertex shaders with transform feedback
  (we want to retain outputs, but transform feedback varyings are
   specified via the API, not the shader - setting SSO fixes this)
- old shader_test files captured before we started adding "SSO ENABLED".

In any case, it seems harmless or beneficial to enable SSO for all
.shader_test files containing a single shader.

Based on a patch by Marek.
---
 run.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/run.c b/run.c
index 024c8e9..f6ce1bf 100644
--- a/run.c
+++ b/run.c
@@ -633,6 +633,12 @@ main(int argc, char **argv)
 }
 ctx_is_core = type == TYPE_CORE;
 
+/* If there's only one shader, mark it separable so inputs
+ * and outputs aren't eliminated.
+ */
+if (num_shaders == 1)
+use_separate_shader_objects = true;
+
 if (type == TYPE_CORE || type == TYPE_COMPAT) {
 GLuint prog = glCreateProgram();
 
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH shader-db 1/2] run: Add separate shader objects support.

2016-07-25 Thread Kenneth Graunke
With this patch, if a .shader_test file contains

[require]
...
SSO ENABLED

then we'll set GL_PROGRAM_SEPARABLE to compile the shaders into separate
shader objects.  This prevents the linker from removing unused inputs
and outputs.  Drivers may also choose to lay out interfaces of SSO
programs differently, resulting in different code.

v2:
- Actually initialize use_separate_shader_objects
- Fix memcmp length parameter (thanks to Matt)

v3:
- Search for "SSO ENABLED" instead of "GL_ARB_separate_shader_objects",
  to match what Timothy did in shader_runner.
- Use GL_PROGRAM_SEPARABLE (suggested by Tapani).  This allows
  multi-stage SSO programs to optimize internal interfaces, while
  still making the end-stages separable.
---
 run.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/run.c b/run.c
index 0e6e248..024c8e9 100644
--- a/run.c
+++ b/run.c
@@ -73,12 +73,14 @@ static struct shader *
 get_shaders(const struct context_info *core, const struct context_info *compat,
 const char *text, size_t text_size,
 enum shader_type *type, unsigned *num_shaders,
+bool *use_separate_shader_objects,
 const char *shader_name)
 {
 static const char *req = "[require]";
 static const char *glsl_req = "\nGLSL >= ";
 static const char *fp_req = "\nGL_ARB_fragment_program";
 static const char *vp_req = "\nGL_ARB_vertex_program";
+static const char *sso_req = "SSO ENABLED";
 static const char *gs = "geometry shader]\n";
 static const char *fs = "fragment ";
 static const char *vs = "vertex ";
@@ -90,6 +92,8 @@ get_shaders(const struct context_info *core, const struct 
context_info *compat,
 static const char *test = "test]\n";
 const char *end_text = text + text_size;
 
+*use_separate_shader_objects = false;
+
 /* Find the [require] block and parse it first. */
 text = memmem(text, end_text - text, req, strlen(req)) + strlen(req);
 
@@ -137,6 +141,9 @@ get_shaders(const struct context_info *core, const struct 
context_info *compat,
 shader_name, (int)(newline - extension_text), 
extension_text);
 return NULL;
 }
+if (memcmp(extension_text, sso_req, strlen(sso_req)) == 0) {
+*use_separate_shader_objects = true;
+}
 }
 
 /* Find the shaders. */
@@ -606,9 +613,11 @@ main(int argc, char **argv)
 
 enum shader_type type;
 unsigned num_shaders;
+bool use_separate_shader_objects;
 struct shader *shader = get_shaders(&core, &compat,
 text, shader_test[i].filesize,
 &type, &num_shaders,
+&use_separate_shader_objects,
 current_shader_name);
 if (unlikely(shader == NULL)) {
 continue;
@@ -627,6 +636,9 @@ main(int argc, char **argv)
 if (type == TYPE_CORE || type == TYPE_COMPAT) {
 GLuint prog = glCreateProgram();
 
+if (use_separate_shader_objects)
+glProgramParameteri(prog, GL_PROGRAM_SEPARABLE, GL_TRUE);
+
 for (unsigned i = 0; i < num_shaders; i++) {
 GLuint s = glCreateShader(shader[i].type);
 glShaderSource(s, 1, &shader[i].text, &shader[i].length);
-- 
2.9.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] vc4: add hash table look-up for exported dmabufs

2016-07-25 Thread Eric Anholt
Rob Herring  writes:

> It is necessary to reuse existing BOs when dmabufs are imported. There
> are 2 cases that need to be handled. dmabufs can be created/exported and
> imported by the same process and can be imported multiple times.
> Copying other drivers, add a hash table to track exported BOs so the
> BOs get reused.
>
> Cc: Eric Anholt 
> Signed-off-by: Rob Herring 

Looks good to me, other than a bit of funny whitespace that I'll fix up.
I built a piglit test for this today (want to go take a look at those?),
and once I get a piglit run through, I'll push the change.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] vc4: add hash table look-up for exported dmabufs

2016-07-25 Thread Rob Clark
On Mon, Jul 25, 2016 at 8:47 PM, Eric Anholt  wrote:
> Rob Herring  writes:
>
>> It is necessary to reuse existing BOs when dmabufs are imported. There
>> are 2 cases that need to be handled. dmabufs can be created/exported and
>> imported by the same process and can be imported multiple times.
>> Copying other drivers, add a hash table to track exported BOs so the
>> BOs get reused.
>>
>> Cc: Eric Anholt 
>> Signed-off-by: Rob Herring 
>
> Looks good to me, other than a bit of funny whitespace that I'll fix up.
> I built a piglit test for this today (want to go take a look at those?),
> and once I get a piglit run through, I'll push the change.

I don't suppose you have a branch somewhere?  I should probably give
that a try..  I've had
https://trello.com/c/x34U0kTQ/114-teach-piglit-about-libdrm-freedreno-for-dmabuf-tests
on my todo list for a while (but a generic gbm based solution seems
even better than teaching piglit about libdrm_$drivername ;-))

BR,
-R


> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/2] st/mesa: provide GL_OES_copy_image support by caching the original ETC data

2016-07-25 Thread Ilia Mirkin
ping

On Sat, Jul 16, 2016 at 12:21 PM, Ilia Mirkin  wrote:
> The additional provision of GL_OES_copy_image is that it work for ETC.
> However many desktop GPUs don't have native ETC support, so st/mesa does
> the decoding by hand. Instead of discarding the compressed data, keep it
> around in CPU memory. Use it when performing image copies.
>
> Signed-off-by: Ilia Mirkin 
> ---
>  docs/GL3.txt |  2 +-
>  docs/relnotes/12.1.0.html|  1 +
>  src/mesa/state_tracker/st_cb_copyimage.c | 96 
> +++-
>  src/mesa/state_tracker/st_cb_texture.c   | 77 +
>  src/mesa/state_tracker/st_extensions.c   | 12 +---
>  src/mesa/state_tracker/st_texture.h  |  7 ++-
>  6 files changed, 156 insertions(+), 39 deletions(-)
>
> diff --git a/docs/GL3.txt b/docs/GL3.txt
> index ce34869..6da5225 100644
> --- a/docs/GL3.txt
> +++ b/docs/GL3.txt
> @@ -255,7 +255,7 @@ GLES3.2, GLSL ES 3.2
>GL_KHR_debug  DONE (all drivers)
>GL_KHR_robustness DONE (i965)
>GL_KHR_texture_compression_astc_ldr   DONE (i965/gen9+)
> -  GL_OES_copy_image DONE (i965)
> +  GL_OES_copy_image DONE (all drivers)
>GL_OES_draw_buffers_indexed   DONE (all drivers 
> that support GL_ARB_draw_buffers_blend)
>GL_OES_draw_elements_base_vertex  DONE (all drivers)
>GL_OES_geometry_shaderstarted (idr)
> diff --git a/docs/relnotes/12.1.0.html b/docs/relnotes/12.1.0.html
> index 096f551..abdd83af 100644
> --- a/docs/relnotes/12.1.0.html
> +++ b/docs/relnotes/12.1.0.html
> @@ -47,6 +47,7 @@ Note: some of the new features are only available with 
> certain drivers.
>  GL_ARB_shader_group_vote on nvc0
>  GL_ARB_ES3_1_compatibility on i965
>  GL_EXT_window_rectangles on nv50, nvc0
> +GL_OES_copy_image on nv50, nvc0, r600, radeonsi, softpipe, llvmpipe
>  
>
>  Bug fixes
> diff --git a/src/mesa/state_tracker/st_cb_copyimage.c 
> b/src/mesa/state_tracker/st_cb_copyimage.c
> index f670bd9..d160c8c 100644
> --- a/src/mesa/state_tracker/st_cb_copyimage.c
> +++ b/src/mesa/state_tracker/st_cb_copyimage.c
> @@ -532,6 +532,90 @@ copy_image(struct pipe_context *pipe,
>   src_box);
>  }
>
> +/* Note, the only allowable compressed format for this function is ETC */
> +static void
> +fallback_copy_image(struct st_context *st,
> +struct gl_texture_image *dst_image,
> +struct pipe_resource *dst_res,
> +int dst_x, int dst_y, int dst_z,
> +struct gl_texture_image *src_image,
> +struct pipe_resource *src_res,
> +int src_x, int src_y, int src_z,
> +int src_w, int src_h)
> +{
> +   uint8_t *dst, *src;
> +   int dst_stride, src_stride;
> +   struct pipe_transfer *dst_transfer, *src_transfer;
> +   unsigned line_bytes;
> +
> +   bool dst_is_compressed = dst_image && 
> _mesa_is_format_compressed(dst_image->TexFormat);
> +   bool src_is_compressed = src_image && 
> _mesa_is_format_compressed(src_image->TexFormat);
> +
> +   unsigned dst_w = src_w;
> +   unsigned dst_h = src_h;
> +   unsigned lines = src_h;
> +
> +   if (src_is_compressed && !dst_is_compressed) {
> +  dst_w = DIV_ROUND_UP(dst_w, 4);
> +  dst_h = DIV_ROUND_UP(dst_h, 4);
> +   } else if (!src_is_compressed && dst_is_compressed) {
> +  dst_w *= 4;
> +  dst_h *= 4;
> +   }
> +   if (src_is_compressed) {
> +  lines = DIV_ROUND_UP(lines, 4);
> +   }
> +
> +   if (src_image)
> +  line_bytes = _mesa_format_row_stride(src_image->TexFormat, src_w);
> +   else
> +  line_bytes = _mesa_format_row_stride(dst_image->TexFormat, dst_w);
> +
> +   if (dst_image) {
> +  st->ctx->Driver.MapTextureImage(
> +st->ctx, dst_image, dst_z,
> +dst_x, dst_y, dst_w, dst_h,
> +GL_MAP_WRITE_BIT, &dst, &dst_stride);
> +   } else {
> +  dst = pipe_transfer_map(st->pipe, dst_res, 0, dst_z,
> +  PIPE_TRANSFER_WRITE,
> +  dst_x, dst_y, dst_w, dst_h,
> +  &dst_transfer);
> +  dst_stride = dst_transfer->stride;
> +   }
> +
> +   if (src_image) {
> +  st->ctx->Driver.MapTextureImage(
> +st->ctx, src_image, src_z,
> +src_x, src_y, src_w, src_h,
> +GL_MAP_READ_BIT, &src, &src_stride);
> +   } else {
> +  src = pipe_transfer_map(st->pipe, src_res, 0, src_z,
> +  PIPE_TRANSFER_READ,
> +  src_x, src_y, src_w, src_h,
> +  &src_transfer);
> +  src_stride = src_transfer->stride;
> +   }
> +
> +   for (int y = 0; y < lines; y++) {
> +  memcpy(dst, src, line_bytes);
> +  dst += dst_str

Re: [Mesa-dev] [PATCH v2] vc4: add hash table look-up for exported dmabufs

2016-07-25 Thread Eric Anholt
Rob Clark  writes:

> On Mon, Jul 25, 2016 at 8:47 PM, Eric Anholt  wrote:
>> Rob Herring  writes:
>>
>>> It is necessary to reuse existing BOs when dmabufs are imported. There
>>> are 2 cases that need to be handled. dmabufs can be created/exported and
>>> imported by the same process and can be imported multiple times.
>>> Copying other drivers, add a hash table to track exported BOs so the
>>> BOs get reused.
>>>
>>> Cc: Eric Anholt 
>>> Signed-off-by: Rob Herring 
>>
>> Looks good to me, other than a bit of funny whitespace that I'll fix up.
>> I built a piglit test for this today (want to go take a look at those?),
>> and once I get a piglit run through, I'll push the change.
>
> I don't suppose you have a branch somewhere?  I should probably give
> that a try..  I've had
> https://trello.com/c/x34U0kTQ/114-teach-piglit-about-libdrm-freedreno-for-dmabuf-tests
> on my todo list for a while (but a generic gbm based solution seems
> even better than teaching piglit about libdrm_$drivername ;-))

https://cgit.freedesktop.org/~anholt/piglit/log/?h=dmabuf-refcount


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] radeonsi/compute: Add some more debug printfs

2016-07-25 Thread Tom Stellard
---
 src/gallium/drivers/radeonsi/si_compute.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/si_compute.c 
b/src/gallium/drivers/radeonsi/si_compute.c
index 5a40286..949ab1a 100644
--- a/src/gallium/drivers/radeonsi/si_compute.c
+++ b/src/gallium/drivers/radeonsi/si_compute.c
@@ -299,6 +299,9 @@ static bool si_switch_compute_shader(struct si_context 
*sctx,
radeon_emit(cs, config->rsrc1);
radeon_emit(cs, config->rsrc2);
 
+   COMPUTE_DBG(sctx->screen, "COMPUTE_PGM_RSRC1: 0x%08x "
+   "COMPUTE_PGM_RSRC2: 0x%08x\n", config->rsrc1, config->rsrc2);
+
radeon_set_sh_reg(cs, R_00B860_COMPUTE_TMPRING_SIZE,
  S_00B860_WAVES(sctx->scratch_waves)
 | S_00B860_WAVESIZE(config->scratch_bytes_per_wave >> 10));
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/95] i965/vec4/nir: simplify glsl_type_for_nir_alu_type()

2016-07-25 Thread Francisco Jerez
Iago Toral Quiroga  writes:

> From: Connor Abbott 
>
> Less duplication, one one less case to handle for doubles and support
> for sized NIR types.
>
> v2: Fix call to get_instance by swapping rows and columns params (Iago)
>
> Signed-off-by: Iago Toral Quiroga 

Reviewed-by: Francisco Jerez 

> ---
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 16 ++--
>  1 file changed, 2 insertions(+), 14 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index f3b4528..6662a1e 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -1696,20 +1696,8 @@ const glsl_type *
>  glsl_type_for_nir_alu_type(nir_alu_type alu_type,
> unsigned components)
>  {
> -   switch (alu_type) {
> -   case nir_type_float:
> -  return glsl_type::vec(components);
> -   case nir_type_int:
> -  return glsl_type::ivec(components);
> -   case nir_type_uint:
> -  return glsl_type::uvec(components);
> -   case nir_type_bool:
> -  return glsl_type::bvec(components);
> -   default:
> -  return glsl_type::error_type;
> -   }
> -
> -   return glsl_type::error_type;
> +   return glsl_type::get_instance(brw_glsl_base_type_for_nir_type(alu_type),
> +  components, 1);
>  }
>  
>  void
> -- 
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 09/95] i965/vec4: add support for printing DF immediates

2016-07-25 Thread Francisco Jerez
Iago Toral Quiroga  writes:

> From: Connor Abbott 
>
Reviewed-by: Francisco Jerez 

> ---
>  src/mesa/drivers/dri/i965/brw_vec4.cpp | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> index 162b481..bf6701e 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> @@ -1485,6 +1485,9 @@ vec4_visitor::dump_instruction(backend_instruction 
> *be_inst, FILE *file)
>   case BRW_REGISTER_TYPE_F:
>  fprintf(file, "%fF", inst->src[i].f);
>  break;
> + case BRW_REGISTER_TYPE_DF:
> +fprintf(file, "%fDF", inst->src[i].df);
> +break;
>   case BRW_REGISTER_TYPE_D:
>  fprintf(file, "%dD", inst->src[i].d);
>  break;
> -- 
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 04/95] i965/vec4/nir: Add bit-size information to types

2016-07-25 Thread Francisco Jerez
Iago Toral Quiroga  writes:

Reviewed-by: Francisco Jerez 

> ---
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index 1f8fa80..c5b9715 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -327,7 +327,7 @@ src_reg
>  vec4_visitor::get_nir_src(const nir_src &src, unsigned num_components)
>  {
> /* if type is not specified, default to signed int */
> -   return get_nir_src(src, nir_type_int, num_components);
> +   return get_nir_src(src, nir_type_int32, num_components);
>  }
>  
>  src_reg
> @@ -733,7 +733,7 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
> *instr)
> case nir_intrinsic_atomic_counter_dec: {
>unsigned surf_index = prog_data->base.binding_table.abo_start +
>   (unsigned) instr->const_index[0];
> -  src_reg offset = get_nir_src(instr->src[0], nir_type_int,
> +  src_reg offset = get_nir_src(instr->src[0], nir_type_int32,
> instr->num_components);
>const src_reg surface = brw_imm_ud(surf_index);
>const vec4_builder bld =
> @@ -787,7 +787,7 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
> *instr)
>* from any live channel.
>*/
>   surf_index = src_reg(this, glsl_type::uint_type);
> - emit(ADD(dst_reg(surf_index), get_nir_src(instr->src[0], 
> nir_type_int,
> + emit(ADD(dst_reg(surf_index), get_nir_src(instr->src[0], 
> nir_type_int32,
> instr->num_components),
>brw_imm_ud(prog_data->base.binding_table.ubo_start)));
>   surf_index = emit_uniformize(surf_index);
> @@ -805,7 +805,7 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
> *instr)
>if (const_offset) {
>   offset = brw_imm_ud(const_offset->u32[0] & ~15);
>} else {
> - offset = get_nir_src(instr->src[1], nir_type_int, 1);
> + offset = get_nir_src(instr->src[1], nir_type_uint32, 1);
>}
>  
>src_reg packed_consts = src_reg(this, glsl_type::vec4_type);
> -- 
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/95] i965/vec4/nir: support doubles in ALU operations

2016-07-25 Thread Francisco Jerez
Iago Toral Quiroga  writes:

> Basically, this involves considering the bit-size information to set
> the appropriate type on both operands and destination.
> ---
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index c5b9715..5a7ee0b 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -1001,14 +1001,18 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr)
>  {
> vec4_instruction *inst;
>  
> -   dst_reg dst = get_nir_dest(instr->dest.dest,
> -  nir_op_infos[instr->op].output_type);
> +   nir_alu_type dst_type = nir_op_infos[instr->op].output_type;
> +   unsigned dst_bit_size = nir_dest_bit_size(instr->dest.dest);
> +   dst_type = (nir_alu_type) (dst_type | dst_bit_size);

Seems rather confusing to declare two temporaries for this and assign
one of them twice, when you could have written the nir_alu_type as a
straightforward closed-form expression in the function call below, but
meh...

Reviewed-by: Francisco Jerez 

> +   dst_reg dst = get_nir_dest(instr->dest.dest, dst_type);
> dst.writemask = instr->dest.write_mask;
>  
> src_reg op[4];
> for (unsigned i = 0; i < nir_op_infos[instr->op].num_inputs; i++) {
> -  op[i] = get_nir_src(instr->src[i].src,
> -  nir_op_infos[instr->op].input_types[i], 4);
> +  nir_alu_type src_type = nir_op_infos[instr->op].input_types[i];
> +  unsigned bit_size = nir_src_bit_size(instr->src[i].src);
> +  src_type = (nir_alu_type) (src_type | bit_size);
> +  op[i] = get_nir_src(instr->src[i].src, src_type, 4);
>op[i].swizzle = brw_swizzle_for_nir_swizzle(instr->src[i].swizzle);
>op[i].abs = instr->src[i].abs;
>op[i].negate = instr->src[i].negate;
> -- 
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] clover: make older c++ (4.8.x) happy

2016-07-25 Thread Dieter Nützel
Signed-off-by: Dieter Nützel 
---
 src/gallium/state_trackers/clover/llvm/codegen/native.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/clover/llvm/codegen/native.cpp 
b/src/gallium/state_trackers/clover/llvm/codegen/native.cpp
index b96236b..f5e887e 100644
--- a/src/gallium/state_trackers/clover/llvm/codegen/native.cpp
+++ b/src/gallium/state_trackers/clover/llvm/codegen/native.cpp
@@ -126,7 +126,7 @@ namespace {
   {
  compat::pass_manager pm;
  ::llvm::raw_svector_ostream os { data };
- compat::raw_ostream_to_emit_file fos { os };
+ compat::raw_ostream_to_emit_file fos ( os );
 
  mod.setDataLayout(compat::get_data_layout(*tm));
  tm->Options.MCOptions.AsmVerbose =
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/95] i965/vec4/nir: allocate two registers for dvec3/dvec4

2016-07-25 Thread Francisco Jerez
Iago Toral Quiroga  writes:

> From: Connor Abbott 
>
> ---
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index 6662a1e..1f8fa80 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -141,6 +141,9 @@ vec4_visitor::nir_emit_impl(nir_function_impl *impl)
>unsigned array_elems =
>   reg->num_array_elems == 0 ? 1 : reg->num_array_elems;
>  
> +  if (reg->bit_size == 64)
> + array_elems *= 2;
> +
>nir_locals[reg->index] = dst_reg(VGRF, alloc.allocate(array_elems));

Shouldn't this just be 'array_elems * DIV_ROUND_UP(bit_size, 32)'?
Seems like a saner long-term plan than special-casing every possible bit
size in every place that cares about the bit size of a variable.

> }
>  
> @@ -270,7 +273,7 @@ dst_reg
>  vec4_visitor::get_nir_dest(const nir_dest &dest)
>  {
> if (dest.is_ssa) {
> -  dst_reg dst = dst_reg(VGRF, alloc.allocate(1));
> +  dst_reg dst = dst_reg(VGRF, alloc.allocate(dest.ssa.bit_size / 32));

Using DIV_ROUND_UP instead of plain integer division would have the
advantage that it won't leave things so badly broken for sub-32-bit
types.

>nir_ssa_values[dest.ssa.index] = dst;
>return dst;
> } else {
> -- 
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/9] st/mesa: completely rewrite state atoms

2016-07-25 Thread Rob Clark
On Mon, Jul 25, 2016 at 1:19 PM, Marek Olšák  wrote:
> On Mon, Jul 25, 2016 at 5:42 PM, Rob Clark  wrote:
>> On Mon, Jul 25, 2016 at 11:16 AM, Brian Paul  wrote:
>>> On 07/18/2016 07:11 AM, Marek Olšák wrote:

 @@ -183,49 +107,42 @@ static void check_attrib_edgeflag(struct st_context
 *st)

   void st_validate_state( struct st_context *st, enum st_pipeline pipeline
 )
   {
 -   const struct st_tracked_state **atoms;
 -   struct st_state_flags *state;
 -   GLuint num_atoms;
 -   GLuint i;
 +   uint64_t dirty, pipeline_mask;
 +   uint32_t dirty_lo, dirty_hi;
 +
 +   /* Get Mesa driver state. */
 +   st->dirty |= st->ctx->NewDriverState & ST_ALL_STATES_MASK;
 +   st->ctx->NewDriverState = 0;

  /* Get pipeline state. */
  switch (pipeline) {
 -case ST_PIPELINE_RENDER:
 -  atoms = render_atoms;
 -  num_atoms = ARRAY_SIZE(render_atoms);
 -  state = &st->dirty;
 +   case ST_PIPELINE_RENDER:
 +  check_attrib_edgeflag(st);
 +  check_program_state(st);
 +  st_manager_validate_framebuffers(st);
 +
 +  pipeline_mask = ST_PIPELINE_RENDER_STATE_MASK;
 break;
  case ST_PIPELINE_COMPUTE:
 -  atoms = compute_atoms;
 -  num_atoms = ARRAY_SIZE(compute_atoms);
 -  state = &st->dirty_cp;
 +  pipeline_mask = ST_PIPELINE_COMPUTE_STATE_MASK;
 break;
  default:
 unreachable("Invalid pipeline specified");
  }

 -   /* Get Mesa driver state. */
 -   st->dirty.st |= st->ctx->NewDriverState;
 -   st->dirty_cp.st |= st->ctx->NewDriverState;
 -   st->ctx->NewDriverState = 0;
 -
 -   if (pipeline == ST_PIPELINE_RENDER) {
 -  check_attrib_edgeflag(st);
 -
 -  check_program_state(st);
 -
 -  st_manager_validate_framebuffers(st);
 -   }
 -
 -   if (state->st == 0 && state->mesa == 0)
 +   dirty = st->dirty & pipeline_mask;
 +   if (!dirty)
 return;

 -   /*printf("%s %x/%x\n", __func__, state->mesa, state->st);*/
 +   dirty_lo = dirty;
 +   dirty_hi = dirty >> 32;

 -   for (i = 0; i < num_atoms; i++) {
 -  if (check_state(state, &atoms[i]->dirty))
 - atoms[i]->update( st );
 -   }
 +   /* Update states. */
 +   while (dirty_lo)
 +  atoms[u_bit_scan(&dirty_lo)]->update(st);
 +   while (dirty_hi)
 +  atoms[32 + u_bit_scan(&dirty_hi)]->update(st);

>>>
>>> Could we just use the u_bit_scan64() function and void the hi/lo split?
>>
>> fwiw, we actually did discuss that on irc, but I guess no one
>> summarized on email thread..
>>
>> Marek's concern was that would generate worse code on 32b since it
>> would pull the right-shift into the loop.
>>
>> I'm not entirely sure if gcc would be clever enough in this case or
>> not.  I guess someone needs to compare generated asm in both cases.
>> And either use u_bit_scan64() if the compiler is clever enough, or add
>> a comment explaining the reason.
>
> Yeah, I added this comment before the loops:
> "Don't use u_bit_scan64, it may be slower on 32-bit."
>
> On 32-bit, ffsll is an if-then-else expression with some arithmetic
> and shifting one bit to the left is another if-then-else expression.

fwiw, I did spend a bit of time this evening playing around with this,
and the dirty_hi/dirty_lo approach w/ 32b/i686 build works out to be
something like 12 instructions shorter for the loop body, ie. gcc
isn't clever enough (total instruction count increases by doubling the
loops but I think that doesn't matter)..  given that this is a hot
spot in profiles that I've looked at, it might even be worth having
some #ifdef 64b / #else.. but ofc that could be left as a future
exercise if someone cares..  either way, r-b

BR,
-R
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH shader-db 2/2] run: Mark shaders with only one stage as separable.

2016-07-25 Thread Timothy Arceri
On Mon, 2016-07-25 at 16:54 -0700, Kenneth Graunke wrote:
> There are a couple cases where a single shader might happen:
> 
> - compute shaders
>   (only one stage, no inputs and outputs; separable shouldn't matter)
> - vertex shaders with transform feedback
>   (we want to retain outputs, but transform feedback varyings are
>    specified via the API, not the shader - setting SSO fixes this)

ARB_enhanced_layouts does allow these to be specified in shader
although that might be difficult to recognise.

Also this will retain all varyings not just xfb varyings maybe we
should capture xfb varyings when dumping shaders as this also doesn't
fix xfb for say vs->gs or even vs->fs. Anyway just a thought, I guess
this patch probably does make things better than worse.

> - old shader_test files captured before we started adding "SSO
> ENABLED".
> 
> In any case, it seems harmless or beneficial to enable SSO for all
> .shader_test files containing a single shader.
> 
> Based on a patch by Marek.
> ---
>  run.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/run.c b/run.c
> index 024c8e9..f6ce1bf 100644
> --- a/run.c
> +++ b/run.c
> @@ -633,6 +633,12 @@ main(int argc, char **argv)
>  }
>  ctx_is_core = type == TYPE_CORE;
>  
> +/* If there's only one shader, mark it separable so
> inputs
> + * and outputs aren't eliminated.
> + */
> +if (num_shaders == 1)
> +use_separate_shader_objects = true;
> +
>  if (type == TYPE_CORE || type == TYPE_COMPAT) {
>  GLuint prog = glCreateProgram();
>  
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH shader-db 1/2] run: Add separate shader objects support.

2016-07-25 Thread Timothy Arceri
On Mon, 2016-07-25 at 16:54 -0700, Kenneth Graunke wrote:
> With this patch, if a .shader_test file contains
> 
> [require]
> ...
> SSO ENABLED
> 
> then we'll set GL_PROGRAM_SEPARABLE to compile the shaders into
> separate
> shader objects.  This prevents the linker from removing unused inputs
> and outputs.  Drivers may also choose to lay out interfaces of SSO
> programs differently, resulting in different code.
> 
> v2:
> - Actually initialize use_separate_shader_objects
> - Fix memcmp length parameter (thanks to Matt)
> 
> v3:
> - Search for "SSO ENABLED" instead of
> "GL_ARB_separate_shader_objects",
>   to match what Timothy did in shader_runner.
> - Use GL_PROGRAM_SEPARABLE (suggested by Tapani).  This allows
>   multi-stage SSO programs to optimize internal interfaces, while
>   still making the end-stages separable.

When using SSO ENABLED in shader_runner each stage is linked as a
separte program here you are creating multi-stage SSO programs. Unless
you are combining the program with another SSO program enabling SSO
doesn't seem very useful. Ideally we would capture and combine specific
stages but that would get complicated which is why I just linked
everything as separate programs in shader_runner.

> ---
>  run.c | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/run.c b/run.c
> index 0e6e248..024c8e9 100644
> --- a/run.c
> +++ b/run.c
> @@ -73,12 +73,14 @@ static struct shader *
>  get_shaders(const struct context_info *core, const struct
> context_info *compat,
>  const char *text, size_t text_size,
>  enum shader_type *type, unsigned *num_shaders,
> +bool *use_separate_shader_objects,
>  const char *shader_name)
>  {
>  static const char *req = "[require]";
>  static const char *glsl_req = "\nGLSL >= ";
>  static const char *fp_req = "\nGL_ARB_fragment_program";
>  static const char *vp_req = "\nGL_ARB_vertex_program";
> +static const char *sso_req = "SSO ENABLED";
>  static const char *gs = "geometry shader]\n";
>  static const char *fs = "fragment ";
>  static const char *vs = "vertex ";
> @@ -90,6 +92,8 @@ get_shaders(const struct context_info *core, const
> struct context_info *compat,
>  static const char *test = "test]\n";
>  const char *end_text = text + text_size;
>  
> +*use_separate_shader_objects = false;
> +
>  /* Find the [require] block and parse it first. */
>  text = memmem(text, end_text - text, req, strlen(req)) +
> strlen(req);
>  
> @@ -137,6 +141,9 @@ get_shaders(const struct context_info *core,
> const struct context_info *compat,
>  shader_name, (int)(newline - extension_text),
> extension_text);
>  return NULL;
>  }
> +if (memcmp(extension_text, sso_req, strlen(sso_req)) == 0) {
> +*use_separate_shader_objects = true;
> +}
>  }
>  
>  /* Find the shaders. */
> @@ -606,9 +613,11 @@ main(int argc, char **argv)
>  
>  enum shader_type type;
>  unsigned num_shaders;
> +bool use_separate_shader_objects;
>  struct shader *shader = get_shaders(&core, &compat,
>  text,
> shader_test[i].filesize,
>  &type, &num_shaders,
> +&use_separate_shader
> _objects,
>  current_shader_name)
> ;
>  if (unlikely(shader == NULL)) {
>  continue;
> @@ -627,6 +636,9 @@ main(int argc, char **argv)
>  if (type == TYPE_CORE || type == TYPE_COMPAT) {
>  GLuint prog = glCreateProgram();
>  
> +if (use_separate_shader_objects)
> +glProgramParameteri(prog, GL_PROGRAM_SEPARABLE,
> GL_TRUE);
> +
>  for (unsigned i = 0; i < num_shaders; i++) {
>  GLuint s = glCreateShader(shader[i].type);
>  glShaderSource(s, 1, &shader[i].text,
> &shader[i].length);
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev