Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering

2019-03-12 Thread Bas Nieuwenhuizen
On Tue, Mar 12, 2019 at 9:59 AM Marc-André Lureau wrote: > > Hi > > On Fri, Mar 1, 2019 at 12:13 PM Mathias Fröhlich > wrote: > > > > On Friday, 1 March 2019 12:15:08 CET Eero Tamminen wrote: > > > Hi, > > > > > > On 1.3.2019 11.12, Michel Dänzer wrote: > > > > On 2019-02-28 8:41 p.m., Marek Olšá

Re: [Mesa-dev] [PATCH v2 01/12] ac: do not force enable IDXEN for 16-bit SSBO loads

2019-03-13 Thread Bas Nieuwenhuizen
NAK. The entire thing about an index being used and possibly still constant 0 (and hence the index being constant 0 is not a sign to use the raw intrinsics) is why we now have both structurized and raw intrinsics. Don't just introduce that mistake again On Wed, Mar 13, 2019 at 11:47 AM Samue

Re: [Mesa-dev] [PATCH] radv: always initialize HTILE when the src layout is UNDEFINED

2019-03-14 Thread Bas Nieuwenhuizen
r-b On Thu, Mar 14, 2019 at 2:24 PM Samuel Pitoiset wrote: > > HTILE should always be initialized when transitioning from > VK_IMAGE_LAYOUT_UNDEFINED to other image layouts. Otherwise, > if an app does a transition from UNDEFINED to GENERAL, the > driver doesn't initialize HTILE and it tries to d

Re: [Mesa-dev] Building error in android-x86 due to recent mesa commit

2019-03-16 Thread Bas Nieuwenhuizen
Should be fixed when https://gitlab.freedesktop.org/mesa/mesa/merge_requests/456 is merged. On Sat, Mar 16, 2019 at 10:16 PM Mauro Rossi wrote: > > Hi Marek, > > I'm getting the following building error after commit [1] > but I don't understand why. > > Mauro > > external/mesa/src/gallium/drive

Re: [Mesa-dev] [PATCH] ac/nir_to_llvm: add assert to emit_bcsel()

2019-03-17 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen On Sun, Mar 17, 2019 at 11:04 AM Timothy Arceri wrote: > > nir to llvm assumes we have already split vectors to scalars via > nir_lower_alu_to_scalar(). > --- > src/amd/common/ac_nir_to_llvm.c | 2 ++ > 1 file changed, 2 insertions(+) > &

Re: [Mesa-dev] [PATCH] radv: fix the NUM_RECORDS field for vertex bindings on GFX6/GFX7

2019-03-18 Thread Bas Nieuwenhuizen
I think this needs to be modified to get the vulkan out of bounds behavior. In particular whether a VS input is out of bounds or not can differ between attributes from the same buffer with the same index, and that is something not handled here. I guess we could fix by only using the offset in the

Re: [Mesa-dev] [PATCH 2/3] radv: remove unnecessary FLUSH_AND_INV_CB when initializing DCC

2019-03-19 Thread Bas Nieuwenhuizen
That it does not use it is exactly why we need to make sure the CB data is not in the CB cache by flushing it? On Tue, Mar 19, 2019 at 12:15 PM Samuel Pitoiset wrote: > > The clear operation (ie. compute) doesn't use the CB caches. > > Signed-off-by: Samuel Pitoiset > --- > src/amd/vulkan/radv_

Re: [Mesa-dev] [PATCH v3 06/11] ac/nir: use ac_build_buffer_load() for SSBO load operations

2019-03-19 Thread Bas Nieuwenhuizen
On Wed, Mar 13, 2019 at 5:38 PM Samuel Pitoiset wrote: > > Signed-off-by: Samuel Pitoiset > --- > src/amd/common/ac_nir_to_llvm.c | 35 ++--- > 1 file changed, 6 insertions(+), 29 deletions(-) > > diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to

Re: [Mesa-dev] [PATCH v3 11/11] ac: use new LLVM 8 intrinsics in ac_build_buffer_store_dword()

2019-03-19 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen FYI since the new intrinsics don't merge voffset and soffset anymore, you can remove the tbuffer variants for LLVM8+. On Wed, Mar 13, 2019 at 5:38 PM Samuel Pitoiset wrote: > > New buffer intrinsics have a separate soffset parameter. &g

Re: [Mesa-dev] [PATCH] ac: use llvm.amdgcn.fract intrinsic for nir_op_ffract

2019-03-19 Thread Bas Nieuwenhuizen
r-b On Tue, Mar 19, 2019 at 11:37 PM Samuel Pitoiset wrote: > > Noticed with a Doom shader. > > 29077 shaders in 15096 tests > Totals: > SGPRS: 1282125 -> 1282133 (0.00 %) > VGPRS: 908716 -> 908616 (-0.01 %) > Spilled SGPRs: 24811 -> 24779 (-0.13 %) > Code Size: 49048176 -> 48936488 (-0.23 %) byt

Re: [Mesa-dev] [PATCH 5/8] ac/nir: implement 8-bit ssbo stores

2019-03-19 Thread Bas Nieuwenhuizen
On Tue, Mar 19, 2019 at 9:28 AM Samuel Pitoiset wrote: > > From: Rhys Perry > > Signed-off-by: Rhys Perry > --- > src/amd/common/ac_nir_to_llvm.c | 9 +++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c >

Re: [Mesa-dev] [PATCH 0/8] radv: VK_KHR_8bit_storage

2019-03-20 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen for the series. On Tue, Mar 19, 2019 at 9:28 AM Samuel Pitoiset wrote: > > Hi, > > This series implements VK_KHR_8bit_storage for RADV. Original work > is from Rhys Perry, I did rebase, update some patches and test. > > Please review, > t

[Mesa-dev] [PATCH 6/8] radeonsi: Add sampling of DCC compressed textures.

2015-09-04 Thread Bas Nieuwenhuizen
The values for resource word 6 have been taken from Catalyst traces. Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_descriptors.c | 5 + src/gallium/drivers/radeonsi/si_pipe.h| 1 + src/gallium/drivers/radeonsi/si_state.c | 13 +++-- 3 files

[Mesa-dev] [PATCH 0/8] Add DCC support.

2015-09-04 Thread Bas Nieuwenhuizen
. As for testing, I run this systemwide for a few days now and a piglit test minus a few tests that locked up on the baseline or seemingly gave random results, did not result in regressions. Bas Nieuwenhuizen (8): radeonsi: Allocate buffers for DCC. radeonsi: Add DCC compression tracking machi

[Mesa-dev] [PATCH 7/8] radeonsi: Do not decompress DCC textures for sampling.

2015-09-04 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_blit.c| 18 -- src/gallium/drivers/radeonsi/si_descriptors.c | 2 +- 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_blit.c b/src/gallium/drivers

[Mesa-dev] [PATCH 1/8] radeonsi: Allocate buffers for DCC.

2015-09-04 Thread Bas Nieuwenhuizen
As the alignment requirements can be 32 KiB or more, also adding an aligned buffer creation function. Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeon/r600_buffer_common.c | 20 +++ src/gallium/drivers/radeon/r600_pipe_common.h | 6 src/gallium/drivers/radeon

[Mesa-dev] [PATCH 5/8] radeonsi: Invalidate the L2 cache on framebuffer change.

2015-09-04 Thread Bas Nieuwenhuizen
This is needed by DCC when using compressed textures. Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_state.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index 5c9c866..3e11922

[Mesa-dev] [PATCH 8/8] radeonsi: Add DCC for multisampled textures.

2015-09-04 Thread Bas Nieuwenhuizen
The DCC fast clear for multisampled textures is still disabled as that does not work correctly yet. Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeon/r600_texture.c | 3 ++- src/gallium/drivers/radeonsi/si_blit.c | 2 +- src/gallium/winsys/amdgpu/drm/amdgpu_surface.c

[Mesa-dev] [PATCH 4/8] radeonsi: Add DCC fast clear.

2015-09-04 Thread Bas Nieuwenhuizen
We cannot use the clear words from the cmask fast clear, so the fast clears are somewhat limited. The clear patterns have been taken from Catalyst traces. Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeon/r600_texture.c | 48 --- 1 file changed, 38

[Mesa-dev] [PATCH 3/8] radeonsi: Enable DCC.

2015-09-04 Thread Bas Nieuwenhuizen
The flags to be enabled in the control registers have been taken from Catalyst traces. Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeon/r600_pipe_common.h | 1 + src/gallium/drivers/radeon/r600_texture.c | 2 ++ src/gallium/drivers/radeon/r600d_common.h | 1 + src

[Mesa-dev] [PATCH 2/8] radeonsi: Add DCC compression tracking machinery.

2015-09-04 Thread Bas Nieuwenhuizen
As textures can be sampled without decompression and the fastclear bits can be erased without decompressing, so add a new set of flags. Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeon/r600_pipe_common.h | 2 ++ src/gallium/drivers/radeonsi/cik_sdma.c | 6 -- src

Re: [Mesa-dev] [PATCH 0/8] Add DCC support.

2015-09-04 Thread Bas Nieuwenhuizen
On Friday, September 04, 2015 05:00:47 PM Alex Deucher wrote: > On Fri, Sep 4, 2015 at 3:47 PM, Bas Nieuwenhuizen > > wrote: > > This patch series enables delta color compression (DCC) for Vulcanic > > Islands GPU's. This should reduce memory bandwidth to increase &

Re: [Mesa-dev] [PATCH 3/8] radeonsi: Enable DCC.

2015-09-24 Thread Bas Nieuwenhuizen
renames that impact this series. On Thursday, September 24, 2015 01:36:31 AM Marek Olšák wrote: > On Fri, Sep 4, 2015 at 9:47 PM, Bas Nieuwenhuizen > > wrote: > > The flags to be enabled in the control registers have been taken from > > Catalyst traces. > > > >

Re: [Mesa-dev] [PATCH 5/8] radeonsi: Invalidate the L2 cache on framebuffer change.

2015-09-24 Thread Bas Nieuwenhuizen
On Thursday, September 24, 2015 02:22:23 AM Marek Olšák wrote: > On Fri, Sep 4, 2015 at 9:47 PM, Bas Nieuwenhuizen > > wrote: > > This is needed by DCC when using compressed textures. > > > > Signed-off-by: Bas Nieuwenhuizen > > --- > > > >

Re: [Mesa-dev] [PATCH 3/8] radeonsi: Enable DCC.

2015-09-24 Thread Bas Nieuwenhuizen
On Thursday, September 24, 2015 07:24:50 PM Marek Olšák wrote: > On Thu, Sep 24, 2015 at 2:15 PM, Bas Nieuwenhuizen > > wrote: > > Hi Marek, > > > > Thanks for the review and my apologies, it seems I underestimated the > > potential for regressions in this se

Re: [Mesa-dev] [PATCH 3/8] radeonsi: Enable DCC.

2015-10-10 Thread Bas Nieuwenhuizen
ppended to the resource of a MSAA buffer. This has the secondary benefit of not needing to reference as many resources for command submission. Yours sincerely, Bas Nieuwenhuizen ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedeskt

[Mesa-dev] [PATCH 3/4] gallium: add global buffer memory barrier bit

2016-04-01 Thread Bas Nieuwenhuizen
Currently radeonsi synchronizes after every dispatch and Clover does nothing to synchronize. This is overzealous, especially with GL compute, so add a barrier for global buffers. Signed-off-by: Bas Nieuwenhuizen --- src/gallium/include/pipe/p_defines.h | 1 + src/gallium

[Mesa-dev] [PATCH 2/4] gallium: add threads per block TGSI property

2016-04-01 Thread Bas Nieuwenhuizen
The value 0 for unknown has been chosen to so that drivers using tgsi_scan_shader do not need to detect missing properties if they zero-initialize the struct. Signed-off-by: Bas Nieuwenhuizen --- src/gallium/auxiliary/tgsi/tgsi_strings.c | 3 +++ src/gallium/docs/source/tgsi.rst

[Mesa-dev] [PATCH 1/4] gallium: add compute shader IR type

2016-04-01 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/trace/tr_dump_state.c | 4 +++- src/gallium/include/pipe/p_state.h| 1 + src/gallium/state_trackers/clover/core/kernel.cpp | 1 + src/gallium/tests/trivial/compute.c | 1 + src/mesa/state_tracker

[Mesa-dev] [PATCH 4/4] gallium: distinguish between shader IR in get_compute_param

2016-04-01 Thread Bas Nieuwenhuizen
cally depend on the compiler. Signed-off-by: Bas Nieuwenhuizen --- src/gallium/docs/source/screen.rst| 18 ++--- src/gallium/drivers/ilo/ilo_screen.c | 1 + src/gallium/drivers/nouveau/nv50/nv50_screen.c| 1 + src/gallium/drivers/nouveau/nvc0/nvc0_scr

Re: [Mesa-dev] [PATCH 2/4] gallium: add threads per block TGSI property

2016-04-01 Thread Bas Nieuwenhuizen
I will change that to TGSI_PROPERTY_CS_FIXED_BLOCK_WIDTH etc. since most other properties, seem to use S instead of P, unless you have any objections. - Bas On Sat, Apr 2, 2016 at 12:37 AM, Ilia Mirkin wrote: > On Fri, Apr 1, 2016 at 6:32 PM, Bas Nieuwenhuizen > wrote: >> The

[Mesa-dev] [PATCH 00/20] GL compute shaders for radeonsi

2016-04-02 Thread Bas Nieuwenhuizen
b.com/BNieuwenhuizen/llvm - https://github.com/BNieuwenhuizen/mesa Bas Nieuwenhuizen (20): radeonsi: set shader calling conventions radeonsi: lower compute shader arguments radeonsi: add shared memory radeonsi: implement shared memory load/store radeonsi: implement shared atomics rad

[Mesa-dev] [PATCH 04/20] radeonsi: implement shared memory load/store

2016-04-02 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_shader.c | 75 +++- 1 file changed, 73 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 2ce37ca..97d4404 100644

[Mesa-dev] [PATCH 07/20] radeonsi: update shader count for compute shaders

2016-04-02 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_state.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_state.h b/src/gallium/drivers/radeonsi/si_state.h index 95a69e8..6d9f02e 100644 --- a/src/gallium/drivers/radeonsi

[Mesa-dev] [PATCH 16/20] radeonsi: split setting graphics and compute descriptors

2016-04-02 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_compute.c | 3 ++ src/gallium/drivers/radeonsi/si_descriptors.c | 60 ++- src/gallium/drivers/radeonsi/si_state.h | 7 +++- src/gallium/drivers/radeonsi/si_state_draw.c | 2 +- 4 files

[Mesa-dev] [PATCH 01/20] radeonsi: set shader calling conventions

2016-04-02 Thread Bas Nieuwenhuizen
Note that old mesa + new LLVM or new mesa + old LLVM breaks with this change and the corresponding LLVM change (D18559). For LLVM version <= 3.8 we use the old method, but we can't detect people using a post 3.8 svn version that is still too old. Signed-off-by: Bas Nieuwenhuizen

[Mesa-dev] [PATCH 17/20] radeonsi: do not do two full flushes on every compute dispatch

2016-04-02 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_compute.c | 17 ++--- src/gallium/drivers/radeonsi/si_state.c | 6 -- 2 files changed, 6 insertions(+), 17 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_compute.c b/src/gallium/drivers/radeonsi

[Mesa-dev] [PATCH 02/20] radeonsi: lower compute shader arguments

2016-04-02 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_shader.c | 41 src/gallium/drivers/radeonsi/si_shader.h | 7 ++ 2 files changed, 48 insertions(+) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi

[Mesa-dev] [PATCH 14/20] radeonsi: implement TGSI compute dispatch

2016-04-02 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_compute.c | 104 ++ 1 file changed, 77 insertions(+), 27 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_compute.c b/src/gallium/drivers/radeonsi/si_compute.c index 74db8d4..64ad2f3

[Mesa-dev] [PATCH 03/20] radeonsi: add shared memory

2016-04-02 Thread Bas Nieuwenhuizen
Declares the shared memory as a global variable so that LLVM is aware of it and it does not conflict with passes like AMDGPUPromoteAlloca. Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeon/radeon_llvm.h | 3 ++ .../drivers/radeon/radeon_setup_tgsi_llvm.c| 4

[Mesa-dev] [PATCH 06/20] radeonsi: set maximum work group size based on block size

2016-04-02 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_shader.c | 12 1 file changed, 12 insertions(+) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 7c7e9e5..28c7923 100644 --- a/src/gallium/drivers/radeonsi

[Mesa-dev] [PATCH 05/20] radeonsi: implement shared atomics

2016-04-02 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_shader.c | 89 +++- 1 file changed, 88 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 97d4404..7c7e9e5 100644

[Mesa-dev] [PATCH 19/20] mesa/st: enable compute shaders if images are also supported

2016-04-02 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- src/mesa/state_tracker/st_extensions.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c index 6c0df8d..7bbe87d 100644 --- a/src/mesa/state_tracker

[Mesa-dev] [PATCH 13/20] radeonsi: only emit compute shader state when switching shaders

2016-04-02 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_compute.c | 142 +- src/gallium/drivers/radeonsi/si_pipe.h| 2 + 2 files changed, 85 insertions(+), 59 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_compute.c b/src/gallium

[Mesa-dev] [PATCH 08/20] radeonsi: implement TGSI compute shader creation

2016-04-02 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_compute.c | 72 +++ 1 file changed, 54 insertions(+), 18 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_compute.c b/src/gallium/drivers/radeonsi/si_compute.c index 1ec695e..f2b13f0

[Mesa-dev] [PATCH 18/20] radeonsi: clean up compute flush

2016-04-02 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_pipe.h | 3 --- src/gallium/drivers/radeonsi/si_state_draw.c | 27 ++- 2 files changed, 10 insertions(+), 20 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_pipe.h b/src/gallium

[Mesa-dev] [PATCH 15/20] radeonsi: split texture decompression for compute shaders

2016-04-02 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_blit.c | 13 +++-- src/gallium/drivers/radeonsi/si_compute.c| 2 ++ src/gallium/drivers/radeonsi/si_pipe.h | 3 ++- src/gallium/drivers/radeonsi/si_state_draw.c | 2 +- 4 files changed, 16 insertions

[Mesa-dev] [PATCH 12/20] radeonsi: rework compute scratch buffer

2016-04-02 Thread Bas Nieuwenhuizen
Instead of having a scratch buffer per program, have one per context. Also removed the per kernel wave count calculations, but that only helped if the total number of waves in the dispatch was smaller than sctx->scratch_waves. Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeo

[Mesa-dev] [PATCH 10/20] radeonsi: don't pass scratch buffer to user SGPRs

2016-04-02 Thread Bas Nieuwenhuizen
As far as I can see we use relocations for clover too. Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_compute.c | 8 1 file changed, 8 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_compute.c b/src/gallium/drivers/radeonsi/si_compute.c index 3702e80

[Mesa-dev] [PATCH 09/20] radeonsi: split input upload off from si_launch_grid

2016-04-02 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_compute.c | 94 +-- 1 file changed, 53 insertions(+), 41 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_compute.c b/src/gallium/drivers/radeonsi/si_compute.c index f2b13f0..3702e80

[Mesa-dev] [PATCH 20/20] radeonsi: enable TGSI support cap for compute shaders

2016-04-02 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- docs/GL3.txt | 4 ++-- docs/relnotes/11.3.0.html | 1 + src/gallium/drivers/radeon/r600_pipe_common.c | 21 - src/gallium/drivers/radeonsi/si_pipe.c| 3 ++- 4 files changed

[Mesa-dev] [PATCH 11/20] radeonsi: do per cs setup for compute shaders once per cs

2016-04-02 Thread Bas Nieuwenhuizen
Also removes PKT3_CONTEXT_CONTROL as that is already being done by si_begin_new_cs, when emitting init_config. Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_compute.c| 69 +++- src/gallium/drivers/radeonsi/si_hw_context.c | 2 + src/gallium

[Mesa-dev] [PATCH 2/4] radeonsi: use bounded indexing for samplers

2016-04-04 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_shader.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index dd04748..392f439 100644 --- a/src/gallium/drivers

[Mesa-dev] [PATCH 1/4] radeonsi: use bounded indexing for constant buffers

2016-04-04 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_shader.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 28c7923..dd04748 100644 --- a/src/gallium/drivers

[Mesa-dev] [PATCH 4/4] radeonsi: mark ARB_robust_buffer_access_behavior as supported

2016-04-04 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- docs/GL3.txt | 2 +- docs/relnotes/11.3.0.html | 1 + src/gallium/drivers/radeonsi/si_pipe.c | 2 +- 3 files changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index 6ea8d5c..d7e0a4b

[Mesa-dev] [PATCH 3/4] expose ARB_robust_buffer_access_behavior

2016-04-04 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- src/gallium/docs/source/screen.rst | 4 +++- src/gallium/drivers/freedreno/freedreno_screen.c | 1 + src/gallium/drivers/i915/i915_screen.c | 1 + src/gallium/drivers/ilo/ilo_screen.c | 1 + src/gallium/drivers/llvmpipe

[Mesa-dev] [PATCH 0/4] ARB_robust_buffer_access_behavior for radeonsi

2016-04-04 Thread Bas Nieuwenhuizen
This series implements ARb_robust_buffer_access_behavior for the radeonsi driver. There are some tests at: https://github.com/BNieuwenhuizen/piglit These have not been send yet as they depend on robust access context support in waffle. Bas Nieuwenhuizen (4): radeonsi: use bounded indexing

Re: [Mesa-dev] [PATCH 13/20] radeonsi: only emit compute shader state when switching shaders

2016-04-04 Thread Bas Nieuwenhuizen
On Mon, Apr 4, 2016 at 7:29 PM, Marek Olšák wrote: > On Sat, Apr 2, 2016 at 3:10 PM, Bas Nieuwenhuizen > wrote: >> Signed-off-by: Bas Nieuwenhuizen >> --- >> src/gallium/drivers/radeonsi/si_compute.c | 142 >> +- >> src/gallium

Re: [Mesa-dev] [PATCH 13/20] radeonsi: only emit compute shader state when switching shaders

2016-04-04 Thread Bas Nieuwenhuizen
On Mon, Apr 4, 2016 at 7:53 PM, Bas Nieuwenhuizen wrote: > On Mon, Apr 4, 2016 at 7:29 PM, Marek Olšák wrote: >> On Sat, Apr 2, 2016 at 3:10 PM, Bas Nieuwenhuizen >> wrote: >>> Signed-off-by: Bas Nieuwenhuizen >>> --- >>> src/g

Re: [Mesa-dev] [PATCH 17/20] radeonsi: do not do two full flushes on every compute dispatch

2016-04-04 Thread Bas Nieuwenhuizen
On Tue, Apr 5, 2016 at 1:18 AM, Marek Olšák wrote: > On Sat, Apr 2, 2016 at 3:11 PM, Bas Nieuwenhuizen > wrote: >> Signed-off-by: Bas Nieuwenhuizen >> --- >> src/gallium/drivers/radeonsi/si_compute.c | 17 ++--- >> src/gallium/drivers/radeonsi/si_sta

Re: [Mesa-dev] [PATCH 05/20] radeonsi: implement shared atomics

2016-04-05 Thread Bas Nieuwenhuizen
On Wed, Apr 6, 2016 at 1:42 AM, Nicolai Hähnle wrote: > On 02.04.2016 08:10, Bas Nieuwenhuizen wrote: >> >> Signed-off-by: Bas Nieuwenhuizen >> --- >> src/gallium/drivers/radeonsi/si_shader.c | 89 >> +++- >> 1 fil

Re: [Mesa-dev] [PATCH 0/7] gallium, radeonsi: raise number of samplers to 32

2016-04-06 Thread Bas Nieuwenhuizen
Hi Nicolai, Patches 1-2 and 5-6 are Reviewed-by: Bas Nieuwenhuizen However, for increasing the limits there are several cases which still use signed shifts (i.e. 1 << ...) which is undefined behavior shifting into bit 31. mesa/st contains several of those, not sure which need to be updat

Re: [Mesa-dev] [PATCH] radeonsi: don't use the real barrier instruction in tess ctrl shaders

2016-04-06 Thread Bas Nieuwenhuizen
This patch is Reviewed-by: Bas Nieuwenhuizen On Thu, Apr 7, 2016 at 2:07 AM, Marek Olšák wrote: > From: Marek Olšák > > --- > src/gallium/drivers/radeonsi/si_shader.c | 8 > 1 file changed, 8 insertions(+) > > diff --git a/src/gallium/drivers/radeonsi/si_shad

Re: [Mesa-dev] [PATCH 0/7] gallium, radeonsi: raise number of samplers to 32

2016-04-07 Thread Bas Nieuwenhuizen
On Thu, Apr 7, 2016 at 12:11 PM, Marek Olšák wrote: > On Thu, Apr 7, 2016 at 2:18 AM, Bas Nieuwenhuizen > wrote: >> Hi Nicolai, >> >> Patches 1-2 and 5-6 are >> >> Reviewed-by: Bas Nieuwenhuizen >> >> However, for increasing the limits there ar

[Mesa-dev] [PATCH] radeonsi: Synchronize a streamout write after read hazard.

2016-04-11 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_descriptors.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c b/src/gallium/drivers/radeonsi/si_descriptors.c index cf898fd..a2c096f 100644 --- a/src/gallium/drivers

Re: [Mesa-dev] [PATCH] radeonsi: Synchronize a streamout write after read hazard.

2016-04-11 Thread Bas Nieuwenhuizen
On Mon, Apr 11, 2016 at 7:12 PM, Nicolai Hähnle wrote: > Sounds right to me. Do you have a test case that fails without it? > The synchronization test that I send to the piglit list has it as a subtest. - Bas ___ mesa-dev mailing list mesa-dev@lists.fr

[Mesa-dev] [PATCH v2 2/3] gallium: Add capability for ARB_robust_buffer_access_behavior

2016-04-12 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- src/gallium/docs/source/screen.rst | 5 + src/gallium/drivers/freedreno/freedreno_screen.c | 1 + src/gallium/drivers/i915/i915_screen.c | 1 + src/gallium/drivers/ilo/ilo_screen.c | 1 + src/gallium/drivers/llvmpipe

[Mesa-dev] [PATCH v2 3/3] radeonsi: Mark ARB_robust_buffer_access_behavior as supported.

2016-04-12 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- docs/GL3.txt | 2 +- docs/relnotes/11.3.0.html | 1 + src/gallium/drivers/radeonsi/si_pipe.c | 2 +- 3 files changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index 066889a..423cafa

[Mesa-dev] [PATCH v2 1/3] radeonsi: Expose the ARB_robust_buffer_access_behavior extension.

2016-04-12 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- src/mesa/main/extensions_table.h | 1 + src/mesa/main/mtypes.h | 1 + src/mesa/main/version.c | 2 +- 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h index

Re: [Mesa-dev] [PATCH v2 2/3] gallium: Add capability for ARB_robust_buffer_access_behavior

2016-04-12 Thread Bas Nieuwenhuizen
On Tue, Apr 12, 2016 at 3:56 PM, Roland Scheidegger wrote: > Am 12.04.2016 um 15:12 schrieb Bas Nieuwenhuizen: >> Signed-off-by: Bas Nieuwenhuizen >> --- >> src/gallium/docs/source/screen.rst | 5 + >> src/gallium/drivers/freedreno/freedreno_scr

Re: [Mesa-dev] [PATCH v2 2/3] gallium: Add capability for ARB_robust_buffer_access_behavior

2016-04-12 Thread Bas Nieuwenhuizen
On Tue, Apr 12, 2016 at 6:09 PM, Roland Scheidegger wrote: > Am 12.04.2016 um 16:23 schrieb Bas Nieuwenhuizen: >> On Tue, Apr 12, 2016 at 3:56 PM, Roland Scheidegger >> wrote: >>> Am 12.04.2016 um 15:12 schrieb Bas Nieuwenhuizen: >>>> Signed-off-by: Bas Nieuw

Re: [Mesa-dev] [PATCH] radeonsi: fix bounds check in si_create_vertex_elements

2016-04-12 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen On Tue, Apr 12, 2016 at 7:25 PM, Nicolai Hähnle wrote: > From: Nicolai Hähnle > > This was triggered by > dEQP-GLES3.functional.vertex_array_objects.all_attributes > > Cc: "11.1 11.2" > --- > src/gallium/drivers/radeonsi/si_s

Re: [Mesa-dev] [PATCH 1/2] radeonsi: fix NUM_SGPRS calculation once more

2016-04-13 Thread Bas Nieuwenhuizen
er_select_ps_parts(struct >> si_screen *sscreen, >> return true; >> } >> >> +static void si_fix_num_sgprs(struct si_shader *shader) >> +{ >> + unsigned min_sgprs = shader->info.num_input_sgprs + 2; /* VCC */ >> + >> + if (shader->

Re: [Mesa-dev] [PATCH] radeonsi: gate PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENT by LLVM version

2016-04-13 Thread Bas Nieuwenhuizen
Reviewed-by: Bas Nieuwenhuizen On Wed, Apr 13, 2016 at 4:16 PM, Nicolai Hähnle wrote: > From: Nicolai Hähnle > > Otherwise we incorrectly claim ARB_ssbo support even with older LLVM versions. > --- > src/gallium/drivers/radeonsi/si_pipe.c | 3 ++- > 1 file changed, 2 inserti

Re: [Mesa-dev] [PATCH] radeonsi: don't overwrite the scratch offset in shader prologs

2016-04-13 Thread Bas Nieuwenhuizen
If I understand correctly, you want to pass the scratch register from the prolog to the main shader? If so, I don't think this will work correctly, as this will shift all prolog outputs after the copied input sgprs one sgpr up and puts the scratch offset in the middle. In the main shader we don't

Re: [Mesa-dev] [PATCH] radeonsi: don't overwrite the scratch offset in shader prologs

2016-04-13 Thread Bas Nieuwenhuizen
On Wed, Apr 13, 2016 at 7:13 PM, Marek Olšák wrote: > On Wed, Apr 13, 2016 at 6:18 PM, Bas Nieuwenhuizen > wrote: >> If I understand correctly, you want to pass the scratch register from >> the prolog to the main shader? >> >> If so, I don't think this will wor

[Mesa-dev] [PATCH v2 02/20] radeonsi: add shared memory

2016-04-13 Thread Bas Nieuwenhuizen
Declares the shared memory as a global variable so that LLVM is aware of it and it does not conflict with passes like AMDGPUPromoteAlloca. v2: - Use ctx->i8. - Dropped null-check for declare_memory_region. - Changed memory region array to single region. Signed-off-by: Bas Nieuwenhui

[Mesa-dev] [PATCH v2 01/20] radeonsi: lower compute shader arguments

2016-04-13 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen Reviewed-by: Marek Olšák Reviewed-by: Nicolai Hähnle --- src/gallium/drivers/radeonsi/si_shader.c | 41 src/gallium/drivers/radeonsi/si_shader.h | 7 ++ 2 files changed, 48 insertions(+) diff --git a/src/gallium/drivers

[Mesa-dev] [PATCH v2 10/20] radeonsi: do per cs setup for compute shaders once per cs

2016-04-13 Thread Bas Nieuwenhuizen
Also removes PKT3_CONTEXT_CONTROL as that is already being done by si_begin_new_cs, when emitting init_config. v2: - Use radeon_set_sh_reg_seq. - Also set COMPUTE_STATIC_THREAD_MGMT_SE2 / SE3 for CIK+ Signed-off-by: Bas Nieuwenhuizen Reviewed-by: Nicolai Hähnle --- src/gallium/drivers

[Mesa-dev] [PATCH v2 06/20] radeonsi: update shader count for compute shaders

2016-04-13 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen Reviewed-by: Marek Olšák Reviewed-by: Nicolai Hähnle --- src/gallium/drivers/radeonsi/si_state.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_state.h b/src/gallium/drivers/radeonsi/si_state.h index

[Mesa-dev] [PATCH v2 15/20] radeonsi: split texture decompression for compute shaders

2016-04-13 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen Reviewed-by: Marek Olšák Reviewed-by: Nicolai Hähnle --- src/gallium/drivers/radeonsi/si_blit.c | 13 +++-- src/gallium/drivers/radeonsi/si_compute.c| 2 ++ src/gallium/drivers/radeonsi/si_pipe.h | 3 ++- src/gallium/drivers/radeonsi

[Mesa-dev] [PATCH v2 08/20] radeonsi: split input upload off from si_launch_grid

2016-04-13 Thread Bas Nieuwenhuizen
input_size is 0, as it contains grid parameters. Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_compute.c | 93 +-- 1 file changed, 52 insertions(+), 41 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_compute.c b/src/gallium/drivers/radeonsi

[Mesa-dev] [PATCH v2 16/20] radeonsi: split setting graphics and compute descriptors

2016-04-13 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen Reviewed-by: Marek Olšák Reviewed-by: Nicolai Hähnle --- src/gallium/drivers/radeonsi/si_compute.c | 3 ++ src/gallium/drivers/radeonsi/si_descriptors.c | 61 ++- src/gallium/drivers/radeonsi/si_state.h | 7 ++- src/gallium

[Mesa-dev] [PATCH v2 19/20] mesa/st: enable compute shaders if images are also supported

2016-04-13 Thread Bas Nieuwenhuizen
v2: Also depend on atomic counters. Signed-off-by: Bas Nieuwenhuizen Reviewed-by: Nicolai Hähnle --- src/mesa/state_tracker/st_extensions.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker

[Mesa-dev] [PATCH v2 18/20] radeonsi: clean up compute flush

2016-04-13 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen Reviewed-by: Marek Olšák Reviewed-by: Nicolai Hähnle --- src/gallium/drivers/radeonsi/si_pipe.h | 3 --- src/gallium/drivers/radeonsi/si_state_draw.c | 27 ++- 2 files changed, 10 insertions(+), 20 deletions(-) diff --git a/src

[Mesa-dev] [PATCH v2 04/20] radeonsi: implement shared atomics

2016-04-13 Thread Bas Nieuwenhuizen
v2: - Use single region - Use get_memory_ptr Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_shader.c | 77 +++- 1 file changed, 76 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers

[Mesa-dev] [PATCH v2 03/20] radeonsi: implement shared memory load/store

2016-04-13 Thread Bas Nieuwenhuizen
v2: - Use single region - Combine address calculation Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_shader.c | 84 +++- 1 file changed, 82 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium

[Mesa-dev] [PATCH v2 07/20] radeonsi: implement TGSI compute shader creation

2016-04-13 Thread Bas Nieuwenhuizen
v2: Moved scratch_enabled initialization after compile. Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_compute.c | 74 +++ 1 file changed, 56 insertions(+), 18 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_compute.c b/src/gallium

[Mesa-dev] [PATCH v2 17/20] radeonsi: do not do two full flushes on every compute dispatch

2016-04-13 Thread Bas Nieuwenhuizen
ff-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_compute.c | 17 ++--- src/gallium/drivers/radeonsi/si_cp_dma.c | 6 -- src/gallium/drivers/radeonsi/si_descriptors.c | 3 ++- src/gallium/drivers/radeonsi/si_state.c | 6 -- 4 files changed, 12 inser

[Mesa-dev] [PATCH v2 20/20] radeonsi: enable TGSI support cap for compute shaders

2016-04-13 Thread Bas Nieuwenhuizen
v2: Use chip_class instead of family. Signed-off-by: Bas Nieuwenhuizen Reviewed-by: Nicolai Hähnle --- docs/GL3.txt | 4 ++-- docs/relnotes/11.3.0.html | 1 + src/gallium/drivers/radeon/r600_pipe_common.c | 21 - src

[Mesa-dev] [PATCH v2 12/20] radeonsi: only emit compute shader state when switching shaders

2016-04-13 Thread Bas Nieuwenhuizen
v2: - Do check if anything changed earlier - Use emitted_program instead of emitted_bo to prevent shaders with shader->bo = NULL confusing the check - Use radeon_set_sh_reg* Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_compute.c |

[Mesa-dev] [PATCH v2 13/20] radeonsi: implement TGSI compute dispatch

2016-04-13 Thread Bas Nieuwenhuizen
v2: - Use radeon_set_sh_reg_seq. - Set predicate bit for conditional rendering. Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_compute.c | 104 ++ 1 file changed, 77 insertions(+), 27 deletions(-) diff --git a/src/gallium/drivers/radeonsi

[Mesa-dev] [PATCH v2 11/20] radeonsi: rework compute scratch buffer

2016-04-13 Thread Bas Nieuwenhuizen
Instead of having a scratch buffer per program, have one per context. Also removed the per kernel wave count calculations, but that only helped if the total number of waves in the dispatch was smaller than sctx->scratch_waves. v2: Fix style issue. Signed-off-by: Bas Nieuwenhuizen Reviewed

[Mesa-dev] [PATCH v2 05/20] radeonsi: set maximum work group size based on block size

2016-04-13 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen Reviewed-by: Marek Olšák Reviewed-by: Nicolai Hähnle --- src/gallium/drivers/radeonsi/si_shader.c | 12 1 file changed, 12 insertions(+) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index

[Mesa-dev] [PATCH v2 14/20] radeonsi: update predicate condition for compute dispatches

2016-04-13 Thread Bas Nieuwenhuizen
Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeonsi/si_compute.c | 6 ++ src/gallium/drivers/radeonsi/si_pipe.h| 9 + 2 files changed, 15 insertions(+) diff --git a/src/gallium/drivers/radeonsi/si_compute.c b/src/gallium/drivers/radeonsi/si_compute.c index 6a4db3a

[Mesa-dev] [PATCH v2 09/20] radeonsi: don't pass scratch buffer to user SGPRs

2016-04-13 Thread Bas Nieuwenhuizen
As far as I can see we use relocations for clover too. Signed-off-by: Bas Nieuwenhuizen Reviewed-by: Marek Olšák Reviewed-by: Nicolai Hähnle --- src/gallium/drivers/radeonsi/si_compute.c | 8 1 file changed, 8 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_compute.c b

[Mesa-dev] [PATCH 01/13] gallium/radeon: move ring_type into winsyses

2016-04-13 Thread Bas Nieuwenhuizen
From: Marek Olšák Not used by drivers. --- src/gallium/drivers/radeon/radeon_winsys.h| 1 - src/gallium/winsys/amdgpu/drm/amdgpu_cs.c | 8 src/gallium/winsys/amdgpu/drm/amdgpu_cs.h | 1 + src/gallium/winsys/radeon/drm/radeon_drm_cs.c | 10 +- src/gallium/winsys/ra

[Mesa-dev] [PATCH 00/13] Use the constant engine in radeonsi

2016-04-13 Thread Bas Nieuwenhuizen
ptors. Bas Nieuwenhuizen (10): winsys/amdgpu: Enlarge const IB size. radeonsi: Create CE IB. radeonsi: Add dirty_mask to descriptor list. radeonsi: Add CE packet definitions. radeonsi: Add CE synchronization. radeonsi: Allocate chunks of CE ram. radeonsi: Add CE uploader. radeonsi

[Mesa-dev] [PATCH 04/13] winsys/amdgpu: Enlarge const IB size.

2016-04-13 Thread Bas Nieuwenhuizen
Necessary to prevent performance regressions due to extra flushing. Probably should enlarge it even further when also updating uniforms through the CE, but this seems large enough for now. Signed-off-by: Bas Nieuwenhuizen --- src/gallium/winsys/amdgpu/drm/amdgpu_cs.c | 19

[Mesa-dev] [PATCH 05/13] radeonsi: Create CE IB.

2016-04-13 Thread Bas Nieuwenhuizen
Based on work by Marek Olšák. Signed-off-by: Bas Nieuwenhuizen --- src/gallium/drivers/radeon/r600_pipe_common.c | 1 + src/gallium/drivers/radeon/r600_pipe_common.h | 1 + src/gallium/drivers/radeonsi/si_hw_context.c | 4 +++- src/gallium/drivers/radeonsi/si_pipe.c| 7 +++ src

<    1   2   3   4   5   6   7   8   9   10   >