Not sure whether I'd reviewed the first series, but patches 2-4 are
Reviewed-by: Bas Nieuwenhuizen
On Sat, Jul 1, 2017 at 4:56 AM, Connor Abbott wrote:
> From: Connor Abbott
>
> We implement the split opcodes, and tell NIR to lower the original ones.
> The lowering to LLVM
Hi Nicolai,
Can we use LLVMValueRef instead of int for the shader_abi? That way we
don't force the values to be function parameters. I don't think the
shared code should have to know about that, and it is more flexible
when we want to pass those slightly differently between the two
drivers, as we
According to Nicolai the SX can already start work when all
the position exports are done, so do those first.
Signed-off-by: Bas Nieuwenhuizen
---
src/amd/common/ac_nir_to_llvm.c | 109
1 file changed, 54 insertions(+), 55 deletions(-)
diff --git a/src
Series is
Reviewed-by: Bas Nieuwenhuizen
On Thu, Jul 6, 2017 at 4:09 AM, Dave Airlie wrote:
> From: Dave Airlie
>
> This just modifies the API to make it easier to add other flags
> to target machine creation.
>
> Signed-off-by: Dave Airlie
> ---
> src/amd/com
On Thu, Jul 6, 2017 at 9:48 PM, Connor Abbott wrote:
> From: Connor Abbott
>
> The old way was very TGSI-based, and couldn't handle indirect
> dereferences at all. Instead, pass through the type information NIR has
I think the old code should handle indirect derefs just fine? See the
indir_index
Reviewed-by: Bas Nieuwenhuizen
On Thu, Jul 6, 2017 at 9:48 PM, Connor Abbott wrote:
> From: Connor Abbott
>
> While normally we give variables whose name field is NULL a temporary
> name when called from nir_print_shader(), when we were calling from
> nir_print_instr() we
Patches 3-4 look technically correct to me, so for just using it for shared vars
Reviewed-by: Bas Nieuwenhuizen
On Thu, Jul 6, 2017 at 9:48 PM, Connor Abbott wrote:
> From: Connor Abbott
>
> Similar to before, do the direct NIR->LLVM translation instead of
> lowering to an arr
Reviewed-by: Bas Nieuwenhuizen
On Thu, Jul 6, 2017 at 9:50 PM, Connor Abbott wrote:
> From: Connor Abbott
>
> This makes the radv shader pipeline much closer to brw_preprocess_nir().
> The main changes are:
>
> - Now we call nir_split_var_copies(), which is necessary for
>
Reviewed-by: Bas Nieuwenhuizen
On Fri, Jul 7, 2017 at 12:10 AM, Dave Airlie wrote:
> From: Dave Airlie
>
> Since radv uses compute rings and we can't know when we are setting
> up the shaders what ring they are to be used on, we should just use
> the default xnack
On Thu, Jul 6, 2017 at 9:50 PM, Connor Abbott wrote:
> From: Connor Abbott
>
> Radeonsi doesn't either. As of the last commit, these should be handled
> properly as long as LLVM has scratch support. We also should use
> nir_lower_io_to_temporaries() for inputs instead of generating an
> if-ladder
Thanks! Pushed and cc'd it to stable.
Not pushing the first patch as I assume that is superseded by Connors patches.
On Fri, Jun 30, 2017 at 12:15 PM, Alex Smith
wrote:
> The NIR parameters are ordered "compare, data", matching GLSL, but both
> the image and buffer LLVM intrinsics take them the
Figured out the clear value when we have a combined depth stencil
surface.
Signed-off-by: Bas Nieuwenhuizen
---
src/amd/vulkan/radv_meta_clear.c | 16 +++-
1 file changed, 7 insertions(+), 9 deletions(-)
diff --git a/src/amd/vulkan/radv_meta_clear.c b/src/amd/vulkan
*max_alignment = addrGetMaxAlignmentsOutput.baseAlign;
> + fprintf(stderr, "max alignment is %d\n",
> *max_alignment);
With this printf removed, this patch is
Reviewed-by: Bas Nieuwenhuizen
> + }
> + }
>
nt.width,
> - iview->extent.height,
> - iview->extent.depth,
> - iview->descriptor,
> - iview->fmask_descriptor);
> - si_set_mutable_
ype when writing descriptor sets.
>
> v2: Generate storage descriptors for images with TRANSFER_DST, since
> those may be used as storage images internally.
>
> Signed-off-by: Alex Smith
> Reviewed-by: Bas Nieuwenhuizen
> ---
> src/amd/vulkan/radv_descriptor_set.c |
Thanks, pushed.
On Wed, Jul 12, 2017 at 12:14 PM, Alex Smith
wrote:
> This free was left in after dynamic descriptors were changed to not be
> allocated separately from the descriptor set, and can cause a crash.
>
> Fixes: 39644fa40a3 ("radv: Don't allocate dynamic descriptors separately")
> Sign
If the app does not plan to put a buffer or image in it
(why? But it is allowed and CTS does it), they do not need to
allocate it with the deciate allocation struct.
Fixes: a639d40f133 "radv: add support for local bos. (v3)"
---
src/amd/vulkan/radv_device.c | 4 +++-
1 file changed, 3 insertions(
Fixes: 93b4cb61eb2 "spirv: Allow OpPtrAccessChain for block indices"
---
src/amd/common/ac_nir_to_llvm.c | 14 ++
1 file changed, 14 insertions(+)
diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index c33408491a9..318c248a985 100644
--- a/src/amd/common/
To support the reindex intrinsic, we need the result to be
something on which we can adjust the index/address.
Since it is all within a basic block, the compiler should be
able to merge any extra loads.
---
src/amd/common/ac_nir_to_llvm.c | 14 +++---
1 file changed, 11 insertions(+), 3 d
Reviewed-by: Bas Nieuwenhuizen
for the series.
On Tue, Dec 12, 2017 at 6:10 PM, Samuel Pitoiset
wrote:
> Signed-off-by: Samuel Pitoiset
> ---
> src/amd/common/ac_llvm_build.c| 9 +
> src/amd/common/ac_llvm_build.h| 2 ++
>
We never supported it. Missed during copy and pasting.
Fixes: 17201a2eb0b "radv: port to using updated anv entrypoint/extension
generator."
---
src/amd/vulkan/radv_extensions.py | 1 -
1 file changed, 1 deletion(-)
diff --git a/src/amd/vulkan/radv_extensions.py
b/src/amd/vulkan/radv_extensions
return V_028710_SPI_SHADER_UINT16_ABGR;
> + } else {
> + return V_028710_SPI_SHADER_ZERO;
> + }
> +}
I'm not a fan of having this function in two places. Can we export the
format from the compiler to radv, or the other way around?
Otherwise,
Reviewed-by: Bas Nieuwen
Reviewed-by: Bas Nieuwenhuizen
On Thu, Dec 14, 2017 at 5:32 PM, Samuel Pitoiset
wrote:
> Signed-off-by: Samuel Pitoiset
> ---
> src/amd/common/ac_nir_to_llvm.c | 3 ++-
> src/amd/common/ac_shader_info.c | 3 +++
> src/amd/common/ac_shader_info.h | 1 +
> src/amd/vulkan/ra
case nir_intrinsic_load_num_work_groups:
> info->cs.uses_grid_size = true;
> break;
> + case nir_intrinsic_load_work_group_id: {
> + unsigned mask = nir_ssa_def_components_read(&instr->dest.ssa);
Nice find that there is an utility function for t
Reviewed-by: Bas Nieuwenhuizen
for the series.
On Thu, Dec 14, 2017 at 4:48 PM, Samuel Pitoiset
wrote:
> We should also not load the input SGPRs and VGPRS, but
> let's start with this for now.
>
> Signed-off-by: Samuel Pitoiset
> ---
> src/amd/vulkan/radv_shader.c | 1
Reviewed-by: Bas Nieuwenhuizen
Would it make sense to move the compute_resource_limits calculation to
pipeline creation time?
On Thu, Dec 14, 2017 at 3:51 PM, Samuel Pitoiset
wrote:
> Ported from RadeonSI.
>
> Signed-off-by: Samuel Pitoiset
> ---
> src/amd/vulkan/radv_cm
Reviewed-by: Bas Nieuwenhuizen
for the series.
On Thu, Dec 14, 2017 at 1:51 PM, Samuel Pitoiset
wrote:
> ac_shader_util.c will contain shader helpers for RadeonSI
> and RADV.
>
> Signed-off-by: Samuel Pitoiset
> ---
> src/amd/Makefile.sources| 5 -
Reviewed-by: Bas Nieuwenhuizen
On Thu, Dec 14, 2017 at 12:51 PM, Samuel Pitoiset
wrote:
> The number of grid components is always 3 when gl_NumWorkGroups
> is declared, because it relies on the number of components of
> nir_instrinsic_load_num_work_groups.
>
> Signed-off-by:
Reviewed-by: Bas Nieuwenhuizen
On Fri, Dec 15, 2017 at 6:54 PM, Samuel Pitoiset
wrote:
> Signed-off-by: Samuel Pitoiset
> ---
> src/amd/vulkan/radv_private.h | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_private.h b/src/amd/vulkan/radv
---
src/intel/vulkan/anv_device.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c
index 59767c2..4638f311dd1 100644
--- a/src/intel/vulkan/anv_device.c
+++ b/src/intel/vulkan/anv_device.c
@@ -741,8 +741,6 @@ void anv_GetPhysica
For the radv dependencies on syncobj signal/reset.
---
configure.ac | 2 +-
meson.build | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/configure.ac b/configure.ac
index a4564d23f4c..138459c6f79 100644
--- a/configure.ac
+++ b/configure.ac
@@ -74,7 +74,7 @@ AC_SUBST([OPENCL
---
src/amd/vulkan/radv_device.c | 113 --
src/amd/vulkan/radv_private.h | 6 ++-
src/amd/vulkan/radv_wsi.c | 5 ++
3 files changed, 109 insertions(+), 15 deletions(-)
diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 7
---
src/amd/vulkan/radv_radeon_winsys.h | 4 +++
src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c | 40 +++
2 files changed, 44 insertions(+)
diff --git a/src/amd/vulkan/radv_radeon_winsys.h
b/src/amd/vulkan/radv_radeon_winsys.h
index 2b815d9c5a9..e851c3edf86 1006
---
src/amd/vulkan/radv_device.c | 47 +++
src/amd/vulkan/radv_extensions.py | 1 +
2 files changed, 48 insertions(+)
diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index fc9fb59f991..94562fda875 100644
--- a/src/amd/vulkan/radv_
First amdgpu bump after inclusion was 20 (which was done for local BOs).
---
src/amd/common/ac_gpu_info.c | 1 +
src/amd/common/ac_gpu_info.h | 1 +
2 files changed, 2 insertions(+)
diff --git a/src/amd/common/ac_gpu_info.c b/src/amd/common/ac_gpu_info.c
index 0576dd369cf..c042bb229ce 100644
---
---
src/amd/vulkan/radv_device.c | 20
src/amd/vulkan/radv_extensions.py | 2 ++
2 files changed, 22 insertions(+)
diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 94562fda875..a4ec912ff2c 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src
Reviewed-by: Bas Nieuwenhuizen
On Fri, Dec 15, 2017 at 4:01 PM, Samuel Pitoiset
wrote:
> This reverts commit 2294d35b243dee15af15895e876a63b7d22e48cc.
>
> We can't do this without adjusting the input SGPRs/VGPRs logic.
> For now, just revert it. I will send a proper solution la
Reviewed-by: Bas Nieuwenhuizen
On Fri, Dec 15, 2017 at 3:37 PM, Samuel Pitoiset
wrote:
> Signed-off-by: Samuel Pitoiset
> ---
> src/amd/common/ac_shader_util.c | 27 +
> src/amd/common/ac_shader_util.h | 6 +
>
Reviewed-by: Bas Nieuwenhuizen
On Fri, Dec 15, 2017 at 3:37 PM, Samuel Pitoiset
wrote:
> Signed-off-by: Samuel Pitoiset
> ---
> src/amd/common/ac_shader_util.c | 35
> +
> src/amd/common/ac_shader_util.h | 3 +++
&g
We did not set the layer correctly for the dst, as we would keep
using the base layer. Same for the source image.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102710
CC:
---
src/amd/vulkan/radv_meta_blit.c | 49 -
1 file changed, 24 insertions(+)
Reviewed-by: Bas Nieuwenhuizen
On Mon, Dec 18, 2017 at 6:08 AM, Dave Airlie wrote:
> From: Dave Airlie
>
> anv merges the tess info correctly, but radv wasn't doing this.
>
> This fixes hangs in
> dEQP-VK.tessellation.winding.default_domain.hlsl_triangles_ccw
>
Reviewed-by: Bas Nieuwenhuizen
for the series.
On Mon, Dec 18, 2017 at 7:55 AM, Dave Airlie wrote:
> From: Dave Airlie
>
> This wasn't calculating the correct value, this along with
> a nir patch fixes a regression in:
> dEQP-VK.tessellation.shader_input_output.barrier
&g
Passes dEQP-VK.*.sync_fd.*
---
src/amd/vulkan/radv_device.c | 19 +++
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index 0c31bfb9b44..51488285b09 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vu
---
src/amd/vulkan/radv_radeon_winsys.h | 5 +
src/amd/vulkan/winsys/amdgpu/radv_amdgpu_cs.c | 21 +
2 files changed, 26 insertions(+)
diff --git a/src/amd/vulkan/radv_radeon_winsys.h
b/src/amd/vulkan/radv_radeon_winsys.h
index e851c3edf86..45f58f063a4 100644
-
---
src/amd/vulkan/radv_device.c | 115 ---
1 file changed, 87 insertions(+), 28 deletions(-)
diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c
index a4ec912ff2c..0c31bfb9b44 100644
--- a/src/amd/vulkan/radv_device.c
+++ b/src/amd/vul
It uses slightly more memory (though still bounded by the number
of mapped ranges), but gives less quadratic behavior.
Cuts 4 minutes from the runtime of the CTS *.sparse.* tests.
---
src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c | 45 ++-
1 file changed, 24 insertions(+),
Reviewed-by: Bas Nieuwenhuizen
for the series.
On Mon, Dec 18, 2017 at 7:38 PM, Samuel Pitoiset
wrote:
> They are dummy objects but the spec requires layout to not be
> NULL, this just makes sure we are creating valid pipeline layout
> objects. This will allow us to remove some usele
Yep, funny that it did not hang during my testing.
Thanks!
On Tue, Dec 19, 2017 at 7:50 PM, Eric Engestrom
wrote:
> On Tuesday, 2017-12-19 09:02:57 +0100, Bas Nieuwenhuizen wrote:
>> It uses slightly more memory (though still bounded by the number
>> of mapped ranges), but gives
Reviewed-by: Bas Nieuwenhuizen
On Wed, Dec 20, 2017 at 8:57 PM, Samuel Pitoiset
wrote:
> Found by inspection.
>
> Cc: 17.3
> Signed-off-by: Samuel Pitoiset
> ---
> src/amd/vulkan/radv_pipeline.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>
Reviewed-by: Bas Nieuwenhuizen
On Wed, Dec 20, 2017 at 9:07 PM, Samuel Pitoiset
wrote:
> This reverts commit ff0f17da1446e7aa965e06c04a6ad5a55d95463d.
>
> See the TODO.
>
> Signed-off-by: Samuel Pitoiset
> ---
> src/amd/vulkan/radv_image.c | 17 +++--
Nice catch!
Reviewed-by: Bas Nieuwenhuizen
On Thu, Dec 21, 2017 at 5:05 AM, Dave Airlie wrote:
> From: Dave Airlie
>
> This fixes vmfaults seen on vega with:
> dEQP-VK.pipeline.multisample_interpolation.sample_interpolate_at_single_sample_.128_128_1.samples_1
>
> These
Assuming you tested with vega,
Reviewed-by: Bas Nieuwenhuizen
On Thu, Dec 21, 2017 at 5:45 PM, Samuel Pitoiset
wrote:
> Signed-off-by: Samuel Pitoiset
> ---
> src/amd/vulkan/radv_image.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/src/amd/vul
Reviewed-by: Bas Nieuwenhuizen
for the series.
On Thu, Dec 21, 2017 at 5:53 PM, Samuel Pitoiset
wrote:
> Signed-off-by: Samuel Pitoiset
> ---
> src/amd/common/ac_nir_to_llvm.c | 64 +++-
> src/amd/common/ac_shader_util.c
Reviewed-by: Bas Nieuwenhuizen
On Thu, Dec 21, 2017 at 2:50 AM, Dave Airlie wrote:
> From: Dave Airlie
>
> On GFX9 we must access 3D textures with 3D samplers AFAICS.
>
> This fixes:
> dEQP-VK.api.image_clearing.core.clear_color_image.3d.single_layer
>
> on GFX9 f
On Thu, Dec 21, 2017 at 2:50 AM, Dave Airlie wrote:
> From: Dave Airlie
>
> looking at traces I noticed we'd set slice_max too large sometimes.
Too small? Otherwise patch 1,3,4 are also
Reviewed-by: Bas Nieuwenhuizen
>
> This should fix it.
>
> Signed-off-by: Dav
e_level_layer(dest_image,
> + dest_image_layout,
>
> &pRegions[r].dstSubresource);
>
> /* for DCC */
> @@ -429,7 +440,9 @@ void radv_CmdCopyImage(
> RADV_FROM_H
DCC was disabled when the image format is !!supported, which is one ! too many.
Ironically the commit that introduced it was supposed to lead to more DCC use
...
Fixes: 969537d9358 "radv: Add support for more DCC compression with
VK_KHR_image_format_list."
---
src/amd/vulkan/radv_image.c | 2 +
6:29 PM, Marek Olšák wrote:
> Does this mean that radeonsi shouldn't use amdgpu_cs_syncobj_wait on older
> DRM?
>
> Does it make sense to have separate has_syncobj and has_syncobj_wait flags?
>
> Marek
>
> On Sun, Dec 17, 2017 at 1:11 AM, Bas Nieuwenhuizen
> wro
The position start at (dst.x, dst.y), so if we want the source to
start at (src.x, src.y), we have to offset by (src.x-dst.x,src.y-dst.y).
Haven't tested that this fixed anything yet, but found by inspection.
Fixes: 69136f4e633 "radv/meta: add resolve pass using fragment/vertex shaders"
---
src/a
Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
---
src/amd/vulkan/radv_meta_resolve_cs.c | 8
src/amd/vulkan/radv_meta_resolve_fs.c | 10 ++
2 files changed, 18 insertions(+)
diff --git a/src/amd/vulkan/radv_meta_resolve_cs.c
b/src/amd/vulkan/radv_meta
HW resolve does not support it either.
---
src/amd/vulkan/radv_meta_resolve.c | 9 -
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/src/amd/vulkan/radv_meta_resolve.c
b/src/amd/vulkan/radv_meta_resolve.c
index e73a950ab7c..26489b7834f 100644
--- a/src/amd/vulkan/radv_meta_r
If the destination has DCC, we will use the FS resolve.
---
src/amd/vulkan/radv_meta_resolve_cs.c | 5 +
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/src/amd/vulkan/radv_meta_resolve_cs.c
b/src/amd/vulkan/radv_meta_resolve_cs.c
index 5b6cea6c103..7c569aa9202 100644
--- a/src/a
the samples_identical instruction returns 0 if they are differet, so
we have to do the extra work if the result is 0, not if it is != 0.
Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
---
src/amd/vulkan/radv_meta.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
d
Reviewed-by: Bas Nieuwenhuizen
for this series.
On Tue, Dec 26, 2017 at 11:19 PM, Dave Airlie wrote:
> From: Dave Airlie
>
> These are just taken from amdvlk, we probably knew these already,
> but may as well port them now.
>
> Signed-off-by: Dave Airlie
> ---
&g
Reviewed-by: Bas Nieuwenhuizen
On Wed, Dec 27, 2017 at 2:24 AM, Dave Airlie wrote:
> From: Dave Airlie
>
> This needs to correspond to the bit depth of the Z plane.
>
> noticed in passing reading amdvlk.
>
> Signed-off-by: Dave Airlie
> ---
> src/amd/vulkan/radv
Reviewed-by: Bas Nieuwenhuizen
On Wed, Dec 27, 2017 at 8:04 AM, Dave Airlie wrote:
> From: Dave Airlie
>
> For copies the texture unit needs to know the depth format so
> it can read the htile data properly.
>
>
On Wed, Dec 27, 2017 at 5:25 AM, Dieter Nützel wrote:
> Am 27.12.2017 01:20, schrieb Bas Nieuwenhuizen:
>>
>> The position start at (dst.x, dst.y), so if we want the source to
>> start at (src.x, src.y), we have to offset by (src.x-dst.x,src.y-dst.y).
>>
>> Haven&
Framebuffer is from 0,0, not (dst.x, dst.y).
Fixes: 69136f4e633 "radv/meta: add resolve pass using fragment/vertex shaders"
---
src/amd/vulkan/radv_meta_resolve_fs.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/amd/vulkan/radv_meta_resolve_fs.c
b/src/amd/vulkan/rad
Reviewed-by: Bas Nieuwenhuizen
On Thu, Dec 28, 2017 at 12:47 AM, Dave Airlie wrote:
> From: Dave Airlie
>
> We destroy the pools but don't free the container.
>
> This fixes:
> dEQP-VK.wsi.xlib.swapchain.simulate_oom*
>
> Fixes: d50937f137 (vulkan/wsi: Implement
Reviewed-by: Bas Nieuwenhuizen
On Thu, Dec 28, 2017 at 7:29 AM, Dave Airlie wrote:
> From: Dave Airlie
>
> The event emission wasn't sending the correct packet for gfx8 compute
> queues, which explains why it works on vega fine.
>
> This fixes the mpv vulkan hang.
>
Please add a fixes tag.
Reviewed-by: Bas Nieuwenhuizen
On Thu, Dec 28, 2017 at 7:33 AM, Dave Airlie wrote:
> From: Dave Airlie
>
> It's legal to a pipeline stat query on a compute queue,
> but we'd emit the wrong packet here. This should fix it to emit
> the correct
Reviewed-by: Bas Nieuwenhuizen
On Thu, Dec 28, 2017 at 7:14 AM, Dave Airlie wrote:
> From: Dave Airlie
>
> These seem mildly unstable on vega, crashing CTS in various fun ways,
> and looks like leaking memory.
>
> Disable for now, but leave the option to enable them.
>
On Thu, Dec 28, 2017 at 3:54 PM, Marek Olšák wrote:
> On Thu, Dec 28, 2017 at 12:29 PM, Konstantin Kharlamov
> wrote:
>> I'm wondering, how is r600g different in that regard? I tried wiring up the
>> code into evergreen_do_fast_color_clear(), both in this state and by using
>> 256*256 — however
For fast clear eliminate and decompressions, we always use the most compressed
format.
For clears, the code already creates a renderpass on demand with the exact same
layout as specified.
Otherwise we start distinguishing between GENERAL and TRANSFER_DST_OPTIMAL.
---
src/amd/vulkan/radv_meta_bli
If both source and destination are DCC compressed, and their formats
are not compatible, we need to decompress one of them to make
sure we can do reinterpretation (which needs src format == dst format)
.
---
src/amd/vulkan/radv_meta_copy.c | 27 +--
1 file changed, 25 inser
Simplifies failure paths. The caller already calls
radv_device_finish_meta_fast_clear_flush_state on failure.
---
src/amd/vulkan/radv_meta_fast_clear.c | 9 +++--
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/src/amd/vulkan/radv_meta_fast_clear.c
b/src/amd/vulkan/radv_meta_fas
Apps can use this for render feedback loops, where things are
defined if they render each pixel only once. However, DCC fails
here, as the level of coherence is a block not a pixel, so disable it.
This is also going to help implementing other stuff.
Even if we optimize this later to only happen i
We don't get a layout when binding to a descriptor set, but can
assume that the LAYOUT is GENERAL.
For DCC stores with the DCC bits set will result in a hang, so
better be safe than sorry.
---
src/amd/vulkan/radv_image.c | 11 ++-
1 file changed, 6 insertions(+), 5 deletions(-)
diff --gi
---
src/amd/vulkan/radv_meta_fast_clear.c | 94 ++-
src/amd/vulkan/radv_private.h | 1 +
2 files changed, 83 insertions(+), 12 deletions(-)
diff --git a/src/amd/vulkan/radv_meta_fast_clear.c
b/src/amd/vulkan/radv_meta_fast_clear.c
index 1acf510359d..44c2f
We do an in place copy where we read compressed and write decompressed.
By doing this in sizes that cover entire DCC blocks and waiting for all
reads in the block before starting to write we avoid corruption.
In the end we clear the DCC metadata to 0x.
---
src/amd/vulkan/radv_meta.h
It should already be valid there + the RB will update it during
rendering.
---
src/amd/vulkan/radv_meta_resolve_fs.c | 5 -
1 file changed, 5 deletions(-)
diff --git a/src/amd/vulkan/radv_meta_resolve_fs.c
b/src/amd/vulkan/radv_meta_resolve_fs.c
index 798129ec854..99314d94e53 100644
--- a/sr
Before this DCC was in practice disabled for most games. This
enables practical DCC use. Expect a 5-10% perf increase on a
bunch of games on vega @ 4k.
---
src/amd/vulkan/radv_image.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/src/amd/vulkan/radv_image.c b/src/amd/vulkan
Those are implemented as texture sampling, so we need to make the
texture TC-compatible too.
Fixes: 34d23e82ca9 "radv: set some dcc parameters depending on if texture will
be sampled"
---
src/amd/vulkan/radv_device.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/amd/v
Letting it be disabled by default.
---
src/amd/vulkan/radv_debug.h | 1 +
src/amd/vulkan/radv_device.c | 8
2 files changed, 9 insertions(+)
diff --git a/src/amd/vulkan/radv_debug.h b/src/amd/vulkan/radv_debug.h
index af07564833e..5b37bfe0847 100644
--- a/src/amd/vulkan/radv_debug.h
+++
Overall it does not really help or hurt. The deferred demo gets 1%
improvement and some games a 3% decrease, so I don't think this
should be enabled by default.
But with the code upstream it is easier to experiment with it.
---
src/amd/vulkan/radv_cmd_buffer.c | 16 ++
src/amd/vulkan/radv_pipeli
I don't like having to fush, so this introduces the other workaround.
Since my experience is that context register writes are pretty cheap,
this should not have too much overhead.
I haven't seen any significant perf changes in benchmarks or games
though.
---
src/amd/vulkan/radv_cmd_buffer.c | 22
Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
---
src/amd/common/ac_nir_to_llvm.c | 13 +++--
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c
index d9f2cb408c3..864f58b56d0 100644
---
When rasterization is disabled we can have that few.
Fixes: 76603aa90b8 "radv: Drop the default viewport when 0 viewports are given."
---
src/amd/vulkan/si_cmd_buffer.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buf
Seems like users are actually hitting 0x actually making
things broken for them, and the mad max regression is fixed, so
lets put this in once more.
Fixes: af2844116fd "radv: Revert HTILE reset word to 0x."
---
src/amd/vulkan/radv_cmd_buffer.c | 2 +-
1 file changed, 1 insertion(+
Seems like users are actually hitting 0x actually making
things broken for them, and the mad max regression is fixed, so
lets put this in once more.
v2: Use 0xf for depth-only htile. (Dave)
Fixes: af2844116fd "radv: Revert HTILE reset word to 0x."
---
src/amd/vulkan/radv_cmd_buff
Reviewed-by: Bas Nieuwenhuizen
We should probably do something similar for has_sync_file since
sync_files are significantly older.
On Wed, Jan 3, 2018 at 10:51 PM, Marek Olšák wrote:
> From: Marek Olšák
>
> ---
> src/amd/common/ac_gpu_info.c | 2 +-
> src/amd/commo
sync_files are in linux since 4.7, while the amdgpu fence_to_handle
ioctl is only in 4.15.
In particular we don't need it for sync_file in radv, because
everything happens via syncobjs, which got support earlier than
fence_to_handle.
---
src/amd/common/ac_gpu_info.c| 4 ++--
src/amd/c
Was surprised that is even supported by Vega.
---
src/amd/vulkan/radv_device.c| 4 +++-
src/amd/vulkan/radv_formats.c | 36
src/amd/vulkan/vk_format_layout.csv | 20 ++--
3 files changed, 49 insertions(+), 11 deletions(-)
diff --
These are just shaders reads, so we need to invalidate L1.
Fixes: 6dbb0eaccc "radv: handle subpass cache flushes"
---
src/amd/vulkan/radv_cmd_buffer.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 6dc8
Copied from radeonsi.
Putting in the correct metadata flush commands for eventually not
flushing L2 on CB/DB switch.
Does not remove the need for V_028A90_CACHE_FLUSH_AND_INV_TS_EVENT
at the moment.
---
src/amd/vulkan/si_cmd_buffer.c | 29 -
1 file changed, 16 inserti
Per spec:
"Additionally, exporting a fence payload to a handle with copy transference has
the same side effects
on the source fence’s payload as executing a fence reset operation. If the
fence was using a
temporarily imported payload, the fence’s prior permanent payload will be
restored."
And
an driver don’t’ support that.
>
> Thanks.
> Best Regards,
> David
>
>> On Jan 4, 2018, at 8:38 AM, Bas Nieuwenhuizen
>> wrote:
>>
>> Was surprised that is even supported by Vega.
>> ---
>> src/amd/vulkan/radv_dev
Looking at AMDGPUAsmPrinter::EmitProgramInfoSI in LLVM that is only
set for compute shaders. So fix radv to default to the proposed value
and fix LLVM to pass it through for all shaders?
On Thu, Jan 4, 2018 at 11:54 AM, Samuel Pitoiset
wrote:
>
>
> On 12/28/2017 11:08 PM, Matt Arsenault wrote:
>>
Reviewed-by: Bas Nieuwenhuizen
On Thu, Jan 4, 2018 at 4:24 PM, Samuel Pitoiset
wrote:
> Signed-off-by: Samuel Pitoiset
> ---
> src/amd/vulkan/radv_cmd_buffer.c | 7 ++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/src/amd/vulkan/radv_cmd_buffer.c
---
src/amd/vulkan/Makefile.am | 6 +-
src/amd/vulkan/radv_entrypoints_gen.py | 4 +++-
src/amd/vulkan/radv_extensions.py | 1 +
3 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/src/amd/vulkan/Makefile.am b/src/amd/vulkan/Makefile.am
index 6b352aebf9..e1a04e8c7f
801 - 900 of 2366 matches
Mail list logo