[Mesa-dev] [PATCH] radeonsi: add a workaround for weird s_buffer_load_dword behavior on SI

Marek Olšák Sun, 22 Oct 2017 14:20:10 -0700

From: Marek Olšák <marek.ol...@amd.com>

See my LLVM patch which fixes the root cause.


Users have to apply this patch and then they have 2 choices:
- Downgrade to LLVM 5.0
- Update to LLVM git after my LLVM patch is pushed.

It won't be possible to use current and earlier development version
of LLVM 6.0.
---
 src/gallium/drivers/radeonsi/si_shader.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 1320c6f..a248cea 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -2008,28 +2008,35 @@ static LLVMValueRef fetch_constant(
        if (sel->info.const_buffers_declared == 1 &&
            sel->info.shader_buffers_declared == 0) {
                LLVMValueRef ptr =
                        LLVMGetParam(ctx->main_fn, 
ctx->param_const_and_shader_buffers);
 
                /* This enables use of s_load_dword and flat_load_dword for 
const buffer 0
                 * loads, and up to x4 load opcode merging. However, it leads 
to horrible
                 * code reducing SIMD wave occupancy from 8 to 2 in many cases.
                 *
                 * Using s_buffer_load_dword (x1) seems to be the best option 
right now.
+                *
+                * LLVM 5.0 on SI doesn't insert a required s_nop between SALU 
setting
+                * a descriptor and s_buffer_load_dword using it, so we can't 
expand
+                * the pointer into a full descriptor like below. We have to use
+                * s_load_dword instead. The only case when LLVM 5.0 would 
select
+                * s_buffer_load_dword (that we have to prevent) is when we use 
use
+                * a literal offset where we don't need bounds checking.
                 */
-#if 0 /* keep this codepath disabled */
-               if (!reg->Register.Indirect) {
+               if (ctx->screen->b.chip_class == SI &&
+                    HAVE_LLVM < 0x0600 &&
+                    !reg->Register.Indirect) {
                        addr = LLVMBuildLShr(ctx->ac.builder, addr, 
LLVMConstInt(ctx->i32, 2, 0), "");
                        LLVMValueRef result = ac_build_load_invariant(&ctx->ac, 
ptr, addr);
                        return bitcast(bld_base, type, result);
                }
-#endif
 
                /* Do the bounds checking with a descriptor, because
                 * doing computation and manual bounds checking of 64-bit
                 * addresses generates horrible VALU code with very high
                 * VGPR usage and very low SIMD occupancy.
                 */
                ptr = LLVMBuildPtrToInt(ctx->ac.builder, ptr, ctx->i64, "");
                ptr = LLVMBuildBitCast(ctx->ac.builder, ptr, ctx->v2i32, "");
 
                LLVMValueRef desc_elems[] = {
-- 
2.7.4

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radeonsi: add a workaround for weird s_buffer_load_dword behavior on SI

Reply via email to