From: Marek Olšák <marek.ol...@amd.com> See my LLVM patch which fixes the root cause.
Users have to apply this patch and then they have 2 choices: - Downgrade to LLVM 5.0 - Update to LLVM git after my LLVM patch is pushed. It won't be possible to use current and earlier development version of LLVM 6.0. --- src/gallium/drivers/radeonsi/si_shader.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index 1320c6f..a248cea 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -2008,28 +2008,35 @@ static LLVMValueRef fetch_constant( if (sel->info.const_buffers_declared == 1 && sel->info.shader_buffers_declared == 0) { LLVMValueRef ptr = LLVMGetParam(ctx->main_fn, ctx->param_const_and_shader_buffers); /* This enables use of s_load_dword and flat_load_dword for const buffer 0 * loads, and up to x4 load opcode merging. However, it leads to horrible * code reducing SIMD wave occupancy from 8 to 2 in many cases. * * Using s_buffer_load_dword (x1) seems to be the best option right now. + * + * LLVM 5.0 on SI doesn't insert a required s_nop between SALU setting + * a descriptor and s_buffer_load_dword using it, so we can't expand + * the pointer into a full descriptor like below. We have to use + * s_load_dword instead. The only case when LLVM 5.0 would select + * s_buffer_load_dword (that we have to prevent) is when we use use + * a literal offset where we don't need bounds checking. */ -#if 0 /* keep this codepath disabled */ - if (!reg->Register.Indirect) { + if (ctx->screen->b.chip_class == SI && + HAVE_LLVM < 0x0600 && + !reg->Register.Indirect) { addr = LLVMBuildLShr(ctx->ac.builder, addr, LLVMConstInt(ctx->i32, 2, 0), ""); LLVMValueRef result = ac_build_load_invariant(&ctx->ac, ptr, addr); return bitcast(bld_base, type, result); } -#endif /* Do the bounds checking with a descriptor, because * doing computation and manual bounds checking of 64-bit * addresses generates horrible VALU code with very high * VGPR usage and very low SIMD occupancy. */ ptr = LLVMBuildPtrToInt(ctx->ac.builder, ptr, ctx->i64, ""); ptr = LLVMBuildBitCast(ctx->ac.builder, ptr, ctx->v2i32, ""); LLVMValueRef desc_elems[] = { -- 2.7.4 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev