Totals from affected shaders: SGPRS: 142032 -> 137651 (-3.08 %) VGPRS: 93280 -> 92992 (-0.31 %) Spilled SGPRs: 104 -> 129 (24.04 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 3309852 -> 3301444 (-0.25 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 22178 -> 22206 (0.13 %) Wait states: 0 -> 0 (0.00 %)
This increases SGPRs spilling a bit with Talos, but I have some other ideas that might reduce it. Signed-off-by: Samuel Pitoiset <samuel.pitoi...@gmail.com> --- This needs to be applied after "radeonsi: load the right number of components for VS inputs and TBOs" src/amd/common/ac_nir_to_llvm.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index 54c5f84886..e21023de0c 100644 --- a/src/amd/common/ac_nir_to_llvm.c +++ b/src/amd/common/ac_nir_to_llvm.c @@ -5348,6 +5348,9 @@ handle_vs_input_decl(struct nir_to_llvm_context *ctx, int index = variable->data.location - VERT_ATTRIB_GENERIC0; int idx = variable->data.location; unsigned attrib_count = glsl_count_attribute_slots(variable->type, true); + uint8_t input_usage_mask = + ctx->shader_info->info.vs.input_usage_mask[variable->data.location]; + unsigned num_channels = util_last_bit(input_usage_mask); variable->data.driver_location = idx * 4; @@ -5372,7 +5375,9 @@ handle_vs_input_decl(struct nir_to_llvm_context *ctx, input = ac_build_buffer_load_format(&ctx->ac, t_list, buffer_index, ctx->ac.i32_0, - 4, true); + num_channels, true); + + input = ac_build_expand_to_vec4(&ctx->ac, input, num_channels); for (unsigned chan = 0; chan < 4; chan++) { LLVMValueRef llvm_chan = LLVMConstInt(ctx->ac.i32, chan, false); -- 2.16.1 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev