On Tue, Oct 17, 2017 at 7:41 AM, Timothy Arceri <tarc...@itsqueeze.com> wrote: > It looks the original indirect mask was probably copied from > ANV. > > Here we drop lowering locals altogether and allow indirects > on inputs where supported. > > Sascha Willems demo results: > > tessellation ~4000 -> ~4200 fps > --- > > Radeonsi also does a couple of other things. > > 1. It sets the following llvm falg: > > sscreen->llvm_has_working_vgpr_indexing ? "" : ",-promote-alloca" > > 2. Lowers indirect outputs on certain hardware: > > return sscreen->llvm_has_working_vgpr_indexing || > /* TCS stores outputs directly to memory. */ > shader == PIPE_SHADER_TESS_CTRL; > > > I'm not sure if we should be doing these things also. Comments?
Yeah, we should be doing those also (the latter is when *NOT* to lower indirect outputs though?). Everything besides TCS uses our custom vector buffering code, so is susceptible to LLVM indirect addressing bugs. We might be reducing this for LS/ES by directly writing to LDS/memory instead of to internal vars first, but that is not the way it is done currently. In the meantime, this patch is Reviewed-by: Bas Nieuwenhuizen <b...@basnieuwenhuizen.nl> > > src/amd/vulkan/radv_shader.c | 20 ++++++++++++++++++-- > 1 file changed, 18 insertions(+), 2 deletions(-) > > diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c > index 055787a705..819d33b1ad 100644 > --- a/src/amd/vulkan/radv_shader.c > +++ b/src/amd/vulkan/radv_shader.c > @@ -238,23 +238,39 @@ radv_shader_compile_to_nir(struct radv_device *device, > NIR_PASS_V(nir, nir_lower_constant_initializers, ~0); > NIR_PASS_V(nir, nir_lower_system_values); > NIR_PASS_V(nir, nir_lower_clip_cull_distance_arrays); > } > > /* Vulkan uses the separate-shader linking model */ > nir->info.separate_shader = true; > > nir_shader_gather_info(nir, entry_point->impl); > > + /* While it would be nice not to have this flag, we are constrained > + * by the reality that LLVM 5.0 doesn't have working VGPR indexing > + * on GFX9. > + */ > + bool llvm_has_working_vgpr_indexing = > + device->physical_device->rad_info.chip_class <= VI; > + > + /* TODO: Indirect indexing of GS inputs is unimplemented. > + * > + * TCS and TES load inputs directly from LDS or offchip memory, so > + * indirect indexing is trivial. > + */ > nir_variable_mode indirect_mask = 0; > - indirect_mask |= nir_var_shader_in; > - indirect_mask |= nir_var_local; > + if (nir->stage == MESA_SHADER_GEOMETRY || > + (nir->stage != MESA_SHADER_TESS_CTRL && > + nir->stage != MESA_SHADER_TESS_EVAL && > + !llvm_has_working_vgpr_indexing)) { > + indirect_mask |= nir_var_shader_in; > + } > > nir_lower_indirect_derefs(nir, indirect_mask); > > static const nir_lower_tex_options tex_options = { > .lower_txp = ~0, > }; > > nir_lower_tex(nir, &tex_options); > > nir_lower_vars_to_ssa(nir); > -- > 2.13.6 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev