On Thu, Aug 23, 2018 at 5:42 PM Caio Marcelo de Oliveira Filho < caio.olive...@intel.com> wrote:
> On Sat, Jul 28, 2018 at 10:44:42PM -0700, Jason Ekstrand wrote: > > We have to be a bit careful with this one because we want it to run in > > the optimization loop but only in the first brw_nir_optimize call. > > Later calls assume that we've lowered away copy_deref instructions and > > we don't want to introduce any more. > > > > Shader-db results on Kaby Lake: > > > > total instructions in shared programs: 15176942 -> 15176942 (0.00%) > > instructions in affected programs: 0 -> 0 > > helped: 0 > > HURT: 0 > > > > In spite of the lack of any shader-db improvement, this patch completely > > eliminates spilling in the Batman: Arkham City tessellation shaders. > > This is because we are now able to detect that the temporary array > > created by DXVK for storing TCS inputs is a copy of the input arrays and > > use indirect URB reads instead of making a copy of 4.5 KiB of input data > > and then indirecting on it with if-ladders. > > --- > > src/intel/compiler/brw_nir.c | 16 +++++++++------- > > src/intel/compiler/brw_nir.h | 3 ++- > > 2 files changed, 11 insertions(+), 8 deletions(-) > > Given the comment mentioned below is added. This patch is > > Reviewed-by: Caio Marcelo de Oliveira Filho <caio.olive...@intel.com> > > There's also an optional suggestion. > > > > > diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c > > index 5e9da9e1ef2..2417c0cd618 100644 > > --- a/src/intel/compiler/brw_nir.c > > +++ b/src/intel/compiler/brw_nir.c > > @@ -533,7 +533,7 @@ brw_nir_no_indirect_mask(const struct brw_compiler > *compiler, > > > > nir_shader * > > brw_nir_optimize(nir_shader *nir, const struct brw_compiler *compiler, > > - bool is_scalar) > > + bool is_scalar, bool allow_copies) > > { > > nir_variable_mode indirect_mask = > > brw_nir_no_indirect_mask(compiler, nir->info.stage); > > @@ -544,6 +544,8 @@ brw_nir_optimize(nir_shader *nir, const struct > brw_compiler *compiler, > > OPT(nir_split_array_vars, nir_var_local); > > OPT(nir_shrink_vec_array_vars, nir_var_local); > > OPT(nir_lower_vars_to_ssa); > > + if (allow_copies) > > + OPT(nir_opt_find_array_copies); > > OPT(nir_opt_copy_prop_vars); > > > > if (is_scalar) { > > Adding a comment here with a subset of your first paragraph in the commit > message will make reduce the chance of future changes to optimize mess > things up > Done. > > @@ -664,7 +666,7 @@ brw_preprocess_nir(const struct brw_compiler > *compiler, nir_shader *nir) > > nir_lower_isign64 | > > nir_lower_divmod64); > > > > - nir = brw_nir_optimize(nir, compiler, is_scalar); > > + nir = brw_nir_optimize(nir, compiler, is_scalar, true); > > > > /* This needs to be run after the first optimization pass but before > we > > * lower indirect derefs away > > @@ -701,7 +703,7 @@ brw_preprocess_nir(const struct brw_compiler > *compiler, nir_shader *nir) > > nir_lower_indirect_derefs(nir, indirect_mask); > > > > /* Get rid of split copies */ > > - nir = brw_nir_optimize(nir, compiler, is_scalar); > > + nir = brw_nir_optimize(nir, compiler, is_scalar, false); > > Optional: the call-sites usually have a "is_scalar" variable around, > making that boolean parameter "self-documenting". This is not the > case for allow_copies. One idea is to make this last argument a flag, > and add an enum for the ALLOW_CREATE_DEREF_COPIES or something like. > I thought seriously about doing that the first time around. Maybe I'll cook up a follow-on patch. --Jason
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev