Shader-db results on Sky Lake (Excluding Deus Ex: Mankind Divided): total instructions in shared programs: 11724449 -> 11706852 (-0.15%) instructions in affected programs: 1218950 -> 1201353 (-1.44%) helped: 2562 HURT: 1208
total cycles in shared programs: 109388578 -> 108934482 (-0.42%) cycles in affected programs: 50640234 -> 50186138 (-0.90%) helped: 16097 HURT: 13858 total loops in shared programs: 1828 -> 1824 (-0.22%) loops in affected programs: 8 -> 4 (-50.00%) helped: 4 HURT: 0 total spills in shared programs: 1930 -> 1926 (-0.21%) spills in affected programs: 1054 -> 1050 (-0.38%) helped: 4 HURT: 5 total fills in shared programs: 3651 -> 3635 (-0.44%) fills in affected programs: 2594 -> 2578 (-0.62%) helped: 4 HURT: 5 LOST: 15 GAINED: 1 Some analysis was done of the hurt programs. The vast majority of the hurt programs were only hurt by 2-3 instructions. Based on a very sparse random sampling, most of those appear to be hurt either because of slightly different MOVs or because they have a single block with a discard and GCM moved the discard higher in the shader which cause us to need to emit a HALT which we didn't emit before. If the case with the discard should should actually be an improvement most of the time in spite of being more instructions. There were also a few larger shaders that were hurt by around 10 instructions. On the helped end of things, there were 74 shaders helped by over 10% and most of those were on the order of 500 instructions. Spilling seems to be a wash. Some stuff is helped by around 10% and others are hurt by about the same amount. Each hurt application is also helped by about the same amount. The only app that's pure loss is orbital explorer... With "Deus Ex: Mankind Divided", the results aren't so good. Running GCM late helps substantially while running GCM early (like we do here) hurts it pretty badly. Given that running early is better for basically everything else (I ran shader-db both ways), I'm recommending we go early for now and try to figure out how to fix deus ex. --- src/mesa/drivers/dri/i965/brw_nir.c | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c index 999e1d2..a7038ef 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.c +++ b/src/mesa/drivers/dri/i965/brw_nir.c @@ -455,7 +455,7 @@ brw_nir_lower_cs_shared(nir_shader *nir) static nir_shader * nir_optimize(nir_shader *nir, const struct brw_compiler *compiler, - bool is_scalar) + bool run_gcm, bool is_scalar) { nir_variable_mode indirect_mask = 0; if (compiler->glsl_compiler_options[nir->stage].EmitNoIndirectInput) @@ -501,6 +501,17 @@ nir_optimize(nir_shader *nir, const struct brw_compiler *compiler, OPT(nir_opt_loop_unroll, indirect_mask); } OPT(nir_opt_remove_phis); + + /* We only want to run global code motion in the early stages of + * compilation. In particular, we want to run it before we lower + * indirects away. If we run GCM after indirect lowering, all of the + * loads stop being dependent on the loop and GCM pulls them out. This + * can lead to massive register pressure problems for shaders with loops + * we can't unroll. + */ + if (run_gcm) + OPT(nir_opt_gcm, true); + OPT(nir_opt_undef); OPT_V(nir_lower_doubles, nir_lower_drcp | nir_lower_dsqrt | @@ -557,7 +568,7 @@ brw_preprocess_nir(const struct brw_compiler *compiler, nir_shader *nir) OPT(nir_split_var_copies); - nir = nir_optimize(nir, compiler, is_scalar); + nir = nir_optimize(nir, compiler, true, is_scalar); if (is_scalar) { OPT_V(nir_lower_load_const_to_scalar); @@ -579,7 +590,7 @@ brw_preprocess_nir(const struct brw_compiler *compiler, nir_shader *nir) nir_lower_indirect_derefs(nir, indirect_mask); /* Get rid of split copies */ - nir = nir_optimize(nir, compiler, is_scalar); + nir = nir_optimize(nir, compiler, false, is_scalar); OPT(nir_remove_dead_variables, nir_var_local); @@ -604,13 +615,12 @@ brw_postprocess_nir(nir_shader *nir, const struct brw_compiler *compiler, bool progress; /* Written by OPT and OPT_V */ (void)progress; - do { progress = false; OPT(nir_opt_algebraic_before_ffma); } while (progress); - nir = nir_optimize(nir, compiler, is_scalar); + nir = nir_optimize(nir, compiler, false, is_scalar); if (devinfo->gen >= 6) { /* Try and fuse multiply-adds */ @@ -703,7 +713,7 @@ brw_nir_apply_sampler_key(nir_shader *nir, if (nir_lower_tex(nir, &tex_options)) { nir_validate_shader(nir); - nir = nir_optimize(nir, compiler, is_scalar); + nir = nir_optimize(nir, compiler, false, is_scalar); } return nir; -- 2.5.0.400.gff86faf _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev