so I guess this is likely to hurt pipe drivers that don't (yet?) have a real compiler backend. (Ie. etnaviv and freedreno/a2xx.) So maybe it should be optional.
Also I wonder about the pre-llvm radeon gen's, since sb uses the actual instruction encoding for IR between tgsi->sb and backend opt passes.. iirc they have had problems when the tgsi code uses too many registers. BR, -R On Mon, Apr 24, 2017 at 5:01 PM, Samuel Pitoiset <samuel.pitoi...@gmail.com> wrote: > The main goal of this pass to merge temporary registers in order > to reduce the total number of registers and also to produce > optimal TGSI code. > > In fact, compilers seem to be confused when temporary variables > are already merged, maybe because it's done too early in the > process. > > Removing the pass, reduce both the register pressure and the code > size (TGSI is no longer optimized, but who cares?). > shader-db results with RadeonSI and Nouveau are interesting. > > Nouveau: > > total instructions in shared programs : 3931608 -> 3929463 (-0.05%) > total gprs used in shared programs : 481255 -> 479014 (-0.47%) > total local used in shared programs : 27481 -> 27381 (-0.36%) > total bytes used in shared programs : 36031256 -> 36011120 (-0.06%) > > local gpr inst bytes > helped 14 1471 1309 1309 > hurt 1 88 384 384 > > RadeonSI: > > PERCENTAGE DELTAS Shaders SGPRs VGPRs SpillSGPR SpillVGPR > PrivVGPR Scratch CodeSize MaxWaves Waits > ---------------------------------------------------------------------------------------------------------------------- > All affected 4906 -0.31 % -0.40 % -2.93 % -20.00 % . > -20.00 % -0.18 % 0.19 % . > ---------------------------------------------------------------------------------------------------------------------- > Total 47109 -0.04 % -0.05 % -1.97 % -7.14 % . > -0.30 % -0.03 % 0.02 % . > > Found by luck while fixing an issue in the TGSI dead code elimination > pass which affects tex instructions with bindless samplers. > > Signed-off-by: Samuel Pitoiset <samuel.pitoi...@gmail.com> > --- > src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 62 > ------------------------------ > 1 file changed, 62 deletions(-) > > diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp > b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp > index de7fe7837a..d033bdcc5a 100644 > --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp > +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp > @@ -565,7 +565,6 @@ public: > int eliminate_dead_code(void); > > void merge_two_dsts(void); > - void merge_registers(void); > void renumber_registers(void); > > void emit_block_mov(ir_assignment *ir, const struct glsl_type *type, > @@ -5262,66 +5261,6 @@ glsl_to_tgsi_visitor::merge_two_dsts(void) > } > } > > -/* Merges temporary registers together where possible to reduce the number of > - * registers needed to run a program. > - * > - * Produces optimal code only after copy propagation and dead code > elimination > - * have been run. */ > -void > -glsl_to_tgsi_visitor::merge_registers(void) > -{ > - int *last_reads = rzalloc_array(mem_ctx, int, this->next_temp); > - int *first_writes = rzalloc_array(mem_ctx, int, this->next_temp); > - struct rename_reg_pair *renames = rzalloc_array(mem_ctx, struct > rename_reg_pair, this->next_temp); > - int i, j; > - int num_renames = 0; > - > - /* Read the indices of the last read and first write to each temp register > - * into an array so that we don't have to traverse the instruction list as > - * much. */ > - for (i = 0; i < this->next_temp; i++) { > - last_reads[i] = -1; > - first_writes[i] = -1; > - } > - get_last_temp_read_first_temp_write(last_reads, first_writes); > - > - /* Start looking for registers with non-overlapping usages that can be > - * merged together. */ > - for (i = 0; i < this->next_temp; i++) { > - /* Don't touch unused registers. */ > - if (last_reads[i] < 0 || first_writes[i] < 0) continue; > - > - for (j = 0; j < this->next_temp; j++) { > - /* Don't touch unused registers. */ > - if (last_reads[j] < 0 || first_writes[j] < 0) continue; > - > - /* We can merge the two registers if the first write to j is after > or > - * in the same instruction as the last read from i. Note that the > - * register at index i will always be used earlier or at the same > time > - * as the register at index j. */ > - if (first_writes[i] <= first_writes[j] && > - last_reads[i] <= first_writes[j]) { > - renames[num_renames].old_reg = j; > - renames[num_renames].new_reg = i; > - num_renames++; > - > - /* Update the first_writes and last_reads arrays with the new > - * values for the merged register index, and mark the newly > unused > - * register index as such. */ > - assert(last_reads[j] >= last_reads[i]); > - last_reads[i] = last_reads[j]; > - first_writes[j] = -1; > - last_reads[j] = -1; > - } > - } > - } > - > - rename_temp_registers(num_renames, renames); > - ralloc_free(renames); > - ralloc_free(last_reads); > - ralloc_free(first_writes); > -} > - > /* Reassign indices to temporary registers by reusing unused indices created > * by optimization passes. */ > void > @@ -6712,7 +6651,6 @@ get_mesa_program_tgsi(struct gl_context *ctx, > while (v->eliminate_dead_code()); > > v->merge_two_dsts(); > - v->merge_registers(); > v->renumber_registers(); > > /* Write the END instruction. */ > -- > 2.12.2 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev