We use this logic to detect live ranges and then do plain renaming across the whole codebase. As such, to prevent WaW hazards, we have to treat a write as if it were also a read.
For example, the following sequence was observed before this patch: 13: UIF TEMP[6].xxxx :0 14: ADD TEMP[6].x, CONST[6].xxxx, -IN[3].yyyy 15: RCP TEMP[7].x, TEMP[3].xxxx 16: MUL TEMP[3].x, TEMP[6].xxxx, TEMP[7].xxxx 17: ADD TEMP[6].x, CONST[7].xxxx, -IN[3].yyyy 18: RCP TEMP[7].x, TEMP[3].xxxx 19: MUL TEMP[4].x, TEMP[6].xxxx, TEMP[7].xxxx While after this patch it becomes: 13: UIF TEMP[7].xxxx :0 14: ADD TEMP[7].x, CONST[6].xxxx, -IN[3].yyyy 15: RCP TEMP[8].x, TEMP[3].xxxx 16: MUL TEMP[4].x, TEMP[7].xxxx, TEMP[8].xxxx 17: ADD TEMP[7].x, CONST[7].xxxx, -IN[3].yyyy 18: RCP TEMP[8].x, TEMP[3].xxxx 19: MUL TEMP[5].x, TEMP[7].xxxx, TEMP[8].xxxx Most importantly note that in the first example, the second RCP is done on the result of the MUL while in the second, the second RCP should have the same value as the first. Looking at the GLSL source, it is apparent that both of the RCP's should have had the same source. Looking at what's going on, the GLSL looks something like float tmin_8; float tmin_10; tmin_10 = tmin_8; ... lots of code ... tmin_8 = tmpvar_17; ... more code that never looks at tmin_8 ... And so we end up with a last_read somewhere at the beginning, and a first_write somewhere at the bottom. For some reason DCE doesn't remove it, but even if that were fixed, DCE doesn't handle 100% of cases, esp including loops. With the last_read somewhere high up, we overwrite the previously correct (and large) last_read with a low one, and then proceed to decide to merge all kinds of junk onto this temp. As a result, we should treat a write as a last_read for the purpose of determining the live range. Signed-off-by: Ilia Mirkin <imir...@alum.mit.edu> Cc: mesa-sta...@lists.freedesktop.org --- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp index f5b8c33..5e8374a 100644 --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp @@ -3825,9 +3825,11 @@ glsl_to_tgsi_visitor::get_last_temp_read_first_temp_write(int *last_reads, int * last_reads[inst->src[j].index] = (depth == 0) ? i : -2; } for (j = 0; j < num_inst_dst_regs(inst); j++) { - if (inst->dst[j].file == PROGRAM_TEMPORARY) + if (inst->dst[j].file == PROGRAM_TEMPORARY) { if (first_writes[inst->dst[j].index] == -1) first_writes[inst->dst[j].index] = (depth == 0) ? i : loop_start; + last_reads[inst->dst[j].index] = (depth == 0) ? i : -2; + } } for (j = 0; j < inst->tex_offset_num_offset; j++) { if (inst->tex_offsets[j].file == PROGRAM_TEMPORARY) @@ -4341,6 +4343,7 @@ glsl_to_tgsi_visitor::merge_registers(void) /* Update the first_writes and last_reads arrays with the new * values for the merged register index, and mark the newly unused * register index as such. */ + assert(last_reads[j] >= last_reads[i]); last_reads[i] = last_reads[j]; first_writes[j] = -1; last_reads[j] = -1; -- 2.4.10 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev