We know that there cannot be any destination dependency race if we reach the beginning or end of the program without having found any other instruction the send could possibly race with. This avoids emitting a pile of useless moves at the beginning or end of the program in the most common case in which the program has a single basic block only.
On the original i965 I get the following shader-db results: total instructions in shared programs: 3354165 -> 3215637 (-4.13%) instructions in affected programs: 3183065 -> 3044537 (-4.35%) helped: 13498 HURT: 0 --- src/mesa/drivers/dri/i965/brw_fs.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 660a8db..bfde69c 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3210,7 +3210,7 @@ fs_visitor::insert_gen4_pre_send_dependency_workarounds(bblock_t *block, /* If we hit control flow, assume that there *are* outstanding * dependencies, and force their cleanup before our instruction. */ - if (block->start() == scan_inst) { + if (block->start() == scan_inst && block->num != 0) { for (int i = 0; i < write_len; i++) { if (needs_dep[i]) DEP_RESOLVE_MOV(fs_builder(this, block, inst), @@ -3274,7 +3274,7 @@ fs_visitor::insert_gen4_post_send_dependency_workarounds(bblock_t *block, fs_ins */ foreach_inst_in_block_starting_from(fs_inst, scan_inst, inst) { /* If we hit control flow, force resolve all remaining dependencies. */ - if (block->end() == scan_inst) { + if (block->end() == scan_inst && block->num != cfg->num_blocks - 1) { for (int i = 0; i < write_len; i++) { if (needs_dep[i]) DEP_RESOLVE_MOV(fs_builder(this, block, scan_inst), -- 2.7.3 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev