Shader-db results on Sky Lake: total instructions in shared programs: 12955125 -> 12953698 (-0.01%) instructions in affected programs: 55956 -> 54529 (-2.55%) helped: 6 HURT: 38
All of the hurt programs were hurt by exactly one instruction because this patch affects copy propagation. Most of the helped instructions came from a single orbital explorer shader that was helped by 14.26% Cc: mesa-sta...@lists.freedesktop.org --- src/intel/compiler/brw_cfg.cpp | 37 +++++++++++++++++++++++++++++++++++-- 1 file changed, 35 insertions(+), 2 deletions(-) diff --git a/src/intel/compiler/brw_cfg.cpp b/src/intel/compiler/brw_cfg.cpp index fad12ee..d8bf725 100644 --- a/src/intel/compiler/brw_cfg.cpp +++ b/src/intel/compiler/brw_cfg.cpp @@ -289,9 +289,42 @@ cfg_t::cfg_t(exec_list *instructions) assert(cur_while != NULL); cur->add_successor(mem_ctx, cur_while); + /* We also add the next block as a successor of the break. If the + * break is predicated, we need to do this because the break may not + * be taken. If the break is not predicated, we add it anyway so + * that our live intervals computations will operate as if the break + * may or may not be taken. Consider the following example: + * + * vec4 color2; + * while (1) { + * vec4 color = texture(); + * if (...) { + * color2 = color * 2; + * break; + * } + * } + * gl_FragColor = color2; + * + * In this case, the definition of color2 dominates the use because + * the loop only has the one exit. This means that the live range + * interval for color2 goes from the statement in the if to it's use + * below the loop. Now suppose that the texture operation has a + * header register that gets assigned one of the registers used for + * color2. If the loop condition is non-uniform and some of the + * threads will take the break and others will continue. In this + * case, the next pass through the loop, the WE_all setup of the + * header register will stomp the disabled channels of color2 and + * corrupt the value. + * + * This same problem can occur if you have a mix of 64, 32, and + * 16-bit registers because the channels do not line up or if you + * have a SIMD16 program and the first half of one value overlaps the + * second half of the other. To solve it, we simply treat the break + * as if it may also continue on because some of the threads may + * continue on. + */ next = new_block(); - if (inst->predicate) - cur->add_successor(mem_ctx, next); + cur->add_successor(mem_ctx, next); set_next_block(&cur, next, ip); break; -- 2.5.0.400.gff86faf _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev