From: Ian Romanick <ian.d.roman...@intel.com> Convert cases like (x * bool(b)) to 'mix(0, x, b)'.
Note: This may hurt the code generated for GPUs that represent Boolean values using floating point. shader-db doesn't play well with i915, so I haven't been able to check it. On at least BDW shaders/warsow/85.shader_test is hurt by about 10%, so that may be worth investigating. v2: Use swizzle_if_required on opeands. Fixes arb_texture_buffer_object-formats in debug builds. Without this, ir_validate fails on that test. Shader-db results: GM45 (0x2A42): total instructions in shared programs: 3548093 -> 3545804 (-0.06%) instructions in affected programs: 213889 -> 211600 (-1.07%) helped: 543 HURT: 2 Iron Lake (0x0046): total instructions in shared programs: 4978454 -> 4975982 (-0.05%) instructions in affected programs: 226063 -> 223591 (-1.09%) helped: 597 HURT: 5 Sandy Bridge (0x0116): total instructions in shared programs: 6806814 -> 6803487 (-0.05%) instructions in affected programs: 447885 -> 444558 (-0.74%) helped: 1612 HURT: 36 Sandy Bridge (0x0116) NIR: total instructions in shared programs: 6813527 -> 6811992 (-0.02%) instructions in affected programs: 329020 -> 327485 (-0.47%) helped: 1002 HURT: 196 Ivy Bridge (0x0166): total instructions in shared programs: 6283080 -> 6279862 (-0.05%) instructions in affected programs: 421859 -> 418641 (-0.76%) helped: 1592 HURT: 36 Ivy Bridge (0x0166) NIR: total instructions in shared programs: 6319944 -> 6317148 (-0.04%) instructions in affected programs: 303221 -> 300425 (-0.92%) helped: 998 HURT: 176 GAINED: 4 Haswell (0x0426): total instructions in shared programs: 5766971 -> 5764623 (-0.04%) instructions in affected programs: 382796 -> 380448 (-0.61%) helped: 1559 HURT: 63 Haswell (0x0426) NIR: total instructions in shared programs: 5793258 -> 5792647 (-0.01%) instructions in affected programs: 276929 -> 276318 (-0.22%) helped: 837 HURT: 343 GAINED: 4 Broadwell (0x162E): total instructions in shared programs: 6813995 -> 6811377 (-0.04%) instructions in affected programs: 469734 -> 467116 (-0.56%) helped: 1772 HURT: 78 LOST: 1 Broadwell (0x162E) NIR: total instructions in shared programs: 7009761 -> 7009142 (-0.01%) instructions in affected programs: 298433 -> 297814 (-0.21%) helped: 866 HURT: 373 Signed-off-by: Ian Romanick <ian.d.roman...@intel.com> --- src/glsl/opt_algebraic.cpp | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp index 69c03ea..837b080 100644 --- a/src/glsl/opt_algebraic.cpp +++ b/src/glsl/opt_algebraic.cpp @@ -560,6 +560,28 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir) } } } + + /* If one of the multiplicands is an ir_unop_b2f, we can convert the + * multiply to a simple csel. + * + * x * b2f(condition) => mix( 0, x, condition) + */ + for (unsigned i = 0; i < 2; i++) { + if (op_expr[i] == NULL) + continue; + + if (op_expr[i]->operation != ir_unop_b2f) + continue; + + /* swizzle_if_required is necessary on both operands. The b2f could + * be a scalar (common) with the other a vector, or the b2f could be + * a vector with the other a scalar (as in piglit's + * arb_texture_buffer_object-formats test). + */ + return csel(swizzle_if_required(ir, op_expr[i]->operands[0]), + swizzle_if_required(ir, ir->operands[i ^ 1]), + ir_constant::zero(ir, ir->type)); + } break; case ir_binop_div: -- 2.1.0 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev