On 01/28/2015 11:51 AM, Matt Turner wrote: > On Wed, Jan 28, 2015 at 11:20 AM, Ian Romanick <i...@freedesktop.org> wrote: >> On 01/28/2015 10:31 AM, Matt Turner wrote: >>> Note: this will round differently for x.5 where x is even. >>> >>> total instructions in shared programs: 5953897 -> 5948654 (-0.09%) >>> instructions in affected programs: 88619 -> 83376 (-5.92%) >>> helped: 696 >>> --- >>> If we implemented round() differently from roundEven(), we should >>> use it instead. >>> >>> (mul (floor (add (abs x) 0.5) (sign x))) is 6 i965 instructions. >>> (roundEven x) is 1 instruction. >>> >>> Most shaders with this pattern wrap it in int(...), which increases >>> the counts by one, to 7 and 2 respectively. >>> >>> Alternatively, we could optimize this as >>> >>> (trunc (add f (mul 0.5 (sign f)))), which would be 6 instructions, >>> and the int() conversion would be free. We could also apply f's sign >>> to 0.5 in two instructions, cutting the total to 4. >>> >>> What do you think? Should we do precisely as they say? All but two >>> of the affected shaders seem to be translated from DX. >>> >>> src/glsl/opt_algebraic.cpp | 32 ++++++++++++++++++++++++++++++++ >>> 1 file changed, 32 insertions(+) >>> >>> diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp >>> index c6f4a9c..eaa5f47 100644 >>> --- a/src/glsl/opt_algebraic.cpp >>> +++ b/src/glsl/opt_algebraic.cpp >>> @@ -514,6 +514,38 @@ ir_algebraic_visitor::handle_expression(ir_expression >>> *ir) >>> if (op_const[1] && !op_const[0]) >>> reassociate_constant(ir, 1, op_const[1], op_expr[0]); >>> >>> + /* Optimizes >>> + * >>> + * (mul (floor (add (abs x) 0.5) (sign x))) >> >> If I'm not mistaken, this isn't round-to-even. Doesn't this round 4.5 >> to 5? roundEven(4.5) should be 4. This looks like "half-up" rounding. >> Which is very different. See >> http://userguide.icu-project.org/formatparse/numbers/rounding-modes > > Isn't this what I said? > > I've suggested a way to cut a six instruction sequence to one, with > the caveat that it doesn't do the right thing for x.5 where x is even.
*blush* I missed the tiny commit message for the much larger addendum message. > What I'm asking is whether we suspect that they specifically want > half-up behavior (speculation, so not likely insightful), or if > there's a way we can emulate round-half-up behavior using round-even > in fewer than four instructions. They may or may not want half-up or round-even or something else. If we change it, someone will see different pixels, and they will probably report a bug. It seems better to play it safe. I think 'round(x + (intToFloatBits(floatToIntBits(.5) | (floatToIntBits(x) & 0x80000000))))' should produce the same result. That should be 4 instructions, I think. Were you thinking of using CSEL for your 4 instruction version? Since this is in common code, we have to be careful about how this will affect drivers that don't support bit-wise operations. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev