The formula we have used in the past is a trivial reduction from the definition by simply multiplying both the numerator and denominator of the formula by 2. However, multiplying by e^x, you can further reduce it. This allows us to get rid of one side of the clamp and two of exponential functions which should make it faster. The new formula still passes the dEQP precision tests for tanh so it should be fine. --- src/compiler/glsl/builtin_functions.cpp | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/src/compiler/glsl/builtin_functions.cpp b/src/compiler/glsl/builtin_functions.cpp index 3dead1a..94e8279 100644 --- a/src/compiler/glsl/builtin_functions.cpp +++ b/src/compiler/glsl/builtin_functions.cpp @@ -3563,17 +3563,19 @@ builtin_builder::_tanh(const glsl_type *type) ir_variable *x = in_var(type, "x"); MAKE_SIG(type, v130, 1, x); - /* Clamp x to [-10, +10] to avoid precision problems. - * When x > 10, e^(-x) is so small relative to e^x that it gets flushed to - * zero in the computation e^x + e^(-x). The same happens in the other - * direction when x < -10. + /* tanh(x) := (0.5 * (e^x - e^(-x))) / (0.5 * (e^x + e^(-x))) + * + * With a little algebra this reduces to (e^2x - 1) / (e^2x + 1) + * + * Clamp x to (-inf, +10] to avoid precision problems. When x > 10, e^x is + * so much larger than 1.0 that 1.0 gets flushed to zero in the computation + * e^x +- 1 so it can be ignored. */ ir_variable *t = body.make_temp(type, "tmp"); - body.emit(assign(t, min2(max2(x, imm(-10.0f)), imm(10.0f)))); + body.emit(assign(t, min2(x, imm(10.0f)))); - /* (e^x - e^(-x)) / (e^x + e^(-x)) */ - body.emit(ret(div(sub(exp(t), exp(neg(t))), - add(exp(t), exp(neg(t)))))); + body.emit(ret(div(sub(exp(mul(t, imm(2.0f))), imm(1.0f)), + add(exp(mul(t, imm(2.0f))), imm(1.0f))))); return sig; } -- 2.5.0.400.gff86faf _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev