Unsurprisingly, the formula looks great to me :-). I was actually wondering about accuracy. I believe the biggest issue (both with the original formula and this one) is probably values around zero - because that gets calculated as (~1 - 1) / 2 - so the closest values to zero you can get (other than zero) are ~2^-25 (whereas an exact calculation could go down to 2^-127). So maybe the simplified formula might actually be even a bit better there? glsl seems to be quite lenient with required exp precision.
In any case, Reviewed-by: Roland Scheidegger <srol...@vmware.com> Am 09.12.2016 um 18:41 schrieb Jason Ekstrand: > The formula we have used in the past is a trivial reduction from the > definition by simply multiplying both the numerator and denominator of the > formula by 2. However, multiplying by e^x, you can further reduce it. > This allows us to get rid of one side of the clamp and two of exponential > functions which should make it faster. The new formula still passes the > dEQP precision tests for tanh so it should be fine. > --- > src/compiler/glsl/builtin_functions.cpp | 18 ++++++++++-------- > 1 file changed, 10 insertions(+), 8 deletions(-) > > diff --git a/src/compiler/glsl/builtin_functions.cpp > b/src/compiler/glsl/builtin_functions.cpp > index 3dead1a..94e8279 100644 > --- a/src/compiler/glsl/builtin_functions.cpp > +++ b/src/compiler/glsl/builtin_functions.cpp > @@ -3563,17 +3563,19 @@ builtin_builder::_tanh(const glsl_type *type) > ir_variable *x = in_var(type, "x"); > MAKE_SIG(type, v130, 1, x); > > - /* Clamp x to [-10, +10] to avoid precision problems. > - * When x > 10, e^(-x) is so small relative to e^x that it gets flushed to > - * zero in the computation e^x + e^(-x). The same happens in the other > - * direction when x < -10. > + /* tanh(x) := (0.5 * (e^x - e^(-x))) / (0.5 * (e^x + e^(-x))) > + * > + * With a little algebra this reduces to (e^2x - 1) / (e^2x + 1) > + * > + * Clamp x to (-inf, +10] to avoid precision problems. When x > 10, e^x > is > + * so much larger than 1.0 that 1.0 gets flushed to zero in the > computation > + * e^x +- 1 so it can be ignored. > */ > ir_variable *t = body.make_temp(type, "tmp"); > - body.emit(assign(t, min2(max2(x, imm(-10.0f)), imm(10.0f)))); > + body.emit(assign(t, min2(x, imm(10.0f)))); > > - /* (e^x - e^(-x)) / (e^x + e^(-x)) */ > - body.emit(ret(div(sub(exp(t), exp(neg(t))), > - add(exp(t), exp(neg(t)))))); > + body.emit(ret(div(sub(exp(mul(t, imm(2.0f))), imm(1.0f)), > + add(exp(mul(t, imm(2.0f))), imm(1.0f))))); > > return sig; > } > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev