On 11.12.2015 18:09, Ganesh Ajjanagadde wrote: > On Fri, Dec 11, 2015 at 11:36 AM, Andreas Cadhalpun > <andreas.cadhal...@googlemail.com> wrote: >> On 11.12.2015 17:21, Ganesh Ajjanagadde wrote: >>> On Fri, Dec 11, 2015 at 11:16 AM, Andreas Cadhalpun >>> <andreas.cadhal...@googlemail.com> wrote: >>>> On 19.11.2015 14:17, Michael Niedermayer wrote: >>>>> From: Michael Niedermayer <mich...@niedermayer.cc> >>>>> >>>>> Signed-off-by: Michael Niedermayer <mich...@niedermayer.cc> >>>>> --- >>>>> libavcodec/aacsbr.c | 1 + >>>>> 1 file changed, 1 insertion(+) >>>>> >>>>> diff --git a/libavcodec/aacsbr.c b/libavcodec/aacsbr.c >>>>> index d1e3a91..e014646 100644 >>>>> --- a/libavcodec/aacsbr.c >>>>> +++ b/libavcodec/aacsbr.c >>>>> @@ -73,6 +73,7 @@ static void sbr_dequant(SpectralBandReplication *sbr, >>>>> int id_aac) >>>>> { >>>>> int k, e; >>>>> int ch; >>>>> + //TODO: Replace exp2f(0.5*x) by a LUT, the inputs are all integer >>>>> and have a small range >>>>> >>>>> if (id_aac == TYPE_CPE && sbr->bs_coupling) { >>>>> float alpha = sbr->data[0].bs_amp_res ? 1.0f : 0.5f; >>>>> >>>> >>>> This shouldn't hurt, with or without the clarification requested by Ganesh. >>> >>> I am doing related work cleaning up and optimizing usages of slow libm >>> functions such as pow and exp2. Do you know the exact possible range >>> of the inputs x, and if so, can it be added to the comment? That will >>> be very helpful for me to come up with a patch. Thanks. >> >> The exp2f expressions are: >> exp2f(sbr->data[0].env_facs_q[e][k] * alpha + 7.0f); >> exp2f((pan_offset - sbr->data[1].env_facs_q[e][k]) * alpha); >> exp2f(NOISE_FLOOR_OFFSET - sbr->data[0].noise_facs_q[e][k] + 1); >> exp2f(12 - sbr->data[1].noise_facs_q[e][k]); >> exp2f(alpha * sbr->data[ch].env_facs_q[e][k] + 6.0f); >> exp2f(NOISE_FLOOR_OFFSET - sbr->data[ch].noise_facs_q[e][k]); >> >> Here alpha is 1 or 0.5, pan_offset 12 or 24 and NOISE_FLOOR_OFFSET is 6. >> After patch 3 of this series, env_facs_q is in the range from 0 to 127 and >> noise_facs_q is already limited to the range from 0 to 30. >> >> So x should always be in the range -300..300, or so. > > Very good, thanks a lot. > > Based on the above range, my idea is to not even use a LUT, but use > something like exp2fi followed by multiplication by M_SQRT2 depending > on even or odd. This will not bloat the binary, but is still very fast > and avoids huge variability in performance. That should provide a good > baseline (see jpeg2000 for this idea), further tweaks can be done (e.g > using an exp2i, i.e double precision to avoid possible branching for > the overflow/underflow cases).
That sounds good. :) > Maybe exp2i and/or exp2fi could be moved to avutil/internal or more > appropriately avcodec/internal as they have utility in this, jpeg2000, > and at least one other place in avcodec (which I can't recall). Moving to avcodec/internal should be fine. > Will be addressed in a week or so, unless someone does it before then. > This is very quick to do, and so this patch may not be needed. Indeed, no need to add the comment, if it's removed a week later. Best regards, Andreas _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel