On Sat, Dec 12, 2015 at 01:08:01PM -0500, Ganesh Ajjanagadde wrote: > On Sat, Dec 12, 2015 at 12:58 PM, Michael Niedermayer <michae...@gmx.at> > wrote: > > On Fri, Dec 11, 2015 at 12:09:57PM -0500, Ganesh Ajjanagadde wrote: > >> On Fri, Dec 11, 2015 at 11:36 AM, Andreas Cadhalpun > >> <andreas.cadhal...@googlemail.com> wrote: > >> > On 11.12.2015 17:21, Ganesh Ajjanagadde wrote: > >> >> On Fri, Dec 11, 2015 at 11:16 AM, Andreas Cadhalpun > >> >> <andreas.cadhal...@googlemail.com> wrote: > >> >>> On 19.11.2015 14:17, Michael Niedermayer wrote: > >> >>>> From: Michael Niedermayer <mich...@niedermayer.cc> > >> >>>> > >> >>>> Signed-off-by: Michael Niedermayer <mich...@niedermayer.cc> > >> >>>> --- > >> >>>> libavcodec/aacsbr.c | 1 + > >> >>>> 1 file changed, 1 insertion(+) > >> >>>> > >> >>>> diff --git a/libavcodec/aacsbr.c b/libavcodec/aacsbr.c > >> >>>> index d1e3a91..e014646 100644 > >> >>>> --- a/libavcodec/aacsbr.c > >> >>>> +++ b/libavcodec/aacsbr.c > >> >>>> @@ -73,6 +73,7 @@ static void sbr_dequant(SpectralBandReplication > >> >>>> *sbr, int id_aac) > >> >>>> { > >> >>>> int k, e; > >> >>>> int ch; > >> >>>> + //TODO: Replace exp2f(0.5*x) by a LUT, the inputs are all > >> >>>> integer and have a small range > >> >>>> > >> >>>> if (id_aac == TYPE_CPE && sbr->bs_coupling) { > >> >>>> float alpha = sbr->data[0].bs_amp_res ? 1.0f : 0.5f; > >> >>>> > >> >>> > >> >>> This shouldn't hurt, with or without the clarification requested by > >> >>> Ganesh. > >> >> > >> >> I am doing related work cleaning up and optimizing usages of slow libm > >> >> functions such as pow and exp2. Do you know the exact possible range > >> >> of the inputs x, and if so, can it be added to the comment? That will > >> >> be very helpful for me to come up with a patch. Thanks. > >> > > >> > The exp2f expressions are: > >> > exp2f(sbr->data[0].env_facs_q[e][k] * alpha + 7.0f); > >> > exp2f((pan_offset - sbr->data[1].env_facs_q[e][k]) * alpha); > >> > exp2f(NOISE_FLOOR_OFFSET - sbr->data[0].noise_facs_q[e][k] + 1); > >> > exp2f(12 - sbr->data[1].noise_facs_q[e][k]); > >> > exp2f(alpha * sbr->data[ch].env_facs_q[e][k] + 6.0f); > >> > exp2f(NOISE_FLOOR_OFFSET - sbr->data[ch].noise_facs_q[e][k]); > >> > > >> > Here alpha is 1 or 0.5, pan_offset 12 or 24 and NOISE_FLOOR_OFFSET is 6. > >> > After patch 3 of this series, env_facs_q is in the range from 0 to 127 > >> > and > >> > noise_facs_q is already limited to the range from 0 to 30. > >> > > >> > So x should always be in the range -300..300, or so. > >> > >> Very good, thanks a lot. > >> > >> Based on the above range, my idea is to not even use a LUT, but use > >> something like exp2fi followed by multiplication by M_SQRT2 depending > >> on even or odd. > > > > conditional operations can due to branch misprediction be potentially > > rather slow > > I think it will still be far faster than exp2f, and in the absence of > hard numbers, I view this a far better approach than a large (~300 > element) lut. Of course, the proof and extent of this will need to > wait for actual benches.
alternatively one could do a if (x+A < (unsigned)B) LUT[x+A] else exp2whatever(x) the range in practice should be much smaller than +-300 also the LUT can possibly be shared between codecs or that code could be in a exp_sqrt2i() or something just some random ideas... [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Complexity theory is the science of finding the exact solution to an approximation. Benchmarking OTOH is finding an approximation of the exact
signature.asc
Description: Digital signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel