[FFmpeg-devel] [PATCHv2] lavc/aacenc_utils: replace powf(x, y) by expf(logf(x), y)

Ganesh Ajjanagadde Sat, 12 Mar 2016 08:41:04 -0800

This is ~2x faster for y not an integer on Haswell+GCC, and should
generally be faster due to the fact that anyway powf essentially does
this under the hood. Made an inline function in lavu/internal.h for this
purpose.


Note that there are some accuracy differences, that should generally be
negligible. In particular, FATE still passes on this platform.

Results in ~ 7% speedup in aac encoding with -march=native, Haswell+GCC.
before:
ffmpeg -i sin.flac -acodec aac -y sin_new.aac  6.05s user 0.06s system 104% cpu 
5.821 total

after:
ffmpeg -i sin.flac -acodec aac -y sin_new.aac  5.67s user 0.03s system 105% cpu 
5.416 total

This is also faster than an alternative approach that pulls in powf, gets rid of
the crufty NaN checks and other special cases, exploits knowledge about the 
intervals, etc.
This of course does not exclude smarter approaches; just suggests that
there would need to be significant work on this front of lower utility than
searches for hotspots elsewhere.

Reviewed-by: Reimar Döffinger <[email protected]>
Reviewed-by: Ronald S. Bultje <[email protected]>
Signed-off-by: Ganesh Ajjanagadde <[email protected]>
---
 libavcodec/aacenc_utils.h |  6 +++++-
 libavutil/internal.h      | 16 ++++++++++++++++
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/libavcodec/aacenc_utils.h b/libavcodec/aacenc_utils.h
index 41a6296..28ea5cd 100644
--- a/libavcodec/aacenc_utils.h
+++ b/libavcodec/aacenc_utils.h
@@ -28,6 +28,7 @@
 #ifndef AVCODEC_AACENC_UTILS_H
 #define AVCODEC_AACENC_UTILS_H
 
+#include "libavutil/internal.h"
 #include "aac.h"
 #include "aacenctab.h"
 #include "aactab.h"
@@ -122,7 +123,10 @@ static inline float find_form_factor(int group_len, int 
swb_size, float thresh,
             if (s >= ethresh) {
                 nzl += 1.0f;
             } else {
-                nzl += powf(s / ethresh, nzslope);
+                if (nzslope == 2.f)
+                    nzl += (s / ethresh) * (s / ethresh);
+                else
+                    nzl += ff_fast_pow(s / ethresh, nzslope);
             }
         }
         if (e2 > thresh) {
diff --git a/libavutil/internal.h b/libavutil/internal.h
index da76ca2..aa43754 100644
--- a/libavutil/internal.h
+++ b/libavutil/internal.h
@@ -315,6 +315,22 @@ static av_always_inline float ff_exp10f(float x)
 }
 
 /**
+ * Compute x^y for floating point x, y. Note: this function is faster than the
+ * libm variant due to mainly 2 reasons:
+ * 1. It does not handle any edge cases. In particular, this is only guaranteed
+ * to work correctly for x > 0.
+ * 2. It is not as accurate as a standard nearly "correctly rounded" libm 
variant.
+ * @param x base
+ * @param y exponent
+ * @return x^y
+ */
+static av_always_inline float ff_fast_pow(float x, float y)
+{
+    return expf(logf(x) * y);
+}
+
+
+/**
  * A wrapper for open() setting O_CLOEXEC.
  */
 av_warn_unused_result
-- 
2.7.2

_______________________________________________
ffmpeg-devel mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

[FFmpeg-devel] [PATCHv2] lavc/aacenc_utils: replace powf(x, y) by expf(logf(x), y)

Reply via email to