On Sun, Mar 06, 2016 at 03:49:00PM -0300, James Almer wrote: > On 3/6/2016 3:35 PM, Reimar Döffinger wrote: > > Approximately 10% faster transcode from mp3 to aac > > with default settings. > > > > Signed-off-by: Reimar Döffinger <reimar.doeffin...@gmx.de> > > --- > > libavcodec/aacenc_utils.h | 47 > > ++++++++++++++++++++++++++++++++++++++--------- > > 1 file changed, 38 insertions(+), 9 deletions(-) > > > > diff --git a/libavcodec/aacenc_utils.h b/libavcodec/aacenc_utils.h > > index b9bd6bf..1639021 100644 > > --- a/libavcodec/aacenc_utils.h > > +++ b/libavcodec/aacenc_utils.h > > @@ -36,15 +36,29 @@ > > #define ROUND_TO_ZERO 0.1054f > > #define C_QUANT 0.4054f > > > > +#define ABSPOW(inv, outv) \ > > +do { \ > > + float a = (inv); \ > > + a = fabsf(a); \ > > + (outv) = sqrtf(a * sqrtf(a)); \ > > +} while(0) > > + > > static inline void abs_pow34_v(float *out, const float *in, const int size) > > { > > int i; > > - for (i = 0; i < size; i++) { > > - float a = fabsf(in[i]); > > - out[i] = sqrtf(a * sqrtf(a)); > > + for (i = 0; i < size - 3; i += 4) { > > + ABSPOW(in[i], out[i]); > > + ABSPOW(in[i+1], out[i+1]); > > + ABSPOW(in[i+2], out[i+2]); > > + ABSPOW(in[i+3], out[i+3]); > > + } > > Are you sure this wasn't vectorized already? I remember i checked and it > mostly > was, at least on gcc 5.3 mingw-w64 with default settings.
Then it would hardly get 10% faster, would it (though I admit I didn't test the two parts separately)? But I am fairly sure that before the patch it only used sqrtss instructions and not sqrtps. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel