On Fri, May 03, 2019 at 09:08:57PM +0200, Lynne wrote: > This commit adds a new API to libavutil to allow for arbitrary transformations > on various types of data. > This is a partly new implementation, with the power of two transforms taken > from libavcodec/fft_template, the 5 and 15-point FFT taken from mdct15, while > the 3-point FFT was written from scratch. > The (i)mdct folding code is taken from mdct15 as well, as the mdct_template > code was somewhat old, messy and not easy to separate. > > A notable feature of this implementation is that it allows for 3xM and 5xM > based transforms, where M is a power of two, e.g. 384, 640, 768, 1280, etc. > AC-4 uses 3xM transforms while Siren uses 5xM transforms, so the code will > allow for decoding of such streams. > A non-exaustive list of supported sizes: > 4, 8, 12, 16, 20, 24, 32, 40, 48, 60, 64, 80, 96, 120, 128, 160, 192, 240, > 256, 320, 384, 480, 512, 640, 768, 960, 1024, 1280, 1536, 1920, 2048, 2560... > > The API was designed such that it allows for not only 1D transforms but also > 2D transforms of certain block sizes. This was partly on accident as the > stride > argument is required for Opus MDCTs, but can be used in the context of a 2D > transform as well. > Also, various data types would be implemented eventually as well, such as > "double" and "int32_t". > > The avfft_transform() function is awkward but avoids some other more > awkward ideas to isolate the private parts of the structure and not > make them part of the API, as well as reducing call overhead from > an additional function call. > > Some performance comparisons with libfftw3f (SIMD disabled for both): > 120: > 22410 decicycles in fftwf_execute, 1024 runs, 0 skips > 28878 decicycles in compound_fft_15x8, 1024 runs, 0 skips > > 28: > 22003 decicycles in fftwf_execute, 1024 runs, 0 skips > 23132 decicycles in monolithic_fft_ptwo, 1024 runs, 0 skips > > 384: > 75939 decicycles in fftwf_execute, 1024 runs, 0 skips > 73973 decicycles in compound_fft_3x128, 1024 runs, 0 skips > > 640: > 104354 decicycles in fftwf_execute, 1024 runs, 0 skips > 149518 decicycles in compound_fft_5x128, 1024 runs, 0 skips > > 768: > 109323 decicycles in fftwf_execute, 1024 runs, 0 skips > 164096 decicycles in compound_fft_3x256, 1024 runs, 0 skips > > 960: > 182275 decicycles in fftwf_execute, 1024 runs, 0 skips > 260288 decicycles in compound_fft_15x64, 1024 runs, 0 skips > > 1024: > 163464 decicycles in fftwf_execute, 1024 runs, 0 skips > 199686 decicycles in monolithic_fft_ptwo, 1024 runs, 0 skips > > With SIMD we should be faster than fftw for 15xM transforms as our fft15 SIMD > is around 2x faster than theirs, even if our ptwo SIMD is slightly slower. > > The goal is to remove the libavcodec/mdct15 code and deprecate the > libavcodec/avfft interface once aarch64 and x86 SIMD code has been ported. > It is unlikely that libavcodec/fft will be removed soon as there's much SIMD > written for exotic or old platforms there, but nevertheless new code > should use this new API throughout the project. > > The implementation passes fate when used in Opus and AAC, and the output > is identical in ATRAC9 as well. >
> Makefile | 2 > fft.c | 791 > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > fft.h | 84 ++++++ > 3 files changed, 877 insertions(+) breaks build on mips CC libavutil/fft.o src/libavutil/fft.c:47: error: redefinition of typedef ‘AVFFTContext’ src/libavutil/fft.h:25: note: previous declaration of ‘AVFFTContext’ was here make: *** [libavutil/fft.o] Error 1 [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB No snowflake in an avalanche ever feels responsible. -- Voltaire
signature.asc
Description: PGP signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".