Hi, On Fri, Mar 25, 2016 at 3:55 PM, Ganesh Ajjanagadde <gajja...@gmail.com> wrote:
> On Fri, Mar 25, 2016 at 12:11 PM, Paul B Mahol <one...@gmail.com> wrote: > > On 3/25/16, Ganesh Ajjanagadde <gajja...@gmail.com> wrote: > >> On Fri, Mar 25, 2016 at 9:36 AM, Nicolas George <geo...@nsup.org> > wrote: > >>> Le sextidi 6 germinal, an CCXXIV, Ganesh Ajjanagadde a ecrit : > >>>> Depends on if it is small or not. Yes, in many codecs, FFT's are short > >>>> length ones, e.g 512. However, on long lengths, e.g 8192+, as seen > >>>> from the benches, there are sometimes 2x variations at the moment. > >>> > >>> And how much of the actual total decoding is spent in the FFT? Even a > *50 > >>> speedup would be useless if it is for a function that never amounts to > >>> more > >>> than 0,01% of the actual time. The FFT is probably not that negligible, > >>> but > >>> this is not a *50 speedup either, and I have no idea how frequent are > long > >>> lengths. > >> > >> Paul had some interest in 2^17 fft's at a point. > > > > And it was done in avfft. So feel free to improve our avfft instead. > > Just to be clear: I won't be working on improving avfft, but of course > I won't oppose patches generally. > > Basically, it boils down to the current asm code being a mess "I don't understand the code" is not the same as "the code is a mess". The code is not a mess, it's highly optimized and you could ask the person that wrote it (see copyright line) for details instead of loudly complaining that you can't understand it. > I can't even identify really what algo is being used. Check the C code. > If anyone cares here, I do not know why we can't use inline asm or > intrinsics. The chief benefit of intrinsics is that the hard part (for > humans) of register allocation/checks is taken care of, but optimized > instructions get used in a readable fashion. Anyway, I know there are > a ton of arguments against it in FFmpeg, almost none of which I buy in > 2016 with modern toolchains. > Inline asm doesn't solve any problem you just mentioned, in fact it makes things worse because it doesn't work on half of our supported compilers (e.g. MSVC). Intrinsics generally perform worse (or at the very least "inconsistent") than the same sequence of operations written out in hand-written asm. If you're interested in getting actual help in learning "hand-written" assembly, let us know and we'll help you move on. Ronald _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel