On Sun, Oct 18, 2015 at 11:17 AM, wm4 <nfx...@googlemail.com> wrote: > On Sun, 18 Oct 2015 11:06:57 -0400 > Ganesh Ajjanagadde <gajja...@mit.edu> wrote: > >> On Sun, Oct 18, 2015 at 11:01 AM, wm4 <nfx...@googlemail.com> wrote: >> > On Sun, 18 Oct 2015 10:47:52 -0400 >> > Ganesh Ajjanagadde <gajjanaga...@gmail.com> wrote: >> > >> >> Commit e11e32686fdb21aded1ccf70202f1fffe87bb6a2 explains why replacing >> >> qsort with AV_QSORT yields performance improvements. >> >> >> >> This replaces all existing uses of libc's qsort with AV_QSORT. >> >> >> >> Benchmarks deemed unnecessary due to existing claims about AV_QSORT. >> >> Tested with FATE. >> >> >> >> Signed-off-by: Ganesh Ajjanagadde <gajjanaga...@gmail.com> >> >> --- >> >> cmdutils.c | 3 ++- >> >> cmdutils_opencl.c | 3 ++- >> >> ffmpeg.c | 3 ++- >> >> libavcodec/aacsbr_template.c | 14 ++++++++------ >> >> libavcodec/huffman.c | 3 ++- >> >> libavcodec/motion_est_template.c | 3 ++- >> >> libavcodec/utvideodec.c | 4 ++-- >> >> libavcodec/utvideoenc.c | 5 +++-- >> >> libavfilter/f_sendcmd.c | 3 ++- >> >> libavfilter/vf_deshake.c | 3 ++- >> >> libavfilter/vf_palettegen.c | 2 +- >> >> libavfilter/vf_paletteuse.c | 2 +- >> >> libavfilter/vf_removegrain.c | 7 ++++--- >> >> libavformat/subtitles.c | 10 +++++++--- >> >> libswresample/swresample-test.c | 3 ++- >> >> tests/checkasm/checkasm.c | 4 ++-- >> >> 16 files changed, 44 insertions(+), 28 deletions(-) >> > >> > By how much does this increase binary code size? >> > >> > Is it really faster? (libc qsort() could use a better algorithm, >> > even if it has to go through indirections.) >> >> Michael has already shown that AV_QSORT is faster. libc's qsort can >> easily be looked up: Michael already uses most of the tricks (usage of >> stack to avoid recursion, etc). The main benefits come from the >> inlining. > > Where?
The inlining of the comparison callback. > > It might depend on the actual data to be sorted. Is the data completely > random? Is it already sorted in most cases? etc. True, but Michael's implementation and libc's do not do any tricks for "almost completely sorted". The time is dominated by the calls to the comparator. This is well known, and is why C++'s std::sort is roughly 5x faster on basic types. > > I'm not fond of such frivolous mass-changes. > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel