On Thu, Feb 25, 2016 at 7:25 AM, Nicolas George <geo...@nsup.org> wrote: > Le sextidi 6 ventôse, an CCXXIV, Ganesh Ajjanagadde a écrit : >> Actual performance benefit is impossible to accurately quantify due to the >> context-dependence of the branch predictor. Nonetheless, as a ballpark >> estimate, it yields ~ 5% improvements in testing via FATE on x86-64, >> Haswell+GCC. > > Five percent is huge, if it is five percent of something relevant. Can you > give a few more details?
Ok, here are the benches I obtained, make fate lavf-ffm, START/STOP around body of av_d2q. Platform: Haswell+GCC under -march=native. Naive stream_loop does not increase the iteration count unfortunately. As for av_d2q's relevance, max call count on any FATE test is 16 or 32 IIRC. old: 74040 decicycles in av_d2q, 1 runs, 0 skips 39480 decicycles in av_d2q, 2 runs, 0 skips 21385 decicycles in av_d2q, 4 runs, 0 skips [...] 13235 decicycles in av_d2q, 8 runs, 0 skips [...] 7571 decicycles in av_d2q, 16 runs, 0 skips new: 68660 decicycles in av_d2q, 1 runs, 0 skips 36410 decicycles in av_d2q, 2 runs, 0 skips 19815 decicycles in av_d2q, 4 runs, 0 skips [...] 12355 decicycles in av_d2q, 8 runs, 0 skips [...] 7080 decicycles in av_d2q, 16 runs, 0 skips > > Regards, > > -- > Nicolas George > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel