On 2017-05-29 23:26, Michael Niedermayer wrote: > On Mon, May 29, 2017 at 09:40:49PM +0200, James Darnley wrote: >> On 2017-05-29 16:51, James Darnley wrote: >>> --- >>> Changes: >>> - Changed type of d40000 constant to dwords because it gets used as dwords. >>> - Changed or removed HAVE_MMX_INLINE preprocessor guards. >>> - Added note about conversion from inline. >>> - New file no longer has "2" suffix. >>> - Whitespace (indentation and alignment). >>> >>> libavcodec/tests/x86/dct.c | 2 +- >>> libavcodec/x86/Makefile | 4 +- >>> libavcodec/x86/idctdsp_init.c | 4 - >>> libavcodec/x86/simple_idct.asm | 889 >>> +++++++++++++++++++++++++++++++++++++++ >>> libavcodec/x86/simple_idct.c | 929 >>> ----------------------------------------- >>> 5 files changed, 892 insertions(+), 936 deletions(-) >>> create mode 100644 libavcodec/x86/simple_idct.asm >>> delete mode 100644 libavcodec/x86/simple_idct.c >> >> Ronald queried on IRC about the performance. The libavcodec/tests/dct >> utility reports these numbers >> >> Yorkfield: >> - inline: IDCT SIMPLE-MMX: 15715.9 kdct/s >> - external: IDCT SIMPLE-MMX: 15699.9 kdct/s >> >> Skylake-U: >> - inline: IDCT SIMPLE-MMX: 11193.3 kdct/s >> - external: IDCT SIMPLE-MMX: 11189.7 kdct/s > > Its better to benchmark by decoding some videos as the sparsness of > the coeffs affects speed
Ah, quite true. Decoding a large HD sample for many runs stays close around 220fps and 187s run time before and after the change. I will push shortly. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel