Re: [FFmpeg-devel] [PATCH 8/9] x86: simple_idct: 12bits versions

Michael Niedermayer Tue, 13 Oct 2015 07:06:10 -0700

On Mon, Oct 12, 2015 at 07:37:49PM +0200, Christophe Gisquet wrote:
> On 12 frames of a 444p 12 bits DNxHR sequence, _put function:
> C:         78902 decicycles in idct,  262071 runs,     73 skips
> avx:       32478 decicycles in idct,  262045 runs,     99 skips
> 
> Difference between the 2:
> stddev:    0.39 PSNR:104.47 MAXDIFF:    2
> 
> This is unavoidable and due to the scale factors used in the x86
> version, which cannot match the C ones.
> 
> In addition, the trick of adding an initial bias to the input of a
> pass can overflow, as the input coefficients are already 15bits,
> which is the maximum this function can handle.
> 
> Overall, however, the omse on 12 bits samples goes from 0.16916 to
> 0.16883. Reducing rowshift by 1 improves to 0.0908, but causes
> overflows.
> ---
>  libavcodec/x86/idctdsp_init.c    | 22 ++++++++++++++++++++--
>  libavcodec/x86/simple_idct.h     |  6 ++++++
>  libavcodec/x86/simple_idct10.asm | 16 ++++++++++++++++
>  3 files changed, 42 insertions(+), 2 deletions(-)


applied

thanks

[..]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

You can kill me, but you cannot change the truth.

signature.asc
Description: Digital signature

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 8/9] x86: simple_idct: 12bits versions

Reply via email to