On Mon, Oct 12, 2015 at 07:37:49PM +0200, Christophe Gisquet wrote: > On 12 frames of a 444p 12 bits DNxHR sequence, _put function: > C: 78902 decicycles in idct, 262071 runs, 73 skips > avx: 32478 decicycles in idct, 262045 runs, 99 skips > > Difference between the 2: > stddev: 0.39 PSNR:104.47 MAXDIFF: 2 > > This is unavoidable and due to the scale factors used in the x86 > version, which cannot match the C ones. > > In addition, the trick of adding an initial bias to the input of a > pass can overflow, as the input coefficients are already 15bits, > which is the maximum this function can handle. > > Overall, however, the omse on 12 bits samples goes from 0.16916 to > 0.16883. Reducing rowshift by 1 improves to 0.0908, but causes > overflows. > --- > libavcodec/x86/idctdsp_init.c | 22 ++++++++++++++++++++-- > libavcodec/x86/simple_idct.h | 6 ++++++ > libavcodec/x86/simple_idct10.asm | 16 ++++++++++++++++ > 3 files changed, 42 insertions(+), 2 deletions(-)
applied thanks [..] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB You can kill me, but you cannot change the truth.
signature.asc
Description: Digital signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel