On Tue, Mar 01, 2016 at 11:11:36AM +0100, Clément Bœsch wrote: > On Mon, Feb 29, 2016 at 10:55:49AM +0100, Clément Bœsch wrote: > > From: Clément Bœsch <clem...@stupeflix.com> > > > > --- > > Changes since latest version: > > - remove unused 32-bit path > > - make 16-bit path more accurate by mirroring the MMX code (still not > > bitexact) > > - the code as originally trying to process 2 lines at a time to save chroma > > pre > > mult computations and avoid re-reading the whole line; for some reason, > > this > > actually made the code around twice slower, for twice the complexity. > > dropping that complexity was a win-win. > > --- > > libswscale/aarch64/Makefile | 3 + > > libswscale/aarch64/swscale_unscaled.c | 132 ++++++++++++++++++++++ > > libswscale/aarch64/yuv2rgb_neon.S | 207 > > ++++++++++++++++++++++++++++++++++ > > libswscale/swscale_internal.h | 1 + > > libswscale/swscale_unscaled.c | 2 + > > 5 files changed, 345 insertions(+) > > create mode 100644 libswscale/aarch64/Makefile > > create mode 100644 libswscale/aarch64/swscale_unscaled.c > > create mode 100644 libswscale/aarch64/yuv2rgb_neon.S > > > > Random benchmark on Hikey (Cortex-A53): > > ./ffmpeg -nostats -f lavfi -i testsrc2=s=uhd2160:d=1 -vf > format=yuv420p,bench=start,format=rgba,bench=stop -f null - > > (yuv420p to rgba in 3840x2160) > > before: > [bench @ 0x2edfe1e0] t:0.181514 avg:0.181514 max:0.181514 min:0.181514 > [bench @ 0x2edfe1e0] t:0.178870 avg:0.180192 max:0.181514 min:0.178870 > [bench @ 0x2edfe1e0] t:0.164448 avg:0.174944 max:0.181514 min:0.164448 > [bench @ 0x2edfe1e0] t:0.164801 avg:0.172408 max:0.181514 min:0.164448 > [bench @ 0x2edfe1e0] t:0.164635 avg:0.170853 max:0.181514 min:0.164448 > [bench @ 0x2edfe1e0] t:0.164756 avg:0.169837 max:0.181514 min:0.164448 > [bench @ 0x2edfe1e0] t:0.164784 avg:0.169115 max:0.181514 min:0.164448 > [bench @ 0x2edfe1e0] t:0.164413 avg:0.168527 max:0.181514 min:0.164413 > [bench @ 0x2edfe1e0] t:0.164760 avg:0.168109 max:0.181514 min:0.164413 > [bench @ 0x2edfe1e0] t:0.164647 avg:0.167762 max:0.181514 min:0.164413 > [bench @ 0x2edfe1e0] t:0.164698 avg:0.167484 max:0.181514 min:0.164413 > [bench @ 0x2edfe1e0] t:0.164600 avg:0.167243 max:0.181514 min:0.164413 > [bench @ 0x2edfe1e0] t:0.164498 avg:0.167032 max:0.181514 min:0.164413 > [bench @ 0x2edfe1e0] t:0.164765 avg:0.166870 max:0.181514 min:0.164413 > [bench @ 0x2edfe1e0] t:0.164613 avg:0.166720 max:0.181514 min:0.164413 > [bench @ 0x2edfe1e0] t:0.164781 avg:0.166598 max:0.181514 min:0.164413 > [bench @ 0x2edfe1e0] t:0.164489 avg:0.166474 max:0.181514 min:0.164413 > [bench @ 0x2edfe1e0] t:0.164432 avg:0.166361 max:0.181514 min:0.164413 > [bench @ 0x2edfe1e0] t:0.164540 avg:0.166265 max:0.181514 min:0.164413 > [bench @ 0x2edfe1e0] t:0.164524 avg:0.166178 max:0.181514 min:0.164413 > [bench @ 0x2edfe1e0] t:0.165147 avg:0.166129 max:0.181514 min:0.164413 > [bench @ 0x2edfe1e0] t:0.165484 avg:0.166099 max:0.181514 min:0.164413 > [bench @ 0x2edfe1e0] t:0.165703 avg:0.166082 max:0.181514 min:0.164413 > [bench @ 0x2edfe1e0] t:0.165643 avg:0.166064 max:0.181514 min:0.164413 > [bench @ 0x2edfe1e0] t:0.165294 avg:0.166033 max:0.181514 min:0.164413 > > after: > [bench @ 0x16d871e0] t:0.042296 avg:0.042296 max:0.042296 min:0.042296 > [bench @ 0x16d871e0] t:0.041986 avg:0.042141 max:0.042296 min:0.041986 > [bench @ 0x16d871e0] t:0.027298 avg:0.037193 max:0.042296 min:0.027298 > [bench @ 0x16d871e0] t:0.027388 avg:0.034742 max:0.042296 min:0.027298 > [bench @ 0x16d871e0] t:0.027383 avg:0.033270 max:0.042296 min:0.027298 > [bench @ 0x16d871e0] t:0.027366 avg:0.032286 max:0.042296 min:0.027298 > [bench @ 0x16d871e0] t:0.027225 avg:0.031563 max:0.042296 min:0.027225 > [bench @ 0x16d871e0] t:0.027685 avg:0.031078 max:0.042296 min:0.027225 > [bench @ 0x16d871e0] t:0.027246 avg:0.030652 max:0.042296 min:0.027225 > [bench @ 0x16d871e0] t:0.027363 avg:0.030323 max:0.042296 min:0.027225 > [bench @ 0x16d871e0] t:0.027449 avg:0.030062 max:0.042296 min:0.027225 > [bench @ 0x16d871e0] t:0.027582 avg:0.029855 max:0.042296 min:0.027225 > [bench @ 0x16d871e0] t:0.027374 avg:0.029664 max:0.042296 min:0.027225 > [bench @ 0x16d871e0] t:0.027429 avg:0.029505 max:0.042296 min:0.027225 > [bench @ 0x16d871e0] t:0.027275 avg:0.029356 max:0.042296 min:0.027225 > [bench @ 0x16d871e0] t:0.027573 avg:0.029244 max:0.042296 min:0.027225 > [bench @ 0x16d871e0] t:0.027219 avg:0.029125 max:0.042296 min:0.027219 > [bench @ 0x16d871e0] t:0.027392 avg:0.029029 max:0.042296 min:0.027219 > [bench @ 0x16d871e0] t:0.027720 avg:0.028960 max:0.042296 min:0.027219 > [bench @ 0x16d871e0] t:0.027449 avg:0.028884 max:0.042296 min:0.027219 > [bench @ 0x16d871e0] t:0.027473 avg:0.028817 max:0.042296 min:0.027219 > [bench @ 0x16d871e0] t:0.027444 avg:0.028755 max:0.042296 min:0.027219 > [bench @ 0x16d871e0] t:0.027535 avg:0.028702 max:0.042296 min:0.027219 > [bench @ 0x16d871e0] t:0.027607 avg:0.028656 max:0.042296 min:0.027219 > [bench @ 0x16d871e0] t:0.027476 avg:0.028609 max:0.042296 min:0.027219
LGTM thx [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Dictatorship: All citizens are under surveillance, all their steps and actions recorded, for the politicians to enforce control. Democracy: All politicians are under surveillance, all their steps and actions recorded, for the citizens to enforce control.
signature.asc
Description: Digital signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel