On Fri, Jul 17, 2020 at 11:08:02PM -0500, Sebastian Pop wrote: > hscale is bound by the number of multiply-adds available on a given core. > The attached patch doubles the number of multiply-adds by distributing half > the load to a helper thread. > > The performance improves up to 50% on Graviton2 Arm Neoverse-N1 processors. > > $ ./ffmpeg_g -nostats -f lavfi -i testsrc2=4k:d=2 -vf > bench=start,scale=1024x1024,bench=stop -f null - > before: [bench @ 0xaaaad62c3d30] t:0.013293 avg:0.013315 max:0.013697 > min:0.013293 > after: [bench @ 0xaaaae9346d30] t:0.009637 avg:0.009691 max:0.010005 > min:0.009637 > 38% improvement > > scale=1280x720 49% improvement > before: [bench @ 0xaaaadba88d30] t:0.015973 avg:0.016321 max:0.016917 > min:0.015973 > after: [bench @ 0xaaaabc78dd30] t:0.010823 avg:0.010869 max:0.011552 > min:0.010708 > > scale=852x480 45% improvement > before: [bench @ 0xaaaaeeed0d30] t:0.013731 avg:0.013727 max:0.013773 > min:0.013279 > after: [bench @ 0xaaaaf5f5dd30] t:0.009279 avg:0.009296 max:0.009328 > min:0.009187 > > scale=640x360 45% improvement > before: [bench @ 0xaaaacee25d30] t:0.012010 avg:0.012006 max:0.012053 > min:0.011653 > after: [bench @ 0xaaaaea2b5d30] t:0.008077 avg:0.008084 max:0.008409 > min:0.008057 > > scale=284x160 36% improvement > before: [bench @ 0xaaaadbb9ed30] t:0.008384 avg:0.008367 max:0.008421 > min:0.008193 > after: [bench @ 0xaaaafb1d6d30] t:0.006099 avg:0.006100 max:0.006120 > min:0.006026
> aarch64/swscale.c | 44 +++++++++++++++++++++++++++++++++++++++++++- > swscale_internal.h | 15 +++++++++++++++ > utils.c | 14 ++++++++++++++ > 3 files changed, 72 insertions(+), 1 deletion(-) > 9a65bd72cd0a37e73a554e568b34f9d6bb27cb58 > 0001-aarch64-improve-hscale-by-50-with-multi-threading.patch > From 3321950c109b416e63eda59c76e6365abc2072b8 Mon Sep 17 00:00:00 2001 > From: Sebastian Pop <s...@amazon.com> > Date: Thu, 2 Jul 2020 16:57:58 +0000 > Subject: [PATCH] [aarch64] improve hscale by 50% with multi-threading Multithreading support should be added in a architecture independant way Thanks [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB During times of universal deceit, telling the truth becomes a revolutionary act. -- George Orwell
signature.asc
Description: PGP signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".