2018-03-18 17:37 GMT+01:00 Paul B Mahol <one...@gmail.com>: > On 3/18/18, Nicolas George <geo...@nsup.org> wrote: > > Martin Vignali (2018-03-18): > >> I run the test again with a bigger width (512 instead of 128) > >> This is my result : > >> shuffle_bytes_0321_c: 128.6 > >> shuffle_bytes_0321_ssse3: 41.6 > >> shuffle_bytes_0321_avx2: 23.4 > > > > IIUC, these benchmarks are expressed in CPU cycles. But what James says > > is that it can cause the CPU frequency to be throttled: if that happens, > > less cycles can use more time, and even worse, cause other unrelated to > > take more time. A benchmark in actual time and typical use case would be > > needed to decide. > > Yes, always also test overall with typical code usecase. > > I tested it using a "benchmark" command line, who test two shuffle func ./ffmpeg -benchmark -f lavfi -i rgbtestsrc=size=3840x2160:duration=10 -vf format=argb,format=rgba -f null -
With the patch : bench: utime=3.611s With only SSSE 3 (disable AVX2 part), i have similar result. Without the patch : bench: utime=6.972s Martin _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel