On 3/18/18, Carl Eugen Hoyos <ceffm...@gmail.com> wrote: > 2018-03-18 18:20 GMT+01:00, Paul B Mahol <one...@gmail.com>: >> On 3/18/18, Carl Eugen Hoyos <ceffm...@gmail.com> wrote: >>> 2018-03-18 17:46 GMT+01:00, Martin Vignali <martin.vign...@gmail.com>: >>>> 2018-03-18 17:37 GMT+01:00 Paul B Mahol <one...@gmail.com>: >>>> >>>>> On 3/18/18, Nicolas George <geo...@nsup.org> wrote: >>>>> > Martin Vignali (2018-03-18): >>>>> >> I run the test again with a bigger width (512 instead of 128) >>>>> >> This is my result : >>>>> >> shuffle_bytes_0321_c: 128.6 >>>>> >> shuffle_bytes_0321_ssse3: 41.6 >>>>> >> shuffle_bytes_0321_avx2: 23.4 >>>>> > >>>>> > IIUC, these benchmarks are expressed in CPU cycles. But what James >>>>> > says >>>>> > is that it can cause the CPU frequency to be throttled: if that >>>>> > happens, >>>>> > less cycles can use more time, and even worse, cause other unrelated >>>>> > to >>>>> > take more time. A benchmark in actual time and typical use case would >>>>> > be >>>>> > needed to decide. >>>>> >>>>> Yes, always also test overall with typical code usecase. >>> >>> +1 >>> >>>> I tested it using a "benchmark" command line, who test two shuffle func >>>> ./ffmpeg -benchmark -f lavfi -i rgbtestsrc=size=3840x2160:duration=10 >>>> -vf >>>> format=argb,format=rgba -f null - >>>> >>>> With the patch : >>>> bench: utime=3.611s >>>> With only SSSE 3 (disable AVX2 part), i have similar result. >>> >>> Indicating James' original comment that the avx2 optimization >>> makes no sense is correct? >> >> You are almost always wrong. > > I tend to agree but I wonder how you know that I am wrong here: > What in above mail indicates that avx2 has an advantage over > ssse3?
It might work with new CPUs much better. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel