Re: [FFmpeg-devel] swscale/rgb2rgb : add X86_64 SIMD (SSSE3 and AVX2) for shuffly_bytes func

Paul B Mahol Sun, 18 Mar 2018 11:05:22 -0700

On 3/18/18, Carl Eugen Hoyos <[email protected]> wrote:
> 2018-03-18 18:20 GMT+01:00, Paul B Mahol <[email protected]>:
>> On 3/18/18, Carl Eugen Hoyos <[email protected]> wrote:
>>> 2018-03-18 17:46 GMT+01:00, Martin Vignali <[email protected]>:
>>>> 2018-03-18 17:37 GMT+01:00 Paul B Mahol <[email protected]>:
>>>>
>>>>> On 3/18/18, Nicolas George <[email protected]> wrote:
>>>>> > Martin Vignali (2018-03-18):
>>>>> >> I run the test again with a bigger width (512 instead of 128)
>>>>> >> This is my result :
>>>>> >> shuffle_bytes_0321_c: 128.6
>>>>> >> shuffle_bytes_0321_ssse3: 41.6
>>>>> >> shuffle_bytes_0321_avx2: 23.4
>>>>> >
>>>>> > IIUC, these benchmarks are expressed in CPU cycles. But what James
>>>>> > says
>>>>> > is that it can cause the CPU frequency to be throttled: if that
>>>>> > happens,
>>>>> > less cycles can use more time, and even worse, cause other unrelated
>>>>> > to
>>>>> > take more time. A benchmark in actual time and typical use case would
>>>>> > be
>>>>> > needed to decide.
>>>>>
>>>>> Yes, always also test overall with typical code usecase.
>>>
>>> +1
>>>
>>>> I tested it using a "benchmark" command line, who test two shuffle func
>>>> ./ffmpeg -benchmark -f lavfi -i rgbtestsrc=size=3840x2160:duration=10
>>>> -vf
>>>> format=argb,format=rgba -f null -
>>>>
>>>> With the patch :
>>>> bench: utime=3.611s
>>>> With only SSSE 3 (disable AVX2 part), i have similar result.
>>>
>>> Indicating James' original comment that the avx2 optimization
>>> makes no sense is correct?
>>
>> You are almost always wrong.
>
> I tend to agree but I wonder how you know that I am wrong here:
> What in above mail indicates that avx2 has an advantage over
> ssse3?


It might work with new CPUs much better.
_______________________________________________
ffmpeg-devel mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] swscale/rgb2rgb : add X86_64 SIMD (SSSE3 and AVX2) for shuffly_bytes func

Reply via email to