On 2018-07-19 17:23, Rostislav Pehlivanov wrote:
> Could you provide standard overall transform results using START/STOP_TIMER
> rather than overall decoding speed?
Ask and ye shall receive.
> haar horizontal compose
> sse2: 3.67x faster (45248±108.1 vs. 12328±21.1 decicycles) compared with
On 19 July 2018 at 16:29, James Darnley wrote:
> On 2018-07-19 17:23, Rostislav Pehlivanov wrote:
> >
> > Could you provide standard overall transform results using
> START/STOP_TIMER
> > rather than overall decoding speed?
> > Coefficients sizes and therefore golomb unpacking speed changes with
On 2018-07-19 17:23, Rostislav Pehlivanov wrote:
>
> Could you provide standard overall transform results using START/STOP_TIMER
> rather than overall decoding speed?
> Coefficients sizes and therefore golomb unpacking speed changes with
> respect to the transform so potentially there could be som
On 19 July 2018 at 15:52, James Darnley wrote:
> I tested the speed gains by using ffmpeg to decode a 720p yuv422p10 file
> encoded
> with the relevant transform. The summary is below.
>
> Haar
> C:119fps
> SSE2: 204fps
> AVX: 206fps
> AVX2: 221fps
>
> 5_3
> C: 94fps
> SSE2: 118fps
> AV
I tested the speed gains by using ffmpeg to decode a 720p yuv422p10 file encoded
with the relevant transform. The summary is below.
Haar
C:119fps
SSE2: 204fps
AVX: 206fps
AVX2: 221fps
5_3
C: 94fps
SSE2: 118fps
AVX2: 121fps
9_7
C: 84fps
SSE2: 111fps
AVX2: 115fps
Is the AVX worth it