On Thu, Jul 18, 2024 at 10:20 AM Anton Khirnov <an...@khirnov.net> wrote:

> Quoting Michael Niedermayer (2024-07-18 00:32:38)
> > the data for each decoder task should be together and not scattered
> around
> > more than needed, reducing cache efficiency
> >
> > putting all this extra code in the inner per pixel loop is not ok
> > especially not for the sake of avoiding a memcpy of a few hundread bytes
> multiple levels of loops outside
>
> A nice theory, but in practice this patchset makes single-threaded
> decoding about 4% faster overall, on a 1920x1080 10bit sample. That's
> just the ffv1 parts (up to patch 28), full set also improves frame
> threading performance as follows:
> threads         improvement
> ---------------------------
> 2                  52% (yes really)
>

What?

Sloppy programming skills....


> 4                  16%
> 8                  12%
>
> --
> Anton Khirnov
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
>
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to