Oct 7, 2023, 17:08 by d...@lynne.ee:

> Removes the clever subgroup parallel prefix computation,
> and instead just computes the prefix inline.
> Cuts down the number of dispatches by a huge amount.
>
> Provides a ~12x speedup (2.5fps to 30fps on a 7900XTX,
> 2.1fps to 24fps on an Ada).
>
> Patch attached.
>

Going to push the patchset a bit later today.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to