Hi,
On Thu, Jul 4, 2024, 13:54 Rémi Denis-Courmont <r...@remlab.net> wrote: > Le torstaina 4. heinäkuuta 2024, 19.26.19 EEST Sean McGovern a écrit : > > Is that correlated with the comment above re: len? Or is it more general > > that I should unroll until I've exhausted the available vector registers? > > You should unroll if it improves bandwidth. > > -- > レミ・デニ-クールモン > http://www.remlab.net/ > > > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". > After adding a 2nd set of load/left shift/store it was diminishing/no returns for more unrolling. I'll send the updated version later. Does wasted32 (and I guess wasted33 by proxy) not have to worry about loops tails? I noticed the other vectorized versions don't do anything special in that regard. -- Sean McGovern > _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".