On Wed, Nov 27, 2019 at 2:13 PM Clément Bœsch <u...@pkh.me> wrote: > Yeah I will by the end of the week. I wrote that a few years ago so I need > to take some time to get back in the context.
Thanks Clément for your help. > > BTW, that's quite a huge speed improvement you're bringing in, are you > sure you are always allowed to read up to filter[3]? I will check. Otherwise we can version the code and keep the existing code along for vector factor 2. > > Last thing: this same optimization was also written for arm following the > same pattern. You may want to adjust that one as well while waiting for my > review :) Thanks for pointing it out. I can submit a separate patch for that. I have also seen that ff_yuv2planeX_8_neon in libswscale/aarch64/output.S could be improved in a similar way, and that function appears on the critical path (for multi threaded encodes) and on the linux-perf profiles. Sebastian _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".