On Fri, Jun 25, 2021 at 1:26 PM Ronald S. Bultje <rsbul...@gmail.com> wrote:
> Hi Alan, > > On Fri, Jun 25, 2021 at 3:59 AM Alan Kelly < > alankelly-at-google....@ffmpeg.org> wrote: > >> These functions replace all ff_hscale8to15_*_ssse3 when avx2 is available. >> > > Re-asking a question I asked before in the other thread: > > Also, what is the cycle count of ssse3/avx2 implementation for this > specific function on Haswell? It would be good to note that in the > respective patch so that we understand why the check was added. > > You should be able to find this in the checkasm --bench --test=X numbers > for this relevant function. > > Ronald > Hi Ronald, Skylake Haswell hscale_8_to_15_width4_ssse3 761.2 760 hscale_8_to_15_width4_avx2 468.7 957 hscale_8_to_15_width8_ssse3 1170.7 1032 hscale_8_to_15_width8_avx2 865.7 1979 hscale_8_to_15_width12_ssse3 2172.2 2472 hscale_8_to_15_width12_avx2 1245.7 2901 hscale_8_to_15_width16_ssse3 2244.2 2400 hscale_8_to_15_width16_avx2 1647.2 3681 As you can see, it is catastrophic on Haswell. In the next iteration of the patch, I will update the description with these numbers. Thanks _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".