On 2/21/15, James Almer <jamr...@gmail.com> wrote: > On 21/02/15 3:47 PM, Paul B Mahol wrote: >> On 2/21/15, James Almer <jamr...@gmail.com> wrote: >>> On 21/02/15 8:49 AM, Paul B Mahol wrote: >>>> Have you measured performance drop before and after? >>> >>> filter_order 8 in decorrelate() >>> >>> Before >>> 903 decicycles in scalarproduct, 8388364 runs, 244 skips >>> After >>> 858 decicycles in scalarproduct, 8388215 runs, 393 skips >>> >>> >>> filter_order 24 in decode_subframe() >>> >>> Before >>> 993 decicycles in scalarproduct, 16776849 runs, 367 skips >>> After >>> 887 decicycles in scalarproduct, 16776783 runs, 433 skips >>> >> >> But what about other filter orders? > > filter order 12 in decode_subframe() > > Before > 963 decicycles in scalarproduct, 8388426 runs, 182 skips > After > 873 decicycles in scalarproduct, 8388410 runs, 198 skips > > > filter_order 8 in decode_subframe() > > Before > 900 decicycles in scalarproduct, 4194020 runs, 284 skips > After > 858 decicycles in scalarproduct, 4194198 runs, 106 skips > > > filter order 4 in decode_subframe() > > Before > 827 decicycles in scalarproduct, 1048561 runs, 15 skips > After > 876 decicycles in scalarproduct, 1048556 runs, 20 skips > > > Seems like only filter_order 4 is slower. I could leave the C code for that > one > case if you prefer.
I'm more afraid of overhead that memset() does. Feel free to apply patch. > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel