On Mon, 2016-07-04 at 23:31 -0500, Dan Parrot wrote: > On Mon, 2016-07-04 at 09:20 +0000, Carl Eugen Hoyos wrote: > > Dan Parrot <dan.parrot <at> mail.com> writes: > > > > > The dataset used was the entire FATE regression suite. > > > > I don't think this is a particularly useful testcase: > > It takes very long but mostly tests other things. > > > > Did you test if using ffmpeg -benchmark -f rawvideo -i /dev/zero... > > showed different results? > > I believe this should be both easier and faster to test. > > > > > name: rgb24ToY_c_vsx. > > > no. of calls: 9999. min: 3832 ns. avg: 4709 ns. max: 37550 ns. > > > total: 47093533 ns. > > > > > > name: rgb24ToY_c. > > > no. of calls: 9999. min: 3809 ns. avg: 4707 ns. max: 29041 ns. > > > total: 47072923 ns. > > > > Without any data, I would have thought that this is the most > > important function (and "no. of calls" seems to confirm this). > > > > Why is this not faster?
I believe I have answered, in earlier posts, all the questions you raised. Finally, just to satisfy my curiosity, I used SystemTap to probe during a run of the entire FATE regression. Here are the same two functions, this time with GCC 6.1.1 instead of 5.3.1 (it is representative of all other functions) name: rgb24ToY_c_vsx. no. of calls: 9999. min: 3053 ns. avg: 3298 ns. max: 69359 ns. total: 32983050 ns. name: rgb24ToY_c. no. of calls: 9999. min: 3040 ns. avg: 4056 ns. max: 79159 ns. total: 40561568 ns. Non-trivial improvement is seen for the SIMD code. So: would you accept and apply the patch? _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel