On 2016-12-07 11:07, Carl Eugen Hoyos wrote: > 2016-12-05 19:32 GMT+01:00 James Darnley <jdarn...@obe.tv>: > >> - sse2: 2.47x (170 vs. 69 cycles) >> - avx: 2.47x (170 vs. 69 cycles) > > Please elaborate on why this was committed.
Because writing it cost almost zero time. All it needed was writing the dsp pointer assignment. Preventing the function from being created (with more %ifs) would have required another patch set being sent through review. Because a few instructions using 3 operand form should be quicker. The fact that it doesn't show is no doubt down to the out of order execution managing to do the moves earlier than written. Because it is future proof. Someone may write a better AVX or a new instruction version of the macros used. A CPU may appear which deprecates all SIMD without the VEX prefix. FFmpeg may allow disabling of old instruction sets without disabling new ones. These last three reasons are each more unlikely than the previous. And now for some more detailed stats, collected for 50 runs, each with 500k calls to the function in question: > sse2: min: 687, max: 774, mean: 690.041, stddev: 12.155 > avx: min: 681, max: 721, mean: 685.469, stddev: 9.083 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel