Hi,
> Problem 1 > you have a mean of around 2100 and Stdev of about between 55 and 80 so if > by > statistically significant you man 2 Stdev, then with the mean you have. > You would declare every optimization of less than 6% to be statistically > insignificant. > So by what you say here, it seems to me you would have to suggest that > every optimization which provides 6% or less overall speedup to be > removed. > That i doubt many will agree with > Logical fallacy. > Problem 2 > We do not meassure speed this way because its not realiable nor practical > just look at this, especially the difference and variation > ./ffmpeg -threads 1 -i ~/videos/matrixbench_mpeg2.mpg -f null - > 8941 decicycles in non-intra, 2097003 runs, 149 skipste=N/A > speed=53.2x > 8941 decicycles in non-intra, 2097013 runs, 139 skipste=N/A > speed=54.1x > 8942 decicycles in non-intra, 2097038 runs, 114 skipste=N/A > speed=54.1x > 8970 decicycles in non-intra, 2097037 runs, 115 skipste=N/A speed= > 54x > > ./ffmpeg -threads 1 -flags2 fast -i ~/videos/matrixbench_mpeg2.mpg -f null > - > 8718 decicycles in non-intra, 2097020 runs, 132 skipste=N/A > speed=54.6x > 8701 decicycles in non-intra, 2097044 runs, 108 skipste=N/A > speed=54.6x > 8718 decicycles in non-intra, 2097034 runs, 118 skipste=N/A > speed=54.5x > 8702 decicycles in non-intra, 2097029 runs, 123 skipste=N/A > speed=54.5x > > This difference is statistically significant, i can say this without the > need > to check > Tested on a AMD Ryzen 9 3950X but i expect you will see similar on most > CPUs > Did you remove the branch for FLAGS2_FAST and the large amount of inlined code when making this measurement? I also note that you just ignore mb_block_count in FLAGS2_FAST mode so this is not a fair comparison. > Problem 3 > You dont search for useless code, this is not a "i tested 100 > optimizations and found these not worth it" > You search for an argument to remove specific pieces of my code. > thats seriously not making sense to me. really not I am not going to even try to respond to that as it clearly extends into issues larger than FFmpeg. > Problem 4 > try H264 using good old time > time ./ffmpeg -thread_type slice -i fate-suite//h264/bbc2.sample.h264 -f > null - > real 0m0,252s > real 0m0,254s > real 0m0,254s > real 0m0,255s > > time ./ffmpeg -flags2 fast -thread_type slice -i > fate-suite//h264/bbc2.sample.h264 -f null - > real 0m0,217s > real 0m0,220s > real 0m0,218s > real 0m0,217s > > Here even with a crude way of meassuring we can see a clear and strong > difference > That's H.264, not MPEG-2, not relevant to this discussion. Is that file even capable of using slice threading to decode? Going back to the original point about MPEG-2, if a user chooses FLAGS2_FAST, they expect it to make a major difference. I had tested MPEG-2 expecting it to be much more significant but it is not. Not 200 decicycles on a single function (making up your own dequant). Kieran _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".