On Mon, 2016-07-04 at 16:30 +0000, Carl Eugen Hoyos wrote: > Dan Parrot <dan.parrot <at> mail.com> writes: > > > > Did you test if using ffmpeg -benchmark -f rawvideo -i /dev/zero... > > > showed different results? > > > I believe this should be both easier and faster to test. > > > > Sorry, I don't understand what that command line just above > > is trying to achieve. Could you elaborate? > > Instead of running the whole fate suite that takes long and > does not test libswscale for most commands, just test an > ffmpeg command line that only tests libswscale: > $ ffmpeg -benchmark -f rawvideo -pix_fmt rgb24 > -i /dev/zero -pix_fmt yuv420p -f null -vframes 10000 - $ ./ffmpeg -benchmark -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt yuv420p -f null -vframes 1000 -
frame= 1000 fps= 16 q=-0.0 Lsize=N/A time=00:00:40.00 bitrate=N/A speed=0.632x video:477kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown bench: utime=62.794s bench: maxrss=21184kB > vs > > $ ffmpeg -cpuflags 0 -benchmark -f rawvideo -pix_fmt rgb24 > -i /dev/zero -pix_fmt yuv420p -f null -vframes 10000 - $ ./ffmpeg -cpuflags 0 -benchmark -f rawvideo -pix_fmt rgb24 -s hd1080 -i /dev/zero -pix_fmt yuv420p -f null -vframes 1000 - frame= 1000 fps= 12 q=-0.0 Lsize=N/A time=00:00:40.00 bitrate=N/A speed=0.479x video:477kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown bench: utime=82.918s bench: maxrss=21120kB > [...] > > > Surprisingly, gcc is producing some badly suboptimal assembly. > > Just to make sure I don't misunderstand: > Does this mean intrinsics are suboptimal to write assembly > code? So, the latest version of GCC does produce more efficient assembly. To recap: GCC 5.3.1 produces assembly that does not take full advantage of PPC64 POWER8 SIMD instructions. GCC 6.1.1 is much better and produces shorter sequences that do use SIMD assembly instructions. > > > Can you confirm with START_TIMER / STOP_TIMER that there is no > > > gain? > > > > SystemTap probes provide identical functionality by measuring > > deltas between function entry and function return. > > Sorry, I don't understand: > Did you test with both methods to verify that they provide > the same results? > Note that if it turns out that START_TIMER / STOP_TIMER > cannot be used on ppc64 (le) this would be important > information for us. These start/stop macros are the last issue I have outstanding. I hope to be done in a few hours. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel