Mark Thompson wrote:
On 17/08/16 20:47, Chao Liu wrote:
Hi there,
I compared h264_qsv decoder from ffmpeg to intel media sdk sample_decode.
There is pretty big speed gap. I wonder whether I did sth. wrong or there
are really some problems with ffmpeg's implementation..
The test video was captured from a 3MP(2048x1536) camera. The commands I
used:
- ffmpeg -c:v h264_qsv -async_depth 10 -i test.h264 -c:v rawvideo -f null
/dev/null
- sample_decode h264 -i test.h264
Both uses 100% cpu (a full core). ffmpeg got 170FPS. sample_decode got
370FPS.
I haven't got time debugging into this. Sending this out to see whether you
guys might have sth. in mind..
I think in both cases your speed bound must be on something other than the
decode, because the hardware goes a lot faster than either of those for me.
Perhaps you are downloading the all of the output frames to normal memory in
order to write them to a null device output, and one of the cases is doing that
less efficiently somehow?
Only tested with AMD UVD, but unless you use -pix_fmt nv12 you will also
get cpu load from ffmpeg doing nv12 -> yuv420p conversion.
Using vaapi on a low-power Haswell mobile chip (i.e. the same Quick Sync
hardware that libmfx uses) decodes a single 2048x1536 stream at around 800fps
with less than 50% CPU for me.
- Mark
(My command to compare is:
./ffmpeg_g -vaapi_device /dev/dri/renderD128 -hwaccel vaapi
-hwaccel_output_format vaapi -i input.mp4 -an -vf 'format=nv12|vaapi,hwupload'
-f null -
Oh nice, I always wondered if there was a way to bench without copy back.
The nasty filtering there is contrived to do nothing, even with the inconvenient stream
reinitialisation. I think libmfx might also work somehow with "-c:v h264_qsv
-hwaccel qsv", but I'm not sure and I don't have anything to try it on right now.)
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel