ffmpeg_qsv.c              | 636 +++++++++++++++++++++++++++++++++++++++++++++-
 libavcodec/qsv.h          |   3 +
 libavcodec/qsv_internal.h |   2 +
 libavcodec/qsvdec.c       |   5 +-
 libavcodec/qsvenc.c       |   2 +
 8 files changed, 649 insertions(+), 5 deletions(-)


This is a giant patch that doesnt even begin to describe what it does.
So, whats it good for? We can already do transcoding of video from QSV
decoder to QSV encoder all in GPU memory without 600+ lines of new
code. Admittedly it currently has a few issues, but those could be
fixed, but why do we need 600 new lines of code?

1. In GPU level, all frames are processed in tiled mode (we called video memory mode) which cannot be read/write by cpu directly. The frame buffer should be allocated via vaCreateSurface. Any non-tiled memory must be copied to tiled memory when using GPU acceleration. The copying task is done by MediaSDK internally.

2. In current implementation, frame buffer is allocated by ffmpeg in linear mode (we called system memory) ; QSV deocder’s output and QSV encoder’s input are all set to video memory mode ( e.g. iopattern = MFX_IOPATTERN_OUT_SYSTEM_MEMORY in qsv decoder); so there are 2 memory copy processes in mediaSDK: one is copying from video_memory to system memory when output from HW decoder, another is copying from system memory to video memory when feeding to HW encoder. It will decrease transcoding performance greatly, especially for high resolution such as 1080 & 4K.

3. The patches are avoiding such additional memory copy when all modules in transcoding pipeline can be accelerated by GPU. To achieving it, iopattern must be set to video_memory, and an external allocator must be implemented as mediaSDK requirements, and set it to QSV codec. Most of the 600 lines in the patches are the code to implement the external allocator. At the same time, the patches also add some code to checking whether all modules in transcoding pipeline can be accelerated by GPU or not, so that transcoder can select video-memory or system-memory automatically.

4. As our test, the transcoding performance can be improved about 20% or more according to resolution with patches. And it can reach the performance which is declared in QSV specification.

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Reply via email to