On 23/01/18 15:14, Mironov, Mikhail wrote: >> -----Original Message----- >> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf >> Of Mironov, Mikhail >> Sent: January 23, 2018 10:04 AM >> To: FFmpeg development discussions and patches <ffmpeg- >> de...@ffmpeg.org> >> Subject: Re: [FFmpeg-devel] [RFC] amfenc: Add support for OpenCL input >> >>> -----Original Message----- >>> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf >>> Of Mark Thompson >>> Sent: January 22, 2018 6:57 PM >>> To: FFmpeg development discussions and patches <ffmpeg- >>> de...@ffmpeg.org> >>> Subject: [FFmpeg-devel] [RFC] amfenc: Add support for OpenCL input >>> >>> --- >>> This allows passing OpenCL frames to AMF without a download/upload >>> step to get around AMD's lack of support for D3D11 mapping. >>> >>> For example: >>> >>> ./ffmpeg -hwaccel dxva2 -hwaccel_output_format dxva2_vld -i input.mp4 >>> -an -vf >>> >> 'hwmap=derive_device=opencl,program_opencl=source=examples.cl:kernel= >>> rotate_image' -c:v h264_amf output.mp4 >>> >>> * I can't find any documentation or examples for these functions, so >>> I'm guessing a bit exactly how they are meant to work. In particular, >>> there are some locking functions which I have ignored because I have >>> no idea under what circumstances something might want to be locked. >>> * I tried to write common parts with D3D11, but I might well have >>> broken >>> D3D11 support in the process - it doesn't work at all for me so I can't >>> test it. >>> * Not sure how to get non-NV12 to work. I may be missing something, >>> or it may just not be there - the trace messages suggest it doesn't >>> like the width of >>> RGB0 or the second plane of GRAY8. >>> >>> - Mark >>> >>> >>> libavcodec/amfenc.c | 178 >>> +++++++++++++++++++++++++++++++++++--------- >>> -------- >>> libavcodec/amfenc.h | 1 + >>> 2 files changed, 123 insertions(+), 56 deletions(-) >>> >>> diff --git a/libavcodec/amfenc.c b/libavcodec/amfenc.c index >>> 89a10ff253..220cdd278f 100644 >>> --- a/libavcodec/amfenc.c >>> +++ b/libavcodec/amfenc.c >>> @@ -24,6 +24,9 @@ >>> #if CONFIG_D3D11VA >>> #include "libavutil/hwcontext_d3d11va.h" >>> #endif >>> +#if CONFIG_OPENCL >>> +#include "libavutil/hwcontext_opencl.h" >>> +#endif >>> #include "libavutil/mem.h" >>> #include "libavutil/pixdesc.h" >>> #include "libavutil/time.h" >>> @@ -51,6 +54,9 @@ const enum AVPixelFormat ff_amf_pix_fmts[] = { #if >>> CONFIG_D3D11VA >>> AV_PIX_FMT_D3D11, >>> #endif >>> +#if CONFIG_OPENCL >>> + AV_PIX_FMT_OPENCL, >>> +#endif >>> AV_PIX_FMT_NONE >>> }; >>> >>> @@ -69,6 +75,7 @@ static const FormatMap format_map[] = >>> { AV_PIX_FMT_YUV420P, AMF_SURFACE_YUV420P }, >>> { AV_PIX_FMT_YUYV422, AMF_SURFACE_YUY2 }, >>> { AV_PIX_FMT_D3D11, AMF_SURFACE_NV12 }, >>> + { AV_PIX_FMT_OPENCL, AMF_SURFACE_NV12 }, >>> }; >>> >>> >>> @@ -154,8 +161,9 @@ static int amf_load_library(AVCodecContext *avctx) >>> >>> static int amf_init_context(AVCodecContext *avctx) { >>> - AmfContext *ctx = avctx->priv_data; >>> - AMF_RESULT res = AMF_OK; >>> + AmfContext *ctx = avctx->priv_data; >>> + AMF_RESULT res; >>> + AVHWDeviceContext *hwdev = NULL; >>> >>> // configure AMF logger >>> // the return of these functions indicates old state and do not >>> affect behaviour @@ -173,59 +181,91 @@ static int >>> amf_init_context(AVCodecContext *avctx) >>> >>> res = ctx->factory->pVtbl->CreateContext(ctx->factory, &ctx->context); >>> AMF_RETURN_IF_FALSE(ctx, res == AMF_OK, AVERROR_UNKNOWN, >>> "CreateContext() failed with error %d\n", res); >>> - // try to reuse existing DX device >>> -#if CONFIG_D3D11VA >>> + >>> + // Attempt to initialise from an existing D3D11 or OpenCL device. >>> if (avctx->hw_frames_ctx) { >>> - AVHWFramesContext *device_ctx = (AVHWFramesContext*)avctx- >>>> hw_frames_ctx->data; >>> - if (device_ctx->device_ctx->type == AV_HWDEVICE_TYPE_D3D11VA) { >>> - if (amf_av_to_amf_format(device_ctx->sw_format) != >>> AMF_SURFACE_UNKNOWN) { >>> - if (device_ctx->device_ctx->hwctx) { >>> - AVD3D11VADeviceContext *device_d3d11 = >>> (AVD3D11VADeviceContext *)device_ctx->device_ctx->hwctx; >>> - res = ctx->context->pVtbl->InitDX11(ctx->context, >> device_d3d11- >>>> device, AMF_DX11_1); >>> - if (res == AMF_OK) { >>> - ctx->hw_frames_ctx = >>> av_buffer_ref(avctx->hw_frames_ctx); >>> - if (!ctx->hw_frames_ctx) { >>> - return AVERROR(ENOMEM); >>> - } >>> - } else { >>> - if(res == AMF_NOT_SUPPORTED) >>> - av_log(avctx, AV_LOG_INFO, >>> "avctx->hw_frames_ctx has >>> D3D11 device which doesn't have D3D11VA interface, switching to >>> default\n"); >>> - else >>> - av_log(avctx, AV_LOG_INFO, >>> "avctx->hw_frames_ctx has >>> non-AMD device, switching to default\n"); >>> - } >>> - } >>> - } else { >>> - av_log(avctx, AV_LOG_INFO, "avctx->hw_frames_ctx has format >>> not uspported by AMF, switching to default\n"); >>> - } >>> + AVHWFramesContext *hwfc = >>> + (AVHWFramesContext*)avctx->hw_frames_ctx->data; >>> + >>> + if (amf_av_to_amf_format(hwfc->sw_format) == >>> AMF_SURFACE_UNKNOWN) { >>> + av_log(avctx, AV_LOG_VERBOSE, "Input hardware frame >>> + format (%s) >>> is not supported.\n", >>> + av_get_pix_fmt_name(hwfc->sw_format)); >>> + } else { >>> + hwdev = hwfc->device_ctx; >>> + >>> + ctx->hw_frames_ctx = av_buffer_ref(avctx->hw_frames_ctx); >>> + if (!ctx->hw_frames_ctx) >>> + return AVERROR(ENOMEM); >>> } >>> - } else if (avctx->hw_device_ctx) { >>> - AVHWDeviceContext *device_ctx = (AVHWDeviceContext*)(avctx- >>>> hw_device_ctx->data); >>> - if (device_ctx->type == AV_HWDEVICE_TYPE_D3D11VA) { >>> - if (device_ctx->hwctx) { >>> - AVD3D11VADeviceContext *device_d3d11 = >>> (AVD3D11VADeviceContext *)device_ctx->hwctx; >>> - res = ctx->context->pVtbl->InitDX11(ctx->context, >>> device_d3d11- >>>> device, AMF_DX11_1); >>> + } >>> + if (!hwdev && avctx->hw_device_ctx) { >>> + hwdev = (AVHWDeviceContext*)avctx->hw_device_ctx->data; >>> + >>> + ctx->hw_device_ctx = av_buffer_ref(avctx->hw_device_ctx); >>> + if (!ctx->hw_device_ctx) >>> + return AVERROR(ENOMEM); >>> + } >>> + if (hwdev) { >>> +#if CONFIG_D3D11VA >>> + if (hwdev->type == AV_HWDEVICE_TYPE_D3D11VA) { >>> + AVD3D11VADeviceContext *d3d11dev = hwdev->hwctx; >>> + >>> + res = ctx->context->pVtbl->InitDX11(ctx->context, >>> + d3d11dev->device, >>> AMF_DX11_1); >>> + if (res == AMF_OK) { >>> + av_log(avctx, AV_LOG_VERBOSE, "Initialised from " >>> + "external D3D11 device.\n"); >>> + return 0; >>> + } >>> + >>> + av_log(avctx, AV_LOG_INFO, "Failed to initialise from " >>> + "external D3D11 device: %d.\n", res); >>> + } else >>> +#endif >>> +#if CONFIG_OPENCL >>> + if (hwdev->type == AV_HWDEVICE_TYPE_OPENCL) { >>> + AVOpenCLDeviceContext *cldev = hwdev->hwctx; >>> + cl_int cle; >>> + >>> + ctx->cl_command_queue = >>> + clCreateCommandQueue(cldev->context, >>> + cldev->device_id, 0, >>> &cle); >>> + if (!ctx->cl_command_queue) { >>> + av_log(avctx, AV_LOG_INFO, "Failed to create OpenCL " >>> + "command queue: %d.\n", cle); >>> + } else { >>> + res = ctx->context->pVtbl->InitOpenCL(ctx->context, >>> + >>> + ctx->cl_command_queue); >>> if (res == AMF_OK) { >>> - ctx->hw_device_ctx = >>> av_buffer_ref(avctx->hw_device_ctx); >>> - if (!ctx->hw_device_ctx) { >>> - return AVERROR(ENOMEM); >>> - } >>> - } else { >>> - if (res == AMF_NOT_SUPPORTED) >>> - av_log(avctx, AV_LOG_INFO, "avctx->hw_device_ctx >>> has >> D3D11 >>> device which doesn't have D3D11VA interface, switching to default\n"); >>> - else >>> - av_log(avctx, AV_LOG_INFO, "avctx->hw_device_ctx >>> has non- >>> AMD device, switching to default\n"); >>> + av_log(avctx, AV_LOG_VERBOSE, "Initialised from " >>> + "external OpenCL device.\n"); >>> + return 0; >>> } >>> + av_log(avctx, AV_LOG_INFO, "Failed to initialise from " >>> + "external OpenCL device: %d.\n", res); >>> } >>> + } else >>> +#endif >>> + { >>> + av_log(avctx, AV_LOG_INFO, "Input device type %s is not >>> supported.\n", >>> + av_hwdevice_get_type_name(hwdev->type)); >>> } >>> } >>> -#endif >>> - if (!ctx->hw_frames_ctx && !ctx->hw_device_ctx) { >>> - res = ctx->context->pVtbl->InitDX11(ctx->context, NULL, >> AMF_DX11_1); >>> - if (res != AMF_OK) { >>> - res = ctx->context->pVtbl->InitDX9(ctx->context, NULL); >>> - AMF_RETURN_IF_FALSE(ctx, res == AMF_OK, AVERROR_UNKNOWN, >>> "InitDX9() failed with error %d\n", res); >>> + >>> + // Initialise from a new D3D11 device, or D3D9 if D3D11 is not >>> available. >>> + res = ctx->context->pVtbl->InitDX11(ctx->context, NULL, AMF_DX11_1); >>> + if (res == AMF_OK) { >>> + av_log(avctx, AV_LOG_VERBOSE, "Initialised from internal >>> + D3D11 >>> device.\n"); >>> + } else { >>> + av_log(avctx, AV_LOG_VERBOSE, "Failed to initialise from >>> + internal >>> D3D11 device: %d.\n", res); >>> + res = ctx->context->pVtbl->InitDX9(ctx->context, NULL); >>> + if (res == AMF_OK) { >>> + av_log(avctx, AV_LOG_VERBOSE, "Initialised from internal >>> + D3D9 >>> device.\n"); >>> + } else { >>> + av_log(avctx, AV_LOG_VERBOSE, "Failed to initialise from >>> + internal >>> D3D9 device: %d.\n", res); >>> + av_log(avctx, AV_LOG_ERROR, "Unable to initialise AMF.\n"); >>> + return AVERROR_UNKNOWN; >>> } >>> } >>> + >>> return 0; >>> } >>> >>> @@ -279,6 +319,11 @@ int av_cold ff_amf_encode_close(AVCodecContext >>> *avctx) >>> av_buffer_unref(&ctx->hw_device_ctx); >>> av_buffer_unref(&ctx->hw_frames_ctx); >>> >>> +#if CONFIG_OPENCL >>> + if (ctx->cl_command_queue) >>> + clReleaseCommandQueue(ctx->cl_command_queue); >>> +#endif >>> + >>> if (ctx->trace) { >>> ctx->trace->pVtbl->UnregisterWriter(ctx->trace, >>> FFMPEG_AMF_WRITER_ID); >>> } >>> @@ -485,17 +530,38 @@ int ff_amf_send_frame(AVCodecContext *avctx, >>> const AVFrame *frame) >>> (AVHWDeviceContext*)ctx->hw_device_ctx->data) >>> )) { >>> #if CONFIG_D3D11VA >>> - static const GUID AMFTextureArrayIndexGUID = { 0x28115527, >>> 0xe7c3, 0x4b66, { 0x99, 0xd3, 0x4f, 0x2a, 0xe6, 0xb4, 0x7f, 0xaf } }; >>> - ID3D11Texture2D *texture = (ID3D11Texture2D*)frame->data[0]; // >>> actual texture >>> - int index = (int)(size_t)frame->data[1]; // index is a slice >>> in texture >>> array is - set to tell AMF which slice to use >>> - texture->lpVtbl->SetPrivateData(texture, >>> &AMFTextureArrayIndexGUID, sizeof(index), &index); >>> - >>> - res = ctx->context->pVtbl->CreateSurfaceFromDX11Native(ctx- >>>> context, texture, &surface, NULL); // wrap to AMF surface >>> - AMF_RETURN_IF_FALSE(ctx, res == AMF_OK, AVERROR(ENOMEM), >>> "CreateSurfaceFromDX11Native() failed with error %d\n", res); >>> - >>> - // input HW surfaces can be vertically aligned by 16; tell AMF >>> the >> real >>> size >>> - surface->pVtbl->SetCrop(surface, 0, 0, frame->width, frame- >>> height); >>> + if (frame->format == AV_PIX_FMT_D3D11) { >>> + static const GUID AMFTextureArrayIndexGUID = { >>> + 0x28115527, >>> 0xe7c3, 0x4b66, { 0x99, 0xd3, 0x4f, 0x2a, 0xe6, 0xb4, 0x7f, 0xaf } }; >>> + ID3D11Texture2D *texture = >>> + (ID3D11Texture2D*)frame->data[0]; >>> // actual texture >>> + int index = (int)(size_t)frame->data[1]; // index is >>> + a slice in texture >>> array is - set to tell AMF which slice to use >>> + texture->lpVtbl->SetPrivateData(texture, >>> + &AMFTextureArrayIndexGUID, sizeof(index), &index); >>> + >>> + res = >>> + ctx->context->pVtbl->CreateSurfaceFromDX11Native(ctx- >>>> context, texture, &surface, NULL); // wrap to AMF surface >>> + AMF_RETURN_IF_FALSE(ctx, res == AMF_OK, >>> + AVERROR(ENOMEM), "CreateSurfaceFromDX11Native() failed with error >>> + %d\n", res); >>> + >>> + // input HW surfaces can be vertically aligned by 16; >>> + tell AMF the >>> real size >>> + surface->pVtbl->SetCrop(surface, 0, 0, frame->width, >>> + frame- >>>> height); >>> + } else >>> +#endif >>> +#if CONFIG_OPENCL >>> + if (frame->format == AV_PIX_FMT_OPENCL) { >>> + void *planes[AV_NUM_DATA_POINTERS]; >>> + AMF_SURFACE_FORMAT format; >>> + int i; >>> + >>> + for (i = 0; i < AV_NUM_DATA_POINTERS; i++) >>> + planes[i] = frame->data[i]; >>> + >>> + format = amf_av_to_amf_format(frame->format); >>> + >>> + res = >>> + ctx->context->pVtbl->CreateSurfaceFromOpenCLNative(ctx- >>>> context, format, >>> + >>> frame->width, frame->height, >>> + >>> planes, &surface, NULL); >>> + AMF_RETURN_IF_FALSE(ctx, res == AMF_OK, >>> AVERROR_UNKNOWN, >>> + "CreateSurfaceFromOpenCLNative() >>> + failed with error >>> %d\n", res); >>> + } else >>> #endif >>> + av_assert0(0 && "Invalid hardware input format."); >>> } else { >>> res = ctx->context->pVtbl->AllocSurface(ctx->context, >>> AMF_MEMORY_HOST, ctx->format, avctx->width, avctx->height, &surface); >>> AMF_RETURN_IF_FALSE(ctx, res == AMF_OK, AVERROR(ENOMEM), >>> "AllocSurface() failed with error %d\n", res); diff --git >>> a/libavcodec/amfenc.h b/libavcodec/amfenc.h index >>> 84f0aad2fa..bb8fd1807a 100644 >>> --- a/libavcodec/amfenc.h >>> +++ b/libavcodec/amfenc.h >>> @@ -61,6 +61,7 @@ typedef struct AmfContext { >>> >>> AVBufferRef *hw_device_ctx; ///< pointer to HW accelerator >>> (decoder) >>> AVBufferRef *hw_frames_ctx; ///< pointer to HW accelerator >>> (frame >>> allocator) >>> + void *cl_command_queue; ///< Command queue for use with >>> OpenCL input >>> >>> // helpers to handle async calls >>> int delayed_drain; >>> -- >>> 2.11.0 >>> _______________________________________________ >>> ffmpeg-devel mailing list >>> ffmpeg-devel@ffmpeg.org >>> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel >> >> AMF encoder works via D3D9 or D3D11 only. AMF OpenCL support is done >> for possible integration with external image processing. Passing regular >> OpenCL 2D images will cause mapping to system memory and copy. >> The fast way is to use interop: >> - Allocate last processing NV12 surface as D3D11 texture >> - iterop it into OpenCL >> - use as output for the last OCL kernel >> - un-interop back to D3D11 >> - submit to AMF. >> There is not much value to initialize AMF with OpenCL unless AMF color >> space converter is used. >> The converter would do the sequence described above. >> >> If AMF CSC is used few things has to be done: >> 1. Device should be created by passing D3D11 device as a parameter. It is >> done in hwcontext_opencl.c clGetDeviceIDsFromD3D11KNR(). >> 2. The D3D11 device used there should be passed to AMF via InitDX11() >> preferably before InitOpenCL() call. >> 3. Add RGB formats for submission. >> Mikhail >> > > Alternatively we could just allocate D3D11 surface, interop to OCL, copy > using OCL, un-interop, and submit to AMF: > Context->InitD3D11(device used for OCL device creation) > Context->InitOpenCL(queue) > Context->AllocSurface(AMF_MEMORY_D3D11,AMF_SURFACE_NV12,, &surface); > surface->Convert(AMF_MEMORY_OPENCL); //interop > cl_mem planeY = surface->GetPlaneAt(0)->GetNative(); > cl_mem planeUV = surface->GetPlaneAt(1)->GetNative(); > > clEnqueueCopyImage() // Y > clEnqueueCopyImage() // UV > surface->Convert(AMF_MEMORY_D3D11); //un-interop > encoder->SubmitInput(surface);
Right, that sequence would work; I might try it with D3D9. Is there a reason why the driver doesn't use this path (or some equivalent) internally? Implementing the download/upload sequence inside the driver feels just as bad, and is significantly more misleading to the user. (I assume the reason why the OpenCL images aren't usable directly is due a restriction on tiling modes or some similar layout issue, so at least one copy is definitely required.) - Mark _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel