Hi!My GPU is GTX 1080Ti. Trying Your command but same error result. I tested on windows build downloaded from https://ffmpeg.zeranoe.com/builds/
Stream mapping: Stream #0:0 (h264) -> overlay_cuda:main Stream #1:0 (png) -> format overlay_cuda -> Stream #0:0 (h264_nvenc) Press [q] to stop, [?] for help [h264 @ 00000231eee7ce40] NVDEC capabilities: [h264 @ 00000231eee7ce40] format supported: yes, max_mb_count: 65536 [h264 @ 00000231eee7ce40] min_width: 48, max_width: 4096 [h264 @ 00000231eee7ce40] min_height: 16, max_height: 4096 [h264 @ 00000231eee7ce40] Reinit context to 1280x720, pix_fmt: cuda [graph 0 input from stream 1:0 @ 0000023182422180] w:1894 h:302 pixfmt:rgba tb:1/25 fr:25/1 sar:11811/11811 [graph 0 input from stream 0:0 @ 000002318bbe1540] w:1280 h:720 pixfmt:cuda tb:1/24000 fr:24000/1001 sar:1/1 [auto_scaler_0 @ 000002318bbe55c0] w:iw h:ih flags:'bilinear' interl:0 [Parsed_format_0 @ 00000231825e4bc0] auto-inserting filter 'auto_scaler_0' between the filter 'graph 0 input from stream 1:0' and the filter 'Parsed_format_0' [auto_scaler_0 @ 000002318bbe55c0] w:1894 h:302 fmt:rgba sar:11811/11811 -> w:1894 h:302 fmt:nv12 sar:1/1 flags:0x2 [overlay_cuda @ 0000023182798140] cu->cuModuleLoadData(&ctx->cu_module, vf_overlay_cuda_ptx) failed -> CUDA_ERROR_INVALID_IMAGE: device kernel image is invalid [Parsed_overlay_cuda_2 @ 0000023182431d40] Failed to configure output pad on Parsed_overlay_cuda_2 Error reinitializing filters! Failed to inject frame into filter network: Generic error in an external library Error while processing the decoded data for stream #0:0 [AVIOContext @ 0000023182437840] Statistics: 0 seeks, 0 writeouts [AVIOContext @ 00000231eee87b80] Statistics: 409657 bytes read, 2 seeks [AVIOContext @ 000002318248e700] Statistics: 67602 bytes read, 0 seeks Conversion failed! --- Original message --- From: "Dennis Mungai" <dmng...@gmail.com> Date: 1 April 2020, 16:51:16 On Wed, 1 Apr 2020 at 16:43, Alex <3.1...@ukr.net> wrote: > Hi!Is it working? I try everything but constantly get error from > overlay_cuda: > > > ffmpeg -y -init_hw_device cuda=cuda -filter_hw_device cuda -hwaccel cuvid > -c:v h264_cuvid -resize 1920x1080 -i 720p.mp4 -i watermark.png > -filter_complex > "[1:v]format=nv12,hwupload[img];[0:v][img]overlay_cuda=x=50:y=800[out]" > -map [out] -c:v h264_nvenc -b:v 6M -an -preset fast -y > out_nvenc_overlay.mp4 > ... > ffmpeg version git-2020-04-01-afa5e38 > ... > [h264_cuvid @ 000001dd1b356d00] CUVID capabilities for h264_cuvid: > [h264_cuvid @ 000001dd1b356d00] 8 bit: supported: 1, min_width: 48, > max_width: 4096, min_height: 16, max_height: 4096 > [h264_cuvid @ 000001dd1b356d00] 10 bit: supported: 0, min_width: 0, > max_width: 0, min_height: 0, max_height: 0 > [h264_cuvid @ 000001dd1b356d00] 12 bit: supported: 0, min_width: 0, > max_width: 0, min_height: 0, max_height: 0 > Stream mapping: > Stream #0:0 (h264_cuvid) -> overlay_cuda:main > Stream #1:0 (png) -> format > overlay_cuda -> Stream #0:0 (h264_nvenc) > Press [q] to stop, [?] for help > [h264_cuvid @ 000001dd1b356d00] Formats: Original: cuda | HW: cuda | SW: > nv12 > [graph 0 input from stream 1:0 @ 000001dd2e84a100] w:1894 h:302 > pixfmt:rgba tb:1/25 fr:25/1 sar:11811/11811 > [graph 0 input from stream 0:0 @ 000001dd2e84ae00] w:1920 h:1080 > pixfmt:cuda tb:1/24000 fr:24000/1001 sar:1/1 > [auto_scaler_0 @ 000001dd2ebf4cc0] w:iw h:ih flags:'bilinear' interl:0 > [Parsed_format_0 @ 000001dd2e849780] auto-inserting filter 'auto_scaler_0' > between the filter 'graph 0 input from stream 1:0' and the filter > 'Parsed_format_0' > [auto_scaler_0 @ 000001dd2ebf4cc0] w:1894 h:302 fmt:rgba sar:11811/11811 > -> w:1894 h:302 fmt:nv12 sar:1/1 flags:0x2 > [overlay_cuda @ 000001dd2ebc87c0] cu->cuModuleLoadData(&ctx->cu_module, > vf_overlay_cuda_ptx) failed -> CUDA_ERROR_INVALID_IMAGE: device kernel > image is invalid > [Parsed_overlay_cuda_2 @ 000001dd2e84b6c0] Failed to configure output pad > on Parsed_overlay_cuda_2 > Error reinitializing filters! > Failed to inject frame into filter network: Generic error in an external > library > Error while processing the decoded data for stream #0:0 > ... > > > > --- Original message --- > From: "Yaroslav Pogrebnyak" <yyyaros...@gmail.com> > Date: 18 March 2020, 09:29:15 > > This patch adds 'vf_overlay_cuda' filter. > It draws one picture on top of another on CUDA GPU. > For the end-user, it's similar to 'vf_overlay_opencl' and other overlay > filters. > > This filter would be especially useful for building video processing > pipelines that execute fully on the CUDA GPU. For example, the following > pipeline would be possible: decode -> scale -> overlay -> encode, without > copying frames between CPU and GPU in between. > > Technical details. > > Supported sw input formats are NV12 and YUV420P for main input, and NV12, > YUV420P and YUVA420P for overlay input. > Main and overlay sw formats should match (i.e, overlaying YUVA420P on NV12 > is not implemented). > All pixel format conversions are needed to be done with 'format' or > 'scale_npp' filters before 'overlay_cuda'. > > It was needed to slightly modify 'hwcontext_cuda.c' to allow overlays with > alpha channel: > - Allow AV_PIX_FMT_YUVA420P to enable hwuploading frames with alpha > channel to GPU. > - Do not shift Height of 4rd plane (alpha) when uploading to GPU. > > Examples. > > - Overlay picture on top of video (main: YUVJ420P->NV12, overlay: NV12) > $ ffmpeg -y -init_hw_device cuda=cuda -filter_hw_device cuda -hwaccel > cuvid \ > -c:v h264_cuvid -i main.mp4 \ > -i ~/overlay.jpg \ > -filter_complex "[1:v]format=nv12, hwupload[overlay], > [0:v][overlay]overlay_cuda=x=0:y=0:shortest=false" \ > -an -c:v h264_nvenc -b:v 5M output.mp4 > > - Overlay one video on top of another (main: NV12, overlay: NV12) > $ ffmpeg -y \ > -hwaccel cuvid -c:v h264_cuvid -i main.mp4 \ > -hwaccel cuvid -c:v h264_cuvid -i overlay.mp4 \ > -filter_complex "[1:v]scale_npp=512:-1[o], > [v:0][o]overlay_cuda=x=100:y=100:shortest=true" \ > -an -c:v h264_nvenc -b:v 5M output.mp4 > > - Overlay picture with alpha channel on top of video (main: NV12->YUV420P, > overlay: RGBA->YUVA420P) > $ ffmpeg -y \ > -init_hw_device cuda=cuda -filter_hw_device cuda -hwaccel cuvid \ > -c:v h264_cuvid -i ~/main.mp4 \ > -i ~/overlay.png \ > -filter_complex "[1:v]format=yuva420p, hwupload[o], > [v:0]scale_npp=format=yuv420p[m], > [m][o]overlay_cuda=x=0:y=0:shortest=false" \ > -an -c:v h264_nvenc -b:v 5M output.mp4 > > Patch attached. > > P.S. This is my first patch, I would be grateful for any feedback to know > if I'm doing things correctly or not. > Thanks! > > > Signed-off-by: Yaroslav Pogrebnyak <yyyaros...@gmail.com> > --- > configure | 2 + > libavfilter/Makefile | 1 + > libavfilter/allfilters.c | 1 + > libavfilter/vf_overlay_cuda.c | 451 +++++++++++++++++++++++++++++++++ > libavfilter/vf_overlay_cuda.cu | 54 ++++ > libavutil/hwcontext_cuda.c | 3 +- > 6 files changed, 511 insertions(+), 1 deletion(-) > create mode 100644 libavfilter/vf_overlay_cuda.c > create mode 100644 libavfilter/vf_overlay_cuda.cu > > > > How does the NVDEC path work out? Try this: ffmpeg -y -init_hw_device cuda=cuda -filter_hw_device cuda -hwaccel cuda --hwaccel_output_format cuda -i 720p.mp4 -i watermark.png -filter_complex "[1:v]format=nv12,hwupload[img];[0:v][img]overlay_cuda=x=50:y=800[out]" -map [out] -c:v h264_nvenc -b:v 6M -an -preset fast -y out_nvenc_overlay.mp4 _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".