Hi!Is it working? I try everything but constantly get error from overlay_cuda:


ffmpeg -y -init_hw_device cuda=cuda -filter_hw_device cuda -hwaccel cuvid -c:v 
h264_cuvid -resize 1920x1080 -i 720p.mp4 -i watermark.png -filter_complex 
"[1:v]format=nv12,hwupload[img];[0:v][img]overlay_cuda=x=50:y=800[out]" -map 
[out] -c:v h264_nvenc -b:v 6M -an -preset fast  -y out_nvenc_overlay.mp4
...
ffmpeg version git-2020-04-01-afa5e38
...
[h264_cuvid @ 000001dd1b356d00] CUVID capabilities for h264_cuvid:
[h264_cuvid @ 000001dd1b356d00] 8 bit: supported: 1, min_width: 48, max_width: 
4096, min_height: 16, max_height: 4096
[h264_cuvid @ 000001dd1b356d00] 10 bit: supported: 0, min_width: 0, max_width: 
0, min_height: 0, max_height: 0
[h264_cuvid @ 000001dd1b356d00] 12 bit: supported: 0, min_width: 0, max_width: 
0, min_height: 0, max_height: 0
Stream mapping:
  Stream #0:0 (h264_cuvid) -> overlay_cuda:main
  Stream #1:0 (png) -> format
  overlay_cuda -> Stream #0:0 (h264_nvenc)
Press [q] to stop, [?] for help
[h264_cuvid @ 000001dd1b356d00] Formats: Original: cuda | HW: cuda | SW: nv12
[graph 0 input from stream 1:0 @ 000001dd2e84a100] w:1894 h:302 pixfmt:rgba 
tb:1/25 fr:25/1 sar:11811/11811
[graph 0 input from stream 0:0 @ 000001dd2e84ae00] w:1920 h:1080 pixfmt:cuda 
tb:1/24000 fr:24000/1001 sar:1/1
[auto_scaler_0 @ 000001dd2ebf4cc0] w:iw h:ih flags:'bilinear' interl:0
[Parsed_format_0 @ 000001dd2e849780] auto-inserting filter 'auto_scaler_0' 
between the filter 'graph 0 input from stream 1:0' and the filter 
'Parsed_format_0'
[auto_scaler_0 @ 000001dd2ebf4cc0] w:1894 h:302 fmt:rgba sar:11811/11811 -> 
w:1894 h:302 fmt:nv12 sar:1/1 flags:0x2
[overlay_cuda @ 000001dd2ebc87c0] cu->cuModuleLoadData(&ctx->cu_module, 
vf_overlay_cuda_ptx) failed -> CUDA_ERROR_INVALID_IMAGE: device kernel image is 
invalid
[Parsed_overlay_cuda_2 @ 000001dd2e84b6c0] Failed to configure output pad on 
Parsed_overlay_cuda_2
Error reinitializing filters!
Failed to inject frame into filter network: Generic error in an external library
Error while processing the decoded data for stream #0:0
...



--- Original message ---
From: "Yaroslav Pogrebnyak" <yyyaros...@gmail.com>
Date: 18 March 2020, 09:29:15

This patch adds 'vf_overlay_cuda' filter. 
It draws one picture on top of another on CUDA GPU. 
For the end-user, it's similar to 'vf_overlay_opencl' and other overlay 
filters. 

This filter would be especially useful for building video processing pipelines 
that execute fully on the CUDA GPU. For example, the following pipeline would 
be possible: decode -> scale -> overlay -> encode, without copying frames 
between CPU and GPU in between.

Technical details.

Supported sw input formats are NV12 and YUV420P for main input, and NV12, 
YUV420P and YUVA420P for overlay input. 
Main and overlay sw formats should match (i.e, overlaying YUVA420P on NV12 is 
not implemented). 
All pixel format conversions are needed to be done with 'format' or 'scale_npp' 
filters before 'overlay_cuda'.

It was needed to slightly modify 'hwcontext_cuda.c' to allow overlays with 
alpha channel:
 - Allow AV_PIX_FMT_YUVA420P to enable hwuploading frames with alpha channel to 
GPU.
 - Do not shift Height of 4rd plane (alpha) when uploading to GPU.

Examples.

- Overlay picture on top of video (main: YUVJ420P->NV12, overlay: NV12)
$ ffmpeg -y -init_hw_device cuda=cuda -filter_hw_device cuda -hwaccel cuvid \
  -c:v h264_cuvid -i main.mp4 \
  -i ~/overlay.jpg \
  -filter_complex "[1:v]format=nv12, hwupload[overlay], 
[0:v][overlay]overlay_cuda=x=0:y=0:shortest=false" \
  -an -c:v h264_nvenc -b:v 5M output.mp4

- Overlay one video on top of another (main: NV12, overlay: NV12)
$ ffmpeg -y \
  -hwaccel cuvid -c:v h264_cuvid -i main.mp4 \
  -hwaccel cuvid -c:v h264_cuvid -i overlay.mp4 \
  -filter_complex "[1:v]scale_npp=512:-1[o], 
[v:0][o]overlay_cuda=x=100:y=100:shortest=true" \
  -an -c:v h264_nvenc -b:v 5M output.mp4

- Overlay picture with alpha channel on top of video (main: NV12->YUV420P, 
overlay: RGBA->YUVA420P)
$ ffmpeg -y \
  -init_hw_device cuda=cuda -filter_hw_device cuda -hwaccel cuvid \
  -c:v h264_cuvid -i ~/main.mp4 \
  -i ~/overlay.png \
  -filter_complex "[1:v]format=yuva420p, hwupload[o], 
[v:0]scale_npp=format=yuv420p[m], [m][o]overlay_cuda=x=0:y=0:shortest=false" \
  -an -c:v h264_nvenc -b:v 5M output.mp4

Patch attached.

P.S. This is my first patch, I would be grateful for any feedback to know if 
I'm doing things correctly or not.
Thanks!


Signed-off-by: Yaroslav Pogrebnyak <yyyaros...@gmail.com>
---
 configure                      |   2 +
 libavfilter/Makefile           |   1 +
 libavfilter/allfilters.c       |   1 +
 libavfilter/vf_overlay_cuda.c  | 451 +++++++++++++++++++++++++++++++++
 libavfilter/vf_overlay_cuda.cu |  54 ++++
 libavutil/hwcontext_cuda.c     |   3 +-
 6 files changed, 511 insertions(+), 1 deletion(-)
 create mode 100644 libavfilter/vf_overlay_cuda.c
 create mode 100644 libavfilter/vf_overlay_cuda.cu


_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to