On 28.09.2021 18:22, Roman Arzumanyan wrote:
Hello,

This patch makes nvenc copy incoming hwaccel frames instead of ref count 
increase.
It fixes the bug which may happen when on-GPU transcoding is done and encoder 
is set to use B frames.

How to reproduce the bug:
./ffmpeg \
   -hwaccel cuda -hwaccel_output_format cuda \
   -i input.mkv \
   -c:v h264_nvenc -preset p4 -tune hq -bf 3 \
   -y output.mkv

Expected output:
[h264 @ 0x55b14da4b4c0] No decoder surfaces left
[h264 @ 0x55b14da682c0] No decoder surfaces left
[h264 @ 0x55b14da850c0] No decoder surfaces left
[h264 @ 0x55b14daa1ec0] No decoder surfaces left
Error while decoding stream #0:0: Invalid data found when processing input
[h264 @ 0x55b14da2e6c0] No decoder surfaces left
Error while decoding stream #0:0: Invalid data found when processing input
     Last message repeated 1 times


Although fix adds extra CUDA DtoD memcopy, our internal testing results didn't 
show any noticeable difference in transcoding performance.


Hmm, so far my approach to deal with this was to inject a scale_cuda=passthrough=0 into the filter chain, which pretty much does exactly this, but only controllable by the user.

But I do agree that this is a bit of a clutch and not all that user friendly.

My main concern with this approach is that it will inevitably increase VRAM usage, depending on bframe count and resolution even quite significantly. And it's surprisingly common that users show up that are highly pressed for memory. When bframes were switched on by default, several people showed up who where suddenly running out of VRAM.

I do like this approach though, since it will for the average user make using a full hw chain a lot less bothersome.

So what I'd propose is:

- Add an option to retain the old behaviour of just holding a reference to the input frame no matter what. - Instead of explicitly copying the frame like you do right now, call av_frame_make_writable() on the frame, right after where you right now are replacing av_frame_ref with av_hwframe_transfer_data. That is for one very easy to disable conditionally, and does not require you to guard all the unref calls. Plus, it will only actually copy the frame if needed (i.e. it won't do anything if it comes out of a filterchain and has nothing else holding a ref)


Timo

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to