Thanks for the quick update, Timo. Honestly I just might have missed this moment with input frame registering as I was tracking CUdeviceptr's in map / unmap and memcpy calls. Strangely enough the patch has passed extensive QA validation round on different codecs and HW setups.
I think your approach is obviously safer and illustrates better API usage. ________________________________ От: Timo Rothenpieler Отправлено: Вторник, 28 сентября, 2021 22:11 Кому: FFmpeg development discussions and patches; Roman Arzumanyan Копия: Yogender Gupta; Ricardo Monteiro Тема: Re: [FFmpeg-devel] [PATCH] libavcodec/nvenc.c: copy incoming hwaccel frames instead of ref count increase On 28.09.2021 19:58, Timo Rothenpieler wrote: > Hmm, so far my approach to deal with this was to inject a > scale_cuda=passthrough=0 into the filter chain, which pretty much does > exactly this, but only controllable by the user. > > But I do agree that this is a bit of a clutch and not all that user > friendly. > > My main concern with this approach is that it will inevitably increase > VRAM usage, depending on bframe count and resolution even quite > significantly. > And it's surprisingly common that users show up that are highly pressed > for memory. When bframes were switched on by default, several people > showed up who where suddenly running out of VRAM. > > I do like this approach though, since it will for the average user make > using a full hw chain a lot less bothersome. > > So what I'd propose is: > > - Add an option to retain the old behaviour of just holding a reference > to the input frame no matter what. > - Instead of explicitly copying the frame like you do right now, call > av_frame_make_writable() on the frame, right after where you right now > are replacing av_frame_ref with av_hwframe_transfer_data. > That is for one very easy to disable conditionally, and does not require > you to guard all the unref calls. > Plus, it will only actually copy the frame if needed (i.e. it won't do > anything if it comes out of a filterchain and has nothing else holding a > ref) > > > Timo See attached patch for that approach. I just encoded a 5 minute sample using it, and I do see a marginal decrease in performance (it drops by literally x0.01 speed, so pretty much within margin of error, but it did show that consistently) and increase in VRAM usage as expected. However, given that your patch does seem to work just fine, somehow, it would be interesting to know if re-using a frame/CUDA buffer after registering it with nvenc is safe? Given that the logic right now never unregisters buffers unless it runs out of free slots, it would seem weird to me if that was the case. What if a buffer actually does get re-used, as is common with non-nvdec-frames allocated from a hwframes ctx? _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".