> -----Original Message----- > From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of > James Almer > Sent: Tuesday, November 22, 2022 3:41 PM > To: ffmpeg-devel@ffmpeg.org > Subject: Re: [FFmpeg-devel] [PATCH 3/4] avutil/cuda_check: propagate > AVERROR_UNRECOVERABLE when needed > > On 11/22/2022 11:33 AM, Soft Works wrote: > > > > > >> -----Original Message----- > >> From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of > >> James Almer > >> Sent: Tuesday, November 22, 2022 2:31 PM > >> To: ffmpeg-devel@ffmpeg.org > >> Subject: Re: [FFmpeg-devel] [PATCH 3/4] avutil/cuda_check: > propagate > >> AVERROR_UNRECOVERABLE when needed > >> > >> On 11/22/2022 10:21 AM, Timo Rothenpieler wrote: > >>> On 22/11/2022 14:07, James Almer wrote: > >>>> Based on a patch by Soft Works. > >>>> > >>>> Signed-off-by: James Almer <jamr...@gmail.com> > >>>> --- > >>>> libavutil/cuda_check.h | 4 ++++ > >>>> 1 file changed, 4 insertions(+) > >>>> > >>>> diff --git a/libavutil/cuda_check.h b/libavutil/cuda_check.h > >>>> index f5a9234eaf..33aaf9c098 100644 > >>>> --- a/libavutil/cuda_check.h > >>>> +++ b/libavutil/cuda_check.h > >>>> @@ -49,6 +49,10 @@ static inline int ff_cuda_check(void *avctx, > >>>> av_log(avctx, AV_LOG_ERROR, " -> %s: %s", err_name, > >>>> err_string); > >>>> av_log(avctx, AV_LOG_ERROR, "\n"); > >>>> + // Not recoverable > >>>> + if (err == CUDA_ERROR_UNKNOWN) > >>>> + return AVERROR_UNRECOVERABLE; > >>> > >>> Why does specifically CUDA_ERROR_UNKNOWN get mapped to > >> unrecoverable? > >> > >> It's the code that Soft Works found out was returned repeatedly no > >> matter how many packets you fed to the encoder, which meant it was > >> stuck > >> in an unrecoverable state. See > >> http://ffmpeg.org/pipermail/ffmpeg-devel/2021-October/287153.html > >> > >> If you know of cases where this error could be returned in valid > >> recoverable scenarios that are not already handled in some other > way, > >> what do you suggest could be done? > > > > Thanks James, for picking this up! > > > > All I can say is that my original patch is deployed to a quite a > > number of systems and there hasn't been any case where this > > would have had an adverse effect. > > > > I hadn't reported this to Nvidia because a solution was needed > > and it was an erroneous file, so the best they could > > have probably done was to return a different error code ;-) > > > > softworkz > > Can you be more specific about what kind of erroneous file it was? > Are > we talking about a completely broken stream where no packet was valid > and even the software decoder would fail, or something that had one > invalid packet that somehow chocked the nvdec...
I was able to find the conversations where this had been reported. There were three cases, two were investigated, both of which quite similar. The first case was about an mpegts "recording" from some online stream where the "recorder" was simply reconnecting on connection failure and then continued writing to the same mpegts file. It seems the server had disconnected after 30 min and the streams have changed from then on: 11:35:35.096 frame=107726 fps=371 q=29.0 size= 588032kB time=00:29:57.59 bitrate=2682.9kbits/s throttle=off speed=6.18x 11:35:35.596 frame=107907 fps=371 q=28.0 size= 589312kB time=00:30:00.62 bitrate=2684.2kbits/s throttle=off speed=6.18x 11:35:35.995 [mpeg2_cuvid @ 0x699a40] AVHWFramesContext is already initialized with incompatible parameters 11:35:35.995 [mpeg2_cuvid @ 0x699a40] ctx->cvdl->cuvidParseVideoData(ctx->cuparser, &cupkt) failed -> CUDA_ERROR_UNKNOWN: unknown error 11:35:35.995 Error while decoding stream #0:0: Generic error in an external library 11:35:35.998 [mpeg2_cuvid @ 0x699a40] ctx->cvdl->cuvidParseVideoData(ctx->cuparser, &cupkt) failed -> CUDA_ERROR_UNKNOWN: unknown error 11:35:35.998 Error while decoding stream #0:0: Generic error in an external library 11:35:36.003 [mpeg2_cuvid @ 0x699a40] ctx->cvdl->cuvidParseVideoData(ctx->cuparser, &cupkt) failed -> CUDA_ERROR_UNKNOWN: unknown error We can't know what "incompatible parameters" actually means. It could be the frame size, but it could also be a different codec (like H264 instead of MPEG2) or both, or interlaced/non-interlaced. The other case was similar. The user had eventually admitted: "I used ffmpeg and a bash script to concat the 3x videos into a single episode" and that the codecs might have been different. Here it fails right from the start as the "-ss 00:07:57.000" is probably jumping right into the second segment which differs from the probe results. (total length 22min) I remember now that I had constructed test files like this, but with much shorter "bad parts". The ffmpeg parser could read over it (at least somewhat and eventually recover, while the cuvid parser never came back. But that was just to find out whether the cuvid error state is terminal or not. The ability to recover doesn’t help when a stream change is permanent (= not an erroneous incident for a few seconds). As such, the requirement was simply: when that happens, ffmpeg should exit. (instead of feeding the cuvid zombie to infinity) Best regards, softworkz _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".