frame: Prepare AVFrame\n for subtitle handling

Daniel Cantarín Thu, 09 Dec 2021 13:33:35 -0800

Hi there.

This is my first message to this list, so please excuse me if Iunintendedly break some rule.

I've read the debate between Soft Works and others, and would like toadd something to it.I don't have a deep knowledge of the libs as other people here show. Myknowledge comes from working with live streams for some years now. And Ido understand the issue about modifying a public API for some use caseunder debate: I believe it's a legit line of questioning to Soft Workspatches. However, I also feel we live streaming people are often letaside as "border case" when it comes to ffmpeg/libav usage, and thisbias is present in many subtitles/captions debates.

I work with Digital TV signals as input, and several different targetoutputs more related to live streaming (mobiles, PCs, and so on). Thetarget location is Latin America, and thus I need subtitles/captions forwhen we use english spoken audio (we speak mostly Spanish in LATAM). TVpeople send you TV subtitle formats: scte-27, dvb subs, and so on. Andlive streaming people uses other subtitles formats, mostly vtt and ttml.I've found that CEA-608 captions are the most compatible caption format,as it's understood natively by smart tvs and other devices, as well asnon-natively by any other device using popular player-side libraries.So, I've made my own filter for generating CEA-608 captions for livestreams, using ffmpeg with the previously available OCR filter. TriedVTT first, but it was problematic for live-streaming packaging, and withCEA-608 I could just ignore that part of the process.

While doing those filters, besides the whole deal of implementing theconversion from text to CEA-608, I struggled with stuff like this:- the sparseness of input subtitles, leading to OOM in servers andstalled players.- the "libavfilter doesn't take subtitle frames" and "it's all ASSinternally" issues.- the "captions timings vs video frame timings vs audio timings"problems (people talk a lot about syncing subs with video frames, butrarely against actual dialogue audio).

- other (meta)data problems, like screen positioning or text encoding.

This are all problems Soft Works seems to have faced as well.

But of all the problems regarding live streaming subtitles with ffmpeg(and there are LOTS of it), the most annoying problem is always this:almost every time someone talked about implementing subtitles in filters(in mail lists, in tickets, in other places like stack overflow,etcetera), they always asumed input files. When the people specificallytalked about live streams, their peers always reasoned with filesmindset, and stated live streaming subtitles/captions as "border case".

Let me be clear: this are not "border case" issues, but actually appearin the most common use cases of live streaming transcoding. They allappear *inmediatelly* when you try to use subtitles/captions in livestreams.

I got here (I mean this thread) while looking for ways to fixing someissues in my setup. I was reconsidering VTT/TTML generation instead ofCEA-608 (as rendering behave significantly different from device todevice), and thus I was about to generate subtitle type output from somefilter, was about to create my own standalone "heartbeat" filter tonormalize the sparseness, and so on and so on: again, all stuff SoftWorks seems to be handling as well. So I was quite happy to find someoneworking on this again; last time I've seen it in ffmpeg'smailing/patchwork(https://patchwork.ffmpeg.org/project/ffmpeg/patch/20161102220934.26010-...@pkh.me)the code there seemed to die, and I was already late to say anythingabout it. However, reading the other devs reaction to Soft Works workwas worrying, as it felt as history wanted to repeat itself (take a lookat discussions back then).

It has been years so far of this situation. This time I wanted toannotate this, as this conversation is still warm, in order to help SoftWorks's code survive. So, dear devs: I love and respect your work, andyour opinion is very important to me. I do not claim to know better thanyou do ffmpeg's code. I do not claim to know better what to do withlibavfilter's API. Please understand: I'm not here to be right, but tonote my point of view. I'm not better than you; quite on the contrarymost likely. But I also need to solve some very real problems, and can'twait until everything else is in wonderful shape to do it. I can't alsoadd lots of conditions in order to just fix the most immediate issues;like it's the case with sparseness and heartbeat frames, which was aheated debate years ago and seems to still be one, while I find it to bethe most obvious common sense backwards-compatible solutionimplementation. Stuff like "clean" or "well designed" can't be moreimportant than actually working use cases while not breaking previouslyimplemented ones: because it's far easier to fix little blocks of "bad"code rather than design something everybody's happy with (and history ofthe project seems to be quite eloquent about that, specially when itcomes to this particular use cases). Also, I have my own patches (whichI would like to upstream some day), and I can tell the API do changequite normally: I understand that should be a curated process, butadding a single property for live-streaming subtitles isn't alsoanybody's death, and thus that shouldn't be the kind of issues thatblocks big and important code implementations like the ones Soft Worksis working on; I just don't have the time to do myself all that workhe/she's doing, and it could be another bunch of years until someoneelse have it. I can't tell if Soft Works code is well enough for you, orif the ideas behind it are the best there are, but I can tell you theimplementations are in the right track: as a live streaming worker, Iknow the problems he/she mentions in their exchanges with you all, and Ican tell you they're all blocking issues when dealing with livestreaming. Soft Work is not "forcing it" into the API, and this are not"border cases" but normal and frequent live streaming issues. So,please, if you don't have the time Soft Works have, or the will totackle the issues he/she's tackling, I beg you at least don't kill thecode this time if it does not breaks working use cases.



Thanks,
Daniel.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v20 02/20] avutil/frame: Prepare AVFrame\n for subtitle handling

Reply via email to