On Sat, Dec 11, 2021 at 06:03:39PM +0000, Soft Works wrote: > > > > -----Original Message----- > > From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of Michael > > Niedermayer > > Sent: Saturday, December 11, 2021 6:21 PM > > To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> > > Subject: Re: [FFmpeg-devel] [PATCH v20 02/20] avutil/frame: Prepare > > AVFrame\n > > for subtitle handling > > > > On Fri, Dec 10, 2021 at 03:02:32PM +0000, Soft Works wrote: > > > > > > > > > > -----Original Message----- > > > > From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of Daniel > > > > Cantarín > > > > Sent: Thursday, December 9, 2021 10:33 PM > > > > To: ffmpeg-devel@ffmpeg.org > > > > Subject: Re: [FFmpeg-devel] [PATCH v20 02/20] avutil/frame: Prepare > > AVFrame\n > > > > for subtitle handling > > > > > > > > Hi there. > > > > This is my first message to this list, so please excuse me if I > > > > unintendedly break some rule. > > > > > > > > I've read the debate between Soft Works and others, and would like to > > > > add something to it. > > > > I don't have a deep knowledge of the libs as other people here show. My > > > > knowledge comes from working with live streams for some years now. And I > > > > do understand the issue about modifying a public API for some use case > > > > under debate: I believe it's a legit line of questioning to Soft Works > > > > patches. However, I also feel we live streaming people are often let > > > > aside as "border case" when it comes to ffmpeg/libav usage, and this > > > > bias is present in many subtitles/captions debates. > > > > > > > > I work with Digital TV signals as input, and several different target > > > > outputs more related to live streaming (mobiles, PCs, and so on). The > > > > target location is Latin America, and thus I need subtitles/captions for > > > > when we use english spoken audio (we speak mostly Spanish in LATAM). TV > > > > people send you TV subtitle formats: scte-27, dvb subs, and so on. And > > > > live streaming people uses other subtitles formats, mostly vtt and ttml. > > > > I've found that CEA-608 captions are the most compatible caption format, > > > > as it's understood natively by smart tvs and other devices, as well as > > > > non-natively by any other device using popular player-side libraries. > > > > So, I've made my own filter for generating CEA-608 captions for live > > > > streams, using ffmpeg with the previously available OCR filter. Tried > > > > VTT first, but it was problematic for live-streaming packaging, and with > > > > CEA-608 I could just ignore that part of the process. > > > > > > > > While doing those filters, besides the whole deal of implementing the > > > > conversion from text to CEA-608, I struggled with stuff like this: > > > > - the sparseness of input subtitles, leading to OOM in servers and > > > > stalled players. > > > > - the "libavfilter doesn't take subtitle frames" and "it's all ASS > > > > internally" issues. > > > > - the "captions timings vs video frame timings vs audio timings" > > > > problems (people talk a lot about syncing subs with video frames, but > > > > rarely against actual dialogue audio). > > > > - other (meta)data problems, like screen positioning or text encoding. > > > > > > > > This are all problems Soft Works seems to have faced as well. > > > > > > > > But of all the problems regarding live streaming subtitles with ffmpeg > > > > (and there are LOTS of it), the most annoying problem is always this: > > > > almost every time someone talked about implementing subtitles in filters > > > > (in mail lists, in tickets, in other places like stack overflow, > > > > etcetera), they always asumed input files. When the people specifically > > > > talked about live streams, their peers always reasoned with files > > > > mindset, and stated live streaming subtitles/captions as "border case". > > > > > > > > Let me be clear: this are not "border case" issues, but actually appear > > > > in the most common use cases of live streaming transcoding. They all > > > > appear *inmediatelly* when you try to use subtitles/captions in live > > > > streams. > > > > > > > > I got here (I mean this thread) while looking for ways to fixing some > > > > issues in my setup. I was reconsidering VTT/TTML generation instead of > > > > CEA-608 (as rendering behave significantly different from device to > > > > device), and thus I was about to generate subtitle type output from some > > > > filter, was about to create my own standalone "heartbeat" filter to > > > > normalize the sparseness, and so on and so on: again, all stuff Soft > > > > Works seems to be handling as well. So I was quite happy to find someone > > > > working on this again; last time I've seen it in ffmpeg's > > > > mailing/patchwork > > > > (https://patchwork.ffmpeg.org/project/ffmpeg/patch/20161102220934.26010- > > 1- > > > > u...@pkh.me) > > > > the code there seemed to die, and I was already late to say anything > > > > about it. However, reading the other devs reaction to Soft Works work > > > > was worrying, as it felt as history wanted to repeat itself (take a look > > > > at discussions back then). > > > > > > > > It has been years so far of this situation. This time I wanted to > > > > annotate this, as this conversation is still warm, in order to help Soft > > > > Works's code survive. So, dear devs: I love and respect your work, and > > > > your opinion is very important to me. I do not claim to know better than > > > > you do ffmpeg's code. I do not claim to know better what to do with > > > > libavfilter's API. Please understand: I'm not here to be right, but to > > > > note my point of view. I'm not better than you; quite on the contrary > > > > most likely. But I also need to solve some very real problems, and can't > > > > wait until everything else is in wonderful shape to do it. I can't also > > > > add lots of conditions in order to just fix the most immediate issues; > > > > like it's the case with sparseness and heartbeat frames, which was a > > > > heated debate years ago and seems to still be one, while I find it to be > > > > the most obvious common sense backwards-compatible solution > > > > implementation. Stuff like "clean" or "well designed" can't be more > > > > important than actually working use cases while not breaking previously > > > > implemented ones: because it's far easier to fix little blocks of "bad" > > > > code rather than design something everybody's happy with (and history of > > > > the project seems to be quite eloquent about that, specially when it > > > > comes to this particular use cases). Also, I have my own patches (which > > > > I would like to upstream some day), and I can tell the API do change > > > > quite normally: I understand that should be a curated process, but > > > > adding a single property for live-streaming subtitles isn't also > > > > anybody's death, and thus that shouldn't be the kind of issues that > > > > blocks big and important code implementations like the ones Soft Works > > > > is working on; I just don't have the time to do myself all that work > > > > he/she's doing, and it could be another bunch of years until someone > > > > else have it. I can't tell if Soft Works code is well enough for you, or > > > > if the ideas behind it are the best there are, but I can tell you the > > > > implementations are in the right track: as a live streaming worker, I > > > > know the problems he/she mentions in their exchanges with you all, and I > > > > can tell you they're all blocking issues when dealing with live > > > > streaming. Soft Work is not "forcing it" into the API, and this are not > > > > "border cases" but normal and frequent live streaming issues. So, > > > > please, if you don't have the time Soft Works have, or the will to > > > > tackle the issues he/she's tackling, I beg you at least don't kill the > > > > code this time if it does not breaks working use cases. > > > > > > > > > > > > Thanks, > > > > Daniel. > > > > > > Hi Daniel, > > > > > > thanks a lot for your kind words. I'm a "He-Man", and if I could turn > > > back time, I would have used my real name. Yet I started off as softworkz > > > and I can't change anymore without compromising the pseudonym. > > > > > > As you have realized, the ML can be a pool of sharks at time, > > > everybody following different motivations, sometimes personal, sometimes > > > commercial, you'll hardly ever know. From my side, I have benefitted > > > a lot from ffmpeg and it has always been a plan to contribute something > > > in return, with the subtitles subject finally being chosen. > > > The conclusion is that I have spent more time on ML interaction than > > > on the development itself, so it hasn't really been an economically > > > effective kind of work load. > > > Nonetheless, I have patiently applied all requested changes going over > > > many iterations so far. > > > From the remaining change requests, there's a major one that I'm rejecting > > > to change (duality of frame.pts and frame.subtitle_pts field), and I don't > > > know whether I haven't explained the requirement for the duality of those > > > sufficiently well, or whether it wasn't attempted to be understood and > > > just blindly objected as being a "gray" spot regarding the frame API. > > > The duality doesn't serve just edge cases, it is an important element > > > of the heartbeat mechanisms for dealing with sparse subtitles and also > > > important to retain muxing offsets (often subtitles are muxed a few > > > seconds ahead of time). > > > > > The other point that I'm rejecting to change are the time bases of the > > > involved fields. I have projected the existing subtitles functionality > > > to the new API in a direct and transparent way, to achieve a high > > > level of compatibility and stability for the transition. > > > Being able to use the result as an instant replacement in production > > > scenarios is a top-level requirement from my side and I cannot take > > > the risk of needing to fix regressions all over the place which > > > would be introduced by a change like making those fields adhering > > > to a non-constant time-base. > > > > This sounds a bit like you expect that the majority of cases to not > > change ? iam asking because > > most cases i tried do change with the part of the patchset which > > cleanly applies. In fact about half of the changes are the failure i already > > posted previously. I think you said its an issue elsewhere. Still that needs > > to be fixed before this patchset can be used as a > > "instant replacement in production scenarios" > > You had posted two cases that were failing. > > 1. > ./ffmpeg -i ~/tickets/1332/Starship_Troopers.vob -scodec xsub -qscale 2 > -an > > file1332.avi > > ==> Fixed since V18 > > > 2. > This breaks: > > ./ffmpeg -i ~/tickets/153/bbc_small.ts -filter_complex '[0:v][0:s]overlay' - > > qscale 2 -t 3 -y file.avi > > ==> It wasn't actually a regression. It was a bug in dvbdubdec that just got > covered up earlier by some sub2video hacks. > > I have submitted this fix for the error: > > https://patchwork.ffmpeg.org/project/ffmpeg/patch/dm8p223mb03655dee6ff0228743117178ba...@dm8p223mb0365.namp223.prod.outlook.com/
doesnt fix it the v23_plus set still fails: ./ffmpeg -ss 20 -i dvbsubtest.ts -filter_complex "[0:v][0:s]overlay[v]" -map '[v]' -map 0:a -acodec copy -vcodec mpeg4 -t 5 -bitexact /tmp/file.avi Input #0, mpegts, from 'dvbsubtest.ts': Duration: 00:00:34.64, start: 79677.098467, bitrate: 4842 kb/s Program 1 Stream #0:0[0x1901](eng): Video: mpeg2video (Main) ([2][0][0][0] / 0x0002), yuv420p(tv, top first), 720x576 [SAR 64:45 DAR 16:9], 25 fps, 25 tbr, 90k tbn Side data: cpb: bitrate max/min/avg: 15000000/0/0 buffer size: 1835008 vbv_delay: N/A Stream #0:1[0x19a1](eng): Audio: mp2 ([4][0][0][0] / 0x0004), 48000 Hz, stereo, s16p, 256 kb/s Stream #0:2[0x19b1](eng): Subtitle: dvb_subtitle ([6][0][0][0] / 0x0006) Stream mapping: Stream #0:0 (mpeg2video) -> overlay Stream #0:2 (dvbsub) -> overlay overlay:default -> Stream #0:0 (mpeg4) Stream #0:1 -> #0:1 (copy) Press [q] to stop, [?] for help subtitle input filter: decoding size 0x0 Auto-inserting graphicsub2video filter [swscaler @ 0x5608969b4940] Value 0.000000 for parameter 'srcw' out of range [1 - 2.14748e+09] [swscaler @ 0x5608969b4940] Value 0.000000 for parameter 'srch' out of range [1 - 2.14748e+09] [swscaler @ 0x5608969b4940] Value 0.000000 for parameter 'dstw' out of range [1 - 2.14748e+09] [swscaler @ 0x5608969b4940] Value 0.000000 for parameter 'dsth' out of range [1 - 2.14748e+09] [graphicsub2video @ 0x560896bd7540] [IMGUTILS @ 0x7fff37fbd6d0] Picture size 0x0 is invalid Error reinitializing filters! Failed to inject frame into filter network: Invalid argument Error while processing the decoded data for stream #0:0 Conversion failed! Same failure with a different iinput: ./ffmpeg -i ~/tickets/4062/negative_pts_sub.ts -copyts -filter_complex '[0:0][0:3]overlay=shortest=1[outv0]' -map 0:1 -map '[outv0]' -bitexact /tmp/sadlybroken.ts Input #0, mpegts, from 'tickets//4062/negative_pts_sub.ts': Duration: 00:00:04.89, start: -47.631967, bitrate: 6154 kb/s Program 1601 Metadata: service_name : Yle TV1 HD 7M service_provider: Yle Stream #0:0[0x137]: Video: h264 (Main) ([27][0][0][0] / 0x001B), yuv420p(tv, bt709, top first), 1920x1080 [SAR 1:1 DAR 16:9], 25 fps, 50 tbr, 90k tbn Stream #0:1[0x3b6](fin): Audio: ac3 ([6][0][0][0] / 0x0006), 48000 Hz, stereo, fltp, 448 kb/s Stream #0:2[0x3b9](dut): Audio: ac3 ([6][0][0][0] / 0x0006), 48000 Hz, stereo, fltp, 192 kb/s (visual impaired) (descriptions) Stream #0:3[0x4cb](fin): Subtitle: dvb_subtitle ([6][0][0][0] / 0x0006) Stream #0:4[0x4e2](fin): Subtitle: dvb_subtitle ([6][0][0][0] / 0x0006) (hearing impaired) Stream #0:5[0x1450](fin): Subtitle: dvb_teletext ([6][0][0][0] / 0x0006), 492x250 Stream mapping: Stream #0:0 (h264) -> overlay (graph 0) Stream #0:3 (dvbsub) -> overlay (graph 0) Stream #0:1 -> #0:0 (ac3 (native) -> mp2 (native)) overlay:default (graph 0) -> Stream #0:1 (mpeg2video) Press [q] to stop, [?] for help [h264 @ 0x55f22e15d840] reference picture missing during reorder [h264 @ 0x55f22e15d840] Missing reference picture, default is 2147483647 [h264 @ 0x55f22df57d00] mmco: unref short failure [h264 @ 0x55f22df7fb00] reference picture missing during reorder Last message repeated 1 times [h264 @ 0x55f22df7fb00] Missing reference picture, default is 65774 Last message repeated 1 times [h264 @ 0x55f22e0cbb00] mmco: unref short failure [h264 @ 0x55f22de22840] mmco: unref short failure [h264 @ 0x55f22df89a00] reference picture missing during reorder [h264 @ 0x55f22df89a00] Missing reference picture, default is 65782 [h264 @ 0x55f22e15d840] mmco: unref short failure subtitle input filter: decoding size 0x0 Auto-inserting graphicsub2video filter [swscaler @ 0x55f22fabad80] Value 0.000000 for parameter 'srcw' out of range [1 - 2.14748e+09] [swscaler @ 0x55f22fabad80] Value 0.000000 for parameter 'srch' out of range [1 - 2.14748e+09] [swscaler @ 0x55f22fabad80] Value 0.000000 for parameter 'dstw' out of range [1 - 2.14748e+09] [swscaler @ 0x55f22fabad80] Value 0.000000 for parameter 'dsth' out of range [1 - 2.14748e+09] [graphicsub2video @ 0x55f22fb10b40] [IMGUTILS @ 0x7fff903e52b0] Picture size 0x0 is invalid Error reinitializing filters! Failed to inject frame into filter network: Invalid argument Error while processing the decoded data for stream #0:0 Conversion failed! and another one: ./ffmpeg -i ~/tickets/4752/dump_dvbsubtitles.mp4 -ss 5 -t 1 -filter_complex '[0:v][0:s]overlay' -bitexact /tmp/withsubtitles.ts Input #0, mpegts, from '/home/michael/tickets//4752/dump_dvbsubtitles.mp4': Duration: 00:01:05.45, start: 57364.369689, bitrate: 6849 kb/s Program 1163 Stream #0:0[0xcb]: Video: h264 (High) ([27][0][0][0] / 0x001B), yuv420p(tv, bt709, top first), 1920x1080 [SAR 1:1 DAR 16:9], 25 fps, 50 tbr, 90k tbn Stream #0:1[0x193](eng): Audio: ac3 ([6][0][0][0] / 0x0006), 48000 Hz, 5.1(side), fltp, 448 kb/s Stream #0:2[0x25b](ara): Subtitle: dvb_subtitle ([6][0][0][0] / 0x0006) Stream #0:3[0x265](eng): Subtitle: dvb_subtitle ([6][0][0][0] / 0x0006) Stream mapping: Stream #0:0 (h264) -> overlay (graph 0) Stream #0:2 (dvbsub) -> overlay (graph 0) overlay:default (graph 0) -> Stream #0:0 (mpeg2video) Stream #0:1 -> #0:1 (ac3 (native) -> mp2 (native)) Press [q] to stop, [?] for help [h264 @ 0x5561c71b23c0] reference picture missing during reorder [h264 @ 0x5561c71b23c0] Missing reference picture, default is 2147483647 [h264 @ 0x5561c7383340] mmco: unref short failure Last message repeated 1 times [h264 @ 0x5561c7383340] number of reference frames (0+4) exceeds max (3; probably corrupt input), discarding one [h264 @ 0x5561c7383340] reference picture missing during reorder [h264 @ 0x5561c7383340] Missing reference picture, default is 66008 subtitle input filter: decoding size 0x0 Auto-inserting graphicsub2video filter [swscaler @ 0x5561c87a0e80] Value 0.000000 for parameter 'srcw' out of range [1 - 2.14748e+09] [swscaler @ 0x5561c87a0e80] Value 0.000000 for parameter 'srch' out of range [1 - 2.14748e+09] [swscaler @ 0x5561c87a0e80] Value 0.000000 for parameter 'dstw' out of range [1 - 2.14748e+09] [swscaler @ 0x5561c87a0e80] Value 0.000000 for parameter 'dsth' out of range [1 - 2.14748e+09] [graphicsub2video @ 0x5561c7194340] [IMGUTILS @ 0x7ffce3e4ef80] Picture size 0x0 is invalid Error reinitializing filters! Failed to inject frame into filter network: Invalid argument Error while processing the decoded data for stream #0:0 Conversion failed! Heres one thet generates different output: (i have not checked at all if this is a bug just seeing its differnt) ./ffmpeg -i ~/tickets/679/oversized_pgs_subtitles.mkv -filter_complex '[0:s:1]scale=848x480,[0:v]overlay=shortest=1' -bitexact /tmp/old-pgstest.avi -rw-r----- 1 michael michael 657582 Dez 12 22:53 /tmp/new-pgstest.avi -rw-r----- 1 michael michael 773460 Dez 12 22:54 /tmp/old-pgstest.avi similarly differens are with: (again i did not debug what is going on why there is a difference) ./ffmpeg -i \[a-s\]_full_metal_panic_fumoffu_-_01_-_the_man_from_the_south_-_a_hostage_with_no_compromises__rs2_\[1080p_bd-rip\]\[BBB48A25\].mkv -filter_complex '[0:s:1]scale=800:600' -t 15 -qscale 2 -bitexact /tmp/pgstest2.avi ./ffmpeg -i ~/tickets/2397/242_4.mkv -filter_complex '[0:v][0:s:1]overlay' -qscale 2 -bitexact /tmp/file2397.avi ./ffmpeg -f lavfi -i 'movie=/home/michael/videos/Closedcaption_rollup.ts[out0+subcc]' /tmp/rollup.srt as well as others thx [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Frequently ignored answer#1 FFmpeg bugs should be sent to our bugtracker. User questions about the command line tools should be sent to the ffmpeg-user ML. And questions about how to use libav* should be sent to the libav-user ML.
signature.asc
Description: PGP signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".