> -----Original Message-----
> From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of
> Devlist Archive
> Sent: Sunday, February 9, 2025 6:42 PM
> To: FFmpeg development discussions and patches <ffmpeg-
> de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] Captions SCC
> 
> >
> > Not to start an argument, but WebVTT is kind of a terrible format.
> > It's a lowest common denominator and loses most formatting
> information
> > available even in 608 (which is now more than 40 years old).  Stuff
> > like rollup captions for live programming, color (to distinguish
> > speakers) and caption positioning are pretty important to the
> hearing
> > impaired.
> 
> 
> From the reading I have done, the WebVTT does support some placement,
> italics, and appearance information, but not all players or ripping
> programs support those functions.  

Yes, that's right. It also supports colors. The unfortunate part is that colors 
and styles need to be predefined as (CSS) classes. It's not possible to use 
inline styles, which essentially forces doing two passes for precise colors. 
With the 8 colors in case of 608 it's easy though.
During the subtitle filtering work I had actually started adding missing 
features to the webvtt encoder but the requirement for predefined styles 
eventually set me off, making accurate conversions hardly possible. An idea was 
to create a predefined set of some-thousand styles and then always pick the 
closes matching one, but I hadn't followed that.


> On Sun, Feb 9, 2025 at 6:03 AM Devin Heitmueller <
> devin.heitmuel...@ltnglobal.com> wrote:
> 
> >
> > To my point: no, I don't think normalizing everything down to
> WebVTT
> > is a good idea.

Yes, WebVTT is not capable enough. ffmpeg internally uses the SSA/ASS format 
(for all text subs), which is undoubtedly the most capable format that exists.
Any subtitle conversion in ffmpeg goes through this format, so when you convert 
sub title format A to B, it's always 

A => ASS => B


So, when it comes to the question about a normalization, ASS is the way to go 
and ffmpeg made a good choice to do so.
For those who haven't seen it yet:

https://github.com/softworkz/SubtitleFilteringDemos/tree/master/Demo1


In this demo, the input is DVB bitmap subtitles and the output is DVB bitmap 
subtitles as well.
But in-between, the OCR filter takes the bitmap and outputs ASS subs. The next 
filter manipulates the text and afterwards the ASS subs are rendered as bitmaps 
and encoded as DVB subs again.

When you see this, you might think that it's kind-of like taking the source 
bitmaps, and writing new text on them, but that's not the case. Right in the 
middle, there's just the ASS format - which allows you to replicate any text 
subtitle feature that other formats have.

> > Much of the goal, at least in the work that I do, is to conform to
> the
> > FCC requirements, which generally require that the original 608/708
> > from the content provider be preserved.

All the above for getting to this answer: With ASS as storage/intermediate 
format it is possible to preserve the original content very precisely - without 
having to deal with a bitstream that cannot be safely applied to videos with 
different parameters than the original source.

It "just" requires an encoder for 608/708, hopefully it's more clear now why I 
had emphasized that earlier.


PS: Please note that this is not a proposal towards using ASS. The point is 
that ASS already _IS_ the intermediate format in ffmpeg and this won't and 
can't change (without re-implementing all text-subtitle encoders and decoders). 

sw



_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to