On Sat, Jul 29, 2017 at 09:27:47PM +0200, Clément Bœsch wrote: > This commit switches off forced correct nesting of tags and only keeps > it for font tags. See long explanations in the code for the rationale. > > This results in various FATE changes which I'll explain here: > > - various swapping in font attributes, this is mostly noise due to the > old reverse stack way of printing them. The new one is more correct as > the last attribute takes over the previous ones. > > - unrecognized tags disappears > > - invalid tags that were previously displayed aren't anymore (instead, > we have a warning). This is better for the end user > > The main benefit of this commit is to be more tolerant to error, leading > to a better handling of badly nested tags or random wrong formatting for > the end user. > --- > libavcodec/htmlsubtitles.c | 199 > ++++++++++++++++++++++++++------------- > tests/ref/fate/sub-sami2 | 4 +- > tests/ref/fate/sub-srt | 14 +-- > tests/ref/fate/sub-srt-badsyntax | 4 +- > tests/ref/fate/sub-textenc | 14 +-- > tests/ref/fate/sub-webvttenc | 14 +-- > 6 files changed, 157 insertions(+), 92 deletions(-) > > diff --git a/libavcodec/htmlsubtitles.c b/libavcodec/htmlsubtitles.c > index e1fb7bc3cf..69d855df21 100644 > --- a/libavcodec/htmlsubtitles.c > +++ b/libavcodec/htmlsubtitles.c > @@ -1,5 +1,6 @@ > /* > * Copyright (c) 2010 Aurelien Jacobs <au...@gnuage.org> > + * Copyright (c) 2017 Clément Bœsch <u...@pkh.me> > * > * This file is part of FFmpeg. > * > @@ -18,6 +19,7 @@ > * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 > USA > */ > > +#include "libavutil/avassert.h" > #include "libavutil/avstring.h" > #include "libavutil/common.h" > #include "libavutil/parseutils.h" > @@ -31,19 +33,6 @@ static int html_color_parse(void *log_ctx, const char *str) > return rgba[0] | rgba[1] << 8 | rgba[2] << 16; > } > > -enum { > - PARAM_UNKNOWN = -1, > - PARAM_SIZE, > - PARAM_COLOR, > - PARAM_FACE, > - PARAM_NUMBER > -}; > - > -typedef struct SrtStack { > - char tag[128]; > - char param[PARAM_NUMBER][128]; > -} SrtStack; > - > static void rstrip_spaces_buf(AVBPrint *buf) > { > if (av_bprint_is_complete(buf)) > @@ -75,17 +64,48 @@ static void handle_open_brace(AVBPrint *dst, const char > **inp, int *an, int *clo > av_bprint_chars(dst, *in, 1); > } > > +struct font_tag { > + char face[128]; > + int size; > + uint32_t color; > +}; > + > +/* > + * The general politic of the convert is to mask unsupported tags or > formatting > + * errors (but still alert the user/subtitles writer with an error/warning) > + * without dropping any actual text content for the final user. > + */ > int ff_htmlmarkup_to_ass(void *log_ctx, AVBPrint *dst, const char *in) > { > - char *param, buffer[128], tmp[128]; > - int len, tag_close, sptr = 1, line_start = 1, an = 0, end = 0; > - SrtStack stack[16]; > + char *param, buffer[128]; > + int len, tag_close, sptr = 0, line_start = 1, an = 0, end = 0; > int closing_brace_missing = 0; > + int i, likely_a_tag; > > - stack[0].tag[0] = 0; > - strcpy(stack[0].param[PARAM_SIZE], "{\\fs}"); > - strcpy(stack[0].param[PARAM_COLOR], "{\\c}"); > - strcpy(stack[0].param[PARAM_FACE], "{\\fn}"); > + /* > + * state stack is only present for fonts since they are the only tags > where > + * the state is not binary. Here is a typical use case: > + * > + * <font color="red" size=10> > + * red 10 > + * <font size=50> RED AND BIG </font> > + * red 10 again > + * </font> > + * > + * On the other hand, using the state system for all the tags should be > + * avoided because it breaks wrongly nested tags such as: > + * > + * <b> foo <i> bar </b> bla </i> > + * > + * We don't want to break here; instead, we will treat all these tags as > + * binary state markers. Basically, "<b>" will activate bold, and "</b>" > + * will deactivate it, whatever the current state. > + * > + * This will also prevents cases where we have a random closing tag > + * remaining after the opening one was dropped. Yes, this happens and we > + * still don't want to print a "</b>" at the end of the dialog event. > + */ > + struct font_tag stack[16] = {0};
this seems to produce a compiler warning: ./libavcodec/htmlsubtitles.c: In function ‘ff_htmlmarkup_to_ass’: ./libavcodec/htmlsubtitles.c:112:12: warning: missing braces around initializer [-Wmissing-braces] gcc (Ubuntu 4.8.5-2ubuntu1~14.04.1) 4.8.5 [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB During times of universal deceit, telling the truth becomes a revolutionary act. -- George Orwell
signature.asc
Description: Digital signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel