On Sat, Aug 8, 2015 at 8:23 AM, Clément Bœsch <u...@pkh.me> wrote: > On Fri, Aug 07, 2015 at 11:03:29PM -0700, Yayoi wrote: >> --- >> libavcodec/samidec.c | 59 >> +++++++++++++++++++++++++++++-------------------- >> tests/ref/fate/sub-sami | 18 +++++++-------- >> 2 files changed, 44 insertions(+), 33 deletions(-) >> >> diff --git a/libavcodec/samidec.c b/libavcodec/samidec.c >> index 47850e2..41826a7 100644 >> --- a/libavcodec/samidec.c >> +++ b/libavcodec/samidec.c >> @@ -27,6 +27,7 @@ >> #include "ass.h" >> #include "libavutil/avstring.h" >> #include "libavutil/bprint.h" >> +#include "htmlsubtitles.h" >> >> typedef struct { >> AVBPrint source; >> @@ -41,11 +42,12 @@ static int sami_paragraph_to_ass(AVCodecContext *avctx, >> const char *src) >> char *tag = NULL; >> char *dupsrc = av_strdup(src); >> char *p = dupsrc; >> + char *pcopy = NULL; >> + int index = 0; >> + int second_paragraph = 0; >> >> - av_bprint_clear(&sami->content); >> for (;;) { >> char *saveptr = NULL; >> - int prev_chr_is_space = 0; >> AVBPrint *dst = &sami->content; >> >> /* parse & extract paragraph tag */ >> @@ -77,37 +79,46 @@ static int sami_paragraph_to_ass(AVCodecContext *avctx, >> const char *src) >> goto end; >> } >> > >> - /* extract the text, stripping most of the tags */ >> + /* check for the second paragrph */ > > Why change the comment? What does "check" mean here? What is the "second > paragraph"?
I answer it below with the other questions you have because you are basically asking the same things. And I will document better too. > >> + pcopy = av_strdup(p); >> while (*p) { >> if (*p == '<') { >> - if (!av_strncasecmp(p, "<P", 2) && (p[2] == '>' || >> av_isspace(p[2]))) >> + if (!av_strncasecmp(p, "<p", 2) && (p[2] == '>' || >> av_isspace(p[2]))) { >> + second_paragraph = 1; >> break; >> - if (!av_strncasecmp(p, "<BR", 3)) >> - av_bprintf(dst, "\\N"); >> - p++; >> - while (*p && *p != '>') >> - p++; >> - if (!*p) >> - break; >> - if (*p == '>') >> - p++; >> - continue; >> + } >> } >> - if (!av_isspace(*p)) >> - av_bprint_chars(dst, *p, 1); >> - else if (!prev_chr_is_space) >> - av_bprint_chars(dst, ' ', 1); >> - prev_chr_is_space = av_isspace(*p); >> p++; >> + index++; >> + } >> + p = p - index; >> + if (second_paragraph) { >> + p[index] = 0; >> } >> - } >> >> - av_bprint_clear(&sami->full); >> - if (sami->source.len) >> - av_bprintf(&sami->full, "{\\i1}%s{\\i0}\\N", sami->source.str); >> - av_bprintf(&sami->full, "%s", sami->content.str); >> + ff_htmlmarkup_to_ass(avctx, dst, p); >> + >> + /* add the source if there are any. */ >> + av_bprint_clear(&sami->full); >> + if (sami->source.len) { >> + av_bprintf(&sami->full, "{\\i1}%s{\\i0}\\N", sami->source.str); >> + av_bprintf(&sami->full, "%s", sami->content.str); >> + if (second_paragraph) { >> + second_paragraph = 0; >> + p = pcopy; >> + p += index; >> + index = 0; >> + continue; >> + } >> + } else { >> + av_bprintf(&sami->full, "%s", sami->content.str); >> + } >> + av_bprint_clear(&sami->content); >> + >> + } >> > > This looks clumsy at best: you are finalizing the subtitle event inside > the paragraph loop when it should be outside. It also seems there is a > duplicating "second paragraph" logic even though the loop is supposed to > handle one paragraph at a time. I know. It is clumsy.. I explain it below as well. > > If you are uncomfortable with the current logic or believe it's badly > designed for the logic you're trying to mangle, feel free to rewrite it; > it's better than trying to inhibit the old behaviour with hacks. It's not that your code was badly designed to begin with. I feel like it is the nature of this format. So let me explain what was going on your old code and what I had changed. Here is the example: (This one is from current fate sample.) <SYNC Start=73000> <P Class=ENUSCC ID=Source>End of: <P Class=ENUSCC>President John F. Kennedy Speech So this is one subtitle event and there are two paragraphs. One paragraph has a new source and the other paragraph has its content. Since it is decided that this code uses html parser(ff_htmlmarkup_to_ass) and use it to parse the contents of the paragraph. Each paragraph content needs to go to to html parser, regardless whether it is source or subtitle content. So I need to tell html_parser, ff_htmlmarkup_to_ass that where the destination is.. (whether it is source or content). I can't no longer specify the destination which I was able to do in your old code. I mean maybe I can change the function header and add additional flag to tell our parser where to put in the dst but I don't want to because it may require to change srtdec and I do not want to do that.. Anyways, so that's the basic idea. I honestly think it is good that subtitle event handle is also in loop because one loop iteration per one subtitle event and subtitle event does not guarantee that it contains only one paragraph to begin with. It was a possibility either adding additional loop or having goto to honor your original logic but after I noticed that subtitle event contains more than one paragraph, I gave up on the idea to keep your logic. The comment says "check second paragraph" but it should be able to handle multiple paragraphs more than 2 in one subtitle event. (I definitely tested while I was working on this part.. I can modify sample or attach test file if that makes you feel happier. And I make a better comment then.) So I want to leave it as it is if you do not see major flaws or you do not have very very strong opinion about how it needs to be modified. Well, given the information, if you have some ideas, you can tell me too. Besides, I can't cross out the possibility if you or me or someone wants to add more enhancement (maybe one day you decided to add support for img tag? who knows..) , and it may need to change code structure again. And as it is now, it is still enhancement that works. ;-). Well, please let me know if you still have concerns and questions etc. I will fix the rest of comments and misplaced things on you mentioned in other mails and I will send the updated patches either today or tomorrow. (or when it is done but hopefully definitely in a week.) Thank you! Yayoi > > [...] > > -- > Clément B. > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel