2018-03-24 17:42 GMT+01:00, wm4 <nfx...@googlemail.com>: > On Sat, 24 Mar 2018 17:05:41 +0100 > Carl Eugen Hoyos <ceffm...@gmail.com> wrote: > >> 2018-03-24 15:48 GMT+01:00, wm4 <nfx...@googlemail.com>: >> > Subtitles which contained styled UTF-8 subtitles (i.e. not just 7 bit >> > ASCII characters) were not handled correctly. The spec mandates that >> > styling start/end ranges are in "characters". It's not quite clear what >> > a "character" is supposed to be, but maybe they mean unicode codepoints. >> > >> > FFmpeg's decoder treated the style ranges as byte idexes, which could >> > lead to UTF-8 sequences being broken, and the common code dropping the >> > whole subtitle line. >> > >> > Change this and count the codepoint instead. This also means that even >> > if this is somehow wrong, the decoder won't break UTF-8 sequences >> > anymore. The sample which led me to investigate this now appears to work >> > correctly. >> >> Could you confirm that this is also what QT does? > > I can't test with QT. VLC seems to behave like with this patch applied. > >> Or is it impossible that the patch breaks something? > > Could probably break movtext subtitles generated by ffmpeg (I didn't > fix the movtext encoder, and it seems to have the same bug). But these > will most likely be broken on other players too. Tough the worst case > is just that the styles get shifted.
Thank you. >> [...] >> >> > @@ -388,17 +405,24 @@ static int text_to_ass(AVBPrint *buf, const char >> > *text, const char *text_end, >> > } >> > } >> > >> > - switch (*text) { >> > - case '\r': >> > - break; >> > - case '\n': >> > - av_bprintf(buf, "\\N"); >> > - break; >> > - default: >> > - av_bprint_chars(buf, *text, 1); >> > - break; >> > + len = get_utf8_length_at(text, text_end); >> > + if (len < 1) { >> > + av_log(avctx, AV_LOG_ERROR, "invalid UTF-8 byte in >> > subtitle\n"); >> > + len = 1; >> > + } >> > + for (i = 0; i < len; i++) { >> >> > + switch (*text) { >> > + case '\r': >> > + break; >> > + case '\n': >> > + av_bprintf(buf, "\\N"); >> > + break; >> > + default: >> > + av_bprint_chars(buf, *text, 1); >> > + break; >> >> Imo, the reindentation is not ok but this isn't my code. > > Why not? Because the patch is much easier to read without it. Carl Eugen _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel