On Mon, Jul 03, 2017 at 11:44:26AM +0200, Paul B Mahol wrote: > On 7/2/17, Michael Niedermayer <mich...@niedermayer.cc> wrote: > > On Sun, Jul 02, 2017 at 01:14:00PM +0200, wm4 wrote: > >> On Sun, 2 Jul 2017 00:09:42 +0200 > >> Michael Niedermayer <mich...@niedermayer.cc> wrote: > >> > >> > This reduces the number of strstr() calls per byte > >> > This diasalows empty tags like '< >' as well as '<' in tags like > >> > '<ab<cd<<ef>' > >> > > >> > Fixes timeout > >> > Fixes: 1817/clusterfuzz-testcase-minimized-5104230530547712 > >> > > >> > Found-by: continuous fuzzing process > >> > https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg > >> > Signed-off-by: Michael Niedermayer <mich...@niedermayer.cc> > >> > --- > >> > libavcodec/htmlsubtitles.c | 4 ++-- > >> > 1 file changed, 2 insertions(+), 2 deletions(-) > >> > > >> > diff --git a/libavcodec/htmlsubtitles.c b/libavcodec/htmlsubtitles.c > >> > index be5c9316ca..67abc94085 100644 > >> > --- a/libavcodec/htmlsubtitles.c > >> > +++ b/libavcodec/htmlsubtitles.c > >> > @@ -110,13 +110,13 @@ int ff_htmlmarkup_to_ass(void *log_ctx, AVBPrint > >> > *dst, const char *in) > >> > case '<': > >> > tag_close = in[1] == '/'; > >> > len = 0; > >> > - if (sscanf(in+tag_close+1, "%127[^>]>%n", buffer, &len) >= > >> > 1 && len > 0) { > >> > + if (sscanf(in+tag_close+1, "%127[^<>]>%n", buffer, &len) >= > >> > 1 && len > 0) { > >> > const char *tagname = buffer; > >> > while (*tagname == ' ') > >> > tagname++; > >> > if ((param = strchr(tagname, ' '))) > >> > *param++ = 0; > >> > - if ((!tag_close && sptr < FF_ARRAY_ELEMS(stack)) || > >> > + if ((!tag_close && sptr < FF_ARRAY_ELEMS(stack) && > >> > *tagname) || > >> > ( tag_close && sptr > 0 && > >> > !strcmp(stack[sptr-1].tag, tagname))) { > >> > int i, j, unknown = 0; > >> > in += len + tag_close; > >> > >> Invalid syntax is not unusual in SRT files. Are you sure this doesn't > >> make the output worse in files that do not use the syntax correctly? > > > > I do not know if this makes the output worse for files with invalid syntax > > I also do not know if this makes the output better for files with invalid > > syntax > > > > i dont have a large library of (real world invalid syntax) srt files > > whith which to find cases where it makes a difference. > > > > You seem to know alot about srt, maybe you can pick some srt files > > and add fate tests, so we have better coverage of odd and nasty cases > > > > about this patch specifically, what do you suggest ? > > should i try to fix this while maintaining existing behavior for > > invalid syntax exactly? (i dont know how that code would look) > > WHat's so wrong with code that you want it changed in bad way?
IIRC If you have tags like <<<<<<<<<<<<<<<<<<<<<<a> you end up checking <<<<<<<<<<<<<<<<<<<<<<a> <<<<<<<<<<<<<<<<<<<<<a> <<<<<<<<<<<<<<<<<<<<a> ... then form the closing tag </<<<<<<<<<<<<<<<<<<<<<a> </<<<<<<<<<<<<<<<<<<<<a> </<<<<<<<<<<<<<<<<<<<a> ... and then search the whole input for each closing tag On top of this there are special cases and limitations like fixed size buffers and a fixed size stack, so it will not always do all the steps above. But in practice this is slow. The same issue should happen on real files if they contain syntax issues and conatin a lot of unescaped '<' [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Never trust a computer, one day, it may think you are the virus. -- Compn
signature.asc
Description: Digital signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel