htmlsubtitles: Protect very slow redundant sscanf() calls by optimized use of strchr()

Michael Niedermayer Fri, 09 Jun 2017 10:41:06 -0700

On Fri, Jun 09, 2017 at 04:32:41PM +0200, wm4 wrote:
> On Thu,  8 Jun 2017 23:53:55 +0200
> Michael Niedermayer <[email protected]> wrote:
> 
> > Fixes Timeout
> > Fixes: 2127/clusterfuzz-testcase-minimized-6595787859427328
> > 
> > Signed-off-by: Michael Niedermayer <[email protected]>
> > ---
> >  libavcodec/htmlsubtitles.c | 12 ++++++++++--
> >  1 file changed, 10 insertions(+), 2 deletions(-)
> > 
> > diff --git a/libavcodec/htmlsubtitles.c b/libavcodec/htmlsubtitles.c
> > index 16295daa0c..ba4f269b3f 100644
> > --- a/libavcodec/htmlsubtitles.c
> > +++ b/libavcodec/htmlsubtitles.c
> > @@ -56,6 +56,7 @@ int ff_htmlmarkup_to_ass(void *log_ctx, AVBPrint *dst, 
> > const char *in)
> >      char *param, buffer[128], tmp[128];
> >      int len, tag_close, sptr = 1, line_start = 1, an = 0, end = 0;
> >      SrtStack stack[16];
> > +    const char *next_closep = NULL;
> >  
> >      stack[0].tag[0] = 0;
> >      strcpy(stack[0].param[PARAM_SIZE],  "{\\fs}");
> > @@ -83,8 +84,15 @@ int ff_htmlmarkup_to_ass(void *log_ctx, AVBPrint *dst, 
> > const char *in)
> >                          and all microdvd like styles such as {Y:xxx} */
> >              len = 0;
> >              an += sscanf(in, "{\\an%*1u}%n", &len) >= 0 && len > 0;
> > -            if ((an != 1 && (len = 0, sscanf(in, "{\\%*[^}]}%n", &len) >= 
> > 0 && len > 0)) ||
> > -                (len = 0, sscanf(in, "{%*1[CcFfoPSsYy]:%*[^}]}%n", &len) 
> > >= 0 && len > 0)) {
> > +
> > +            if(!next_closep || next_closep <= in) {
> > +                next_closep = strchr(in+1, '}');
> > +                if (!next_closep)
> > +                    next_closep = in + strlen(in);
> > +            }
> > +
> > +            if (*next_closep == '}' && (an != 1 && (len = 0, sscanf(in, 
> > "{\\%*[^}]}%n", &len) >= 0 && len > 0)) ||
> > +                *next_closep == '}' && (len = 0, sscanf(in, 
> > "{%*1[CcFfoPSsYy]:%*[^}]}%n", &len) >= 0 && len > 0)) {
> >                  in += len - 1;
> >              } else
> >                  av_bprint_chars(dst, *in, 1);
> 
> I'm not convinced that bad performance with an obscure fuzzed sample is
> worth the complexity increase here.


If we want to make ff_htmlmarkup_to_ass() not go into near infinite
loops with crafted data then this or a equivalent change is likely
needed.


> 
> I'd rather ask, why the heck is it using sscanf in the first place?

I suspect the use of scanf is fewer lines of code than the alternative
but iam not the author of this, i dont know for sure why it was written
as it was.


> The existing code is certainly unreadable already. (Could you tell what
> it does without staring at the scanf manpage for a while? And then
> guarantee correctness/performance/security?)

I know the scanf syntax well enough to not need the manpage much,
but i was toying around with the idea of replacing the scanf already.

I can do that if people want but it will be more code than the scanf
based one. Also the next_closep based code is needed as it is even
without scanf().
What that code does is prevent searching the closing '}' repeatly
for each byte which is very slow if its done for nearly every single
byte in a long string.

[...]
-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The real ebay dictionary, page 2
"100% positive feedback" - "All either got their money back or didnt complain"
"Best seller ever, very honest" - "Seller refunded buyer after failed scam"

signature.asc
Description: Digital signature

_______________________________________________
ffmpeg-devel mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH 1/2] avcodec/htmlsubtitles: Protect very slow redundant sscanf() calls by optimized use of strchr()

Reply via email to