On Wed, Jan 23, 2013 at 3:28 PM, Konstantin Serebryany
<konstantin.s.serebry...@gmail.com> wrote:
>
> On Wed, Jan 23, 2013 at 3:13 PM, Jakub Jelinek <ja...@redhat.com> wrote:
> > On Wed, Jan 23, 2013 at 02:31:57PM +0400, Konstantin Serebryany wrote:
> >> The attached patch is the libsanitizer merge from upstream r173241.
> >>
> >> Lots of changes. Among other things:
> >>   - slow CFI-based unwinder is on by default for fatal errors
> >> (fast_unwind_on_fatal=0, Linux-only)
> >>   - more interceptors in asan/tsan
> >>   - new asan allocator on Linux (faster and uses less memory than the old 
> >> one)
> >>   - Dropped dependency on CoreFoundation (Mac OS)
> >>
> >> Patch for libsanitizer is automatically generated by libsanitizer/merge.sh
> >> Tested with
> >> rm -rf */{*/,}libsanitizer \
> >>   && make -j 50 \
> >>   && make -C gcc check-g{cc,++}
> >> RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} asan.exp'
> >>
> >> Our internal LLVM bots (Linux, Mac and Android) are green.
> >>
> >> Ok to commit?
> >
> > Please mention
> >         PR sanitizer/55989
> > in the ChangeLog entry, as this should fix that issue.
>
> done.
>
> >
> > If you want to commit now, I'd prefer if you could for now change
> > # define SANITIZER_INTERCEPT_SCANF SI_NOT_WINDOWS
> > to
> > # define SANITIZER_INTERCEPT_SCANF 0
> > (see comments below).  Ok with those changes.
>
> Done, added Evgeniy Stepanov.
>
>
> >
> > No changes for the ppc64 address space issue?
>
> Not yet.
>
> >  Do you think that can be done
> > say before end of month for yet another merge?
>
> I hope so.
>
> > Also, I'd appreciate the
> > (optional) libtsan shadow mapping change to allow tracing non-PIEs (that
> > would really be a big usability plus for libtsan).
>
> I wouldn't promise that. I'd really want to get rid of the COMPAT
> mapping first, otherwise the code will get too complicated.
> Dmitry?
>
> >
> > I think we should stop doing the full libsanitizer merges at the beginning
> > of February, we can backport strict severe bugfixes then, but otherwise it
> > should be frozen and ABI stable.
>
> Sounds good.
>
>
> Committing this merge now.
>
> --kcc
>
>
>
> >
> > +static void scanf_common(void *ctx, const char *format, va_list ap_const) {
> > +  va_list aq;
> > +  va_copy(aq, ap_const);
> > +
> > +  const char *p = format;
> > +  unsigned size;
> > +
> > +  while (*p) {
> > +    if (*p != '%') {
> > +      ++p;
> > +      continue;
> > +    }
> > +    ++p;
> > +    if (*p == '*' || *p == '%' || *p == 0) {
> > +      ++p;
> > +      continue;
> > +    }
> > +    if (*p == '0' || (*p >= '1' && *p <= '9')) {
> > +      size = internal_atoll(p);
> > +      // +1 for the \0 at the end
> > +      COMMON_INTERCEPTOR_WRITE_RANGE(ctx, va_arg(aq, void *), size + 1);
> > +      ++p;
> > +      continue;
> > +    }
> > +
> > +    if (*p == 'L' || *p == 'q') {
> > +      ++p;
> > +      size = match_spec(scanf_llspecs, scanf_llspecs_cnt, *p);
> > +      COMMON_INTERCEPTOR_WRITE_RANGE(ctx, va_arg(aq, void *), size);
> > +      continue;
> > +    }
> > +
> > +    if (*p == 'l') {
> > +      ++p;
> > +      if (*p == 'l') {
> > +        ++p;
> > +        size = match_spec(scanf_llspecs, scanf_llspecs_cnt, *p);
> > +        COMMON_INTERCEPTOR_WRITE_RANGE(ctx, va_arg(aq, void *), size);
> > +        continue;
> > +      } else {
> > +        size = match_spec(scanf_lspecs, scanf_lspecs_cnt, *p);
> > +        COMMON_INTERCEPTOR_WRITE_RANGE(ctx, va_arg(aq, void *), size);
> > +        continue;
> > +      }
> > +    }
> > +
> > +    if (*p == 'h' && *(p + 1) == 'h') {
> > +      COMMON_INTERCEPTOR_WRITE_RANGE(ctx, va_arg(aq, void *), 
> > sizeof(char));
> > +      p += 2;
> > +      continue;
> > +    }
> > +
> > +    size = match_spec(scanf_specs, scanf_specs_cnt, *p);
> > +    if (size) {
> > +      COMMON_INTERCEPTOR_WRITE_RANGE(ctx, va_arg(aq, void *), size);
> > +      ++p;
> > +      continue;
> > +    }
> > +  }
> > +  va_end(aq);
> > +}
> >
> > I think this isn't safe.  It should always parse completely the whole %
> > spec, and whenever it discovers one that it doesn't fully understand, it
> > should just give up completely, because it has no idea how much it needs to
> > read from the va_arg.
I'm not sure we can ever be sure this code is safe. It has to deal
with different libc implementations (ex. bionic), and, as you
mentioned, has no way to detect future changes like scanf hooks that
will break it completely. As the whole *sanitizer stuff, it is best
effort. Also, sanitizers have different requirements for this
interceptor: it should be conservative in asan, aggressive in msan.

>  E.g. it doesn't handle %5$d syntax (generally not a
> > problem), but it has to give up if it sees it.

Agreed, that's an issue.

>  So, e.g. whenever match_spec
> > returns 0, it should break out of the loop, rather than continue.
> > And for %hh it doesn't check following letters, no match_spec at all.

That's cause they don't change the write size. The same for %h, the
following letters don't matter to us. It seems safe to "continue" as
soon as we are sure we know what the current spec's write length is.

> > %[, %s, %c, %n, %ls, %lc, %p, %A and many others aren't handled FYI (again, 
> > not
> > a problem if the implementation is conservative).  Handling %c, %s and
> > %[0123456789] style would be worthwhile though, I bet most of the buffer
> > overflows are from the use of those scanf specifiers.

At least some of those are handled (%p, %n). Not sure what %A is, it's
not in my man page.

> > BTW, %as, %aS and %a[ should always result in giving up, as there is an
> > ambiguity between GNU extensions and C99/POSIX and libasan can't know how
> > glibc resolves that ambiguity.
> >
> > What if glibc adds a scanf hook (like it has already printf hooks), apps
> > could then register their own stuff and the above would then break.  It
> > really should be very conservative, and should be checked e.g. with all
> > glibc's *scanf tests (e.g. stdio-common/scanf[0-9]*.c,
> > stdio-common/tst-sscanf.c).

I'll add support for the missing %specs. About the testing, these
files seem like GPL, so I'd prefer not to look at them at all. We've
got a smallish test for the scanf implementation in sanitizer_common,
but it would be really great to run it on full glibc scanf tests.
Would you be willing to setup such testing gnu-side?

Reply via email to