On Wed, Jan 23, 2013 at 02:31:57PM +0400, Konstantin Serebryany wrote:
> The attached patch is the libsanitizer merge from upstream r173241.
> 
> Lots of changes. Among other things:
>   - slow CFI-based unwinder is on by default for fatal errors
> (fast_unwind_on_fatal=0, Linux-only)
>   - more interceptors in asan/tsan
>   - new asan allocator on Linux (faster and uses less memory than the old one)
>   - Dropped dependency on CoreFoundation (Mac OS)
> 
> Patch for libsanitizer is automatically generated by libsanitizer/merge.sh
> Tested with
> rm -rf */{*/,}libsanitizer \
>   && make -j 50 \
>   && make -C gcc check-g{cc,++}
> RUNTESTFLAGS='--target_board=unix\{-m32,-m64\} asan.exp'
> 
> Our internal LLVM bots (Linux, Mac and Android) are green.
> 
> Ok to commit?

Please mention
        PR sanitizer/55989
in the ChangeLog entry, as this should fix that issue.

If you want to commit now, I'd prefer if you could for now change
# define SANITIZER_INTERCEPT_SCANF SI_NOT_WINDOWS
to
# define SANITIZER_INTERCEPT_SCANF 0
(see comments below).  Ok with those changes.

No changes for the ppc64 address space issue?  Do you think that can be done
say before end of month for yet another merge?  Also, I'd appreciate the
(optional) libtsan shadow mapping change to allow tracing non-PIEs (that
would really be a big usability plus for libtsan).

I think we should stop doing the full libsanitizer merges at the beginning
of February, we can backport strict severe bugfixes then, but otherwise it
should be frozen and ABI stable.

+static void scanf_common(void *ctx, const char *format, va_list ap_const) {
+  va_list aq;
+  va_copy(aq, ap_const);
+
+  const char *p = format;
+  unsigned size;
+
+  while (*p) {
+    if (*p != '%') {
+      ++p;
+      continue;
+    }
+    ++p;
+    if (*p == '*' || *p == '%' || *p == 0) {
+      ++p;
+      continue;
+    }
+    if (*p == '0' || (*p >= '1' && *p <= '9')) {
+      size = internal_atoll(p);
+      // +1 for the \0 at the end
+      COMMON_INTERCEPTOR_WRITE_RANGE(ctx, va_arg(aq, void *), size + 1);
+      ++p;
+      continue;
+    }
+
+    if (*p == 'L' || *p == 'q') {
+      ++p;
+      size = match_spec(scanf_llspecs, scanf_llspecs_cnt, *p);
+      COMMON_INTERCEPTOR_WRITE_RANGE(ctx, va_arg(aq, void *), size);
+      continue;
+    }
+
+    if (*p == 'l') {
+      ++p;
+      if (*p == 'l') {
+        ++p;
+        size = match_spec(scanf_llspecs, scanf_llspecs_cnt, *p);
+        COMMON_INTERCEPTOR_WRITE_RANGE(ctx, va_arg(aq, void *), size);
+        continue;
+      } else {
+        size = match_spec(scanf_lspecs, scanf_lspecs_cnt, *p);
+        COMMON_INTERCEPTOR_WRITE_RANGE(ctx, va_arg(aq, void *), size);
+        continue;
+      }
+    }
+
+    if (*p == 'h' && *(p + 1) == 'h') {
+      COMMON_INTERCEPTOR_WRITE_RANGE(ctx, va_arg(aq, void *), sizeof(char));
+      p += 2;
+      continue;
+    }
+
+    size = match_spec(scanf_specs, scanf_specs_cnt, *p);
+    if (size) {
+      COMMON_INTERCEPTOR_WRITE_RANGE(ctx, va_arg(aq, void *), size);
+      ++p;
+      continue;
+    }
+  }
+  va_end(aq);
+}

I think this isn't safe.  It should always parse completely the whole %
spec, and whenever it discovers one that it doesn't fully understand, it
should just give up completely, because it has no idea how much it needs to
read from the va_arg.  E.g. it doesn't handle %5$d syntax (generally not a
problem), but it has to give up if it sees it.  So, e.g. whenever match_spec
returns 0, it should break out of the loop, rather than continue.
And for %hh it doesn't check following letters, no match_spec at all.
%[, %s, %c, %n, %ls, %lc, %p, %A and many others aren't handled FYI (again, not
a problem if the implementation is conservative).  Handling %c, %s and
%[0123456789] style would be worthwhile though, I bet most of the buffer
overflows are from the use of those scanf specifiers.

BTW, %as, %aS and %a[ should always result in giving up, as there is an
ambiguity between GNU extensions and C99/POSIX and libasan can't know how
glibc resolves that ambiguity.

What if glibc adds a scanf hook (like it has already printf hooks), apps
could then register their own stuff and the above would then break.  It
really should be very conservative, and should be checked e.g. with all
glibc's *scanf tests (e.g. stdio-common/scanf[0-9]*.c,
stdio-common/tst-sscanf.c).

        Jakub

Reply via email to