On Mon, Feb 11, 2013 at 3:38 PM, Jakub Jelinek <ja...@redhat.com> wrote:
> On Wed, Jan 23, 2013 at 04:24:01PM +0400, Evgeniy Stepanov wrote:
>> > > What if glibc adds a scanf hook (like it has already printf hooks), apps
>> > > could then register their own stuff and the above would then break.  It
>> > > really should be very conservative, and should be checked e.g. with all
>> > > glibc's *scanf tests (e.g. stdio-common/scanf[0-9]*.c,
>> > > stdio-common/tst-sscanf.c).
>>
>> I'll add support for the missing %specs. About the testing, these
>> files seem like GPL, so I'd prefer not to look at them at all. We've
>> got a smallish test for the scanf implementation in sanitizer_common,
>> but it would be really great to run it on full glibc scanf tests.
>> Would you be willing to setup such testing gnu-side?
>
> Seems the code in llvm repo looks much better now to me than it used to,
> still it breaks on:
>
> #define _GNU_SOURCE 1
> #include <stdio.h>
> #include <string.h>
>
> int
> main ()
> {
>   char *p;
>   long double q;
>   if (sscanf ("abcdefghijklmno 1.0", "%ms %Lf", &p, &q) != 2
>       || strcmp (p, "abcdefghijklmno") != 0
>       || q != 1.0L)
>     return 1;
>   return 0;
> }
>
> because it mishandles %ms.
>
> Attached patch fixes that and also handles %a when possible.
> There are cases where it is unambiguous, e.g.
> s%Las is s %La s, reading long double, or %ar is %a r, reading
> float, other cases where we can check at least something and keep checking
> the rest of the format string, e.g. for:
> %as
> we know it will either store float, or char * pointer, so just assume
> conservatively the smaller size, and other cases where we have to punt,
> %a[a%d]
> (where % appears in the [...] list).
>
> I've tested various glibc *scanf testcases with the GCC libsanitizer
> + enabling of # define SANITIZER_INTERCEPT_SCANF SI_NOT_WINDOWS
> + llvm svn diff between last merge point to GCC and current llvm trunk
> + this patch, and it looked good.

Sounds great. I'll commit the patch to compiler-rt if you are ok with that.

> There is one further issue, I'd say you need to pass down to scanf_common
> the result from the intercepted call, and use it to see if it is valid
> to call strlen on the ptr, or if just 1 should be assumed for SSS_STRLEN.
> Note 'n' doesn't count towards that.  Because, it is unsafe to call strlen
> on arguments that haven't been assigned yet.  Testcase that still fails:

Definitely. Also, I'm undecided on this, but we probably don't want to
report access ranges for the arguments that were not written to. On
the other hand, those are potential writes that did not happen due to,
in most cases, user input - we might want to report them anyway.

>
> #define _GNU_SOURCE 1
> #include <stdio.h>
> #include <string.h>
>
> int
> main ()
> {
>   int a, b, c, d;
>   char e[3], f[10];
>   memset (e, ' ', sizeof e);
>   memset (f, ' ', sizeof f);
>   if (sscanf ("1 2 a", "%d%n%n%d %s %s", &a, &b, &c, &d, e, f) != 3
>       || a != 1
>       || b != 1
>       || c != 1
>       || d != 2
>       || strcmp (e, "a") != 0)
>     return 1;
>   return 0;
> }
>
> Oh, one more thing, on Linux for recent enough glibc's it would be
> desirable to also intercept:
> __isoc99_sscanf, __isoc99_scanf, __isoc99_vsscanf, __isoc99_fscanf,
> __isoc99_vfscanf, __isoc99_vscanf
> functions (guard the interception with
> #if defined __GLIBC_PREREQ && __GLIBC_PREREQ (2, 7)
> ).  All these work exactly like the corresponding non-__isoc99_
> functions, just %a followed by s, S or [ always writes float, thus
> there is no ambiguity.
>
>         Jakub

Reply via email to