On 09.12.2017 19:15, Robert Elz wrote: > Date: Sat, 9 Dec 2017 15:46:42 +0100 > From: Kamil Rytarowski <[email protected]> > Message-ID: <[email protected]> > > | However there exist programs in the basesystem that shadow libc > | symbol routines as well, > > There is nothing wrong with that, in fact it is almost unavoidable, > as programs need names to use, and libraries need names for functions > they add later, and it is inevitable that they will clash from time to > time. > > | for example ps(1): > | > | bin/ps/extern.h:void uname(struct pinfo *, VARENT *, enum mode); > > I suspect that the BSD ps command has had a uname() function since long > before the Sys V (or Sys III or wherever it originated) was added to the > BSD libc - this is a perfect example. > > To handle this kind of issue, the libc functions only get to be defined > when the relevant header file is included, in this case <sys/utsname.h> > which ps does not do, hence, it is perfectly entitled to have a function > called uname if it wants, or a "struct utsname" if it really wanted to > be perverse. > > | I'm going to rename the symbol routine names when I will hit them. > > There is nothing inherently wrong with that - they are just names after > all, but it is the wrong solution, and one that would have no end. > > There could easily be a "usrname()" function added to libc next week, > and the sanitizers could learn about it the week after, and then you're > back with the exact same problem. > > The right way is for the sanitizers to learn which headers define the > symbols that they want to take over, and only do that when the appropriate > header is included (one way to do that would be to define shadow headers, > so LLVM could define a sys/utsname.h and arrange for that one to be found > ahead of /usr/include/sys/utsname.h when compiling. Then that header does > the magic needed to get the LLVM version of uname() - otherwise it simply > does nothing with a function called uname() if the program happens to have > one. > > And the same for all the other symbols that it feels the need to take over > from libc (or other libraries.) > > Whether that's done with actual new header files, or simply by recognising > the system headers being included and then adding the appropriate magic > only in those cases when it observes the system header being included is > just an implementation detail. > > kre >
The problem is not on the header files (preprocessor), but on the linker
level.
We are linking prebuilt .a / .so files with a target application.
$ nm
/usr/local/lib/clang/6.0.0/lib/netbsd/libclang_rt.msan-x86_64.a|grep uname
0000000000000000 B _ZN14__interception10real_unameE
0000000000000000 T __interceptor_uname
0000000000000000 T uname
We are intercepting uname(3) because behind the scenes it's a syscall
and we need to hardcode sanitizing rules (length of a field that is
being initialized).
INTERCEPTOR(int, uname, struct utsname *utsname) {
ENSURE_MSAN_INITED();
int res = REAL(uname)(utsname);
if (!res)
__msan_unpoison(utsname, __sanitizer::struct_utsname_sz);
return res;
}
In the MSan case we mark the utsname pointer as initialized.
The impact for basesystem utilities is rather low so far (in sh(1) there
are 0 symbol clashes, in ksh(1) there is 1 clash) and appears to be the
least intrusive workaround.
I agree that this is not perfect, but I'm not aware about a better
solution that does not introduce redesign&rewrite of the sanitizers.
signature.asc
Description: OpenPGP digital signature
