https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68016
--- Comment #8 from Maxim Ostapenko <chefmax at gcc dot gnu.org> --- (In reply to Reid Kleckner from comment #7) > (In reply to Jakub Jelinek from comment #6) > > Because symbol size is part of the ABI, and LLVM emits different symbol size > > between -fsanitize=address and -fno-sanitize=address. > > E.g. COPY relocations use the st_size field, so you can have: > > 1) shared library originally not ASAN instrumented, binary (e.g. non-ASAN) > > linked against it, then the shared library recompiled with ASAN - the size > > of the symbol in the binary will be the one without padding, but LLVM > > incorrectly registers the variable using global symbol rather than local > > alias and thus assumes there is padding which is not available (plus you can > > get a runtime warning on the st_size mismatch from the dynamic linker) > > I thought COPY relocations only occurred with fPIE, but I must have been > mistaken. > > > 2) even without COPY relocations, you could have the same variable defined > > in multiple shared libraries, if some of them are -fsanitize=address and the > > others are not, there is mismatch between the variable sizes, and depending > > on which library comes earlier in the symbol search scope, you could have > > either the version without or with padding used at runtime, but the > > sanitized libraries could very well register the non-padded one, making it > > fatal error to access e.g. variables after it > > LLVM ASan tries to instrument only global definitions with external linkage. > The goal of this check is to ensure that we have found the one true > definition of the global, and it isn't COMDAT, weak, a C string, or going to > get replaced with something else at link time through some other means. > > It seems like you are describing interposition, which isn't something LLVM > supports very well. LLVM has no equivalent of -fsemantic-interposition, for > example. We always operate under something like -fno-semantic-interposition. > (I know, it's ironic, because ASan interposes libc.) > > Anyway, I agree the COPY relocation issue is a real problem, but other than > that I think our approach is at least internally consistent. Jakub is right, here an example, where I believe COPY relocations are not involved: max@max:/tmp$ cat libfoo.c long h = 15; long f = 4; long foo (long *p) { return *p; } max@max:/tmp$ cat libbar.c extern void abort (void); long foo (long *); long h = 12; long i = 13; long f = 5; int bar () { if (foo (&f) != 5 || foo (&h) != 12 || foo (&i) != 13) abort (); return 0; } max@max:/tmp$ cat main.c int bar (); int main () { return bar (); } max@max:/tmp$ clang libfoo.c -shared -fpic -o libfoo.so -g max@max:/tmp$ clang libbar.c -shared -fpic -o libbar.so -g max@max:/tmp$ clang main.c -c -o main.o max@max:/tmp$ clang main.o ./libbar.so ./libfoo.so -o main -fsanitize=address max@max:/tmp$ ./main max@max:/tmp$ clang libfoo.c -shared -fpic -o libfoo.so -g -fsanitize=address max@max:/tmp$ ./main ================================================================= ==27105==ERROR: AddressSanitizer: global-buffer-overflow on address 0x7f28c26a0050 at pc 0x7f28c229d9c1 bp 0x7ffd1716a950 sp 0x7ffd1716a948 READ of size 8 at 0x7f28c26a0050 thread T0 #0 0x7f28c229d9c0 in foo /tmp/libfoo.c:4:10 #1 0x7f28c249f7bf in bar /tmp/libbar.c:8:7 #2 0x4e1585 in main (/tmp/main+0x4e1585) #3 0x7f28c13b3ec4 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287 #4 0x418f25 in _start (/tmp/main+0x418f25) 0x7f28c26a0050 is located 0 bytes inside of global variable 'f' defined in 'libfoo.c:2:6' (0x7f28c26a0050) of size 8 0x7f28c26a0050 is located 8 bytes to the right of global variable 'h' defined in 'libfoo.c:1:6' (0x7f28c26a0040) of size 8 SUMMARY: AddressSanitizer: global-buffer-overflow /tmp/libfoo.c:4:10 in foo Shadow bytes around the buggy address: 0x0fe5984cbfb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0fe5984cbfc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0fe5984cbfd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0fe5984cbfe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0fe5984cbff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>0x0fe5984cc000: 00 00 00 00 00 00 00 00 00 f9[f9]f9 f9 f9 f9 f9 0x0fe5984cc010: f9 f9 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0fe5984cc020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0fe5984cc030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0fe5984cc040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0fe5984cc050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Heap right redzone: fb Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack partial redzone: f4 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb ==27105==ABORTING max@max:/tmp$ readelf -r main | grep COPY Here, global symbols 'f' and 'h' are resolved to libbar.so, that is not sanitized. However, when libfoo.so registers its "own" globals, it actually poisons libbar.so's 'f' and 'h' that are not properly padded.