Issue |
90010
|
Summary |
Function sanitizer is extremely fragile on macOS
|
Labels |
|
Assignees |
|
Reporter |
glandium
|
Because clang/llvm stores the function signature for runtime check as an integer preceding the function in the text section, and because symbols don't have explicit sizes in mach-O, as far as the linker knows, that data belongs to the function that precedes the "annotated" function.
So if you have function `foo` and function `bar`, the signature for `bar` belongs to `foo`. If for some reason, the linker removes `foo`, then the signature for `bar` becomes whatever the signature of `foo` was, and function call checks at runtime for indirect calls to bar fail.
There are two main situations where this can happen in practice: the `-dead_strip` linker flag (which removes dead code), and when `foo` is defined in multiple compilation units (e.g. template instantiations).
Here's a testcase for both cases:
`foo.h`:
```
template<class T>
int foo(T*) {
return 0;
}
```
`foo.cpp`
```
#include "foo.h"
int call(int(*cb)(int*), int* x) {
return cb(x);
}
int dead() {
return 42;
}
int(*bar)(int*) = &foo<int>;
int main() {
return call(bar, 0);
}
```
`bar.cpp`:
```
#include "foo.h"
template int foo<void>(void*);
template int foo<int>(int*);
```
`baz.cpp`:
```
#include "foo.h"
template int foo<void>(void*);
```
First case: build with `clang++ -o foo foo.cpp -fsanitize=function -fno-sanitize-recover=function -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk -Wl,-dead_strip`
Fails with:
```
foo.cpp:4:10: runtime error: call to function int foo<int>(int*) through pointer to incorrect function type 'int (*)(int *)'
(foo:arm64+0x100003f24): note: int foo<int>(int*) defined here
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior foo.cpp:4:10
```
The reason it fails in this case is that the signature for the function `foo` is in `dead`, which is removed, so `foo` ends up with the signature of `dead`, from the last bytes of the function that precedes it. If `dead` had the same signature as `foo<int>`, it would not fail, by mere chance.
Second case: build with `clang++ -o foo baz.cpp bar.cpp foo.cpp -fsanitize=function -fno-sanitize-recover=function -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk`
The failure output is the same as for the first case, but the reason is different. Because of the order of the .cpp files, `foo<int>` comes from the instantiation in `bar.cpp`. Its signature is thus in `foo<void>`. But `foo<void>` is actually taken from the instantiation in `baz.cpp`, so practically speaking, this works as if `foo<void>` in `bar.cpp` was dead code, so the signature for `foo<int>` comes from the end of whatever was before `foo<void>`, which in our case is text associated with no symbol at all at the beginning of `bar.cpp`, corresponding to the signature of `foo<void>`.
Presumably, if clang emitted symbols for those function signatures, there wouldn't be a problem, because the linker wouldn't think the signature belongs to the preceding function.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs