lebedev.ri added a subscriber: tentzen. lebedev.ri added inline comments.
================ Comment at: clang/lib/CodeGen/CGCall.cpp:5402-5411 + if (SanOpts.has(SanitizerKind::ExceptionEscape) && + ExceptionEscapeUBLastInvokeSrcLoc) { + llvm::Constant *CheckSourceLocation = EmitCheckSourceLocation(Loc); + Builder.CreateStore( + CheckSourceLocation, + Address(ExceptionEscapeUBLastInvokeSrcLoc, + CheckSourceLocation->getType(), ---------------- lebedev.ri wrote: > lebedev.ri wrote: > > rjmccall wrote: > > > lebedev.ri wrote: > > > > lebedev.ri wrote: > > > > > @rjmccall > > > > > So there are two issues: > > > > > 1. `getInvokeDest()` isn't necessarily called just before creating an > > > > > `invoke`, see e.g. `-EHa` handling in > > > > > `CodeGenFunction::PopCleanupBlock()` > > > > > 2. Even if we ignore that, we need to do this for *every* `invoke`, > > > > > not just those going to the our UB landing pad, consider: > > > > > https://godbolt.org/z/qTeKor41a <- the invoke leads to a normal > > > > > landing pad, yet we immediately rethrow the just-caught exception, > > > > > and now end up in the UB landing pad. > > > > > > > > > > So i'm not really seeing an alternative path here? > > > > @rjmccall do you agree with my reasoning here? > > > > Perhaps there is some other solution that i'm not seeing, > > > > other than not having the source location? > > > I think there are four interesting questions posed by your example. > > > > > > The first is whether we should make an effort to report the original > > > source location for the call to `maybe_throws` instead of the rethrow > > > location. To me, the answer is no, and the main reason is that the user > > > is explicitly acknowledging the possibility of an exception and is > > > (probably) trying to handle it. That means that the proximate source of > > > programmer error is now that their attempt to handle it failed, not that > > > an exception was raised in the first place. That might seem weird for > > > your example specifically, > > > > > > The second is what source location we should report for invokes that > > > don't have an obvious source location. The answer is that I think we > > > should always try to have a source location, even if it might be an > > > imperfect one. Synthesized code almost always has a source location that > > > triggered the synthesis. Cleanups are typically associated with some > > > variable or temporary that itself should have a source location. Even > > > things like `__cxa_end_catch` are associated with the `catch` clause. > > > Hopefully UBSan has a flexible enough schema in how source locations are > > > expressed to it for us to communicate these things down to the runtime, > > > like "synthesized code at Foo.cpp:18" or "cleanup for variable at > > > Bar.cpp:107". > > > > > > The third is the more general question of whether we need to store a > > > source location before every `invoke`. To ensure that this variable is > > > meaningfully initialized, we need to store a source location before every > > > invoke that can lead to a UB scope. But the variable is stateful, so > > > it's equally important that we not emit stores that might overwrite the > > > source location that we actually want to report. For example, we want > > > the source location reported in your example to be the source location > > > for the `throw;`, not something associated with the `__cxa_end_catch` > > > cleanup. This basically boils down to not doing stores within a landing > > > pad before it enters a `catch` handler. Fortunately, landing pad code > > > like that should always be in a terminate scope, so I think you can just > > > look at the current EH stack and see whether the landing pad can possibly > > > reach a UB scope. (You might need some save-and-restore logic around > > > `@finally` blocks in ObjC.) > > > > > > The final question is whether we need to emit cleanups on paths that lead > > > to UB scopes at all. In the simplest cases, I think we don't. The > > > standard says that it's implementation-specified whether the stack is > > > unwound on an exception path that leads to termination. That rule > > > doesn't technically apply here, but only because we're dealing with a > > > language extension that makes unwinding UB in the first place. Since we > > > have room to choose the semantics, the standard gives us pretty strong > > > cover to be just as aggressive about skipping cleanups in contexts where > > > throwing is UB as it is in contexts where throwing terminates. That, in > > > turn, means that the landing pad in these contexts will usually just be a > > > UB scope with no cleanups in it. So even if we did need to worry about > > > cleanups overwriting the source location, it usually won't be possible > > > because we won't be running cleanups. In fact, if there's a catch-all or > > > a terminate scope, the UB scope won't be reachable, so the only situation > > > where we have to worry about cleanups being run before we reach a UB > > > scope is if there are cleanups inside a *partial* catch clause, like > > > `catch (const std::exception &e)`. And in principle we have the > > > information up front from the EH selector to know whether we're going to > > > end up in a catch handler, so we could actually just check immediately at > > > the start of the landing pad and then trigger UBSan. > > > The final question is whether we need to emit cleanups on paths that lead > > > to UB scopes at all. In the simplest cases, I think we don't. The > > > standard says that it's implementation-specified whether the stack is > > > unwound on an exception path that leads to termination. That rule doesn't > > > technically apply here, but only because we're dealing with a language > > > extension that makes unwinding UB in the first place. Since we have room > > > to choose the semantics, the standard gives us pretty strong cover to be > > > just as aggressive about skipping cleanups in contexts where throwing is > > > UB as it is in contexts where throwing terminates. That, in turn, means > > > that the landing pad in these contexts will usually just be a UB scope > > > with no cleanups in it. So even if we did need to worry about cleanups > > > overwriting the source location, it usually won't be possible because we > > > won't be running cleanups. In fact, if there's a catch-all or a terminate > > > scope, the UB scope won't be reachable, so the only situation where we > > > have to worry about cleanups being run before we reach a UB scope is if > > > there are cleanups inside a *partial* catch clause, like catch (const > > > std::exception &e). And in principle we have the information up front > > > from the EH selector to know whether we're going to end up in a catch > > > handler, so we could actually just check immediately at the start of the > > > landing pad and then trigger UBSan. > > > > The more i stare at this. The more worried i am. > > Is the idea that, when we are in termination/UB EH scope, > > we no longer have to play by the RAII rules, > > and destructors, for the moment, are no-ops and are side-effect free? > > > > Consider: > > ``` > > #include <cstdlib> > > #include <cstdio> > > > > void will_throw() { throw int(42); } > > void might_throw() { } > > > > struct abort_on_dtor { > > abort_on_dtor() {} > > [[noreturn]] ~abort_on_dtor() { printf("in ~abort_on_dtor()\n"); abort(); > > } > > }; > > > > void test0() noexcept { > > abort_on_dtor scope; > > will_throw(); > > might_throw(); > > } > > > > __attribute__ ((pure)) void test1() { > > abort_on_dtor scope; > > will_throw(); > > might_throw(); > > } > > > > int main(int argc, char* argv[]) { > > printf("argc = %i\n", argc); > > if(argc == 1) > > abort_on_dtor scope; > > if(argc == 2) > > test0(); > > if(argc == 3) > > test1(); > > return 0; > > } > > ``` > > You'd think, if call to `test0()`/`test1()` unwinds, > > we'd first destruct `abort_on_dtor`, and termination/UB no longer happens. > > But i guess termination/UB time-traveled/"happened" the moment > > `test0()`/`test1()` unwinds? > > > > The final question is whether we need to emit cleanups on paths that lead > > to UB scopes at all. In the simplest cases > > I believe, i have implemented these semantics, i.e. > * no cleanups in UB EH scope, just like how we deal with Terminate scope > * when in UB EH scope, and not sanitizing, all callees are implicitly nounwind > * If `catch()` blocks don't catch the exception, and we get to UB scope, we > go into `unreachable` block. > > Did i understand you correctly? > (test suite does not pass, there are a few bugs to deal with) > Ok, trying to continue looking into this > The first is whether we should make an effort to report the original source > location for the call to maybe_throws instead of the rethrow location. To me, > the answer is no, Sure, seems reasonable. > The second is what source location we should report for invokes that don't > have an obvious source location. The answer is that I think we should always > try to have a source location, even if it might be an imperfect one. It seems like this point is non-actionable. We simply always emit the source location, and if it is invalid, well, then that is a more general problem. > The third is the more general question of whether we need to store a source > location before every invoke. To ensure that this variable is meaningfully > initialized, we need to store a source location before every invoke that can > lead to a UB scope. But the variable is stateful, so it's equally important > that we not emit stores that might overwrite the source location that we > actually want to report. For example, we want the source location reported in > your example to be the source location for the throw;, not something > associated with the __cxa_end_catch cleanup. This basically boils down to not > doing stores within a landing pad before it enters a catch handler. > Fortunately, landing pad code like that should always be in a terminate > scope, so I think you can just look at the current EH stack and see whether > the landing pad can possibly reach a UB scope. (You might need some > save-and-restore logic around @finally blocks in ObjC.) I've fixed source location for `throw`, but i'm not quite following the rest of the comment here. Can you perhaps give a self-contained example where we'd end up with wrong line? I'm not sure about ObjC, so i'll ignore that bit for a moment at least. ================ Comment at: clang/test/CodeGen/windows-seh-EHa-CppCondiTemps.cpp:3 +// FIXME: this check appears to be miscompiled? +// XFAIL: * ---------------- This test broke once we always started adding (outermost) UB scope for nounwind functions. I don't quite get what is going wrong. It could be a bug in SEH handling. Can someone who has some idea about that code take a look and suggest a fix? @tentzen ? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D137381/new/ https://reviews.llvm.org/D137381 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits