lebedev.ri added a reviewer: efriedma. lebedev.ri added a comment. @rjmccall thank you for taking a look!
In D137381#3955100 <https://reviews.llvm.org/D137381#3955100>, @rjmccall wrote: > In D137381#3953178 <https://reviews.llvm.org/D137381#3953178>, @lebedev.ri > wrote: > >> ping. We need to get this going, reminder, this was caught because it caused >> a visible miscompilation. > > I'm sorry, I caught COVID-19 during the LLVM conference and have not had a > lot of energy recently, I'm sorry. > plus last week was the Thanksgiving holiday in the US. > Please don't describe this as a miscompilation. It may not match what you > want as a user, but what the the compiler is doing is not incorrect. Right. > What you're doing is adding a mitigation to try to protect against a source > of UB that recently affected you, which is commendable but doesn't have the > same level of urgency. I'm mainly worried that there are likely other "miscompilations" in the wild. Also, after this, we might want to reevaluate which clang attribute does what, ================ Comment at: clang/lib/CodeGen/CGCall.cpp:5402-5411 + if (SanOpts.has(SanitizerKind::ExceptionEscape) && + ExceptionEscapeUBLastInvokeSrcLoc) { + llvm::Constant *CheckSourceLocation = EmitCheckSourceLocation(Loc); + Builder.CreateStore( + CheckSourceLocation, + Address(ExceptionEscapeUBLastInvokeSrcLoc, + CheckSourceLocation->getType(), ---------------- rjmccall wrote: > lebedev.ri wrote: > > lebedev.ri wrote: > > > @rjmccall > > > So there are two issues: > > > 1. `getInvokeDest()` isn't necessarily called just before creating an > > > `invoke`, see e.g. `-EHa` handling in `CodeGenFunction::PopCleanupBlock()` > > > 2. Even if we ignore that, we need to do this for *every* `invoke`, not > > > just those going to the our UB landing pad, consider: > > > https://godbolt.org/z/qTeKor41a <- the invoke leads to a normal landing > > > pad, yet we immediately rethrow the just-caught exception, and now end up > > > in the UB landing pad. > > > > > > So i'm not really seeing an alternative path here? > > @rjmccall do you agree with my reasoning here? > > Perhaps there is some other solution that i'm not seeing, > > other than not having the source location? > I think there are four interesting questions posed by your example. > > The first is whether we should make an effort to report the original source > location for the call to `maybe_throws` instead of the rethrow location. To > me, the answer is no, and the main reason is that the user is explicitly > acknowledging the possibility of an exception and is (probably) trying to > handle it. That means that the proximate source of programmer error is now > that their attempt to handle it failed, not that an exception was raised in > the first place. That might seem weird for your example specifically, > > The second is what source location we should report for invokes that don't > have an obvious source location. The answer is that I think we should always > try to have a source location, even if it might be an imperfect one. > Synthesized code almost always has a source location that triggered the > synthesis. Cleanups are typically associated with some variable or temporary > that itself should have a source location. Even things like > `__cxa_end_catch` are associated with the `catch` clause. Hopefully UBSan > has a flexible enough schema in how source locations are expressed to it for > us to communicate these things down to the runtime, like "synthesized code at > Foo.cpp:18" or "cleanup for variable at Bar.cpp:107". > > The third is the more general question of whether we need to store a source > location before every `invoke`. To ensure that this variable is meaningfully > initialized, we need to store a source location before every invoke that can > lead to a UB scope. But the variable is stateful, so it's equally important > that we not emit stores that might overwrite the source location that we > actually want to report. For example, we want the source location reported > in your example to be the source location for the `throw;`, not something > associated with the `__cxa_end_catch` cleanup. This basically boils down to > not doing stores within a landing pad before it enters a `catch` handler. > Fortunately, landing pad code like that should always be in a terminate > scope, so I think you can just look at the current EH stack and see whether > the landing pad can possibly reach a UB scope. (You might need some > save-and-restore logic around `@finally` blocks in ObjC.) > > The final question is whether we need to emit cleanups on paths that lead to > UB scopes at all. In the simplest cases, I think we don't. The standard > says that it's implementation-specified whether the stack is unwound on an > exception path that leads to termination. That rule doesn't technically > apply here, but only because we're dealing with a language extension that > makes unwinding UB in the first place. Since we have room to choose the > semantics, the standard gives us pretty strong cover to be just as aggressive > about skipping cleanups in contexts where throwing is UB as it is in contexts > where throwing terminates. That, in turn, means that the landing pad in > these contexts will usually just be a UB scope with no cleanups in it. So > even if we did need to worry about cleanups overwriting the source location, > it usually won't be possible because we won't be running cleanups. In fact, > if there's a catch-all or a terminate scope, the UB scope won't be reachable, > so the only situation where we have to worry about cleanups being run before > we reach a UB scope is if there are cleanups inside a *partial* catch clause, > like `catch (const std::exception &e)`. And in principle we have the > information up front from the EH selector to know whether we're going to end > up in a catch handler, so we could actually just check immediately at the > start of the landing pad and then trigger UBSan. > The final question is whether we need to emit cleanups on paths that lead to > UB scopes at all. In the simplest cases, I think we don't. The standard says > that it's implementation-specified whether the stack is unwound on an > exception path that leads to termination. That rule doesn't technically apply > here, but only because we're dealing with a language extension that makes > unwinding UB in the first place. Since we have room to choose the semantics, > the standard gives us pretty strong cover to be just as aggressive about > skipping cleanups in contexts where throwing is UB as it is in contexts where > throwing terminates. That, in turn, means that the landing pad in these > contexts will usually just be a UB scope with no cleanups in it. So even if > we did need to worry about cleanups overwriting the source location, it > usually won't be possible because we won't be running cleanups. In fact, if > there's a catch-all or a terminate scope, the UB scope won't be reachable, so > the only situation where we have to worry about cleanups being run before we > reach a UB scope is if there are cleanups inside a *partial* catch clause, > like catch (const std::exception &e). And in principle we have the > information up front from the EH selector to know whether we're going to end > up in a catch handler, so we could actually just check immediately at the > start of the landing pad and then trigger UBSan. The more i stare at this. The more worried i am. Is the idea that, when we are in termination/UB EH scope, we no longer have to play by the RAII rules, and destructors, for the moment, are no-ops and are side-effect free? Consider: ``` #include <cstdlib> #include <cstdio> void will_throw() { throw int(42); } void might_throw() { } struct abort_on_dtor { abort_on_dtor() {} [[noreturn]] ~abort_on_dtor() { printf("in ~abort_on_dtor()\n"); abort(); } }; void test0() noexcept { abort_on_dtor scope; will_throw(); might_throw(); } __attribute__ ((pure)) void test1() { abort_on_dtor scope; will_throw(); might_throw(); } int main(int argc, char* argv[]) { printf("argc = %i\n", argc); if(argc == 1) abort_on_dtor scope; if(argc == 2) test0(); if(argc == 3) test1(); return 0; } ``` You'd think, if call to `test0()`/`test1()` unwinds, we'd first destruct `abort_on_dtor`, and termination/UB no longer happens. But i guess termination/UB time-traveled/"happened" the moment `test0()`/`test1()` unwinds? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D137381/new/ https://reviews.llvm.org/D137381 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits