On 11/07/18 21:46, Jeff Law wrote: > On 07/10/2018 10:43 AM, Richard Earnshaw (lists) wrote: >> On 10/07/18 16:42, Jeff Law wrote: >>> On 07/10/2018 02:49 AM, Richard Earnshaw (lists) wrote: >>>> On 10/07/18 00:13, Jeff Law wrote: >>>>> On 07/09/2018 10:38 AM, Richard Earnshaw wrote: >>>>>> >>>>>> To address all of the above, these patches adopt a new approach, based >>>>>> in part on a posting by Chandler Carruth to the LLVM developers list >>>>>> (https://lists.llvm.org/pipermail/llvm-dev/2018-March/122085.html), >>>>>> but which we have extended to deal with inter-function speculation. >>>>>> The patches divide the problem into two halves. >>>>> We're essentially turning the control dependency into a value that we >>>>> can then use to munge the pointer or the resultant data. >>>>> >>>>>> >>>>>> The first half is some target-specific code to track the speculation >>>>>> condition through the generated code to provide an internal variable >>>>>> which can tell us whether or not the CPU's control flow speculation >>>>>> matches the data flow calculations. The idea is that the internal >>>>>> variable starts with the value TRUE and if the CPU's control flow >>>>>> speculation ever causes a jump to the wrong block of code the variable >>>>>> becomes false until such time as the incorrect control flow >>>>>> speculation gets unwound. >>>>> Right. >>>>> >>>>> So one of the things that comes immediately to mind is you have to run >>>>> this early enough that you can still get to all the control flow and >>>>> build your predicates. Otherwise you have do undo stuff like >>>>> conditional move generation. >>>> >>>> No, the opposite, in fact. We want to run this very late, at least on >>>> Arm systems (AArch64 or AArch32). Conditional move instructions are >>>> fine - they're data-flow operations, not control flow (in fact, that's >>>> exactly what the control flow tracker instructions are). By running it >>>> late we avoid disrupting any of the earlier optimization passes as well. >>> Ack. I looked at the aarch64 implementation after sending my message >>> and it clearly runs very late. >>> >>> I haven't convinced myself that all the work generic parts of the >>> compiler to rewrite and eliminate conditionals is safe. But even if it >>> isn't, you're probably getting enough coverage to drastically reduce the >>> attack surface. I'm going to have to think about the early >>> transformations we make and how they interact here harder. But I think >>> the general approach can dramatically reduce the attack surface. >> >> My argument here would be that we are concerned about speculation that >> the CPU does with the generated program. We're not particularly >> bothered about the abstract machine description it's based upon. As >> long as the earlier transforms lead to a valid translation (it hasn't >> removed a necessary bounds check) then running late is fine. > I'm thinking about obfuscation of the bounds check or the pointer or > turning branchy into straightline code, possibly doing some speculation > in the process, if-conversion and the like. > > For example hoist_adjacent_loads which results in speculative loads and > likely a conditional move to select between the two loaded values. > > Or what if we've done something like > > if (x < maxval) > res = *p; > > And we've turned that into > > > t = *p; > res = (x < maxval) ? t : res;
Hmm, interesting. But for that to be safe, the compiler would have to be able to prove that dereferencing p was safe even if x >= maxval, otherwise the run-time code could fault (so if there's any chance that it could point to something vulnerable, then there must also be a chance that it points to unmapped memory). Given that requirement, I don't think this case can be a specific concern, since the requirement implies that p must already be within some known bounds for the type of object it points to. R. > > > That may be implemented as a conditional move at the RTL level, so > protecting that may be nontrivial. > > In those examples the compiler itself has introduced the speculation. > > I can't find the conditional obfuscation I was looking for, so it's hard > to rule it in our out as potentially problematical. > > WRT pointer obfuscation, we no longer propagate conditional equivalences > very agressively, so it may be a non-issue in the end. > > But again, even with these concerns I think what you're doing cuts down > the attack surface in meaningful ways. > > > >> >> I can't currently conceive a situation where the compiler would be able >> to remove a /necessary/ bounds check that could lead to unsafe >> speculation later on. A redundant bounds check removal shouldn't be a >> problem as the non-redundant check should remain and that will still get >> tracking code added. > It's less about removal and more about either compiler-generated > speculation or obfuscation of the patterns you're looking for. > > > jeff > > > >