On Tue, 9 Jan 2018, Richard Earnshaw (lists) wrote:
> > Sorry, I don't follow. On ARM, it is surprising that CSEL-CSDB-LDR sequence
> > wouldn't work (applying CSEL to the address rather than loaded value), and
> > if it wouldn't, then ARM-specific lowering of the builtin can handle that
> > anyhow, right? (by spilling the pointer)
> 
> The load has to feed /in/ to the csel/csdb sequence, not come after it.

Again, I'm sorry, but I have to insist that what you're saying here contradicts
the documentation linked from https://developer.arm.com/support/security-update
The PDF currently says, in "Details of the CSDB barrier":

    Until the barrier completes:
    1) For any load, store, data or instruction preload, RW2, appearing in
    program order *after the barrier* [...]

    2) For any indirect branch (B2), appearing in program order
    *after the barrier* [...]

    [...] the speculative execution of RW2/B2 does not influence the
    allocations of entries in a cache [...]

It doesn't say anything about the behavior of CSDB being dependent on the loads
encountered prior to it.  (and imho it doesn't make sense for a hardware
implementation to work that way)

> As I explained to Bernd last night, I think this is likely be unsafe.
> If there's some control path before __builtin_nontransparent that allows
> 'predicate' to be simplified (eg by value range propagation), then your
> guard doesn't protect against the speculation that you think it does.
> Changing all the optimizers to guarantee that wouldn't happen (and
> guaranteeing that all future optimizers won't introduce new problems of
> that nature) is, I suspect, very non-trivial.

But note that in that case the compiler could have performed the same
simplification in the original code as well, emitting straight-line machine code
lacking speculatively executable parts in the first place.

Alexander

Reply via email to