On 01/09/2018 03:47 AM, Richard Earnshaw (lists) wrote: > On 05/01/18 13:08, Alexander Monakov wrote: >> On Fri, 5 Jan 2018, Richard Earnshaw (lists) wrote: >>> This is quite tricky. For ARM we have to have a speculated load. >> >> Sorry, I don't follow. On ARM, it is surprising that CSEL-CSDB-LDR sequence >> wouldn't work (applying CSEL to the address rather than loaded value), and >> if it wouldn't, then ARM-specific lowering of the builtin can handle that >> anyhow, right? (by spilling the pointer) > > The load has to feed /in/ to the csel/csdb sequence, not come after it. > >> >> (on x86 the current Intel's recommendation is to emit LFENCE prior to the >> load) > > That can be supported in the way you expand the builtin. The builtin > expander is given a (MEM (ptr)) , but it's up to the back-end where to > put that in the expanded sequence to materialize the load, so you could > write (sorry, don't know x86 asm very well, but I think this is how > you'd put it) > > lfence > mov (ptr), dest > > with branches around that as appropriate to support the remainder of the > builtin's behaviour. I think the argument is going to be that they don't want the branches around to support the other test + failval semantics. Essentially the same position as IBM has with PPC.
> >> Is the main issue expressing the CSEL condition in the source code? Perhaps >> it is >> possible to introduce >> >> int guard = __builtin_nontransparent(predicate); >> >> if (predicate) >> foo = __builtin_load_no_speculate(&arr[addr], guard); >> >> ... or maybe even >> >> if (predicate) >> foo = arr[__builtin_loadspecbarrier(addr, guard)]; >> >> where internally __builtin_nontransparent is the same as >> >> guard = predicate; >> asm volatile("" : "+g"(guard)); >> >> although admittedly this is not perfect since it forces evaluation of 'guard' >> before the branch. > > As I explained to Bernd last night, I think this is likely be unsafe. > If there's some control path before __builtin_nontransparent that allows > 'predicate' to be simplified (eg by value range propagation), then your > guard doesn't protect against the speculation that you think it does. > Changing all the optimizers to guarantee that wouldn't happen (and > guaranteeing that all future optimizers won't introduce new problems of > that nature) is, I suspect, very non-trivial. Agreed. Whatever PREDICATE happens to be, the compiler is going to go through extreme measures to try and collapse PREDICATE down to a compile-time constant, including splitting paths to the point where PREDICATE is used in the conditional so that on one side it's constant and the other it's non-constant. It seems like this approach is likely to be compromised by the optimizers. Jeff