On 22/11/2018 13:27, Jan Beulich wrote: >>>> On 22.11.18 at 13:38, <julien.gr...@arm.com> wrote: >> Hi Andrew, >> >> On 11/22/18 12:23 PM, Andrew Cooper wrote: >>> On 22/11/2018 12:03, Roger Pau Monne wrote: >>>> LLVM code generation can attempt to perform a load from a variable in >>>> the next condition of an expression under certain circumstances, thus >>>> turning the following condition: >>>> >>>> if ( system_state < SYS_STATE_active && opt_bootscrub == BOOTSCRUB_IDLE ) >>>> >>>> Into: >>>> >>>> 0xffff82d080223967 <+103>: cmpl $0x3,0x37b032(%rip) # 0xffff82d08059e9a0 >> <system_state> >>>> 0xffff82d08022396e <+110>: setb -0x29(%rbp) >>>> 0xffff82d080223972 <+114>: cmpl $0x2,0x228a8b(%rip) # 0xffff82d08044c404 >> <opt_bootscrub> >>> You're actually missing two pieces here. If I recall the disassembly >>> correctly, there was a `sete %al`, and an `and %al, -0x29(%rbp)` which >>> had the cumulative effect of calculating `idle_scrub` branchlessly >>> (which is no doubt the intended optimisation). >>> >>>> Such code will trigger a page fault if system_state >= >>>> SYS_STATE_active. >>>> >>>> In order to prevent such optimization signal to the compiler that >>>> accessing opt_bootscrub can have side effects by using ACCESS_ONCE. >>>> This has been reported and discussed with upstream LLVM: >>>> >>>> https://bugs.llvm.org/show_bug.cgi?id=39707 >>>> >>>> I haven't been able to find any other instances of such conditional >>>> expression that uses system_state together with an init variable or >>>> function. >>>> >>>> Signed-off-by: Roger Pau Monné <roger....@citrix.com> >>> To unblock the Clang build, Acked-by: Andrew Cooper >>> <andrew.coop...@citrix.com>, although ideally with the expanded disassembly. >> Actually is it enough in all the case? We are only preventing the >> re-ordering for the compiler. The processor is still free to re-order it >> and load opt_bootscrub before loading system_state. > The processor is fine to issue the load early, but it is not permitted > to raise an exception resulting from that read attempt before the > reading insn is the next one to retire.
I think Julien's point is that without explicitly barriers, CPU0's update to system_state may not be visible on CPU1, even though the mappings have been shot down. Therefore, from the processors point of view, it did everything correctly, and hit a real pagefault. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel