On 4 July 2014 16:35, Frederic Konrad <fred.kon...@greensocs.com> wrote: > Hi everybody, > > We are experimenting with multi-core QEMU. We have Multiple QEMU cores > running > on multiple host threads and we are now looking at the issue of ‘atomic’ > instructions. > > Our initial thought was to push some sort of set of flags out to the memory > chain (eg: ATOMIC or NORMAL much as the H/W would do). However, > this is a lot of changes and there seems to be a choice of a number of > different ways of doing this.
Yes, I don't think you can emulate load-store exclusive instrucitons like this. (Also, these days hardware doesn't generally use that sort of "lock the bus" signal, which is awkward for SMP; indeed that's why load-store exclusive paired instructions have taken over from SWP on ARM.) > We think the best approach overall is to leave the current mechanisms for > guaranteeing the functionality of e.g. load/store exclusive in place. > That is to say, right now, for instance for ARM, QEMU stores the addr/val of > loads, and compare them to ensure they have not been changed on store. > Effectively it does a load-compair-store for the store. > This is a ‘belt and braces’ implementation of the H/W, but it’s good. Actually what we do at the moment isn't architecturally valid for ARM. If any other core writes the same value to the memory location between the LDREX and the STREX the architecture says that we must fail the STREX, but our implementation does not. The architecture also says that plain stores by other cores should break the lock, and we don't implement that (we only handle STREX specially). I recommend reading the ARM ARM sections on synchronization and exclusive accesses (though they are rather heavy going...) In a multi-threaded TCG world I would be inclined actually to implement this in a manner somewhat closer to what hardware does: on LDREX you mark the page as read-only, and (using a similar method to what we do for watchpoints) arrange that if some other core writes to that page then we un-read-protect it and note that the STREX should fail. > I believe it is valid to say that - so long as each core can guarantee that > the ‘load-compair-store’ is somehow atomic, then it is perfectly satisfactory > for each QEMU core to hold it’s own value for the ‘old value’ etc, and this > mechanism will still work. > > The issue will only be to ensure that the load-compair-store is atomic - and > only in the ‘store exclusive’. You also need some means of ensuring that atomicity is preserved in other cases: for instance for ARM guests with LPAE 64-bit loads and stores must be atomic, but we don't currently guarantee that in TCG. > Overall, this mechanism does not actually mirror the hardware we are > modelling, > so overall we think it would be easier and more re-usable to provide two new > OP’s in the TCG, one to lock a mutex, one to release it. What in particular are you proposing that these mutexes should protect against? I suspect you may want to describe the semantics at a slightly higher level (perhaps "do not allow any other vCPU to execute while these TCG instructions are executing" markers?) It would probably also be useful if you were to sketch out how you would expect this to work for: * a simple atomic operation (eg ARM SWP or some of the x86 LOCK-prefixed insns) * LDREX/STREX or load-locked/store-conditional pairs * any other atomicity requirements that might differ between guest and host, like whether 64-bit accesses are atomic thanks -- PMM