On Thu, Feb 01, 2018 at 02:29:09PM +0100, Peter Zijlstra wrote: > On Thu, Feb 01, 2018 at 09:27:50PM +0900, Stafford Horne wrote: > > I tried to clarify some of this in the spec v1.2 [0] which help formalize > > some of > > the techniques we used for the SMP implementation. Its probably not > > perfect, > > but I added a section "10. Multicore support" and tried to clarify some > > things > > in section 7 on Atomicity. But it seems I dont cover exactly what are are > > mentioning here. In general: > > > > 1 Secondary cores have memory snooping enabled meaning that any write to a > > cached address will cause the cache line to be invalidated. > > 2 l.swa (store atomic word) implies a store buffer flush. > > What about l.lwa? Can that observe 'old' values, or rather, miss values > stuck in a remote store buffer? > > This will then cause the first l.swa to fail, which, per the above, > would then sync things up? Which means you get that one extra > merry-go-round.
Sorry, I remembered incorrectly, l.lwa also implies a (l.msync) store buffer flush for the local cpu. However, in order to see something stuch in the remote store buffer a flush would need to be inititiated on the remote core. I think that is what we would expect though right? > > 3 l.msync is used to flush the store buffer > > > > Also, during the IPI controller review [1] Marc Z asked many similar > > questions. > > I believe he was ok in the end. > > > > Anyway, > > Thanks for thanks for spotting the issue here. For some reason I remember > > we > > did have an l.msync for our mb(). Let me think about and test out this > > patch > > (and the fix to actually define mb) to see if anything comes up. > > > > Also, I haven't seen any implementations that use WOM. Stefan might know > > better. > > So if the strong model has a store buffer, as I think the above says, > then it is _NOT_ correct for l.msync to be treated as a NOP, it _must_ > flush the store buffer. > > At which point I think your 'strong' model is basically TSO. So it would > be very good to get that spelled out somewhere. Yes, I think the original author did not think of PSO/TSO and store buffers. Its not clear of the authors intention. It should be cleared up. I would say: 1 Weak order model with store buffers is PSO (must implement l.msync) 2 Strong model with store buffers is TSO (must implement l.msync) 3 Implementations without store buffers could be weak or strong? a weak meaning cpu could schedule loads stores out of order l.msync would cause all pending load/store instructions to be retired. b strong meaning loads/stores would happen in instruction order, in this case l.msync could be a no-op as there is no buffering of stores or loads. 1 doesnt exist as far as I know. So its probably better to remove. 2 is what we have now in mor1kx. 3.b it possible, but we always have a l.msync implementation. But maybe it doesnt make sense when there is no store buffer. -Stafford