On Mon, Sep 05, 2011 at 12:19:46PM +0300, Michael S. Tsirkin wrote: > On Mon, Sep 05, 2011 at 02:43:16PM +1000, David Gibson wrote: > > On Sun, Sep 04, 2011 at 12:16:43PM +0300, Michael S. Tsirkin wrote: > > > On Sun, Sep 04, 2011 at 12:46:35AM +1000, David Gibson wrote: > > > > On Fri, Sep 02, 2011 at 06:45:50PM +0300, Michael S. Tsirkin wrote: > > > > > On Thu, Sep 01, 2011 at 04:31:09PM -0400, Paolo Bonzini wrote: > > > > > > > > > Why not limit the change to ppc then? > > > > > > > > > > > > > > > > Because the bug is masked by the x86 memory model, but it is > > > > > > > > still > > > > > > > > there even there conceptually. It is not really true that x86 > > > > > > > > does > > > > > > > > not need memory barriers, though it doesn't in this case: > > > > > > > > > > > > > > > > http://bartoszmilewski.wordpress.com/2008/11/05/who-ordered-memory-fences-on-an-x86/ > > > > > > > > > > > > > > > > Paolo > > > > > > > > > > > > > > Right. > > > > > > > To summarize, on x86 we probably want wmb and rmb to be compiler > > > > > > > barrier only. Only mb might in theory need to be an mfence. > > > > > > > > > > > > No, wmb needs to be sfence and rmb needs to be lfence. GCC does > > > > > > not provide those, so they should become __sync_synchronize() too, > > > > > > or you should use inline assembly. > > > > > > > > > > > > > But there might be reasons why that is not an issue either > > > > > > > if we look closely enough. > > > > > > > > > > > > Since the ring buffers are not using locked instructions (no xchg > > > > > > or cmpxchg) the barriers simply must be there, even on x86. Whether > > > > > > it works in practice is not interesting, only the formal model is > > > > > > interesting. > > > > > > > > > > > > Paolo > > > > > > > > > > Well, can you describe an issue in virtio that lfence/sfence help > > > > > solve > > > > > in terms of a memory model please? > > > > > Pls note that guest uses smp_ variants for barriers. > > > > > > > > Ok, so, I'm having a bit of trouble with the fact that I'm having to > > > > argue the case that things the protocol requiress to be memory > > > > barriers actually *be* memory barriers on all platforms. > > > > > > > > I mean argue for a richer set of barriers, with per-arch minimal > > > > implementations instead of the large but portable hammer of > > > > sync_synchronize, if you will. > > > > > > That's what I'm saying really. On x86 the richer set of barriers > > > need not insert code at all for both wmb and rmb macros. All we > > > might need is an 'optimization barrier'- e.g. linux does > > > __asm__ __volatile__("": : :"memory") > > > ppc needs something like sync_synchronize there. > > > > But you're approaching this the wrong way around - correctness should > > come first. That is, we should first ensure that there is a > > sufficient memory barrier to satisfy the protocol. Then, *if* there > > is a measurable performance improvement and *if* we can show that a > > weaker barrier is sufficient on a given platform, then we can whittle > > it down to a lighter barrier. > > You are only looking at ppc. But on x86 this code ships in > production. So changes should be made in a way to reduce > a potential for regressions, balancing risk versus potential benefit. > I'm trying to point out a way to do this.
Oh, please. Adding a stronger barrier has a miniscule chance of breaking things. And this in a project that has build-breaking regressions with tedious frequency. > > > > But just leaving them out on x86!? > > > > Seriously, wtf? Do you enjoy having software that works chiefly by > > > > accident? > > > > > > I'm surprised at the controversy too. People seem to argue that > > > x86 cpu does not reorder stores and that we need an sfence between > > > stores to prevent the guest from seeing them out of order, at > > > the same time. > > > > I don't know the x86 storage model well enough to definitively say > > that the barrier is not necessary there - nor to say that it is > > necessary. All I know is that the x86 model is quite strongly > > ordered, and I assume that is why the lack of barrier has not caused > > an observed problem on x86. > > Please review Documentation/memory-barriers.txt as one reference. > then look at how SMP barriers are implemented at various systems. > In particular, note how it says 'Mandatory barriers should not be used > to control SMP effects'. No, again, correctness first; the onus of showing it's safe is on those who want weaker barriers. > > Again, correctness first. sync_synchronize should be a sufficient > > barrier for wmb() on all platforms. If you really don't want it, the > > onus is on you > > Just for fun, I did a quick hack replacing all barriers with mb() > in the userspace virtio test. This is on x386. > > Before: > [mst@tuck virtio]$ sudo time ./virtio_test > spurious wakeus: 0x1da > 24.53user 14.63system 0:41.91elapsed 93%CPU (0avgtext+0avgdata > 464maxresident)k > 0inputs+0outputs (0major+154minor)pagefaults 0swaps > > After: > [mst@tuck virtio]$ sudo time ./virtio_test > spurious wakeus: 0x218 > 33.97user 6.22system 0:42.10elapsed 95%CPU (0avgtext+0avgdata > 464maxresident)k > 0inputs+0outputs (0major+154minor)pagefaults 0swaps > > So user time went up significantly, as expected. Surprisingly the kernel > side started working more efficiently - surprising since > kernel was not changed - with net effect close to evening out. Right. So small overall performance impact, and that's on a dedicated testcase which does nothing but the virtio protocol. I *strongly* suspect the extra cost of the memory barriers will be well and truly lost in the rest of the overhead of the qemu networking code. > So a risk of performance regressions from unnecessary fencing > seems to be non-zero, assuming that time doesn't lie. > This might be worth investigating, but I'm out of time right now. > > > > to show that (a) it's safe to do so and > > (b) it's actually worth it. > > Worth what? I'm asking you to minimuse disruption to other platforms > while you fix ppc. I'm not "fixing ppc". I'm fixing a fundamental flaw in the protocol implementation. _So far_ I've only observed the effects on ppc, but that doesn't mean they don't exist. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson