On Mon, May 21, 2012 at 07:16:27PM +1000, Benjamin Herrenschmidt wrote: > On Mon, 2012-05-21 at 19:07 +1000, Benjamin Herrenschmidt wrote: > > > One thing that might alleviate some of your concerns would possibly be > > to "remember" in a global (to be replaced with a thread var eventually) > > the last transfer direction and use a simple test to chose the barrier, > > ie, store + store -> wmb, load + load -> rmb, other -> mb.
But how do you know guest did a store? > > > > But first I'd be curious if some x86 folks could actually measure the > > impact of the patch as I proposed it. That would give us an idea of how > > bad the performance problem is and how far we need to go to address it. > > Another option.... go back to something more like the original patch, > ie, put the barrier in the new dma_* accessors (and provide a > non-barrier one while at it) rather than the low level cpu_physical_* > accessor. > > That makes it a lot easier for selected driver to be converted to avoid > the barrier in thing like code running in the vcpu context. It also > means that virtio doesn't get any added barriers which is what we want > as well. > > IE. Have something along the lines (based on the accessors added by the > iommu series) (using __ kernel-style, feel free to provide a better > naming) > > static inline int __dma_memory_rw( ... args ... ) > { > if (!dma_has_iommu(dma)) { > /* Fast-path for no IOMMU */ > cpu_physical_memory_rw( ... args ...); > return 0; > } else { > return iommu_dma_memory_rw( ... args ...); > } > } > > static inline int dma_memory_rw( ... args ... ) > { > smp_mb(); /* Or use finer grained as discussied earlier */ > > return __dma_memory_rw( ... args ... ) Heh. But don't we need an mb afterwards too? > } > > And corresponding __dma_memory_read/__dma_memory_write (again, feel > free to suggest a more "qemu'ish" naming if you don't like __, it's > a kernel habit, not sure what you guys do in qemu land). > > Cheers, > Ben. And my preference is to first convert everyone to __ variants and carefully switch devices to the barrier version after a bit of consideration. -- MST