On Mon, 2012-05-21 at 18:58 +1000, Benjamin Herrenschmidt wrote: > Except that you mostly don't know at that level what you can or cannot > do, it depends on the caller. We should have the standard accessors do > it the "safe" way and have performance sensitive stuff do map/unmap, at > least that's the result of the discussions with Anthony. > > If we can address the virtqueue_map_sg problem, I think we should be > good, I'll look at it tomorrow. Maybe the right way for now is to remove > the barrier I added to "map" and only leave the one in _rw
One thing that might alleviate some of your concerns would possibly be to "remember" in a global (to be replaced with a thread var eventually) the last transfer direction and use a simple test to chose the barrier, ie, store + store -> wmb, load + load -> rmb, other -> mb. But first I'd be curious if some x86 folks could actually measure the impact of the patch as I proposed it. That would give us an idea of how bad the performance problem is and how far we need to go to address it. Cheers, Ben.