On Wed, Oct 10, 2007 at 02:25:50AM -0700, David Miller wrote: > The chip I was working with at the time (UltraSPARC-IIi) compressed > all the linear stores into 64-byte full cacheline transactions via > the store buffer.
That's a pretty old CPU. Conclusions on more modern ones might be different. > In fact, such a thing might not pan out well, because most of the time > you write a single descriptor or two, and that isn't a full cacheline, > which means a read/modify/write is the only coherent way to make such > a write to RAM. x86 WC does R-M-W and is coherent of course. The main difference is just that the result is not cached. When the hardware accesses the cache line then the cache should be also invalidated. > Sure you could batch, but I'd rather give the chip work to do unless > I unequivocably knew I'd have enough pending to fill a cacheline's > worth of descriptors. And since you suggest we shouldn't queue in > software... :-) Hmm, it probably would need to be coupled with batched submission if multiple packets are available you're right. Probably not worth doing explicit queueing though. I suppose it would be an interesting experiment at least. -Andi - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html