Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

David Miller Wed, 10 Oct 2007 02:26:12 -0700

From: Andi Kleen <[EMAIL PROTECTED]>
Date: Wed, 10 Oct 2007 11:16:44 +0200


> > A 256 entry TX hw queue fills up trivially on 1GB and 10GB, but if you
> 
> With TSO really? 

Yes.

> > increase the size much more performance starts to go down due to L2
> > cache thrashing.
> 
> Another possibility would be to consider using cache avoidance
> instructions while updating the TX ring (e.g. write combining 
> on x86) 

The chip I was working with at the time (UltraSPARC-IIi) compressed
all the linear stores into 64-byte full cacheline transactions via
the store buffer.

It's true that it would allocate in the L2 cache on a miss, which
is different from your suggestion.

In fact, such a thing might not pan out well, because most of the time
you write a single descriptor or two, and that isn't a full cacheline,
which means a read/modify/write is the only coherent way to make such
a write to RAM.

Sure you could batch, but I'd rather give the chip work to do unless
I unequivocably knew I'd have enough pending to fill a cacheline's
worth of descriptors.  And since you suggest we shouldn't queue in
software... :-)
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ofa-general] Re: [PATCH 2/3][NET_BATCH] net core use batching

Reply via email to