On Wed, 2007-08-08 at 21:42 +0800, Herbert Xu wrote: > On Wed, Aug 08, 2007 at 03:49:00AM -0700, David Miller wrote: > > > > Not because I think it obviates your work, but rather because I'm > > curious, could you test a TSO-in-hardware driver converted to > > batching and see how TSO alone compares to batching for a pure > > TCP workload? > > You could even lower the bar by disabling TSO and enabling > software GSO.
>From my observation for TCP packets slightly above MDU (upto 2K), GSO gives worse performance than non-GSO throughput-wise. Actually this has nothing to do with batching, rather the behavior is consistent with or without batching changes. > > I personally don't think it will help for that case at all as > > TSO likely does better job of coalescing the work _and_ reducing > > bus traffic as well as work in the TCP stack. > > I agree. > I suspect the bulk of the effort is in getting > these skb's created and processed by the stack so that by > the time that they're exiting the qdisc there's not much > to be saved anymore. pktgen shows a clear win if you test the driver path - which is what you should test because thats where the batching changes are. Using TCP or UDP adds other variables[1] that need to be isolated first in order to quantify the effect of batching. For throughput and CPU utilization, the benefit will be clear when there are a lot more flows. cheers, jamal [1] I think there are too many other variables in play unfortunately when you are dealing with a path that starts above the driver and one that covers end to end effect: traffic/app source, system clock sources as per my recent discovery, congestion control algorithms used, tuning of recevier etc. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html