> -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Andi Kleen > Sent: Friday, August 19, 2005 9:33 AM > To: John Heffner > Cc: Wael Noureddine; David S. Miller; [EMAIL PROTECTED]; > [EMAIL PROTECTED]; [EMAIL PROTECTED]; > [EMAIL PROTECTED]; netdev@vger.kernel.org; [EMAIL PROTECTED] > Subject: Re: [PATCH] TCP Offload (TOE) - Chelsio > > > I'm personally not a big fan of TSO or TOE. They both add a lot of > > complexity to the network stack, and have other downsides. > The *best* > > way to solve these problems is to engineer technologies to > use larger > > packet sizes. Even at 9k (or better yet 16k) the > advantages of these > > offload schemes is vanishingly small. (Though if a TOE can do > > zero-copy receive, this is a win over what currently exists, but I > > think there are other ways to do that as > > well.) The Linux kernel may not be able to do too much to > encourage > > deployment of larger MTUs, but NIC vendors probably can.
This is already done, both on the hardware and on the OS side. All 10GbE and vast majority GbE NICs and switches/routers support 9k Jumbo frames in a fully interoperable fashion in LAN and WAN environments. 16k MTU is more controversial due to crc32 and other issues, but you are correct 9k mtu (or even 8k, if one wants to stay with 2 page allocation) captures the sweet spot. All Operating systems (except one, and hopefully not for long) support Jumbo frames in the box. So, the hardware capability is there, it is just in some rare cases users can't or unwilling to configure Jumbo frames for the entire path - and this is the case that stateless and state aware NICs (as well as TOE engines) are trying to address. > > Hmm - but is a 9k or 16k packet on the wire not equivalent to > a micro burst? > (actually it is not that micro compared to 1.5k packets). At > least against burstiness they don't help and make things even > worse because the bursts cannot be split up anymore. > > Actually I think there is still much potential to lower the > CPU overhead of individual packets (e.g. by optimizing the > cache latencies of fetching headers and writing TX rings and > using per CPU MSIs aggressively for TX completion > interrupts). So it might be possible to do much better even > with small packets. > Even for TX. For RX there is even more relatively low hanging > fruit given some NIC support (however it will need some > limited amount of state in the NIC) > > -Andi > - > To unsubscribe from this list: send the line "unsubscribe > netdev" in the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html