On Wed, Jun 06, 2007 at 04:52:15PM -0700, David Miller wrote: > For the locking is makes a ton of sense. > > If you have sendmsg() calls going on N cpus, would you rather > they: > > 1) All queue up to the single netdev->tx_lock > > or > > 2) All take local per-hw-queue locks > > to transmit the data they are sending? > > I thought this was obvious... guess not :-)
Agreed ++ For my part, I definitely want to see parallel Tx as well as parallel Rx. It's the only thing that makes sense for modern multi-core CPUs. Two warnings flags are raised in my brain though: 1) you need (a) well-designed hardware _and_ (b) a smart driver writer to avoid bottlenecking on internal driver locks. As you can see we have both (a) and (b) for tg3 ;-) But it's up in the air whether a multi-TX-queue scheme can be sanely locked internally on other hardware. At the moment we have to hope Intel gets it right in their driver... 2) I fear that the getting-it-into-the-Tx-queue part will take some thought in order to make this happen, too. Just like you have the SMP/SMT/Multi-core scheduler scheduling various resources, surely we will want some smarts so that sockets are not bouncing wildly across CPUs, absent other factors outside our control. Otherwise you will negate a lot of the value of the nifty multi-TX-lock driver API, by bouncing data across CPUs on each transmit anyway. IOW, you will have to sanely fill each of the TX queues. Jeff - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html