David S. Miller a écrit :
No, all of your cpus are racing to get the transmit lock of the tg3 driver. Whoever wins the race gets to queue the packet, the others have to back off.
I believe the tg3_tx() holds the tx_lock for too long, in the case 200 or so skb are delivered Maybe adding a netif_stop_queue(dev) at the beginning of this function would help ? Or adding yet another bit in to tell that one qdisc_restart() is in flight.. int qdisc_restart(struct net_device *dev) { struct Qdisc *q = dev->qdisc; struct sk_buff *skb; + if (dev->qrestart_active) + return 1; /* Dequeue packet */ if ((skb = q->dequeue(q)) != NULL) { ... { + dev->qrestart_active = 1; /* And release queue */ spin_unlock(&dev->queue_lock); if (!netif_queue_stopped(dev)) { int ret; if (netdev_nit) dev_queue_xmit_nit(skb, dev); ret = dev->hard_start_xmit(skb, dev); if (ret == NETDEV_TX_OK) { if (!nolock) { dev->xmit_lock_owner = -1; spin_unlock(&dev->xmit_lock); } spin_lock(&dev->queue_lock); + dev->qrestart_active = 0; return -1; } if (ret == NETDEV_TX_LOCKED && nolock) { spin_lock(&dev->queue_lock); + dev->qrestart_active = 0; goto collision; } } /* NETDEV_TX_BUSY - we need to requeue */ /* Release the driver */ if (!nolock) { dev->xmit_lock_owner = -1; spin_unlock(&dev->xmit_lock); } spin_lock(&dev->queue_lock); + dev->qrestart_active = 0; q = dev->qdisc;
It does indeed look stupid that we resend the packet to network taps repeatedly if we need to requeue. I wonder what a clean way is to fix that. Probably the best idea is to grab a reference to the SKB if netdev_nit, before we send it off the the driver, and if the transmit succeeds we actually call dev_queue_xmit_nit(). Actually, that won't work due to skb_header_cloned(). If we pass the packet, or will, to the network taps, we have to make sure skb_header_cloned() returns one else TSO mangling of the TCP and IP headers by the driver will be seen by the network taps. So, this isn't easy to fix at all :)
Oh well ... :( Given that only one skb (the first in the queue) could have the property 'already given to taps', I believe we could use one bit at queue level : If set (by a previous requeue), then the first packet *should not* be resent to taps. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html