On Wed, 6 Sep 2006 10:58:15 -0700 (PDT) [EMAIL PROTECTED] wrote: > Hello All, > > I have a question about the use of the tx_ring->next_to_use variable in > the e1000. Specifically, I'm wondering about a race between the use of > next_to_use in e1000_xmit_frame and the clearing of next_to_use in > e1000_down via e1000_clean_tx_ring. > > Thread 1 (_xmit) -> first = adapter->tx_ring.next_to_use; > e1000_tx_map(); > Thread 2 (_down) -> e1000_clean_tx_ring(); > tx_ring->next_to_use = 0; > Thread 1 (_xmit) -> e1000_tx_queue(); > > It seems that tx_ring.next_to_use could change between the time the skbuff > is mapped in e1000_tx_map and the time it is reported to the hardware in > e1000_tx_queue. While I don't see any memory leaks or possible oops, it > does seem possible that that an skbuff could be "lost" in the ring as it > will not be queued in the subsequent e1000_queue. > > If the race is possible, perhaps this could be the culprit behind the tx > timeouts we've seen reported in this list? The watchdog will eventually > find the "lost" skbuff and mistakenly think that the hardware transmit has > hung and stop the queue. > > Could one of you plese tell me how this race is avoided, if indeed it is? > > Thanks, > Shaw >
e1000_down calls netif_stop_queue() and that stops transmit requests. It doesn't handle the case of a transmit in flight during the e1000_down. Shouldn't clean_tx_ring acquire tx_ring->tx_lock to avoid that? -- Stephen Hemminger <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html