I encountered an issue with DPDK 2.1.0  which occasionally causes the link 
status interrupt callback not to be called after the interface is started for 
the first time. I traced the problem back to the function 
eth_igb_link_update(), which is used to determine if the link has changed state 
since the previous time it was called. It appears that this function can be 
called simultaneously from two different threads:

(1) From the main application/configuration thread, via rte_eth_dev_start() - 
pointed to by (*dev->dev_ops->link_update)
(2) From the eal interrupt thread, via eth_igb_interrupt_action(), to check if 
the link state has transitioned up or down. The user callback is only executed 
if the link has changed state.

The race condition manifests itself as follows:
 - Main thread configures the interface with link status interrupt (LSI) 
enabled, sets up the queues etc.
 - Main thread calls rte_eth_dev_start. The interface is started and then we 
call eth_igb_link_update()
 - While in this call, the link goes up. Accordingly, we  detect the 
transition, and write the new link state (up) into the global rte_eth_dev struct
 - The interrupt fires, which also drops into the eth_igb_link_update function, 
finds that the global link status has already been set to up (no change)
 - Therefore, the handler thinks the interrupt was spurious, and the callback 
doesn't get called.

I suspect that rte_eth_dev_start shouldn't be checking the link state if 
interrupts are enabled. Would someone mind taking a quick look at the patch 
below?

Thanks!
Tim

--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1300,7 +1300,7 @@ rte_eth_dev_start(uint8_t port_id)

        rte_eth_dev_config_restore(port_id);

-       if (dev->data->dev_conf.intr_conf.lsc != 0) {
+       if (dev->data->dev_conf.intr_conf.lsc == 0) {
                FUNC_PTR_OR_ERR_RET(*dev->dev_ops->link_update, -ENOTSUP);
                (*dev->dev_ops->link_update)(dev, 0);
        }


Reply via email to