On Thu, Feb 08, 2018 at 04:34:13PM +0000, Matan Azrad wrote: > Following are the failure steps: > 1. The physical device is removed due to change in one of PF parameters > (e.g. MTU) 2. The interrupt thread flags the device 3. Within 2 seconds > Interrupt thread initializes the actual device removal, then every 2 > seconds it tries to re-sync (plug in) the device. The trials fail as > long as VF parameter mismatches the PF parameter. > 4. A control thread initiates a control operation on failsafe which > initiates this operation on the device. > 5. A race condition occurs between the control thread and interrupt > thread when accessing the device data structures. > > This patch mitigates the race condition in step 5. > > Fixes: a46f8d584eb8 ("net/failsafe: add fail-safe PMD") > Cc: sta...@dpdk.org > > Signed-off-by: Matan Azrad <ma...@mellanox.com> > --- > drivers/net/failsafe/failsafe.c | 2 +- > drivers/net/failsafe/failsafe_eal.c | 2 +- > drivers/net/failsafe/failsafe_ether.c | 2 +- > drivers/net/failsafe/failsafe_ops.c | 26 +++++++++++++++++-------- > drivers/net/failsafe/failsafe_private.h | 34 > ++++++++++++++++++++++++++------- > 5 files changed, 48 insertions(+), 18 deletions(-) > > diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c > index 7b2cdbb..6cdefd0 100644 > --- a/drivers/net/failsafe/failsafe.c > +++ b/drivers/net/failsafe/failsafe.c > @@ -187,7 +187,7 @@ > * If MAC address was provided as a parameter, > * apply to all probed slaves. > */ > - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_PROBED) { > + FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, DEV_PROBED) {
No need for the UNSAFE here. The ports should have been just initialized, and sdev->remove should be 0. If sdev->remove is 1, then it means it has been set already by a plugout event, meaning that rte_eth_dev_default_mac_addr_set should not even be called on it. > ret = rte_eth_dev_default_mac_addr_set(PORT_ID(sdev), > mac); > if (ret) { > diff --git a/drivers/net/failsafe/failsafe_eal.c > b/drivers/net/failsafe/failsafe_eal.c > index c3d6731..b3b9c32 100644 > --- a/drivers/net/failsafe/failsafe_eal.c > +++ b/drivers/net/failsafe/failsafe_eal.c > @@ -126,7 +126,7 @@ > int sdev_ret; > int ret = 0; > > - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_PROBED) { > + FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, DEV_PROBED) { > sdev_ret = rte_eal_hotplug_remove(sdev->bus->name, > sdev->dev->name); > if (sdev_ret) { > diff --git a/drivers/net/failsafe/failsafe_ether.c > b/drivers/net/failsafe/failsafe_ether.c > index ca42376..f2a52c9 100644 > --- a/drivers/net/failsafe/failsafe_ether.c > +++ b/drivers/net/failsafe/failsafe_ether.c > @@ -325,7 +325,7 @@ > struct sub_device *sdev; > uint8_t i; > > - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) > + FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, DEV_ACTIVE) > if (sdev->remove && fs_rxtx_clean(sdev)) { > fs_dev_stats_save(sdev); > fs_dev_remove(sdev); > diff --git a/drivers/net/failsafe/failsafe_ops.c > b/drivers/net/failsafe/failsafe_ops.c > index a7c2dba..3d2cb32 100644 > --- a/drivers/net/failsafe/failsafe_ops.c > +++ b/drivers/net/failsafe/failsafe_ops.c > @@ -181,6 +181,9 @@ > FOREACH_SUBDEV(sdev, i, dev) { > if (sdev->state != DEV_ACTIVE) > continue; > + if (sdev->remove == 1 && PRIV(dev)->state < DEV_STARTED) > + /* Application shouldn't start removed sub-devices. */ > + continue; FOREACH_SUBDEV should already have avoided sub-devices which remove flag is 1. If not, then the fs_err(sdev, ret) stanza right after will let the loop continue, and the port should be handled by the next slave cleanup. > DEBUG("Starting sub_device %d", i); > ret = rte_eth_dev_start(PORT_ID(sdev)); > if (ret) { > @@ -265,10 +268,17 @@ > uint8_t i; > > failsafe_hotplug_alarm_cancel(dev); > - if (PRIV(dev)->state == DEV_STARTED) > + if (PRIV(dev)->state == DEV_STARTED) { > + /* > + * Clean remove flags to allow stop for all sub-devices because > + * there is not hot-plug alarm to stop the removed sub-devices. > + */ > + FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, DEV_STARTED) > + sdev->remove = 0; Why make this conditional to state == DEV_STARTED? > dev->dev_ops->dev_stop(dev); > + } > PRIV(dev)->state = DEV_ACTIVE - 1; > - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) { > + FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, DEV_ACTIVE) { > DEBUG("Closing sub_device %d", i); > rte_eth_dev_close(PORT_ID(sdev)); > sdev->state = DEV_ACTIVE - 1; --> /* * Clean remove flags to allow stop for all sub-devices because * there is no hot-plug alarm to clean the removed sub-devices. */ FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, DEV_ACTIVE) sdev->remove = 0; if (PRIV(dev)->state == DEV_STARTED) dev->dev_ops->dev_stop(dev); PRIV(dev)->state = DEV_ACTIVE - 1; FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) { DEBUG("Closing sub_device %d", i); rte_eth_dev_close(PORT_ID(sdev)); sdev->state = DEV_ACTIVE - 1; > @@ -309,7 +319,7 @@ > if (rxq->event_fd > 0) > close(rxq->event_fd); > dev = rxq->priv->dev; > - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) > + FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, DEV_ACTIVE) No need here, as you would have reset sdev->remove if the port was closing, or it would be dealt with by fs_dev_remove if the alarm is still running. > SUBOPS(sdev, rx_queue_release) > (ETH(sdev)->data->rx_queues[rxq->qid]); > dev->data->rx_queues[rxq->qid] = NULL; > @@ -376,7 +386,7 @@ you really should update your git, it is difficult to verify these changes without the function contexts. > return ret; > rxq->event_fd = intr_handle.efds[0]; > dev->data->rx_queues[rx_queue_id] = rxq; > - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) { > + FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, DEV_ACTIVE) { Why should we setup queues on ports marked for removal? > ret = rte_eth_rx_queue_setup(PORT_ID(sdev), > rx_queue_id, > nb_rx_desc, socket_id, > @@ -493,7 +503,7 @@ > return; > txq = queue; > dev = txq->priv->dev; > - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) > + FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, DEV_ACTIVE) Same as for rx_queue_release: either the device is closing, and the flag has been reset, or the alarm is still running and will take care of this. > SUBOPS(sdev, tx_queue_release) > (ETH(sdev)->data->tx_queues[txq->qid]); Actually, now that I think about it, there seems to be an issue with queues not released on plugout? In fs_dev_remove, we only do the general dev_stop and dev_close on the sub_device. shouldn't we call tx_queue_release as well before? > dev->data->tx_queues[txq->qid] = NULL; > @@ -548,7 +558,7 @@ > txq->info.nb_desc = nb_tx_desc; > txq->priv = PRIV(dev); > dev->data->tx_queues[tx_queue_id] = txq; > - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) { > + FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, DEV_ACTIVE) { Why using the UNSAFE operator for a setup operation? (Same as for rx_queue_setup.) > ret = rte_eth_tx_queue_setup(PORT_ID(sdev), > tx_queue_id, > nb_tx_desc, socket_id, > @@ -663,7 +673,7 @@ > int ret; > > rte_memcpy(stats, &PRIV(dev)->stats_accumulator, sizeof(*stats)); > - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) { > + FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, DEV_ACTIVE) { Why do you want to attempt a stat read on port bound for removal? > struct rte_eth_stats *snapshot = &sdev->stats_snapshot.stats; > uint64_t *timestamp = &sdev->stats_snapshot.timestamp; > > @@ -746,7 +756,7 @@ > > rx_offload_capa = default_infos.rx_offload_capa; > rxq_offload_capa = default_infos.rx_queue_offload_capa; > - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_PROBED) { > + FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, DEV_PROBED) { same here. > rte_eth_dev_info_get(PORT_ID(sdev), > &PRIV(dev)->infos); > rx_offload_capa &= PRIV(dev)->infos.rx_offload_capa; > diff --git a/drivers/net/failsafe/failsafe_private.h > b/drivers/net/failsafe/failsafe_private.h > index f3be152..7ddd63a 100644 > --- a/drivers/net/failsafe/failsafe_private.h > +++ b/drivers/net/failsafe/failsafe_private.h > @@ -244,16 +244,31 @@ int failsafe_eth_lsc_event_callback(uint16_t port_id, > ((sdev)->sid) > > /** > - * Stateful iterator construct over fail-safe sub-devices: > + * Stateful iterator construct over fail-safe sub-devices, > + * including the removed sub-devices: "including sub-devices marked for removal" is more correct here, as the device is not actually removed yet, only scheduled for. > + * s: (struct sub_device *), iterator > + * i: (uint8_t), increment > + * dev: (struct rte_eth_dev *), fail-safe ethdev > + * state: (enum dev_state), minimum acceptable device state > + */ > + Here the same documentation as for other macros: parameters type, quick description of what it does. > +#define FOREACH_SUBDEV_STATE_UNSAFE(s, i, dev, state) \ > + for (s = fs_find_next((dev), 0, state, 0, &i); \ > + s != NULL; \ > + s = fs_find_next((dev), i + 1, state, 0, &i)) > + > +/** > + * Stateful iterator construct over fail-safe sub-devices, > + * except the removed sub-devices: > * s: (struct sub_device *), iterator > * i: (uint8_t), increment > * dev: (struct rte_eth_dev *), fail-safe ethdev > * state: (enum dev_state), minimum acceptable device state > */ > #define FOREACH_SUBDEV_STATE(s, i, dev, state) \ > - for (s = fs_find_next((dev), 0, state, &i); \ > + for (s = fs_find_next((dev), 0, state, 1, &i); \ > s != NULL; \ > - s = fs_find_next((dev), i + 1, state, &i)) > + s = fs_find_next((dev), i + 1, state, 1, &i)) > > /** > * Iterator construct over fail-safe sub-devices: > @@ -262,7 +277,7 @@ int failsafe_eth_lsc_event_callback(uint16_t port_id, > * dev: (struct rte_eth_dev *), fail-safe ethdev > */ > #define FOREACH_SUBDEV(s, i, dev) \ > - FOREACH_SUBDEV_STATE(s, i, dev, DEV_UNDEFINED) > + FOREACH_SUBDEV_STATE_UNSAFE(s, i, dev, DEV_UNDEFINED) No actually, the default case should be using the "SAFE" iterator, so no change needed here. > > /* dev: (struct rte_eth_dev *) fail-safe device */ > #define PREFERRED_SUBDEV(dev) \ > @@ -328,6 +343,7 @@ int failsafe_eth_lsc_event_callback(uint16_t port_id, > fs_find_next(struct rte_eth_dev *dev, > uint8_t sid, > enum dev_state min_state, > + uint8_t check_remove, skip_remove? Seems more descriptive. > uint8_t *sid_out) > { > struct sub_device *subs; > @@ -336,8 +352,12 @@ int failsafe_eth_lsc_event_callback(uint16_t port_id, > subs = PRIV(dev)->subs; > tail = PRIV(dev)->subs_tail; > while (sid < tail) { > - if (subs[sid].state >= min_state) > - break; > + if (subs[sid].state >= min_state) { > + if (check_remove == 0) > + break; > + if (PRIV(dev)->subs[sid].remove == 0) > + break; > + } > sid++; > } > *sid_out = sid; > @@ -376,7 +396,7 @@ int failsafe_eth_lsc_event_callback(uint16_t port_id, > uint8_t i; > > /* Using acceptable device */ > - FOREACH_SUBDEV_STATE(sdev, i, dev, req_state) { > + FOREACH_SUBDEV_STATE_UNSAFE(sdev, i, dev, req_state) { Why should we switch emitting device to one marked for removal? > if (sdev == banned) > continue; > DEBUG("Switching tx_dev to sub_device %d", > -- > 1.8.3.1 > Regards, -- Gaëtan Rivet 6WIND