On Thu, Nov 02, 2017 at 03:42:03PM +0000, Matan Azrad wrote: > Fail-safe PMD expects to get -ENODEV error value if sub PMD control > command fails because of device removal. > > Make control callbacks return with -ENODEV when the device has > disappeared. > > Signed-off-by: Matan Azrad <ma...@mellanox.com>
I think there are a several inconsistencies regarding the places where mlx4_removed() is used, this could lead to mistakes or redundant calls to this function later on. You have to choose between low-level internal functions (e.g. mlx4_set_sysfs_ulong()) or user-facing ones from the eth_dev_ops interface (e.g. mlx4_dev_set_link_up()), but neither intermediate functions nor a mix of all approaches. Standardizing on low-level functions is not practical as it means you'd have to check for a device removal after each ibv_*() call. Therefore my suggestion is to check it at the highest level, in all functions exposed though mlx4_dev_ops in case of error, even innocuous one like mlx4_stats_get() and those returning void (rte_errno can still be set), all in the name of consistency. The mlx4_removed() documentation should be updated to reflect the places it's supposed to be called as well. All this means a larger patch is necessary. See below for coding style issues. > --- > drivers/net/mlx4/mlx4.h | 1 + > drivers/net/mlx4/mlx4_ethdev.c | 38 ++++++++++++++++++++++++++++++++++---- > drivers/net/mlx4/mlx4_flow.c | 2 ++ > drivers/net/mlx4/mlx4_intr.c | 5 ++++- > drivers/net/mlx4/mlx4_rxq.c | 1 + > drivers/net/mlx4/mlx4_txq.c | 1 + > 6 files changed, 43 insertions(+), 5 deletions(-) > > diff --git a/drivers/net/mlx4/mlx4.h b/drivers/net/mlx4/mlx4.h > index e0a9853..cac9654 100644 > --- a/drivers/net/mlx4/mlx4.h > +++ b/drivers/net/mlx4/mlx4.h > @@ -149,6 +149,7 @@ int mlx4_flow_ctrl_get(struct rte_eth_dev *dev, > struct rte_eth_fc_conf *fc_conf); > int mlx4_flow_ctrl_set(struct rte_eth_dev *dev, > struct rte_eth_fc_conf *fc_conf); > +int mlx4_removed(const struct priv *priv); > > /* mlx4_intr.c */ > > diff --git a/drivers/net/mlx4/mlx4_ethdev.c b/drivers/net/mlx4/mlx4_ethdev.c > index b0acd12..76914b0 100644 > --- a/drivers/net/mlx4/mlx4_ethdev.c > +++ b/drivers/net/mlx4/mlx4_ethdev.c > @@ -312,6 +312,8 @@ > > ret = mlx4_sysfs_write(priv, name, value_str, (sizeof(value_str) - 1)); > if (ret < 0) { > + if (mlx4_removed(priv)) > + ret = -ENODEV; > DEBUG("cannot write %s `%s' (%lu) to sysfs: %s", > name, value_str, value, strerror(rte_errno)); > return ret; > @@ -340,15 +342,19 @@ > > if (sock == -1) { > rte_errno = errno; > - return -rte_errno; > + goto error; > } > ret = mlx4_get_ifname(priv, &ifr->ifr_name); > if (!ret && ioctl(sock, req, ifr) == -1) { > rte_errno = errno; > - ret = -rte_errno; > + close(sock); > + goto error; > } > close(sock); > return ret; > +error: > + mlx4_removed(priv); > + return -rte_errno; > } > > /** > @@ -473,13 +479,17 @@ > if (up) { > err = mlx4_set_flags(priv, ~IFF_UP, IFF_UP); > if (err) > - return err; > + goto error; > } else { > err = mlx4_set_flags(priv, ~IFF_UP, ~IFF_UP); > if (err) > - return err; > + goto error; > } > return 0; > +error: > + if (mlx4_removed(priv)) > + return -ENODEV; > + return err; > } > > /** > @@ -947,6 +957,7 @@ enum rxmode_toggle { > > ifr.ifr_data = (void *)ðpause; > if (mlx4_ifreq(priv, SIOCETHTOOL, &ifr)) { > + mlx4_removed(priv); > ret = rte_errno; > WARN("ioctl(SIOCETHTOOL, ETHTOOL_GPAUSEPARAM)" > " failed: %s", > @@ -1002,6 +1013,7 @@ enum rxmode_toggle { > else > ethpause.tx_pause = 0; > if (mlx4_ifreq(priv, SIOCETHTOOL, &ifr)) { > + mlx4_removed(priv); > ret = rte_errno; > WARN("ioctl(SIOCETHTOOL, ETHTOOL_SPAUSEPARAM)" > " failed: %s", > @@ -1013,3 +1025,21 @@ enum rxmode_toggle { > assert(ret >= 0); > return -ret; > } Missing empty line. > +/** > + * Check if mlx4 device was removed. "mlx4" is a somewhat redundant given PMD name. A separate paragraph should describe where this function is supposed to be called. > + * > + * @param priv > + * Pointer to private structure. > + * > + * @return > + * -ENODEV when device is removed and rte_errno is set, otherwise 0. > + */ > +int > +mlx4_removed(const struct priv *priv) > +{ > + struct ibv_device_attr device_attr; > + > + if (ibv_query_device(priv->ctx, &device_attr) == EIO) > + return -(rte_errno = ENODEV); Although a nice shortcut, coding rules don't allow this. You have to assign rte_errno on its own separate line. My suggestion if you want to avoid a block would be to return 0 directly when != EIO. > + return 0; > +} > diff --git a/drivers/net/mlx4/mlx4_flow.c b/drivers/net/mlx4/mlx4_flow.c > index 8b87b29..606c888 100644 > --- a/drivers/net/mlx4/mlx4_flow.c > +++ b/drivers/net/mlx4/mlx4_flow.c > @@ -1069,6 +1069,8 @@ struct mlx4_drop { > err = errno; > msg = "flow rule rejected by device"; > error: > + if (mlx4_removed(priv)) > + err = ENODEV; > return rte_flow_error_set > (error, err, RTE_FLOW_ERROR_TYPE_HANDLE, flow, msg); > } > diff --git a/drivers/net/mlx4/mlx4_intr.c b/drivers/net/mlx4/mlx4_intr.c > index b17d109..0ebdb28 100644 > --- a/drivers/net/mlx4/mlx4_intr.c > +++ b/drivers/net/mlx4/mlx4_intr.c > @@ -359,7 +359,10 @@ > ret = EINVAL; > } > if (ret) { > - rte_errno = ret; > + if (mlx4_removed(dev->data->dev_private)) > + ret = ENODEV; > + else > + rte_errno = ret; > WARN("unable to disable interrupt on rx queue %d", > idx); > } else { > diff --git a/drivers/net/mlx4/mlx4_rxq.c b/drivers/net/mlx4/mlx4_rxq.c > index 7fe21b6..43dad26 100644 > --- a/drivers/net/mlx4/mlx4_rxq.c > +++ b/drivers/net/mlx4/mlx4_rxq.c > @@ -832,6 +832,7 @@ void mlx4_rss_detach(struct mlx4_rss *rss) > ret = rte_errno; > mlx4_rx_queue_release(rxq); > rte_errno = ret; > + mlx4_removed(priv); > assert(rte_errno > 0); > return -rte_errno; > } > diff --git a/drivers/net/mlx4/mlx4_txq.c b/drivers/net/mlx4/mlx4_txq.c > index a9c5bd2..09bdfd8 100644 > --- a/drivers/net/mlx4/mlx4_txq.c > +++ b/drivers/net/mlx4/mlx4_txq.c > @@ -372,6 +372,7 @@ struct txq_mp2mr_mbuf_check_data { > ret = rte_errno; > mlx4_tx_queue_release(txq); > rte_errno = ret; > + mlx4_removed(priv); > assert(rte_errno > 0); > return -rte_errno; > } > -- > 1.8.3.1 > -- Adrien Mazarguil 6WIND