> -----Original Message-----
> From: Saurabh Singhal <saura...@arista.com>
> Sent: Monday, September 4, 2023 9:01 PM
> To: Thomas Monjalon <tho...@monjalon.net>; Wu, Jingjing
> <jingjing...@intel.com>; Xing, Beilei <beilei.x...@intel.com>
> Cc: dev@dpdk.org; Singhal, Saurabh <saura...@arista.com>
> Subject: [PATCH] net/iavf: unregister intr handler before FD close
>
> Unregister VFIO interrupt handler before the interrupt fd gets closed in case
> iavf_dev_init() returns an error.
>
> dpdk creates a standalone thread named eal-intr-thread for processing
> interrupts for the PCI devices. The interrupt handler callbacks are registered
> by the VF driver(iavf, in this case).
>
> When we do a PCI probe of the network interfaces, we register an interrupt
> handler, open a vfio-device fd using ioctl, and an eventfd in dpdk. These
> interrupt sources are registered in a global linked list that the
> eal-intr-thread
> keeps iterating over for handling the interrupts. In our internal testing, we
> see
> eal-intr-thread crash in these two ways:
>
> Error adding fd 660 epoll_ctl, Operation not permitted
>
> or
>
> Error adding fd 660 epoll_ctl, Bad file descriptor
>
> epoll_ctl() returns EPERM if the target fd does not support poll.
> It returns EBADF when the epoll fd itself is closed or the target fd is
> closed.
>
> When the first type of crash happens, we see that the fd 660 is
> anon_inode:[vfio-device] which does not support poll.
>
> When the second type of crash happens, we could see from the fd map of
> the crashing process that the fd 660 was already closed.
>
> This means the said fd has been closed and in certain cases may have been
> reassigned to a different device by the operating system but the eal-intr-
> thread does not know about it.
>
> We observed that these crashes were always accompanied by an error in
> iavf_dev_init() after rte_intr_callback_register() and
> iavf_enable_irq0() have already happened. In the error path, the
> intr_handle_fd was being closed but the interrupt handler wasn't being
> unregistered.
>
> The fix is to unregister the interrupt handle in the
> iavf_dev_init() error path.
Thanks for all these explanations!
>
> Signed-off-by: Saurabh Singhal <saura...@arista.com>
> ---
> .mailmap | 1 +
> drivers/net/iavf/iavf_ethdev.c | 15 +++++++++++++++
> 2 files changed, 16 insertions(+)
>
> diff --git a/.mailmap b/.mailmap
> index 864d33ee46..4dac53011b 100644
> --- a/.mailmap
> +++ b/.mailmap
> @@ -1227,6 +1227,7 @@ Satananda Burla <sbu...@marvell.com> Satha Rao
> <skotesh...@marvell.com> <skotesh...@caviumnetworks.com> Satheesh
> Paul <psathe...@marvell.com> Sathesh Edara <sed...@marvell.com>
> +Saurabh Singhal <saura...@arista.com>
> Savinay Dharmappa <savinay.dharma...@intel.com> Scott Branden
> <scott.bran...@broadcom.com> Scott Daniels <dani...@research.att.com>
> diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
> index f2fc5a5621..df87a553db 100644
> --- a/drivers/net/iavf/iavf_ethdev.c
> +++ b/drivers/net/iavf/iavf_ethdev.c
> @@ -133,6 +133,8 @@ static int iavf_dev_rx_queue_intr_enable(struct
> rte_eth_dev *dev,
> uint16_t queue_id);
> static int iavf_dev_rx_queue_intr_disable(struct rte_eth_dev *dev,
> uint16_t queue_id);
> +static void iavf_dev_interrupt_handler(void *param); static inline void
> +iavf_disable_irq0(struct iavf_hw *hw);
> static int iavf_dev_flow_ops_get(struct rte_eth_dev *dev,
> const struct rte_flow_ops **ops);
> static int iavf_set_mc_addr_list(struct rte_eth_dev *dev, @@ -2490,9
> +2492,22 @@ iavf_uninit_vf(struct rte_eth_dev *dev) {
> struct iavf_hw *hw = IAVF_DEV_PRIVATE_TO_HW(dev->data-
> >dev_private);
> struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(dev->data-
> >dev_private);
> + struct rte_pci_device *pci_dev = RTE_ETH_DEV_TO_PCI(dev);
> + struct rte_intr_handle *intr_handle = pci_dev->intr_handle;
>
> iavf_shutdown_adminq(hw);
>
> + if (vf->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_WB_ON_ITR) {
> + /* disable uio intr before callback unregiser */
> + rte_intr_disable(intr_handle);
> +
> + /* unregister callback func from eal lib */
> + rte_intr_callback_unregister(intr_handle,
> + iavf_dev_interrupt_handler, dev);
> + } else {
> + rte_eal_alarm_cancel(iavf_dev_alarm_handler, dev);
> + }
> +
I assume the error handling should follow the reversed order.
So, can we move above code right after the goto label "flow_init_err", like
below?
....
iavf_enable_irq0(hw);
ret = iavf_flow_init(adapter);
if (ret) {
PMD_INIT_LOG(ERR, "Failed to initialize flow");
goto flow_init_err;
}
...
flow_init_err:
iavf_disable_irq0(hw)
if (vf->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_WB_ON_ITR) {
...
}
Btw, I saw missing error handling in iavf_ipsec_crypto_supported branch which
should be fixed, if you are hesitating to apply above fix because this
inconsistent, please ignore this.
Thanks
Qi
> rte_free(vf->vf_res);
> vf->vsi_res = NULL;
> vf->vf_res = NULL;
> --
> 2.41.0