[PATCH v4] net/iavf: unregister intr handler before FD close
Zhang, Qi Z
qi.z.zhang at intel.com
Mon Sep 11 02:55:40 CEST 2023
> -----Original Message-----
> From: Saurabh Singhal <saurabhs at arista.com>
> Sent: Thursday, September 7, 2023 11:15 AM
> To: Thomas Monjalon <thomas at monjalon.net>; Wu, Jingjing
> <jingjing.wu at intel.com>; Xing, Beilei <beilei.xing at intel.com>
> Cc: dev at dpdk.org; Singhal, Saurabh <saurabhs at arista.com>
> Subject: [PATCH v4] net/iavf: unregister intr handler before FD close
>
> Unregister VFIO interrupt handler before the interrupt fd gets closed in case
> iavf_dev_init() returns an error.
>
> dpdk creates a standalone thread named eal-intr-thread for processing
> interrupts for the PCI devices. The interrupt handler callbacks are registered
> by the VF driver(iavf, in this case).
>
> When we do a PCI probe of the network interfaces, we register an interrupt
> handler, open a vfio-device fd using ioctl, and an eventfd in dpdk. These
> interrupt sources are registered in a global linked list that the eal-intr-thread
> keeps iterating over for handling the interrupts. In our internal testing, we see
> eal-intr-thread crash in these two ways:
>
> Error adding fd 660 epoll_ctl, Operation not permitted
>
> or
>
> Error adding fd 660 epoll_ctl, Bad file descriptor
>
> epoll_ctl() returns EPERM if the target fd does not support poll.
> It returns EBADF when the epoll fd itself is closed or the target fd is closed.
>
> When the first type of crash happens, we see that the fd 660 is
> anon_inode:[vfio-device] which does not support poll.
>
> When the second type of crash happens, we could see from the fd map of
> the crashing process that the fd 660 was already closed.
>
> This means the said fd has been closed and in certain cases may have been
> reassigned to a different device by the operating system but the eal-intr-
> thread does not know about it.
>
> We observed that these crashes were always accompanied by an error in
> iavf_dev_init() after rte_intr_callback_register() and
> iavf_enable_irq0() have already happened. In the error path, the
> intr_handle_fd was being closed but the interrupt handler wasn't being
> unregistered.
>
> The fix is to unregister the interrupt handle in the
> iavf_dev_init() error path.
>
> Ensure proper cleanup if iavf_security_init() or
> iavf_security_ctx_create() fail. Earlier, we were leaking memory by simply
> returning from iavf_dev_init().
Fixes: 22b123a36d07 ("net/avf: initialize PMD")
Fixes: 6bc987ecb860 ("net/iavf: support IPsec inline crypto")
Cc: stable at dpdk.org
>
> Signed-off-by: Saurabh Singhal <saurabhs at arista.com>
Acked-by: Qi Zhang <qi.z.zhang at intel.com>
Applied to dpdk-next-net-intel.
Thanks
Qi
More information about the dev
mailing list