[PATCH v4] net/iavf: unregister intr handler before FD close

Zhang, Qi Z qi.z.zhang at intel.com
Mon Sep 11 02:55:40 CEST 2023



> -----Original Message-----
> From: Saurabh Singhal <saurabhs at arista.com>
> Sent: Thursday, September 7, 2023 11:15 AM
> To: Thomas Monjalon <thomas at monjalon.net>; Wu, Jingjing
> <jingjing.wu at intel.com>; Xing, Beilei <beilei.xing at intel.com>
> Cc: dev at dpdk.org; Singhal, Saurabh <saurabhs at arista.com>
> Subject: [PATCH v4] net/iavf: unregister intr handler before FD close
> 
> Unregister VFIO interrupt handler before the interrupt fd gets closed in case
> iavf_dev_init() returns an error.
> 
> dpdk creates a standalone thread named eal-intr-thread for processing
> interrupts for the PCI devices. The interrupt handler callbacks are registered
> by the VF driver(iavf, in this case).
> 
> When we do a PCI probe of the network interfaces, we register an interrupt
> handler, open a vfio-device fd using ioctl, and an eventfd in dpdk. These
> interrupt sources are registered in a global linked list that the eal-intr-thread
> keeps iterating over for handling the interrupts. In our internal testing, we see
> eal-intr-thread crash in these two ways:
> 
> Error adding fd 660 epoll_ctl, Operation not permitted
> 
> or
> 
> Error adding fd 660 epoll_ctl, Bad file descriptor
> 
> epoll_ctl() returns EPERM if the target fd does not support poll.
> It returns EBADF when the epoll fd itself is closed or the target fd is closed.
> 
> When the first type of crash happens, we see that the fd 660 is
> anon_inode:[vfio-device] which does not support poll.
> 
> When the second type of crash happens, we could see from the fd map of
> the crashing process that the fd 660 was already closed.
> 
> This means the said fd has been closed and in certain cases may have been
> reassigned to a different device by the operating system but the eal-intr-
> thread does not know about it.
> 
> We observed that these crashes were always accompanied by an error in
> iavf_dev_init() after rte_intr_callback_register() and
> iavf_enable_irq0() have already happened. In the error path, the
> intr_handle_fd was being closed but the interrupt handler wasn't being
> unregistered.
> 
> The fix is to unregister the interrupt handle in the
> iavf_dev_init() error path.
> 
> Ensure proper cleanup if iavf_security_init() or
> iavf_security_ctx_create() fail. Earlier, we were leaking memory by simply
> returning from iavf_dev_init().

Fixes: 22b123a36d07 ("net/avf: initialize PMD")
Fixes: 6bc987ecb860 ("net/iavf: support IPsec inline crypto")
Cc: stable at dpdk.org

> 
> Signed-off-by: Saurabh Singhal <saurabhs at arista.com>

Acked-by: Qi Zhang <qi.z.zhang at intel.com>

Applied to dpdk-next-net-intel.

Thanks
Qi



More information about the dev mailing list