[PATCH v4 3/3] net/mlx5: check devx disconnect/error interrupt events
Slava Ovsiienko
viacheslavo at nvidia.com
Tue Mar 3 17:16:57 CET 2026
> -----Original Message-----
> From: Kevin Traynor <ktraynor at redhat.com>
> Sent: Thursday, February 19, 2026 4:39 PM
> To: dev at dpdk.org
> Cc: NBU-Contact-Thomas Monjalon (EXTERNAL) <thomas at monjalon.net>;
> david.marchand at redhat.com; Dariusz Sosnowski <dsosnowski at nvidia.com>;
> Slava Ovsiienko <viacheslavo at nvidia.com>; hkalra at marvell.com; Kevin
> Traynor <ktraynor at redhat.com>; stable at dpdk.org
> Subject: [PATCH v4 3/3] net/mlx5: check devx disconnect/error interrupt events
>
> A busy-loop may occur when there are disconnect/error events such as
> EPOLLERR, EPOLLHUP or EPOLLRDHUP on Linux for the devx interrupt fd.
>
> This may happen if the interrupt fd is deleted, if the device is unbound from
> mlx5_core kernel driver or if the device is removed by the mlx5 kernel driver as
> part of LAG setup.
>
> As the interrupt is not removed or condition reset, it causes an interrupt
> processing busy-loop, which leads to the dpdk-intr thread going to 100% CPU.
>
> e.g.
> epoll_wait
> (6, [{events=EPOLLIN|EPOLLRDHUP, data={u32=28, u64=28}}], 8, -1) = 1
> read(28, 0x7f1f5c7fc2f0, 40)
> = -1 EAGAIN (Resource temporarily unavailable) epoll_wait
> (6, [{events=EPOLLIN|EPOLLRDHUP, data={u32=28, u64=28}}], 8, -1) = 1
> read(28, 0x7f1f5c7fc2f0, 40)
> = -1 EAGAIN (Resource temporarily unavailable)
>
> In order to prevent a busy-loop use the eal API rte_intr_active_events() to get
> the interrupt events and check for disconnect/error.
>
> If there is a disconnect/error event, unregister the devx callback.
>
> Bugzilla ID: 1873
> Fixes: f15db67df09c ("net/mlx5: accelerate DV flow counter query")
> Cc: stable at dpdk.org
>
> Signed-off-by: Kevin Traynor <ktraynor at redhat.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo at nvidia.com>
More information about the dev
mailing list