[PATCH v2 1/2] net/mlx5: check for no data read in devx interrupt
Kevin Traynor
ktraynor at redhat.com
Tue Feb 10 16:05:55 CET 2026
On 07/02/2026 06:09, Stephen Hemminger wrote:
> On Fri, 6 Feb 2026 17:20:53 +0000
> Kevin Traynor <ktraynor at redhat.com> wrote:
>
>> A busy-loop may occur when there are EPOLLERR, EPOLLHUP or
>> EPOLLRDHUP epoll events for the devx interrupt fd.
>>
>> This may happen if the interrupt fd is deleted, if the device
>> is unbound from mlx5_core kernel driver or if the device is
>> removed by the mlx5 kernel driver as part of LAG setup.
>>
>> When that occurs, there is no data to be read and in the devx
>> interrupt handler an EAGAIN is returned on the first call to
>> devx_get_async_cmd_comp, but this is not checked.
>>
>> As the interrupt is not removed or condition reset, it causes
>> an interrupt processing busy-loop, which leads to the dpdk-intr
>> thread going to 100% CPU.
>>
>> e.g.
>> epoll_wait
>> (6, [{events=EPOLLIN|EPOLLRDHUP, data={u32=28, u64=28}}], 8, -1) = 1
>> read(28, 0x7f1f5c7fc2f0, 40)
>> = -1 EAGAIN (Resource temporarily unavailable)
>> epoll_wait
>> (6, [{events=EPOLLIN|EPOLLRDHUP, data={u32=28, u64=28}}], 8, -1) = 1
>> read(28, 0x7f1f5c7fc2f0, 40)
>> = -1 EAGAIN (Resource temporarily unavailable)
>>
>> Add a check for an EAGAIN return from devx_get_async_cmd_comp on the
>> first read. If that happens, unregister the callback to prevent looping.
>>
>> Bugzilla ID: 1873
>> Fixes: f15db67df09c ("net/mlx5: accelerate DV flow counter query")
>> Cc: stable at dpdk.org
>>
>> Signed-off-by: Kevin Traynor <ktraynor at redhat.com>
>
> AI spotted this, I didn't...
>
>
> Errors:
>
> Line 139: Unnecessary semicolon after closing brace
>
> c
>
> };
>
> Should be:
> c
>
> }
>
> Lines 142-146: Block comment uses incorrect style Block comments in C code should use /* and */ style, not /** which is reserved for documentation comments.
>
> c
>
> /**
> * no data and EAGAIN indicate there is an error or
> * disconnect state. Unregister callback to prevent
> * interrupt busy-looping.
> */
>
> Should be:
> c
>
> /*
> * no data and EAGAIN indicate there is an error or
> * disconnect state. Unregister callback to prevent
> * interrupt busy-looping.
> */
>
> Warnings:
>
> Logic clarity: The variable data_read is set to true inside the while loop but never checked when data WAS read. Consider if data_read is the clearest way to express this condition.
>
Ack above. Thanks.Will be fixed in v3.
More information about the stable
mailing list