[PATCH] lib/ethdev: fix segfault in secondary process by validating dev_private pointer

Ivan Malov ivan.malov at arknetworks.am
Tue Jul 22 21:05:06 CEST 2025


There is a difference between control path and data path. Always has been. Yes,
on data path, DPDK has historically sought better performance, but on the slow
path, checks have typically been implemented, even in the flow API, with the
only exception being "asynchronous flow" APIs, which are meant to be fast-path.

Yes, the idea to have a "secondary process reference counter" in 'rte_device'
to be either guarded with its own lock or accessed atomically by 'rte_dev_probe'
and 'rte_dev_remove' (to increment and decrement/check respectively) as well as
by 'rte_eth_dev_close' and 'rte_eth_dev_reset' (to decrement/check) may not be
a hill to die on, to be honest, and might be wrong, so I have no strong opinion.

What scares me most in this idea is that, one may still end up with certain
entry points overlooked, rendering the whole effort worthless.

Thank you.

On Tue, 22 Jul 2025, Stephen Hemminger wrote:

> On Tue, 22 Jul 2025 22:53:08 +0500
> Khadem Ullah <14pwcse1224 at uetpeshawar.edu.pk> wrote:
>
>> Right, but performance and reliability are both important. While DPDK
>> rightly prioritizes performance, some level of reliability should still be
>> ensured, especially to catch known issues that could lead to instability.
>>
>> On Tue, Jul 22, 2025, 22:38 Stephen Hemminger <stephen at networkplumber.org>
>> wrote:
>>
>>> On Tue, 22 Jul 2025 22:04:32 +0500
>>> Khadem Ullah <14pwcse1224 at uetpeshawar.edu.pk> wrote:
>>>
>>>> Agree, but I think it's also a good practice to guard against known cases
>>>> that are prone to crashes.
>>>
>>>
>>> Right but DPDK chooses performance over API safety.
>>> For example rx/tx burst doesn't check args.
>>>
>>> The point is that as a library, if application is doing something wrong
>>> returning error doesn't always help.
>>>
>
> The problem is that all those values dev->data and private are shared
> between processes without any locking. If the API's are going to MP safe
> then they would require locking. The DPDK has made an explicit decision
> to not use locking in ethdev control or data path.
>
> You can get away with checking for dev->data being NULL on x86 where
> there is data consistency. But on weakly ordered platforms that is not going
> to work.
>


More information about the dev mailing list