[PATCH] [PATCH v3] lib/ethdev: fix segfault in secondary process by validating dev_private pointer

Stephen Hemminger stephen at networkplumber.org
Wed Jul 23 16:22:16 CEST 2025


On Wed, 23 Jul 2025 18:34:04 +0500
Khadem Ullah <14pwcse1224 at uetpeshawar.edu.pk> wrote:

> Hi Ivan, agree. I think we can atleast currently guard all the known
> crashes.
> 
> Sure, I will check the macro and get back to you.
> 
> Thank you!
> 
> On Wed, Jul 23, 2025, 18:19 Ivan Malov <ivan.malov at arknetworks.am> wrote:
> 
> > Hi Khadem,
> >
> > On Wed, 23 Jul 2025, Khadem Ullah wrote:
> >  
> > > In secondary processes, directly accessing 'dev->data->dev_private' can
> > > cause a segmentation fault if the primary process has exited or if the
> > > shared memory is no longer accessible.
> > >
> > > Secondary application not only breaking on device closing,
> > > but also getting segfault when we do "show device info all" from  
> > secondary  
> > > after primary closes.
> > >
> > > This patch adds safety checks while using rte_mem_virt2phy(), with an
> > > unlikely() branch hint to minimize performance impact in the fast path.
> > > This ensures 'dev_private' is still valid before accessing it.
> > >
> > > Fixes: bdad90d12ec8 ("ethdev: change device info get callback to return  
> > int")  
> > > Cc: stable at dpdk.org
> > >
> > > Signed-off-by: Khadem Ullah <14pwcse1224 at uetpeshawar.edu.pk>
> > > ---
> > > lib/ethdev/rte_ethdev.c | 15 ++++++++++++++-
> > > 1 file changed, 14 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> > > index dd7c00bc94..343e156a4f 100644
> > > --- a/lib/ethdev/rte_ethdev.c
> > > +++ b/lib/ethdev/rte_ethdev.c
> > > @@ -4079,6 +4079,13 @@ rte_eth_dev_info_get(uint16_t port_id, struct  
> > rte_eth_dev_info *dev_info)  
> > >
> > >       if (dev->dev_ops->dev_infos_get == NULL)
> > >               return -ENOTSUP;
> > > +     if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
> > > +             unlikely(rte_mem_virt2phy(dev->data->dev_private) ==  
> > RTE_BAD_PHYS_ADDR)) {  
> > > +                     RTE_ETHDEV_LOG_LINE(ERR,
> > > +                     "Secondary: dev_private not accessible (primary  
> > exited?)");  
> > > +                     rte_errno = ENODEV;
> > > +                     return -rte_errno;
> > > +     }
> > >       diag = dev->dev_ops->dev_infos_get(dev, dev_info);
> > >       if (diag != 0) {
> > >               /* Cleanup already filled in device information */
> > > @@ -4307,7 +4314,13 @@ rte_eth_macaddr_get(uint16_t port_id, struct  
> > rte_ether_addr *mac_addr)  
> > >                       port_id);
> > >               return -EINVAL;
> > >       }
> > > -
> > > +     if (rte_eal_process_type() == RTE_PROC_SECONDARY &&
> > > +             (dev->data->mac_addrs == NULL)) {
> > > +                     RTE_ETHDEV_LOG_LINE(ERR,
> > > +                     "Secondary: dev_private not accessible (primary  
> > exited?)");  
> > > +                     rte_errno = ENODEV;
> > > +                     return -rte_errno;
> > > +     }
> > >       rte_ether_addr_copy(&dev->data->mac_addrs[0], mac_addr);
> > >
> > >       rte_eth_trace_macaddr_get(port_id, mac_addr);  
> >
> > I see one more API has been augmented with the check. But community
> > members may
> > still argue this is not robust, as many other APIs will also fail. So,
> > even if
> > the task was to augment as many APIs as possible with the check, then the
> > check
> > would still be required to be factorised/generalised somehow. What do you
> > think?
> >
> > Please also note that there are already macro invocations in many of these
> > APIs,
> > for example, RTE_ETH_VALID_PORTID_OR_ERR_RET. Could be convenient.
> >
> > Thank you.
> >  
> > > --
> > > 2.43.0
> > >
> > >  
> >  

No top posting.

How are you monitoring the primary? Lets fix that


More information about the dev mailing list