[External] : Re: [dpdk-dev] [PATCH 2/2] net/failsafe: fix primary/secondary mutex
Madhuker Mythri
madhuker.mythri at oracle.com
Mon Oct 17 12:40:12 CEST 2022
> On Wed, Jun 9, 2021, at 12:04, Andrew Rybchenko wrote:
>> On 6/8/21 11:48 PM, Stephen Hemminger wrote:
>> > On Tue, 8 Jun 2021 18:55:17 +0300
>> > Andrew Rybchenko <andrew.rybchenko at oktetlabs.ru> wrote:
>> >
>> >> On 6/8/21 6:42 PM, Stephen Hemminger wrote:
> >>> On Tue, 8 Jun 2021 11:00:37 +0300
> >>> Andrew Rybchenko <andrew.rybchenko at oktetlabs.ru> wrote:
> >>>
> >>>> On 4/19/21 8:08 PM, Thomas Monjalon wrote:
> >>>>> About the title, better to speak about multi-process, it is less
> >>>>> confusing than primary/secondary.
> >>>>>
> >>>>> 15/03/2021 20:27, Stephen Hemminger:
> >>>>>> Set mutex used in failsafe driver to protect when used by both
> >>>>>> primary and secondary process. Without this fix, the failsafe
> >>>>>> lock is not really locking when there are multiple secondary processes.
> >>>>>>
> >>>>>> Bugzilla ID: 662
> >>>>>> Signed-off-by: Stephen Hemminger <stephen at networkplumber.org>
> >>>>>> Fixes: 655fcd68c7d2 ("net/failsafe: fix hotplug races")
> >>>>>> Cc: matan at mellanox.com
> >>>>>
> >>>>> The correct order for above lines is:
> >>>>>
> >>>>> Bugzilla ID: 662
> >>>>> Fixes: 655fcd68c7d2 ("net/failsafe: fix hotplug races")
> >>>>>
> >>>>> Signed-off-by: Stephen Hemminger <stephen at networkplumber.org>
> >>>>>
> >>>>>> ---
> > >>>>>> --- a/drivers/net/failsafe/failsafe.c
> > >>>>>> +++ b/drivers/net/failsafe/failsafe.c
> >>>>>> @@ -140,6 +140,11 @@ fs_mutex_init(struct fs_priv *priv)
> > >>>>>> ERROR("Cannot initiate mutex attributes - %s", strerror(ret));
> > >>>>>> return ret;
> >>>>>> }
> > >>>>>> + /* Allow mutex to protect primary/secondary */
> > >>>>>> + ret = pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
> > >>>>>> + if (ret)
> > >>>>>> + ERROR("Cannot set mutex shared - %s", strerror(ret));
> >>>>>
> > >>>>> Why not returning an error here?
> >>>>
> > >>>> +1
> >>>>
> > >>>> I think it would be safer to return an error here.
> >>>
> > >>> Ok but it never happens.
> >>>
> >>
> > >> May I ask why? 'man pthread_mutexattr_setpshared' says that it is
> > >> possible.
> >>
> >
> > > The glibc implementation of pthread_mutexattr_setpshared is:
> >
> >
> > > int
> > > pthread_mutexattr_setpshared (pthread_mutexattr_t *attr, int
> > > pshared) {
> > > struct pthread_mutexattr *iattr;
> >
> > > int err = futex_supports_pshared (pshared);
> > > if (err != 0)
> > > return err;
> > >
> > > iattr = (struct pthread_mutexattr *) attr;
> > >
> > > if (pshared == PTHREAD_PROCESS_PRIVATE)
> > > iattr->mutexkind &= ~PTHREAD_MUTEXATTR_FLAG_PSHARED;
> > > else
> > > iattr->mutexkind |= PTHREAD_MUTEXATTR_FLAG_PSHARED;
> > >
> > > return 0;
> > > }
> > >
> > > And
> > >
> > > /* FUTEX_SHARED is always supported by the Linux kernel. */ static
> > > __always_inline int futex_supports_pshared (int pshared) {
> > > if (__glibc_likely (pshared == PTHREAD_PROCESS_PRIVATE))
> > > return 0;
> > > else if (pshared == PTHREAD_PROCESS_SHARED)
> > > return 0;
> > > else
> > > return EINVAL;
> > }
> > >
> >
> > > There for the code as written can not return an error.
> > > The check was only because someone could report a bogus issue from a
> > > broken c library.
> > >
> >
> > Many thanks for detailed description.
> > I thought that it is better to follow API definition and it is not
> > that hard to check return code and handle it. Yes, glibc is not the
> > only C library.
> >
>
> On principle the API spec should be respected without assuming a specific implementation.
>
> Another way to think about it is that a future dev having zero knowledge of this thread, reading this code and checking the POSIX manual, will also need to check that usual c lib implementations are unlikely > to generate an error before concluding that this code is alright. It should not be necessary.
>
We are also facing similar issue, while probe of fail-safe PMD b/w multi-process.
rte_eth_dev_attach_secondary(), API return's error, while probing from secondary process in rte_pmd_tap_probe().
So, can you please let us know, if any fix available on such issue ?
Thanks,
Madhuker.
More information about the dev
mailing list