[PATCH V2 3/7] net/mlx5: add new devargs to control probe optimization
Stephen Hemminger
stephen at networkplumber.org
Wed Oct 30 20:05:17 CET 2024
On Wed, 30 Oct 2024 08:16:58 +0000
Slava Ovsiienko <viacheslavo at nvidia.com> wrote:
> Hi,
>
>
> > -----Original Message-----
> > From: Stephen Hemminger <stephen at networkplumber.org>
> > Sent: Tuesday, October 29, 2024 6:07 PM
> > To: Minggang(Gavin) Li <gavinl at nvidia.com>
> > Cc: Slava Ovsiienko <viacheslavo at nvidia.com>; Matan Azrad
> > <matan at nvidia.com>; Ori Kam <orika at nvidia.com>; NBU-Contact-Thomas
> > Monjalon (EXTERNAL) <thomas at monjalon.net>; Dariusz Sosnowski
> > <dsosnowski at nvidia.com>; Bing Zhao <bingz at nvidia.com>; Suanming Mou
> > <suanmingm at nvidia.com>; dev at dpdk.org; Raslan Darawsheh
> > <rasland at nvidia.com>; rongwei liu <rongweil at nvidia.com>
> > Subject: Re: [PATCH V2 3/7] net/mlx5: add new devargs to control probe
> > optimization
> >
> > On Tue, 29 Oct 2024 16:27:25 +0800
> > "Minggang(Gavin) Li" <gavinl at nvidia.com> wrote:
> >
> > > On 10/28/2024 11:47 PM, Stephen Hemminger wrote:
> > > > On Mon, 28 Oct 2024 11:18:18 +0200
> > > > "Minggang Li(Gavin)" <gavinl at nvidia.com> wrote:
> > > >
> > > >> +- ``probe_opt_en`` parameter [int]
> > > >> +
> > > >> + A non-zero value optimizes the probe process, especially for large
> > scale.
> > > >> + PMD will hold the IB device information internally and reuse it.
> > > >> +
> > > >> + By default, the PMD will set this value to 0.
> > > >> +
> > > > Is there ever a case where this should not be used?
> > > >
> > > > It would be better to just detect and use it if available.
> > > > This driver does not need more options...
> > > The new mechanism, which is required by few users, so we would not
> > > break production and with the option we encourage to use new way only
> > > those who actually needs. Once we see the new way is reliable - we
> > > will change the default value.
> >
> > I understand that philosophy but it leads to a maze of technical debt.
>
> This specific case is not about philosophy in general.
>
> We have users with huge number of SFs/VFs configured and experiencing the issues
> with gigantic probing timings (literally - tens of minutes). This story was lasting
> long time, we were trying different approaches, then admitted we had to update kernel,
> etc., and eventually we had things done and it resulted in this series.
>
> The new approach is event driven and based on the handling the new kernel-generated events.
> So, it relies on system-wide environment and might be problematic on some hosts (we do not
> expect too much though).
>
> At the same time, the existing probe approach provides acceptable performance and
> satisfies the vast majority of the users. So, our main objective is not to break anything
> in production (most users), the second objective - to resolve issues of some users with
> configuration specifics (few users). That's why we would prefer to have the devarg
> (with all its cons and pros) and set the devarg default value to false. Later, once the new kernel
> API spreads and we have good production statistics we can consider altering the default
> value to true or obsolete the devarg at all. Does this approach look reasonable?
Ok, was just hoping that all this could be transparent to the users.
Ideally, the driver could detect if the right version of components (rdma, kernel) were available
at run time and just do the fast thing.
More information about the dev
mailing list