[RFC v3 2/2] ethdev: add API to set process to active or standby

Jerin Jacob jerinjacobk at gmail.com
Wed Dec 21 13:44:58 CET 2022


On Wed, Dec 21, 2022 at 5:35 PM Rongwei Liu <rongweil at nvidia.com> wrote:
>
> Hi Jerin:
>
> BR
> Rongwei
>
> > -----Original Message-----
> > From: Jerin Jacob <jerinjacobk at gmail.com>
> > Sent: Wednesday, December 21, 2022 19:00
> > To: Rongwei Liu <rongweil at nvidia.com>
> > Cc: Matan Azrad <matan at nvidia.com>; Slava Ovsiienko
> > <viacheslavo at nvidia.com>; Ori Kam <orika at nvidia.com>; NBU-Contact-
> > Thomas Monjalon (EXTERNAL) <thomas at monjalon.net>; Ferruh Yigit
> > <ferruh.yigit at amd.com>; Andrew Rybchenko
> > <andrew.rybchenko at oktetlabs.ru>; dev at dpdk.org; Raslan Darawsheh
> > <rasland at nvidia.com>
> > Subject: Re: [RFC v3 2/2] ethdev: add API to set process to active or standby
> >
> > External email: Use caution opening links or attachments
> >
> >
> > On Wed, Dec 21, 2022 at 3:02 PM Rongwei Liu <rongweil at nvidia.com> wrote:
> > >
> > > HI Jerin:
> > >
> >
> > Hi Rongwei
> >
> > > BR
> > > Rongwei
> > >
> > > > -----Original Message-----
> > > > From: Jerin Jacob <jerinjacobk at gmail.com>
> > > > Sent: Wednesday, December 21, 2022 17:13
> > > > To: Rongwei Liu <rongweil at nvidia.com>
> > > > Cc: Matan Azrad <matan at nvidia.com>; Slava Ovsiienko
> > > > <viacheslavo at nvidia.com>; Ori Kam <orika at nvidia.com>; NBU-Contact-
> > > > Thomas Monjalon (EXTERNAL) <thomas at monjalon.net>; Ferruh Yigit
> > > > <ferruh.yigit at amd.com>; Andrew Rybchenko
> > > > <andrew.rybchenko at oktetlabs.ru>; dev at dpdk.org; Raslan Darawsheh
> > > > <rasland at nvidia.com>
> > > > Subject: Re: [RFC v3 2/2] ethdev: add API to set process to active
> > > > or standby
> > > >
> > > > External email: Use caution opening links or attachments
> > > >
> > > >
> > > > On Wed, Dec 21, 2022 at 2:31 PM Rongwei Liu <rongweil at nvidia.com>
> > wrote:
> > > > >
> > > > > Users may want to change the DPDK process to different versions
> > > >
> > > > Different version of DPDK? If there is any ABI change how to support this?
> > > >
> > > There is a new member which was introduced into rte_eth_dev_info but it
> > shouldn’t be ABI breaking since using reserved fields.
> >
> > That is just for rte_eth_dev_info. What about the ABI change in different
> > ethdev structure and rte_flow structures across different DPDK ABI versions.
> >
> Besides this, there is no other ABI changes dependency.
>
> Assume there is a DPDK process A running with version v21.11 and plan to upgrade to
> version v22.11. Let' call v22.11 as process B.

OK. That's a relief. I understand the use case now.

Why not simply use standard DPDK multiprocess model then.
Primary process act as server for slow path API. Secondary process can
come and go(aka can be updated at runtime)
and use as client to update rules via primary-secondray communication mechanism.


>
> Now, process A has been running for long time and has lot of rules configured. It' "active" role per this API definition.
> Process B starts and it should call this API and set itself to "standby" role and user can program the flow rules as they want
> and different NIC vendors may have different recommendations. Nvidia suggests only program process B with group 0' rules now.
>
> The user should sync all desired configurations from process A to process B, and process A starts to yield traffic like "delete all group 0
> rules for Nvidia' NICs" or quit.
> After that process B calls this API and set itself to "active" role, now the hot-upgrade finishes.
>
> > > > > such as hot upgrade.
> > > > > There is a strong requirement to simplify the logic and shorten
> > > > > the traffic downtime as much as possible.
> > > > >
> > > > > This update introduces new rte_eth process role definitions:
> > > > > active or standby.
> > > > >
> > > > > The active role means rules are programmed to HW immediately, and
> > > > > no
> > > >
> > > > Why it has to be specific only to rte_flow rule? If it spedieic to
> > > > rte_flow, why it is in rte_eth_process_ name space?
> > > For now, this design focuses on the flow rule offloading and traffic
> > redirection.
> > > When switching process version, it' important to make sure which
> > application receives and handles the traffic.
> >
> > Changing the DPDK version runtime is just beyond rte_flow driver.
>
> It' not about changing DPDK version but upgrading DPDK from one PMD version to another one.
> Does the preceding example answer your question?
> >
> > > The changing should be effective across all probing eth devices, that' why it
> > was put under rte_eth_process_ (for all rte_eth_dev) name space.
> > > >
> > > > Also, if we are moving the standby, What about the rule whose ABI is
> > > > changed between versions?
> > >
> > > Like the comments mentioned: " Before role transition, all the rules set by
> > the active process should be flushed first. "
> >
> > What happens to rte_flow flow handles for existing ones  which is created with
> > version X?
> > Also What if new version Y has ABI change in rte_flow_pattern and
> > rte_flow_action structure?
> >
> > For me, If DPDK version change is needed, simply reload the application. This
> > API will soon bloat, and it will be a mess if to start handling Different DPDK
> > version which is not ABI compatible at all.
> >
> Yes, you are right. Reloading the application is the easiest way but it may have a long time
> Window that traffic is lost. No traffic arrives at process A or process B.
> We are trying to simplify the reloading logic and minimize the traffic down time as much as possible.
> The approach may differentiate hugely between different NIC vendors, so I think it should be better if
> DPDK can provide an abstract API.
>
> If process A and process B are ABI different, it doesn't matter.
> 1. Call this API with process A means older ABI.
> 2. Call this API with process B means newer ABI.
> It' have process concept and working scope.
>
> >
> >
> >
> > > > > behavior changed. This is the default state.
> > > > > The standby role means rules are queued in the HW. If no active
> > > > > roles alive or back to active, the rules are effective immediately.
> > > > >
> > > > > Signed-off-by: Rongwei Liu <rongweil at nvidia.com>


More information about the dev mailing list