[dpdk-dev] [RFC v4 1/4] ethdev: add meter PPS profile

Dumitrescu, Cristian cristian.dumitrescu at intel.com
Wed Mar 3 21:35:25 CET 2021


Hi Matan,

> -----Original Message-----
> From: Matan Azrad <matan at nvidia.com>
> Sent: Tuesday, March 2, 2021 6:10 PM
> To: Dumitrescu, Cristian <cristian.dumitrescu at intel.com>; Li Zhang
> <lizh at nvidia.com>; Dekel Peled <dekelp at nvidia.com>; Ori Kam
> <orika at nvidia.com>; Slava Ovsiienko <viacheslavo at nvidia.com>
> Cc: dev at dpdk.org; NBU-Contact-Thomas Monjalon
> <thomas at monjalon.net>; Raslan Darawsheh <rasland at nvidia.com>;
> mb at smartsharesystems.com; ajit.khaparde at broadcom.com; Yigit, Ferruh
> <ferruh.yigit at intel.com>; Singh, Jasvinder <jasvinder.singh at intel.com>
> Subject: RE: [dpdk-dev] [RFC v4 1/4] ethdev: add meter PPS profile
> 
> Hi Cristian
> 
> Good discussion, thank you for that!
> 
> From: Dumitrescu, Cristian
> > Hi Matan,
> >
> > > -----Original Message-----
> > > From: Matan Azrad <matan at nvidia.com>
> > > Sent: Tuesday, March 2, 2021 12:37 PM
> > > To: Dumitrescu, Cristian <cristian.dumitrescu at intel.com>; Li Zhang
> > > <lizh at nvidia.com>; Dekel Peled <dekelp at nvidia.com>; Ori Kam
> > > <orika at nvidia.com>; Slava Ovsiienko <viacheslavo at nvidia.com>
> > > Cc: dev at dpdk.org; NBU-Contact-Thomas Monjalon
> <thomas at monjalon.net>;
> > > Raslan Darawsheh <rasland at nvidia.com>; mb at smartsharesystems.com;
> > > ajit.khaparde at broadcom.com; Yigit, Ferruh <ferruh.yigit at intel.com>;
> > > Singh, Jasvinder <jasvinder.singh at intel.com>
> > > Subject: RE: [dpdk-dev] [RFC v4 1/4] ethdev: add meter PPS profile
> > >
> > > HI Cristian
> > >
> > > From: Dumitrescu, Cristian
> > > > Hi Matan,
> > > >
> > > > > -----Original Message-----
> > > > > From: Matan Azrad <matan at nvidia.com>
> > > > > Sent: Tuesday, March 2, 2021 7:02 AM
> > > > > To: Dumitrescu, Cristian <cristian.dumitrescu at intel.com>; Li Zhang
> > > > > <lizh at nvidia.com>; Dekel Peled <dekelp at nvidia.com>; Ori Kam
> > > > > <orika at nvidia.com>; Slava Ovsiienko <viacheslavo at nvidia.com>
> > > > > Cc: dev at dpdk.org; NBU-Contact-Thomas Monjalon
> > > <thomas at monjalon.net>;
> > > > > Raslan Darawsheh <rasland at nvidia.com>;
> mb at smartsharesystems.com;
> > > > > ajit.khaparde at broadcom.com; Yigit, Ferruh
> > > > > <ferruh.yigit at intel.com>; Singh, Jasvinder
> > > > > <jasvinder.singh at intel.com>
> > > > > Subject: RE: [dpdk-dev] [RFC v4 1/4] ethdev: add meter PPS profile
> > > > >
> > > > >
> > > > >
> > > > > Hi Cristian
> > > > >
> > > > > Thank you for review, please see inline.
> > > > >
> > > > > From: Dumitrescu, Cristian
> > > > > > > From: dev <dev-bounces at dpdk.org> On Behalf Of Li Zhang
> > > > > <snip>
> > > > > > We had this same problem earlier for the rte_tm.h API, where
> > > > > > people
> > > > > asked to
> > > > > > add support for WRED and shaper rates specified in packets to
> > > > > > the existing
> > > > > byte
> > > > > > rate support. I am more than happy to support adding the same
> > > > > > here, but please let's adopt the same solution here rather than
> > > > > > invent a different approach.
> > > > > >
> > > > > > Please refer to struct rte_tm_wred_params and struct
> > > > > rte_tm_shaper_params
> > > > > > from rte_tm.h: the packets vs. bytes mode is explicitly
> > > > > > specified through
> > > > > the use
> > > > > > of a flag called packet_mode that is added to the WRED and
> > > > > > shaper
> > > profile.
> > > > > > When packet_mode is 0, the profile rates and bucket sizes are
> > > > > > specified in bytes per second and bytes, respectively; when
> > > > > > packet_mode is not 0, the profile rates and bucket sizes are
> > > > > > specified in packets and packets per
> > > > > second,
> > > > > > respectively. The same profile parameters are used, no need to
> > > > > > invent additional algorithms (such as srTCM - packet mode) or
> > > > > > profile data
> > > > > structures.
> > > > > > Can we do the same here, please?
> > > > >
> > > > > This flag approach is very intuitive suggestion and it has advantages.
> > > > >
> > > > > The main problem with the flag approach is that it breaks ABI and API.
> > > > > The profile structure size is changed due to a new field - ABI
> breakage.
> > > > > The user must initialize the flag with zero to get old behavior -
> > > > > API
> > > breakage.
> > > > >
> > > >
> > > > The rte_mtr API is experimental, all the API functions are correctly
> > > > marked with __rte_experimental in rte_mtr.h file, so we can safely
> > > > change the API
> > > and
> > > > the ABI breakage is not applicable here. Therefore, this problem
> > > > does not
> > > exist,
> > > > correct?
> > >
> > > Yes, but still meter is not new API and I know that a lot of user uses
> > > it for a long time.
> > > Forcing them to change while we have good solution that don't force
> > > it, looks me problematic.
> > >
> >
> > Not really, only 3 drivers are currently implementing this API.
> 
> The user is not the PMD, the PMDs are the providers.
> I'm talking about all our customers, all the current DPDK based applications
> like OVS and others (I familiar with at least 4 ConnectX customer applications)
> which use the meter API and I'm sure there are more around the world.
> 
> > Even to these drivers, the required changes are none or extremely small:
> as Ajit
> > was also noting, as the default value of 0 continues to represent the
> existing
> > byte mode, all you have to do is make sure the new flag is set to zero in the
> > profile params structure, which is already done implicitly in most places as
> this
> > structure is initialized to all-zeros.
> 
> Are you sure all the world initialize the struct to 0? and also in this case,
> without new compilation, not all the struct will be zeroes(the old size is
> smaller).
> 
> > A simple search exercise for struct rte_mtr_meter_profile is all that is
> needed.
> > You also agreed the flag approach is very intuitive, hence better and nicer,
> with
> > no additional work needed for you, so why not do it?
> 
> Do you understand that any current application that use the meter API must
> recompile the code of the application? Part of them also probably need to
> set the flag to 0....
> Do you understand also the potential issues for the applications which are
> not aware to the change? Debug time, etc....
> 
> > > > > I don't see issues with Li suggestion, Do you think Li suggestion
> > > > > has critical issues?
> > > >
> > > > It is probably better to keep the rte_mtr and the rte_tm APIs
> > > > aligned, it simplifies the code maintenance and improves the user
> > > > experience, which always pays off in the long run. Both APIs
> > > > configure token buckets in either packet mode or byte mode, and it
> > > > is desirable to have them work in the
> > > same
> > > > way. Also, I think we should avoid duplicating configuration data
> > > > structures
> > > for
> > > > to support essentially the same algorithms (such as srTCM or trTCM)
> > > > if we
> > > can.
> > > >
> > >
> > > Yes, but I don't think this motivation is critical.
> >
> > I really disagree. As API maintainer, making every effort to keep the APIs
> clear
> > and consistent is a critical task for me.
> 
> New pps profile is also clear and simple.
> 
> > We don't want to proliferate the API
> > data structures and parameters if there is a good way to avoid it. Especially
> in
> > cases like this, when the drivers are just beginning to pick up this (still
> > experimental) API,  we have the rare chance to make things right and
> therefore
> > we should do it. Please also keep in mind that, as more feature are added
> to
> > the API, small corner cuts like this one that might not look like a big deal
> now,
> > eventually come back as unnecessary complexity in the drivers themselves.
> 
> I don't see a complexity in the current suggestion.
> 
> > So, please, let's try to keep the quality of the APIs high.
> 
> Also by this way is high.
> 
> 
> Look, the flag approach is also good and makes the job.
> The two approaches are clear, simple and in high quality.
> I don't care which one from the 2 to take but I want to be sure we are all
> understand the pros and cons.
> 
> If you understand my concern on flag approach and still insist to take the flag
> approach we will align.

Yes, thanks for summarizing the pros and cons, I confirm that I do understand your concerns.

Yes, sorry to disappoint you, I still think the packet_mode based approach is better for the long run, as it keeps the APIs clean and consistent. We are not adding new algorithms here, we're just adding a new mode to an existing algorithm, so IMO we should not duplicate configuration data structures and proliferate the number of algorithms artificially.

Yes, I do realize that in some limited cases the users will have to explicitly set the new packet_mode flag to zero or one, in case they need to enable the packet mode, but I think this is an acceptable cost because: (A) This API is clearly marked as experimental; (B) It is better to take a small incremental hit now to keep the APIs in good order rather than taking a bit hit in a few years as more features are added in the wrong way and the APIs become unmanageable.

> 
> And if we so, and we are going to break the API\ABI, we are going to
> introduce new meter policy API soon and there, breaking API can help, lets
> see in other discussion later.
> 

Yes, as you point out API changes are unavoidable as new features are added, we have to manage the API evolution correctly.

> One more point:
> Currently, the meter_id is managed by the user, I think it is better to let the
> PMDs to manage the meter_id.
> 
> Searching the PMD meter handler inside the PMD is very expensive for the
> API call rate when the meter_id is managed by the user.
> 
> Same for profile_id.
> 
> Also all the rte_flow API including the shared action API taking the PMD
> management approach.
> 
> What do you think?
> 

Yes, we have carefully considered and discussed both approaches a few years back when the API was introduced, this is not done by accident :), there are pros and cons for each of them.

If the object IDs are generated by the driver (outputs of the API), then it is the user application that needs to keep track of them, which can be very painful. Basically, for each API object the user application needs to create its own wrapper to store this ID. We basically transfer this problem to the user app.

If the object IDs are generated by the user application (inputs into the API), then we simplify the application by removing and indirection layer. Yes, it is true that this indirection layer now moves into the driver, but we should try to make the life easier for the appl developers as opposed to us, the driver developers. This indirection layer in the driver can be made a bit smarter than just a slow "for" loop; the search operation can be made faster with a small bit of effort, such as keeping this list sorted based on the object ID, splitting this list into buckets (similar to a hash table), etc, right?

Having the user app provide the object ID is especially important in the case of rte_tm API, where we have to deal with a tree of nodes, with thousands of nodes for each level. Having the app to store and manages this tree of IDs is a really bad idea, as the user app needs to mirror the tree of nodes on its side for no real benefit. As an added benefit, the user can generate these IDs using a rule, such as: given the specific path through the tree, the value of the ID can be computed.

But again, as you also mention above, there is a list of pros and cons for every approach, no approach is perfect. We took this approach for the good reasons listed above.

> > > > The flag proposal is actually reducing the amount of work that you
> > > > guys
> > > need to
> > > > do to implement your proposal. There is no negative impact to your
> > > proposal
> > > > and no big change, right?
> > >
> > > Yes you right, but the implementation effect is not our concern.
> > >
> > >
> > > > > > This is a quick summary of the required API changes to add
> > > > > > support for the packet mode, they are minimal:
> > > > > > a) Introduce the packet_mode flag in the profile parameters data
> > > > > structure.
> > > > > > b) Change the description (comment) of the rate and bucket size
> > > > > parameters in
> > > > > > the meter profile parameters data structures to reflect that
> > > > > > their values represents either bytes or packets, depending on
> > > > > > the value of the new flag packet_mode from the same structure.
> > > > > > c) Add the relevant capabilities: just search for "packet" in
> > > > > > the rte_tm.h capabilities data structures and apply the same to
> > > > > > the rte_mtr.h
> > > > > capabilities,
> > > > > > when applicable.
> > > > >
> > > > > > Regards,
> > > > > > Cristian
> > > >
> > > > Regards,
> > > > Cristian
> >
> > Regards,
> > Cristian

Regards,
Cristian


More information about the dev mailing list