[dpdk-dev] [RFC v4 1/4] ethdev: add meter PPS profile

Matan Azrad matan at nvidia.com
Thu Mar 4 07:34:18 CET 2021


Hi Cristian

From: Dumitrescu, Cristian
> Hi Matan,
> 
> > -----Original Message-----
> > From: Matan Azrad <matan at nvidia.com>
> > Sent: Tuesday, March 2, 2021 6:10 PM
> > To: Dumitrescu, Cristian <cristian.dumitrescu at intel.com>; Li Zhang
> > <lizh at nvidia.com>; Dekel Peled <dekelp at nvidia.com>; Ori Kam
> > <orika at nvidia.com>; Slava Ovsiienko <viacheslavo at nvidia.com>
> > Cc: dev at dpdk.org; NBU-Contact-Thomas Monjalon <thomas at monjalon.net>;
> > Raslan Darawsheh <rasland at nvidia.com>; mb at smartsharesystems.com;
> > ajit.khaparde at broadcom.com; Yigit, Ferruh <ferruh.yigit at intel.com>;
> > Singh, Jasvinder <jasvinder.singh at intel.com>
> > Subject: RE: [dpdk-dev] [RFC v4 1/4] ethdev: add meter PPS profile
> >
> > Hi Cristian
> >
> > Good discussion, thank you for that!
> >
> > From: Dumitrescu, Cristian
> > > Hi Matan,
> > >
> > > > -----Original Message-----
> > > > From: Matan Azrad <matan at nvidia.com>
> > > > Sent: Tuesday, March 2, 2021 12:37 PM
> > > > To: Dumitrescu, Cristian <cristian.dumitrescu at intel.com>; Li Zhang
> > > > <lizh at nvidia.com>; Dekel Peled <dekelp at nvidia.com>; Ori Kam
> > > > <orika at nvidia.com>; Slava Ovsiienko <viacheslavo at nvidia.com>
> > > > Cc: dev at dpdk.org; NBU-Contact-Thomas Monjalon
> > <thomas at monjalon.net>;
> > > > Raslan Darawsheh <rasland at nvidia.com>; mb at smartsharesystems.com;
> > > > ajit.khaparde at broadcom.com; Yigit, Ferruh
> > > > <ferruh.yigit at intel.com>; Singh, Jasvinder
> > > > <jasvinder.singh at intel.com>
> > > > Subject: RE: [dpdk-dev] [RFC v4 1/4] ethdev: add meter PPS profile
> > > >
> > > > HI Cristian
> > > >
> > > > From: Dumitrescu, Cristian
> > > > > Hi Matan,
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Matan Azrad <matan at nvidia.com>
> > > > > > Sent: Tuesday, March 2, 2021 7:02 AM
> > > > > > To: Dumitrescu, Cristian <cristian.dumitrescu at intel.com>; Li
> > > > > > Zhang <lizh at nvidia.com>; Dekel Peled <dekelp at nvidia.com>; Ori
> > > > > > Kam <orika at nvidia.com>; Slava Ovsiienko
> > > > > > <viacheslavo at nvidia.com>
> > > > > > Cc: dev at dpdk.org; NBU-Contact-Thomas Monjalon
> > > > <thomas at monjalon.net>;
> > > > > > Raslan Darawsheh <rasland at nvidia.com>;
> > mb at smartsharesystems.com;
> > > > > > ajit.khaparde at broadcom.com; Yigit, Ferruh
> > > > > > <ferruh.yigit at intel.com>; Singh, Jasvinder
> > > > > > <jasvinder.singh at intel.com>
> > > > > > Subject: RE: [dpdk-dev] [RFC v4 1/4] ethdev: add meter PPS
> > > > > > profile
> > > > > >
> > > > > >
> > > > > >
> > > > > > Hi Cristian
> > > > > >
> > > > > > Thank you for review, please see inline.
> > > > > >
> > > > > > From: Dumitrescu, Cristian
> > > > > > > > From: dev <dev-bounces at dpdk.org> On Behalf Of Li Zhang
> > > > > > <snip>
> > > > > > > We had this same problem earlier for the rte_tm.h API, where
> > > > > > > people
> > > > > > asked to
> > > > > > > add support for WRED and shaper rates specified in packets
> > > > > > > to the existing
> > > > > > byte
> > > > > > > rate support. I am more than happy to support adding the
> > > > > > > same here, but please let's adopt the same solution here
> > > > > > > rather than invent a different approach.
> > > > > > >
> > > > > > > Please refer to struct rte_tm_wred_params and struct
> > > > > > rte_tm_shaper_params
> > > > > > > from rte_tm.h: the packets vs. bytes mode is explicitly
> > > > > > > specified through
> > > > > > the use
> > > > > > > of a flag called packet_mode that is added to the WRED and
> > > > > > > shaper
> > > > profile.
> > > > > > > When packet_mode is 0, the profile rates and bucket sizes
> > > > > > > are specified in bytes per second and bytes, respectively;
> > > > > > > when packet_mode is not 0, the profile rates and bucket
> > > > > > > sizes are specified in packets and packets per
> > > > > > second,
> > > > > > > respectively. The same profile parameters are used, no need
> > > > > > > to invent additional algorithms (such as srTCM - packet
> > > > > > > mode) or profile data
> > > > > > structures.
> > > > > > > Can we do the same here, please?
> > > > > >
> > > > > > This flag approach is very intuitive suggestion and it has advantages.
> > > > > >
> > > > > > The main problem with the flag approach is that it breaks ABI and API.
> > > > > > The profile structure size is changed due to a new field - ABI
> > breakage.
> > > > > > The user must initialize the flag with zero to get old
> > > > > > behavior - API
> > > > breakage.
> > > > > >
> > > > >
> > > > > The rte_mtr API is experimental, all the API functions are
> > > > > correctly marked with __rte_experimental in rte_mtr.h file, so
> > > > > we can safely change the API
> > > > and
> > > > > the ABI breakage is not applicable here. Therefore, this problem
> > > > > does not
> > > > exist,
> > > > > correct?
> > > >
> > > > Yes, but still meter is not new API and I know that a lot of user
> > > > uses it for a long time.
> > > > Forcing them to change while we have good solution that don't
> > > > force it, looks me problematic.
> > > >
> > >
> > > Not really, only 3 drivers are currently implementing this API.
> >
> > The user is not the PMD, the PMDs are the providers.
> > I'm talking about all our customers, all the current DPDK based
> > applications like OVS and others (I familiar with at least 4 ConnectX
> > customer applications) which use the meter API and I'm sure there are more
> around the world.
> >
> > > Even to these drivers, the required changes are none or extremely small:
> > as Ajit
> > > was also noting, as the default value of 0 continues to represent
> > > the
> > existing
> > > byte mode, all you have to do is make sure the new flag is set to
> > > zero in the profile params structure, which is already done
> > > implicitly in most places as
> > this
> > > structure is initialized to all-zeros.
> >
> > Are you sure all the world initialize the struct to 0? and also in
> > this case, without new compilation, not all the struct will be
> > zeroes(the old size is smaller).
> >
> > > A simple search exercise for struct rte_mtr_meter_profile is all
> > > that is
> > needed.
> > > You also agreed the flag approach is very intuitive, hence better
> > > and nicer,
> > with
> > > no additional work needed for you, so why not do it?
> >
> > Do you understand that any current application that use the meter API
> > must recompile the code of the application? Part of them also probably
> > need to set the flag to 0....
> > Do you understand also the potential issues for the applications which
> > are not aware to the change? Debug time, etc....
> >
> > > > > > I don't see issues with Li suggestion, Do you think Li
> > > > > > suggestion has critical issues?
> > > > >
> > > > > It is probably better to keep the rte_mtr and the rte_tm APIs
> > > > > aligned, it simplifies the code maintenance and improves the
> > > > > user experience, which always pays off in the long run. Both
> > > > > APIs configure token buckets in either packet mode or byte mode,
> > > > > and it is desirable to have them work in the
> > > > same
> > > > > way. Also, I think we should avoid duplicating configuration
> > > > > data structures
> > > > for
> > > > > to support essentially the same algorithms (such as srTCM or
> > > > > trTCM) if we
> > > > can.
> > > > >
> > > >
> > > > Yes, but I don't think this motivation is critical.
> > >
> > > I really disagree. As API maintainer, making every effort to keep
> > > the APIs
> > clear
> > > and consistent is a critical task for me.
> >
> > New pps profile is also clear and simple.
> >
> > > We don't want to proliferate the API data structures and parameters
> > > if there is a good way to avoid it. Especially
> > in
> > > cases like this, when the drivers are just beginning to pick up this
> > > (still
> > > experimental) API,  we have the rare chance to make things right and
> > therefore
> > > we should do it. Please also keep in mind that, as more feature are
> > > added
> > to
> > > the API, small corner cuts like this one that might not look like a
> > > big deal
> > now,
> > > eventually come back as unnecessary complexity in the drivers themselves.
> >
> > I don't see a complexity in the current suggestion.
> >
> > > So, please, let's try to keep the quality of the APIs high.
> >
> > Also by this way is high.
> >
> >
> > Look, the flag approach is also good and makes the job.
> > The two approaches are clear, simple and in high quality.
> > I don't care which one from the 2 to take but I want to be sure we are
> > all understand the pros and cons.
> >
> > If you understand my concern on flag approach and still insist to take
> > the flag approach we will align.
> 
> Yes, thanks for summarizing the pros and cons, I confirm that I do understand
> your concerns.
> 
> Yes, sorry to disappoint you, I still think the packet_mode based approach is
> better for the long run, as it keeps the APIs clean and consistent. We are not
> adding new algorithms here, we're just adding a new mode to an existing
> algorithm, so IMO we should not duplicate configuration data structures and
> proliferate the number of algorithms artificially.

Actually, PPS meter is a new algorithm - you can see that current algorithms RFCs don't talk about PPS.

> Yes, I do realize that in some limited cases the users will have to explicitly set
> the new packet_mode flag to zero or one, in case they need to enable the
> packet mode, but I think this is an acceptable cost because: (A) This API is
> clearly marked as experimental; (B) It is better to take a small incremental hit
> now to keep the APIs in good order rather than taking a bit hit in a few years as
> more features are added in the wrong way and the APIs become
> unmanageable.

I don't think that the current suggestion is in wrong way.
In any case, you insist, we will align.

> > And if we so, and we are going to break the API\ABI, we are going to
> > introduce new meter policy API soon and there, breaking API can help,
> > lets see in other discussion later.
> >
> 
> Yes, as you point out API changes are unavoidable as new features are added,
> we have to manage the API evolution correctly.
> 
> > One more point:
> > Currently, the meter_id is managed by the user, I think it is better
> > to let the PMDs to manage the meter_id.
> >
> > Searching the PMD meter handler inside the PMD is very expensive for
> > the API call rate when the meter_id is managed by the user.
> >
> > Same for profile_id.
> >
> > Also all the rte_flow API including the shared action API taking the
> > PMD management approach.
> >
> > What do you think?
> >
> 
> Yes, we have carefully considered and discussed both approaches a few years
> back when the API was introduced, this is not done by accident :), there are
> pros and cons for each of them.
> 
> If the object IDs are generated by the driver (outputs of the API), then it is the
> user application that needs to keep track of them, which can be very painful.
> Basically, for each API object the user application needs to create its own
> wrapper to store this ID. We basically transfer this problem to the user app.

No exactly\not for all, the app gets the meter_id in the same time it decides it now.
 
> If the object IDs are generated by the user application (inputs into the API),
> then we simplify the application by removing and indirection layer. Yes, it is
> true that this indirection layer now moves into the driver, but we should try to
> make the life easier for the appl developers as opposed to us, the driver
> developers. This indirection layer in the driver can be made a bit smarter than
> just a slow "for" loop; the search operation can be made faster with a small bit
> of effort, such as keeping this list sorted based on the object ID, splitting this list
> into buckets (similar to a hash table), etc, right?

Yes, there are even better solution than hash table from "rate" perspective.

But any solution costs a lot of memory just for this mapping...
When we talked about 4M meters supported(in mlx5 next release) it becomes an issue.  
 
> Having the user app provide the object ID is especially important in the case of
> rte_tm API, where we have to deal with a tree of nodes, with thousands of
> nodes for each level. Having the app to store and manages this tree of IDs is a
> really bad idea, as the user app needs to mirror the tree of nodes on its side for
> no real benefit. As an added benefit, the user can generate these IDs using a
> rule, such as: given the specific path through the tree, the value of the ID can
> be computed.

rte_tm is not rte_mtr - I think meter is different and used differently.
For example, as I know, no one from our dpdk meter customers(at least 5) use TREEs for meter management. OVS, for example, just randomize some meter_id and don't care about it.

Also, all the rte_flow API basics works with PMD ID\handle management approach.

> But again, as you also mention above, there is a list of pros and cons for every
> approach, no approach is perfect. We took this approach for the good reasons
> listed above.

If you familiar with TREE usage with meter, maybe we can combined easily the two approaches in this topic,

meter_id argument can be by reference, if it 0 - PMD set it, if not PMD use it.

> > > > > The flag proposal is actually reducing the amount of work that
> > > > > you guys
> > > > need to
> > > > > do to implement your proposal. There is no negative impact to
> > > > > your
> > > > proposal
> > > > > and no big change, right?
> > > >
> > > > Yes you right, but the implementation effect is not our concern.
> > > >
> > > >
> > > > > > > This is a quick summary of the required API changes to add
> > > > > > > support for the packet mode, they are minimal:
> > > > > > > a) Introduce the packet_mode flag in the profile parameters
> > > > > > > data
> > > > > > structure.
> > > > > > > b) Change the description (comment) of the rate and bucket
> > > > > > > size
> > > > > > parameters in
> > > > > > > the meter profile parameters data structures to reflect that
> > > > > > > their values represents either bytes or packets, depending
> > > > > > > on the value of the new flag packet_mode from the same structure.
> > > > > > > c) Add the relevant capabilities: just search for "packet"
> > > > > > > in the rte_tm.h capabilities data structures and apply the
> > > > > > > same to the rte_mtr.h
> > > > > > capabilities,
> > > > > > > when applicable.
> > > > > >
> > > > > > > Regards,
> > > > > > > Cristian
> > > > >
> > > > > Regards,
> > > > > Cristian
> > >
> > > Regards,
> > > Cristian
> 
> Regards,
> Cristian


More information about the dev mailing list