[dpdk-dev] [RFC] Generic flow director/filtering/classification API

Lu, Wenzhuo wenzhuo.lu at intel.com
Thu Jul 21 05:18:11 CEST 2016
Previous message: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
Next message: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Adrien,

> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil at 6wind.com]
> Sent: Wednesday, July 20, 2016 6:41 PM
> To: Lu, Wenzhuo
> Cc: dev at dpdk.org; Thomas Monjalon; Zhang, Helin; Wu, Jingjing; Rasesh Mody;
> Ajit Khaparde; Rahul Lakkireddy; Jan Medala; John Daley; Chen, Jing D; Ananyev,
> Konstantin; Matej Vido; Alejandro Lucero; Sony Chacko; Jerin Jacob; De Lara
> Guarch, Pablo; Olga Shern
> Subject: Re: [RFC] Generic flow director/filtering/classification API
> 
> Hi Wenzhuo,
> 
> On Wed, Jul 20, 2016 at 02:16:51AM +0000, Lu, Wenzhuo wrote:
> [...]
> > > So, today an application cannot combine N-tuple and FDIR flow rules
> > > and get a reliable outcome, unless it is designed for specific
> > > devices with a known behavior.
> > >
> > > > What's the right behavior of PMD if APP want to create a flow
> > > > director rule
> > > which has a higher or even equal priority than an existing n-tuple
> > > rule? Should PMD return fail?
> > >
> > > First remember applications only deal with the generic API, PMDs are
> > > responsible for choosing the most appropriate HW implementation to
> > > use according to the requested flow rules (FDIR, N-tuple or anything else).
> > >
> > > For the specific case of FDIR vs N-tuple, if the underlying HW
> > > supports both I do not see why the PMD would create a N-tuple rule.
> > > Doesn't FDIR support everything N-tuple can do and much more?
> > Talking about the filters, fdir can cover n-tuple. I think that's why i40e only
> supports fdir but not n-tuple. But n-tuple has its own highlight. As we know, at
> least on intel NICs, fdir only supports per device mask. But n-tuple can support
> per rule mask.
> > As every pattern has spec and mask both, we cannot guarantee the masks are
> same. I think ixgbe will try to use n-tuple first if can. Because even the masks are
> different, we can support them all.
> 
> OK, makes sense. In that case existing rules may indeed prevent subsequent
> ones from getting created if their priority is wrong. I do not think there is a way
> around that if the application needs this exact ordering.
Agree. I don’t see any workaround either. PMD has to return fail sometimes.

> 
> > > Assuming such a thing happened anyway, that the PMD had to create a
> > > rule using a high priority filter type and that the application
> > > requests the creation of a rule that can only be done using a lower
> > > priority filter type, but also requested a higher priority for that rule, then yes,
> it should obviously fail.
> > >
> > > That is, unless the PMD can perform some kind of workaround to have both.
> > >
> > > > If so, do we need more fail reasons? According to this RFC, I
> > > > think we need
> > > return " EEXIST: collision with an existing rule. ", but it's not
> > > very clear, APP doesn't know the problem is priority, maybe more detailed
> reason is helpful.
> > >
> > > Possibly, I've defined a basic set of errors, there are quite a
> > > number of errno values to choose from. However I think we should not
> define too many values.
> > > In my opinion the basic set covers every possible failure:
> > >
> > > - EINVAL: invalid format, rule is broken or cannot be understood by the PMD
> > >   anyhow.
> > >
> > > - ENOTSUP: pattern/actions look fine but something in the requested rule is
> > >   not supported and thus cannot be applied.
> > >
> > > - EEXIST: pattern/actions are fine and could have been applied if only some
> > >   other rule did not prevent the PMD to do it (I see it as the closest thing
> > >   to "ETOOBAD" which unfortunately does not exist).
> > >
> > > - ENOMEM: like EEXIST, except it is due to the lack of resources not because
> > >   of another rule. I wasn't sure which of ENOMEM or ENOSPC was better but
> > >   settled on ENOMEM as it is well known. Still open to debate.
> > >
> > > Errno values are only useful to get a rough idea of the reason, and
> > > another mechanism is needed to pinpoint the exact problem for
> > > debugging/reporting purposes, something like:
> > >
> > >  enum rte_flow_error_type {
> > >      RTE_FLOW_ERROR_TYPE_NONE,
> > >      RTE_FLOW_ERROR_TYPE_UNKNOWN,
> > >      RTE_FLOW_ERROR_TYPE_PRIORITY,
> > >      RTE_FLOW_ERROR_TYPE_PATTERN,
> > >      RTE_FLOW_ERROR_TYPE_ACTION,
> > >  };
> > >
> > >  struct rte_flow_error {
> > >      enum rte_flow_error_type type;
> > >      void *offset; /* Points to the exact pattern item or action. */
> > >      const char *message;
> > >  };
> > When we are using a CLI and it fails, normally it will let us know
> > which parameter is not appropriate. So, I think it’s a good idea to
> > have this error structure :)
> 
> Agreed.
> 
> > > Then either provide an optional struct rte_flow_error pointer to
> > > rte_flow_validate(), or a separate function (rte_flow_analyze()?),
> > > since processing this may be quite expensive and applications may
> > > not care about the exact reason.
> > Agree the processing may be too expensive. Maybe we can say it's optional to
> return error details. And that's a good question that what APP should do if
> creating the rule fails. I believe normally it will choose handle the rule by itself.
> But I think it's not bad to feedback more. Or even the APP want to adjust the
> rules, it cannot be an option for lack of info.
> 
> All right then, I'll add it to the specification.
> 
>  int
>  rte_flow_validate(uint8_t port_id,
>                    const struct rte_flow_pattern *pattern,
>                    const struct rte_flow_actions *actions,
>                    struct rte_flow_error *error);
> 
> With error possibly NULL if the application does not care. Is it fine for you?
Yes, it looks good to me. Thanks for that :)

> 
> [...]
> > > > > > > - PMDs, not applications, are responsible for maintaining flow rules
> > > > > > >   configuration when stopping and restarting a port or performing
> other
> > > > > > >   actions which may affect them. They can only be destroyed explicitly.
> > > > > > Don’t understand " They can only be destroyed explicitly."
> > > > >
> > > > > This part says that as long as an application has not called
> > > > > rte_flow_destroy() on a flow rule, it never disappears, whatever
> > > > > happens to the port (stopped, restarted). The application is not
> > > > > responsible for re-creating rules after that.
> > > > >
> > > > > Note that according to the specification, this may translate to
> > > > > not being able to stop a port as long as a flow rule is present,
> > > > > depending on how nice the PMD intends to be with applications.
> > > > > Implementation can be done in small steps with minimal amount of
> > > > > code on
> > > the PMD side.
> > > > Does it mean PMD should store and maintain all the rules? Why not
> > > > let rte do
> > > that? I think if PMD maintain all the rules, it means every kind of
> > > NIC should have a copy of code for the rules. But if rte do that,
> > > only one copy of code need to be maintained, right?
> > >
> > > I've considered having rules stored in a common format understood at
> > > the RTE level and not specific to each PMD and decided that the
> > > opaque rte_flow pointer was a better choice for the following reasons:
> > >
> > > - Even though flow rules management is done in the control path, processing
> > >   must be as fast as possible. Letting PMDs store flow rules using their own
> > >   internal representation gives them the chance to achieve better
> > >   performance.
> > Not quite understand. I think we're talking about maintain the rules by SW. I
> don’t think there's something need to be optimized according to specific NICs. If
> we need to optimize the code, I think we need to consider the CPU, OS ... and
> some common means. I'm wrong?
> 
> Perhaps we were talking about different things, here I was only explaining why
> rte_flow (the result of creating a flow rule) should be opaque and fully managed
> by the PMD. More on the SW side of things below.
> 
> > > - An opaque context managed by PMDs would probably have to be stored
> > >   somewhere as well anyway.
> > >
> > > - PMDs may not need to allocate/store anything at all if they exclusively
> > >   rely on HW state for everything. In my opinion, the generic API has enough
> > >   constraints for this to work and maintain consistency between flow
> > >   rules. Note this is currently how most PMDs implement FDIR and other
> > >   filter types.
> > Yes, the rules are stored by HW. But considering stop/start the device, the
> rules in HW will lose. we have to store the rules by SW and re-program them
> when restarting the device.
> 
> Assume a HW capable of keeping flow rules programmed even during a
> stop/start cycle (e.g. mlx4/mlx5 may be able to do it from DPDK point of view),
> don't you think it is more efficient to standardize on this behavior and let PMDs
> restore flow rules for HW that do not support it regardless of whether it would
> be done by RTE or the application (SW)?
Didn’t know that. As some NICs have already had the ability to keep the rules during a stop/start cycle, maybe it could be a trend :)

> 
> > And in existing code, we store the filters by SW at least on Intel NICs. But I
> think we cannot reuse them, because considering the priority and which
> category of filter should be chosen, I think we need a whole new table for
> generic API. I think it’s what's designed now, right?
> 
> So I understand you'd want RTE to help your PMD keep track of the flow rules it
> created?
Yes. But as you said before, it’s not a good idea for mlx4/mlx5, because their HW doesn't need SW to re-program the rules after stopping/starting. If we make it a common mechanism, it just wastes time for mlx4/mlx5.

> 
> Nothing wrong with that, all I'm saying is that it should be entirely optional. RTE
> should not automatically maintain a list. PMDs have to call RTE helpers if they
> need help to maintain a context. These helpers are not defined in this API yet
> because it is difficult to know what will be useful in advance.
> 
> > > - RTE can (and will) provide helpers to avoid most of the code redundancy,
> > >   PMDs are free to use them or manage everything by themselves.
> > >
> > > - Given that the opaque rte_flow pointer associated with a flow rule is to
> > >   be stored by the application, PMDs do not even have to keep references to
> > >   them.
> > Don’t understand. More details?
> 
> In an application:
> 
>  rte_flow *foo = rte_flow_create(...);
> 
> In the above example, foo cannot be dereferenced by the application nor RTE,
> only the PMD is aware of its contents. This object can only be used with
> rte_flow*() functions.
> 
> PMDs are thus free to make this object grow as needed when adding internal
> features without breaking any kind of public API/ABI.
> 
> What I meant is, given that the application is supposed to store foo somewhere
> in order to destroy it later, the PMD does not have to keep track of that pointer
> assuming it does not need to access it later on its own for some reason.
> 
> > > - The flow rules format described in this specification (pattern / actions)
> > >   will be used by applications directly, and will be free to arrange them in
> > >   lists, trees or in any other way if they need to keep flow specifications
> > >   around for further processing.
> > Who will create the lists, trees or something else? According to previous
> discussion, I think the APP will program the rules one by one. So if APP organize
> the rules to lists, trees..., PMD doesn’t know that.
> > And you said " Given that the opaque rte_flow pointer associated with a flow
> rule is to be stored by the application ". I'm lost here.
> 
> I guess that's because we're discussing two different things, flow rule
> specifications and flow rule objects. Let me sum it up:
> 
> - Flow rule specifications are the patterns/actions combinations provided by
>   applications to rte_flow_create(). Applications can store those as needed
>   and organize them as they wish (hash, tree, list). Neither PMDs nor RTE
>   will do it for them.
> 
> - Flow rule objects (struct rte_flow *) are generated when a flow rule is
>   created. Applications must keep these around if they want to manipulate
>   them later (i.e. destroy or query existing rules).
Thanks for this clarification. So the specifications can be different with objects, right? The specifications are what the APP wants, the objects are what the APP really gets. As rte_flow_create can fail. Right?

> 
> Then PMDs *may* need to keep and arrange flow rule objects internally for
> management purposes. Could be because HW requires it, detecting conflicting
> rules, managing priorities and so on. Possible reasons are not described in this
> API because these are thought as PMD-specific needs.
Got it.

> 
> > > > When the port is stopped and restarted, rte can reconfigure the
> > > > rules. Is the
> > > concern that PMD may adjust the sequence of the rules according to
> > > the priority, so every NIC has a different list of rules? But PMD
> > > can adjust them again when rte reconfiguring the rules.
> > >
> > > What about PMDs able to stop and restart ports without destroying
> > > their own flow rules? If we assume flow rules must be destroyed when
> > > stopping a port, these PMDs are needlessly penalized with slower
> > > stop/start cycles. Think about it assuming thousands of flow rules.
> > I believe the rules maintained by SW should not be destroyed, because they're
> used to be re-programed when the device starts again.
> 
> Do we agree that applications should not care? Flow rules configured before
> stopping a port must still be there after restarting it.
Yes, agree.

> 
> What we seem to not agree about is that you think RTE should be responsible
> for restoring flow rules of devices that lose them when stopped. I think doing so
> is unfair to devices for which it is not the case and not really nice to applications,
> so my opinion is that the PMD is responsible for restoring flow rules however it
> wants. It is free to use RTE helpers to keep their track, as long as it's all managed
> internally.
What I think is RTE can store the flow rules and recreate them after restarting, in the function like rte_dev_start, so APP knows nothing about it. But according to the discussing above, I think the design doesn't support it, right?
RTE doesn't store the flow rules objects and event it stores them, there's no way designed to re-program the objects. And also considering some HW doesn't need to be re-programed. I think it's OK  to let PMD maintain the rules as the re-programing is a NIC specific requirement.

> 
> > > Thus from an application point of view, whatever happens when
> > > stopping and restarting a port should not matter. If a flow rule was
> > > present before, it must still be present afterwards. If the PMD had
> > > to destroy flow rules and re-create them, it does not actually matter if they
> differ slightly at the HW level, as long as:
> > >
> > > - Existing opaque flow rule pointers (rte_flow) are still valid to the PMD
> > >   and refer to the same rules.
> > >
> > > - The overall behavior of all rules is the same.
> > >
> > > The list of rules you think of (patterns / actions) is maintained by
> > > applications (not RTE), and only if they need them. RTE would needlessly
> duplicate this.
> > As said before, need more details to understand this. Maybe an example
> > is better :)
> 
> The generic format both RTE and applications might understand is the one
> described in this API (struct rte_flow_pattern and struct rte_flow_actions).
> 
> If we wanted RTE to maintain some sort of per-port state for flow rule
> specifications, it would have to be a copy of these structures arranged somehow
> (list or something else).
> 
> If we consider that PMDs need to keep a context object associated to a flow
> rule (the opaque struct rte_flow *), then RTE would most likely have to store it
> along with the flow specification.
> 
> Such a list may not be useful to applications (list lookups take time), so they
> would implement their own redundant method. They might also require extra
> room to attach some application context to flow rules. A generic list cannot plan
> for it.
> 
> Applications know what they want to do with flow rules and are responsible for
> managing them efficiently with RTE out of the way.
> 
> I'm not sure if this answered your question, if not, please describe a scenario
> where a RTE-managed list of flow rules would be mandatory.
Got your point and agree :)

> 
> --
> Adrien Mazarguil
> 6WIND
Previous message: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
Next message: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the dev mailing list