[dpdk-dev] [RFC] Generic flow director/filtering/classification API

Adrien Mazarguil adrien.mazarguil at 6wind.com
Wed Jul 20 12:41:15 CEST 2016


Hi Wenzhuo,

On Wed, Jul 20, 2016 at 02:16:51AM +0000, Lu, Wenzhuo wrote:
[...]
> > So, today an application cannot combine N-tuple and FDIR flow rules and get a
> > reliable outcome, unless it is designed for specific devices with a known
> > behavior.
> > 
> > > What's the right behavior of PMD if APP want to create a flow director rule
> > which has a higher or even equal priority than an existing n-tuple rule? Should
> > PMD return fail?
> > 
> > First remember applications only deal with the generic API, PMDs are
> > responsible for choosing the most appropriate HW implementation to use
> > according to the requested flow rules (FDIR, N-tuple or anything else).
> > 
> > For the specific case of FDIR vs N-tuple, if the underlying HW supports both I do
> > not see why the PMD would create a N-tuple rule. Doesn't FDIR support
> > everything N-tuple can do and much more?
> Talking about the filters, fdir can cover n-tuple. I think that's why i40e only supports fdir but not n-tuple. But n-tuple has its own highlight. As we know, at least on intel NICs, fdir only supports per device mask. But n-tuple can support per rule mask.
> As every pattern has spec and mask both, we cannot guarantee the masks are same. I think ixgbe will try to use n-tuple first if can. Because even the masks are different, we can support them all.

OK, makes sense. In that case existing rules may indeed prevent subsequent
ones from getting created if their priority is wrong. I do not think there
is a way around that if the application needs this exact ordering.

> > Assuming such a thing happened anyway, that the PMD had to create a rule
> > using a high priority filter type and that the application requests the creation of a
> > rule that can only be done using a lower priority filter type, but also requested a
> > higher priority for that rule, then yes, it should obviously fail.
> > 
> > That is, unless the PMD can perform some kind of workaround to have both.
> > 
> > > If so, do we need more fail reasons? According to this RFC, I think we need
> > return " EEXIST: collision with an existing rule. ", but it's not very clear, APP
> > doesn't know the problem is priority, maybe more detailed reason is helpful.
> > 
> > Possibly, I've defined a basic set of errors, there are quite a number of errno
> > values to choose from. However I think we should not define too many values.
> > In my opinion the basic set covers every possible failure:
> > 
> > - EINVAL: invalid format, rule is broken or cannot be understood by the PMD
> >   anyhow.
> > 
> > - ENOTSUP: pattern/actions look fine but something in the requested rule is
> >   not supported and thus cannot be applied.
> > 
> > - EEXIST: pattern/actions are fine and could have been applied if only some
> >   other rule did not prevent the PMD to do it (I see it as the closest thing
> >   to "ETOOBAD" which unfortunately does not exist).
> > 
> > - ENOMEM: like EEXIST, except it is due to the lack of resources not because
> >   of another rule. I wasn't sure which of ENOMEM or ENOSPC was better but
> >   settled on ENOMEM as it is well known. Still open to debate.
> > 
> > Errno values are only useful to get a rough idea of the reason, and another
> > mechanism is needed to pinpoint the exact problem for debugging/reporting
> > purposes, something like:
> > 
> >  enum rte_flow_error_type {
> >      RTE_FLOW_ERROR_TYPE_NONE,
> >      RTE_FLOW_ERROR_TYPE_UNKNOWN,
> >      RTE_FLOW_ERROR_TYPE_PRIORITY,
> >      RTE_FLOW_ERROR_TYPE_PATTERN,
> >      RTE_FLOW_ERROR_TYPE_ACTION,
> >  };
> > 
> >  struct rte_flow_error {
> >      enum rte_flow_error_type type;
> >      void *offset; /* Points to the exact pattern item or action. */
> >      const char *message;
> >  };
> When we are using a CLI and it fails, normally it will let us know which parameter is not appropriate. So, I think it’s a good idea to have this error structure :)

Agreed.

> > Then either provide an optional struct rte_flow_error pointer to
> > rte_flow_validate(), or a separate function (rte_flow_analyze()?), since
> > processing this may be quite expensive and applications may not care about the
> > exact reason.
> Agree the processing may be too expensive. Maybe we can say it's optional to return error details. And that's a good question that what APP should do if creating the rule fails. I believe normally it will choose handle the rule by itself. But I think it's not bad to feedback more. Or even the APP want to adjust the rules, it cannot be an option for lack of info.

All right then, I'll add it to the specification.

 int
 rte_flow_validate(uint8_t port_id,
                   const struct rte_flow_pattern *pattern,
                   const struct rte_flow_actions *actions,
                   struct rte_flow_error *error);

With error possibly NULL if the application does not care. Is it fine for
you?

[...]
> > > > > > - PMDs, not applications, are responsible for maintaining flow rules
> > > > > >   configuration when stopping and restarting a port or performing other
> > > > > >   actions which may affect them. They can only be destroyed explicitly.
> > > > > Don’t understand " They can only be destroyed explicitly."
> > > >
> > > > This part says that as long as an application has not called
> > > > rte_flow_destroy() on a flow rule, it never disappears, whatever
> > > > happens to the port (stopped, restarted). The application is not
> > > > responsible for re-creating rules after that.
> > > >
> > > > Note that according to the specification, this may translate to not
> > > > being able to stop a port as long as a flow rule is present,
> > > > depending on how nice the PMD intends to be with applications.
> > > > Implementation can be done in small steps with minimal amount of code on
> > the PMD side.
> > > Does it mean PMD should store and maintain all the rules? Why not let rte do
> > that? I think if PMD maintain all the rules, it means every kind of NIC should have
> > a copy of code for the rules. But if rte do that, only one copy of code need to be
> > maintained, right?
> > 
> > I've considered having rules stored in a common format understood at the RTE
> > level and not specific to each PMD and decided that the opaque rte_flow pointer
> > was a better choice for the following reasons:
> > 
> > - Even though flow rules management is done in the control path, processing
> >   must be as fast as possible. Letting PMDs store flow rules using their own
> >   internal representation gives them the chance to achieve better
> >   performance.
> Not quite understand. I think we're talking about maintain the rules by SW. I don’t think there's something need to be optimized according to specific NICs. If we need to optimize the code, I think we need to consider the CPU, OS ... and some common means. I'm wrong?

Perhaps we were talking about different things, here I was only explaining
why rte_flow (the result of creating a flow rule) should be opaque and fully
managed by the PMD. More on the SW side of things below.

> > - An opaque context managed by PMDs would probably have to be stored
> >   somewhere as well anyway.
> > 
> > - PMDs may not need to allocate/store anything at all if they exclusively
> >   rely on HW state for everything. In my opinion, the generic API has enough
> >   constraints for this to work and maintain consistency between flow
> >   rules. Note this is currently how most PMDs implement FDIR and other
> >   filter types.
> Yes, the rules are stored by HW. But considering stop/start the device, the rules in HW will lose. we have to store the rules by SW and re-program them when restarting the device.

Assume a HW capable of keeping flow rules programmed even during a
stop/start cycle (e.g. mlx4/mlx5 may be able to do it from DPDK point of
view), don't you think it is more efficient to standardize on this behavior
and let PMDs restore flow rules for HW that do not support it regardless of
whether it would be done by RTE or the application (SW)?

> And in existing code, we store the filters by SW at least on Intel NICs. But I think we cannot reuse them, because considering the priority and which category of filter should be chosen, I think we need a whole new table for generic API. I think it’s what's designed now, right?

So I understand you'd want RTE to help your PMD keep track of the flow rules
it created?

Nothing wrong with that, all I'm saying is that it should be entirely
optional. RTE should not automatically maintain a list. PMDs have to call
RTE helpers if they need help to maintain a context. These helpers are not
defined in this API yet because it is difficult to know what will be useful
in advance.

> > - RTE can (and will) provide helpers to avoid most of the code redundancy,
> >   PMDs are free to use them or manage everything by themselves.
> > 
> > - Given that the opaque rte_flow pointer associated with a flow rule is to
> >   be stored by the application, PMDs do not even have to keep references to
> >   them.
> Don’t understand. More details?

In an application:

 rte_flow *foo = rte_flow_create(...);

In the above example, foo cannot be dereferenced by the application nor RTE,
only the PMD is aware of its contents. This object can only be used with
rte_flow*() functions.

PMDs are thus free to make this object grow as needed when adding internal
features without breaking any kind of public API/ABI.

What I meant is, given that the application is supposed to store foo
somewhere in order to destroy it later, the PMD does not have to keep track
of that pointer assuming it does not need to access it later on its own for
some reason.

> > - The flow rules format described in this specification (pattern / actions)
> >   will be used by applications directly, and will be free to arrange them in
> >   lists, trees or in any other way if they need to keep flow specifications
> >   around for further processing.
> Who will create the lists, trees or something else? According to previous discussion, I think the APP will program the rules one by one. So if APP organize the rules to lists, trees..., PMD doesn’t know that. 
> And you said " Given that the opaque rte_flow pointer associated with a flow rule is to be stored by the application ". I'm lost here.

I guess that's because we're discussing two different things, flow rule
specifications and flow rule objects. Let me sum it up:

- Flow rule specifications are the patterns/actions combinations provided by
  applications to rte_flow_create(). Applications can store those as needed
  and organize them as they wish (hash, tree, list). Neither PMDs nor RTE
  will do it for them.

- Flow rule objects (struct rte_flow *) are generated when a flow rule is
  created. Applications must keep these around if they want to manipulate
  them later (i.e. destroy or query existing rules).

Then PMDs *may* need to keep and arrange flow rule objects internally for
management purposes. Could be because HW requires it, detecting conflicting
rules, managing priorities and so on. Possible reasons are not described in
this API because these are thought as PMD-specific needs.

> > > When the port is stopped and restarted, rte can reconfigure the rules. Is the
> > concern that PMD may adjust the sequence of the rules according to the priority,
> > so every NIC has a different list of rules? But PMD can adjust them again when
> > rte reconfiguring the rules.
> > 
> > What about PMDs able to stop and restart ports without destroying their own
> > flow rules? If we assume flow rules must be destroyed when stopping a port,
> > these PMDs are needlessly penalized with slower stop/start cycles. Think about
> > it assuming thousands of flow rules.
> I believe the rules maintained by SW should not be destroyed, because they're used to be re-programed when the device starts again.

Do we agree that applications should not care? Flow rules configured before
stopping a port must still be there after restarting it.

What we seem to not agree about is that you think RTE should be responsible
for restoring flow rules of devices that lose them when stopped. I think
doing so is unfair to devices for which it is not the case and not really
nice to applications, so my opinion is that the PMD is responsible for
restoring flow rules however it wants. It is free to use RTE helpers to keep
their track, as long as it's all managed internally.

> > Thus from an application point of view, whatever happens when stopping and
> > restarting a port should not matter. If a flow rule was present before, it must
> > still be present afterwards. If the PMD had to destroy flow rules and re-create
> > them, it does not actually matter if they differ slightly at the HW level, as long as:
> > 
> > - Existing opaque flow rule pointers (rte_flow) are still valid to the PMD
> >   and refer to the same rules.
> > 
> > - The overall behavior of all rules is the same.
> > 
> > The list of rules you think of (patterns / actions) is maintained by applications
> > (not RTE), and only if they need them. RTE would needlessly duplicate this.
> As said before, need more details to understand this. Maybe an example is better :)

The generic format both RTE and applications might understand is the one
described in this API (struct rte_flow_pattern and struct
rte_flow_actions).

If we wanted RTE to maintain some sort of per-port state for flow rule
specifications, it would have to be a copy of these structures arranged
somehow (list or something else).

If we consider that PMDs need to keep a context object associated to a flow
rule (the opaque struct rte_flow *), then RTE would most likely have to
store it along with the flow specification.

Such a list may not be useful to applications (list lookups take time), so
they would implement their own redundant method. They might also require
extra room to attach some application context to flow rules. A generic list
cannot plan for it.

Applications know what they want to do with flow rules and are responsible
for managing them efficiently with RTE out of the way.

I'm not sure if this answered your question, if not, please describe a
scenario where a RTE-managed list of flow rules would be mandatory.

-- 
Adrien Mazarguil
6WIND


More information about the dev mailing list