[dpdk-dev] [RFC] Generic flow director/filtering/classification API

Adrien Mazarguil adrien.mazarguil at 6wind.com
Wed Aug 10 15:37:55 CEST 2016
Previous message: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
Next message: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Tue, Aug 09, 2016 at 02:47:44PM -0700, John Fastabend wrote:
> On 16-08-04 06:24 AM, Adrien Mazarguil wrote:
> > On Wed, Aug 03, 2016 at 12:11:56PM -0700, John Fastabend wrote:
[...]
> >> The problem is keeping priorities in order and/or possibly breaking
> >> rules apart (e.g. you have an L2 table and an L3 table) becomes very
> >> complex to manage at driver level. I think its easier for the
> >> application which has some context to do this. The application "knows"
> >> if its a router for example will likely be able to pack rules better
> >> than a PMD will.
> > 
> > I don't think most applications know they are L2 or L3 routers. They may not
> > know more than the pattern provided to the PMD, which may indeed end at a L2
> > or L3 protocol. If the application simply chooses a table based on this
> > information, then the PMD could have easily done the same.
> > 
> 
> But when we start thinking about encap/decap then its natural to start
> using this interface to implement various forwarding dataplanes. And one
> common way to organize a switch is into a TEP, router, switch
> (mac/vlan), ACL tables, etc. In fact we see this topology starting to
> show up in the NICs now.
> 
> Further each table may be "managed" by a different entity. In which
> case the software will want to manage the physical and virtual networks
> separately.
> 
> It doesn't make sense to me to require a software aggregator object to
> marshal the rules into a flat table then for a PMD to split them apart
> again.

OK, my point was mostly about handling basic cases easily and making sure
applications do not have to bother with petty HW details when they do not
want to, yet still get maximum performance by having the PMD make the most
appropriate choices automatically.

You've convinced me that in many cases PMDs won't be able to optimize
efficiently and that conscious applications will know better. The API has to
provide the ability to do so. I think it's fine as long as it is not
mandatory.

> > I understand the issue is what happens when applications really want to
> > define e.g. L2/L3/L2 rules in this specific order (or any ordering that
> > cannot be satisfied by HW due to table constraints).
> > 
> > By exposing tables, in such a case applications should move all rules from
> > L2 to a L3 table themselves (assuming this is even supported) to guarantee
> > ordering between rules, or fail to add them. This is basically what the PMD
> > could have done, possibly in a more efficient manner in my opinion.
> 
> I disagree with the more efficient comment :)
> 
> If the software layer is working on L2/TEP/ACL/router layers merging
> them just to pull them back apart is not going to be more efficient.

Moving flow rules around cannot be efficient by definition, however I think
that attempting to describe table capabilities may be as complicated as
describing HW bit-masking features. Applications may get it wrong as a
result while a PMD would not make any mistake.

Your use case is valid though, if the application already groups rules, then
sharing this information with the PMD would make sense from a performance
standpoint.

> > Let's assume two opposite scenarios for this discussion:
> > 
> > - App #1 is a command-line interface directly mapped to flow rules, which
> >   basically gets slow random input from users depending on how they want to
> >   configure their traffic. All rules differ considerably (L2, L3, L4, some
> >   with incomplete bit-masks, etc). All in all, few but complex rules with
> >   specific priorities.
> > 
> 
> Agree with this and in this case the application should be behind any
> network physical/virtual and not giving rules like encap/decap/etc. This
> application either sits on the physical function and "owns" the hardware
> resource or sits behind a virtual switch.
> 
> 
> > - App #2 is something like OVS, creating and deleting a large number of very
> >   specific (without incomplete bit-masks) and mostly identical
> >   single-priority rules automatically and very frequently.
> > 
> 
> Maybe for OVS but not all virtual switches are built with flat tables
> at the bottom like this. Nor is it optimal it necessarily optimal.
> 
> Another application (the one I'm concerned about :) would be build as
> a pipeline, something like
> 
> 	ACL -> TEP -> ACL -> VEB -> ACL
> 
> If I have hardware that supports a TEP hardware block an ACL hardware
> block and a VEB  block for example I don't want to merge my control
> plane into a single table. The merging in this case is just pure
> overhead/complexity for no gain.

It could be done by dedicating priority ranges for each item in the
pipeline but then it would be clunky. OK then, let's discuss the best
approach to implement this.

[...]
> >> Its not about mask vs no mask. The devices with multiple tables that I
> >> have don't have this mask limitations. Its about how to optimally pack
> >> the rules and who implements that logic. I think its best done in the
> >> application where I have the context.
> >>
> >> Is there a way to omit the table field if the PMD is expected to do
> >> a best effort and add the table field if the user wants explicit
> >> control over table mgmt. This would support both models. I at least
> >> would like to have explicit control over rule population in my pipeline
> >> for use cases where I'm building a pipeline on top of the hardware.
> > 
> > Yes that's a possibility. Perhaps the table ID to use could be specified as
> > a meta pattern item? We'd still need methods to report how many tables exist
> > and perhaps some way to report their limitations, these could be later
> > through a separate set of functions.
> 
> Sure I think a meta pattern item would be fine or put it in the API call
> directly, something like
> 
>   rte_flow_create(port_id, pattern, actions);
>   rte_flow_create_table(port_id, table_id, pattern, actions);

I suggest using a common method for both cases, either seems fine to me, as
long as a default table value can be provided (zero) when applications do
not care.

Now about tables management, I think there is no need to not expose table
capabilities (in case they have different capabilities) but instead provide
guidelines as part of the specification to encourage applications writers to
group similar rules in tables. A previously discussed, flow rules priorities
would be specific to the table they are affected to.

Like flow rules, table priorities could be handled through their index with
index 0 having the highest priority. Like flow rule priorities, table
indices wouldn't have to be contiguous.

If this works for you, how about renaming "tables" to "groups"?

-- 
Adrien Mazarguil
6WIND
Previous message: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
Next message: [dpdk-dev] [RFC] Generic flow director/filtering/classification API
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the dev mailing list