[dpdk-dev] [RFC] Generic flow director/filtering/classification API

Adrien Mazarguil adrien.mazarguil at 6wind.com
Wed Aug 3 18:44:11 CEST 2016


Replying to everything at once, please see below.

On Tue, Jul 26, 2016 at 03:37:35PM +0530, Rahul Lakkireddy wrote:
> On Monday, July 07/25/16, 2016 at 09:40:02 -0700, John Fastabend wrote:
> > On 16-07-25 04:32 AM, Rahul Lakkireddy wrote:
> > > Hi Adrien,
> > > 
> > > On Thursday, July 07/21/16, 2016 at 19:07:38 +0200, Adrien Mazarguil wrote:
> > >> Hi Rahul,
> > >>
> > >> Please see below.
> > >>
> > >> On Thu, Jul 21, 2016 at 01:43:37PM +0530, Rahul Lakkireddy wrote:
> > >>> Hi Adrien,
> > >>>
> > >>> The proposal looks very good.  It satisfies most of the features
> > >>> supported by Chelsio NICs.  We are looking for suggestions on exposing
> > >>> more additional features supported by Chelsio NICs via this API.
> > >>>
> > >>> Chelsio NICs have two regions in which filters can be placed -
> > >>> Maskfull and Maskless regions.  As their names imply, maskfull region
> > >>> can accept masks to match a range of values; whereas, maskless region
> > >>> don't accept any masks and hence perform a more strict exact-matches.
> > >>> Filters without masks can also be placed in maskfull region.  By
> > >>> default, maskless region have higher priority over the maskfull region.
> > >>> However, the priority between the two regions is configurable.
> > >>
> > >> I understand this configuration affects the entire device. Just to be clear,
> > >> assuming some filters are already configured, are they affected by a change
> > >> of region priority later?
> > >>
> > > 
> > > Both the regions exist at the same time in the device.  Each filter can
> > > either belong to maskfull or the maskless region.
> > > 
> > > The priority is configured at time of filter creation for every
> > > individual filter and cannot be changed while the filter is still
> > > active. If priority needs to be changed for a particular filter then,
> > > it needs to be deleted first and re-created.
> > 
> > Could you model this as two tables and add a table_id to the API? This
> > way user space could populate the table it chooses. We would have to add
> > some capabilities attributes to "learn" if tables support masks or not
> > though.
> > 
> 
> This approach sounds interesting.

Now I understand the idea behind these tables, however from an application
point of view I still think it's better if the PMD could take care of flow
rules optimizations automatically. Think about it, PMDs have exactly a
single kind of device they know perfectly well to manage, while applications
want the best possible performance out of any device in the most generic
fashion.

> > I don't see how the PMD can sort this out in any meaningful way and it
> > has to be exposed to the application that has the intelligence to 'know'
> > priorities between masks and non-masks filters. I'm sure you could come
> > up with something but it would be less than ideal in many cases I would
> > guess and we can't have the driver getting priorities wrong or we may
> > not get the correct behavior.

It may be solved by having the PMD maintain a SW state to quickly know which
rules are currently created and in what state the device is so basically the
application doesn't have to perform this work.

This API allows applications to express basic needs such as "redirect
packets matching this pattern to that queue". It must not deal with HW
details and limitations in my opinion. If a request cannot be satisfied,
then the rule cannot be created. No help from the application must be
expected by PMDs, otherwise it opens the door to the same issues as the
legacy filtering APIs.

[...]
> > > Unfortunately, our maskfull region is extremely small too compared to
> > > maskless region.
> > > 
> > 
> > To me this means a userspace application would want to pack it
> > carefully to get the full benefit. So you need some mechanism to specify
> > the "region" hence the above table proposal.
> > 
> 
> Right. Makes sense.

I do not agree, applications should not be aware of it. Note this case can
be handled differently, so that rules do not have to be moved back and forth
between both tables. If the first created rule requires a maskfull entry,
then all subsequent rules will be entered into that table. Otherwise no
maskfull entry can be created as long as there is one maskless entry. When
either table is full, no more rules may be added. Would that work for you?

> [...]
> > >> Now about this "promisc" match criteria, it can be added as a new meta
> > >> pattern item (4.1.3 Meta item types). Do you want it to be defined from the
> > >> start or add it later with the related code in your PMD?
> > >>
> > > 
> > > It could be added as a meta item.  If there are other interested
> > > parties, it can be added now.  Otherwise, we'll add it with our filtering
> > > related code.
> > > 
> > 
> > hmm I guess by "promisc" here you mean match packets received from the
> > wire before they have been switched by the silicon?
> > 
> 
> Match packets received from wire before they have been switched by
> silicon, and which also includes packets not destined for DUT and were
> still received due to interface being in promisc mode.

I think it's fine, but we'll have to precisely define what happens when a
packet matched with such pattern is part of a terminating rule. For instance
if it is duplicated by HW, then the rule cannot be terminating.

[...]
> > > This raises another interesting question.  What should the PMD do
> > > if it has support to only a subset of fields in the particular item?
> > > 
> > > For example, if a rule has been sent to match IP fragmentation along
> > > with several other IPv4 fields, and if the underlying hardware doesn't
> > > support matching based on IP fragmentation, does the PMD reject the
> > > complete rule although it could have done the matching for rest of the
> > > IPv4 fields?
> > 
> > I think it has to fail the command other wise user space will not have
> > any way to understand that the full match criteria can not be met and
> > we will get different behavior for the same applications on different
> > nics depending on hardware feature set. This will most likely break
> > applications so we need the error IMO.
> > 
> 
> Ok. Makes sense.

Yes, I fully agree with this.

> > >>> - Match range of physical ports on the NIC in a single rule via masks.
> > >>>   For ex: match all UDP packets coming on ports 3 and 4 out of 4
> > >>>   ports available on the NIC.
> > >>
> > >> Applications create flow rules per port, I'd thus suggest that the PMD
> > >> should detect identical rules created on different ports and aggregate them
> > >> as a single HW rule automatically.
> > >>
> > >> If you think this approach is not right, the alternative is a meta pattern
> > >> item that provides a list of ports. I'm not sure this is the right approach
> > >> considering it would most likely not be supported by most NICs. Applications
> > >> may not request it explicitly.
> > >>
> > > 
> > > Aggregating via PMD will be expensive operation since it would involve:
> > > - Search of existing filters.
> > > - Deleting those filters.
> > > - Creating a single combined filter.
> > > 
> > > And all of above 3 operations would need to be atomic so as not to
> > > affect existing traffic which is hitting above filters.

Atomicity may not be a problem if the PMD makes sure the new combined rule
is inserted before the others, so they do not need to be removed either.

> > > Adding a
> > > meta item would be a simpler solution here.

Yes, clearly.

> > For this adding a meta-data item seems simplest to me. And if you want
> > to make the default to be only a single port that would maybe make it
> > easier for existing apps to port from flow director. Then if an
> > application cares it can create a list of ports if needed.
> > 
> 
> Agreed.

However although I'm not opposed to adding dedicated meta items, remember
applications will not automatically benefit from the increased performance
if a single PMD implements this feature, their maintainers will probably not
bother with it.

> > >>> - Match range of Physical Functions (PFs) on the NIC in a single rule
> > >>>   via masks. For ex: match all traffic coming on several PFs.
> > >>
> > >> The PF and VF pattern items assume there is a single PF associated with a
> > >> DPDK port. VFs are identified with an ID. I basically took the same
> > >> definitions as the existing filter types, perhaps this is not enough for
> > >> Chelsio adapters.
> > >>
> > >> Do you expose more than one PF for a DPDK port?
> > >>
> > >> Anyway, I'd suggest the same approach as above, automatic aggregation of
> > >> rules for performance reasons, otherwise new or updated PF/VF pattern items,
> > >> in which case it would be great if you could provide ideal structure
> > >> definitions for this use case.
> > >>
> > > 
> > > In Chelsio hardware, all the ports of a device are exposed via single
> > > PF4. There could be many VFs attached to a PF.  Physical NIC functions
> > > are operational on PF4, while VFs can be attached to PFs 0-3.
> > > So, Chelsio hardware doesn't remain tied on a PF-to-Port, one-to-one
> > > mapping assumption.
> > > 
> > > There already seems to be a PF meta-item, but it doesn't seem to accept
> > > any "spec" and "mask" field.  Similarly, the VF meta-item doesn't
> > > seem to accept a "mask" field.  We could probably enable these fields
> > > in the PF and VF meta-items to allow configuration.
> > 
> > Maybe a range field would help here as well? So you could specify a VF
> > range. It might be one of the things to consider adding later though if
> > there is no clear use for it now.
> > 
> 
> VF-value and VF-mask would help to achieve the desired filter.
> VF-mask would also enable to specify a range of VF values.

Like John, I think a range or even a list instead of a mask would be better,
the PMD can easily create a mask from that if necessary. Reason is that
we've always had bad experiences with bit-fields, they're always too short
at some point and we would like to avoid having to break the ABI to update
existing pattern items later.

Also while I don't think this is the case yet, perhaps it will be a good
idea for PFs/VFs to have global unique IDs, just like DPDK ports.

Thanks.

-- 
Adrien Mazarguil
6WIND


More information about the dev mailing list