[dpdk-dev] [RFC] Generic flow director/filtering/classification API

John Fastabend john.fastabend at gmail.com
Wed Aug 10 18:35:36 CEST 2016


On 16-08-10 04:02 AM, Adrien Mazarguil wrote:
> On Tue, Aug 09, 2016 at 02:24:26PM -0700, John Fastabend wrote:
> [...]
>>> Just an idea, could some kind of meta pattern items specifying time
>>> constraints for a rule address this issue? Say, how long (cycles/ms) the PMD
>>> may take to query/apply/delete the rule. If it cannot be guaranteed, the
>>> rule cannot be created. Applications could mantain statistic counters about
>>> failed rules to determine if performance issues are caused by the inability
>>> to create them.
>>
>> It seems a bit heavy to me to have each PMD driver implementing
>> something like this. But it would be interesting to explore probably
>> after the basic support is implemented though.
> 
> OK, let's keep this for later.
> 
> [...]
>>> Such a pattern could be generated from a separate function before feeding it
>>> to rte_flow_create(), or translated by the PMD afterwards assuming a
>>> separate meta item such as RAW_END exists to signal the end of a RAW layer.
>>> Of course processing this would be more expensive.
>>>
>>
>> Or the supported parse graph could be fetched from the hardware with the
>> values for each protocol so that the programming interface is the same.
>> The well known protocols could keep the 'enum values' in the header
>> rte_flow_item_type enum so that users would not be required to do
>> the parse graph but for new or experimental protocols we could query
>> the parse graph and get the programming pattern matching id for them.
>>
>> The normal flow would be unchanged but we don't get stuck upgrading
>> everything to add our own protocol. So the flow would be,
>>
>>  rte_get_parse_graph(graph);
>>  flow_item_proto = is_my_proto_supported(graph);
>>
>>  pattern = build_flow_match(flow_item_proto, value, mask);
>>  action = build_action();
>>  rte_flow_create(my_port, pattern, action);
>>
>> The only change to the API proposed to support this would be to allow
>> unsupported RTE_FLOW_ values to be pushed to the hardware and define
>> a range of values that are reserved for use by the parse graph discover.
>>
>> This would not have to be any more expensive.
> 
> Makes sense. Unless made entirely out of RAW items however the ensuing
> pattern would not be portable across DPDK ports, instances and versions if
> dumped in binary form for later use.
> 

Right.

> Since those would have be recognized by PMDs and applications regardless of
> the API version, I suggest making generated item types negative (enums are
> signed, let's use that).

That works then the normal positive enums maintain the list of
known/accepted protocols.

> 
> DPDK would have to maintain a list of expended values to avoid collisions
> between PMDs. A method should be provided to release them.

The 'middle layer' could have a non-public API for PMDs to get new
values call it get_flow_type_item_id() or something.

> 
> [...]
>> hmm for performance reasons building an entire graph up using RAW items
>> seems to be a bit heavy. Another alternative to the above parse graph
>> notion would be to allow users to add RAW node definitions at init time
>> and have the PMD give a ID back for those. Then the new node could be
>> used just like any other RTE_FLOW_ITEM_TYPE in a pattern.
>>
>> Something like,
>>
>> 	ret_flow_item_type_foo = rte_create_raw_node(foo_raw_pattern)
>> 	ret_flow_item_type_bar = rte_create_raw_node(bar_raw_pattern)
>>
>> then allow ret_flow_item_type_{foo|bar} to be used in subsequent
>> pattern matching items. And if the hardware can not support this return
>> an error from the initial rte_create_raw_node() API call.
>>
>> Do any either of those proposals sound like reasonable extensions?
> 
> Both seem acceptable in my opinion as they fit in the described API. However
> I think it would be better for this function to spit out a pattern made of
> any number of items instead of a single new item type. That way, existing
> fixed items can be reused as well, the entire pattern may even become
> portable as a result, it could be considered as a method to optimize a
> RAW pattern.
> 
> The RAW approach has the advantage of not requiring much additional code in
> the public API besides a couple of function declarations. A proper full
> blown graph would require a lot more as described in your original
> reply. Not sure which is better.
> 
> Either way they won't be part of the initial specification but it looks like
> they can be added later without affecting the basics.
> 

Right its not needed in initial spec as long as we have a path to get
there and it looks like we have two usable possibilities so that works
for me.


>>> [...]
>>>> The two open items from me are do we need to support adding new variable
>>>> length headers? And how do we handle multiple tables I'll take that up
>>>> in the other thread.
>>>
>>> I think variable length headers may be eventually supported through pattern
>>> tricks or eventually a separate conversion layer.
>>>
>>
>> A parse graph notion would support this naturally though without pattern
>> tricks hence my above suggestions.
> 
> All right, I agree a method to let applications precisely define what they
> want to match can be useful now I understand what you mean by
> "dynamically".
> 
>> Also in the current scheme how would I match an ipv6 option or specific
>> nsh option or mpls tag?
> 
> Ideally through specific pattern items defined for this purpose, which is
> how I thought the API would evolve. Of course it wouldn't be fully dynamic
> and you'd have to wait for a DPDK release that implements them.
> 

The only trouble is if you don't know exactly where the option is in the
list of options (which you wont in general) its a bit hard to get right
with the existing spec as best I can tell. Because RAW patterns
would require you to know where the option is in the list and ANY
pattern wouldn't guarantee a match is in the correct header with stacked
headers. At least if I'm reading the spec correctly it seems to be
an issue.

.John


More information about the dev mailing list