[dpdk-dev] [RFC] ethdev: add fragment attribute to IPv6 item

Andrew Rybchenko arybchenko at solarflare.com
Sun Jul 5 15:13:30 CEST 2020
Previous message: [dpdk-dev] [Bug 501] igb: segfault in rte_eth_tx_done_cleanup() when using advanced descriptors
Next message: [dpdk-dev] [PATCH 1/2] ethdev: allow negative values in flow rule types
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 6/2/20 10:04 PM, Adrien Mazarguil wrote:
> Hi Ori, Andrew, Delek,
> 
> (been a while eh?)
> 
> On Tue, Jun 02, 2020 at 06:28:41PM +0000, Ori Kam wrote:
>> Hi Andrew,
>>
>> PSB,
> [...]
>>>> diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
>>>> index b0e4199..3bc8ce1 100644
>>>> --- a/lib/librte_ethdev/rte_flow.h
>>>> +++ b/lib/librte_ethdev/rte_flow.h
>>>> @@ -787,6 +787,8 @@ struct rte_flow_item_ipv4 {
>>>>   */
>>>>  struct rte_flow_item_ipv6 {
>>>>  	struct rte_ipv6_hdr hdr; /**< IPv6 header definition. */
>>>> +	uint32_t is_frag:1; /**< Is IPv6 packet fragmented/non-fragmented. */
>>>> +	uint32_t reserved:31; /**< Reserved, must be zero. */
>>>
>>> The solution is simple, but hardly generic and adds an
>>> example for the future extensions. I doubt that it is a
>>> right way to go.
>>>
>> I agree with you that this is not the most generic way possible,
>> but the IPV6 extensions are very unique. So the solution is also unique.
>> In general, I'm always in favor of finding the most generic way, but sometimes
>> it is better to keep things simple, and see how it goes.
> 
> Same feeling here, it doesn't look right.
> 
>>> May be we should add 256-bit string with one bit for each
>>> IP protocol number and apply it to extension headers only?
>>> If bit A is set in the mask:
>>>  - if bit A is set in spec as well, extension header with
>>>    IP protocol (1 << A) number must present
>>>  - if bit A is clear in spec, extension header with
>>>    IP protocol (1 << A) number must absent
>>> If bit is clear in the mask, corresponding extension header
>>> may present and may absent (i.e. don't care).
>>>
>> There are only 12 possible extension headers and currently none of them
>> are supported in rte_flow. So adding a logic to parse the 256 just to get a max of 12 
>> possible values is an overkill. Also, if we disregard the case of the extension, 
>> the application must select only one next proto. For example, the application
>> can't select udp + tcp. There is the option to add a flag for each of the
>> possible extensions, does it makes more sense to you?
> 
> Each of these extension headers has its own structure, we first need the
> ability to match them properly by adding the necessary pattern items.
> 
>>> The RFC indirectly touches IPv6 proto (next header) matching
>>> logic.
>>>
>>> If logic used in ETH+VLAN is applied on IPv6 as well, it would
>>> make pattern specification and handling complicated. E.g.:
>>>   eth / ipv6 / udp / end
>>> should match UDP over IPv6 without any extension headers only.
>>>
>> The issue with VLAN I agree is different since by definition VLAN is 
>> layer 2.5. We can add the same logic also to the VLAN case, maybe it will
>> be easier. 
>> In any case, in your example above and according to the RFC we will
>> get all ipv6 udp traffic with and without extensions.
>>
>>> And how to specify UPD over IPv6 regardless extension headers?
>>
>> Please see above the rule will be eth / ipv6 /udp.
>>
>>>   eth / ipv6 / ipv6_ext / udp / end
>>> with a convention that ipv6_ext is optional if spec and mask
>>> are NULL (or mask is empty).
>>>
>> I would guess that this flow should match all ipv6 that has one ext and the next 
>> proto is udp.
> 
> In my opinion RTE_FLOW_ITEM_TYPE_IPV6_EXT is a bit useless on its own. It's
> only for matching packets that contain some kind of extension header, not a
> specific one, more about that below.
> 
>>> I'm wondering if any driver treats it this way?
>>>
>> I'm not sure, we can support only the frag ext by default, but if required we can support other 
>> ext.
>>  
>>> I agree that the problem really comes when we'd like match
>>> IPv6 frags or even worse not fragments.
>>>
>>> Two patterns for fragments:
>>>   eth / ipv6 (proto=FRAGMENT) / end
>>>   eth / ipv6 / ipv6_ext (next_hdr=FRAGMENT) / end
>>>
>>> Any sensible solution for not-fragments with any other
>>> extension headers?
>>>
>> The one propose in this mail 😊 
>>
>>> INVERT exists, but hardly useful, since it simply says
>>> that patches which do not match pattern without INVERT
>>> matches the pattern with INVERT and
>>>   invert / eth / ipv6 (proto=FRAGMENT) / end
>>> will match ARP, IPv4, IPv6 with an extension header before
>>> fragment header and so on.
>>>
>> I agree with you, INVERT in this doesn’t help.
>> We were considering adding some kind of not mask / item per item.
>> some think around this line:
>> user request ipv6 unfragmented udp packets. The flow would look something
>> like this:
>> Eth / ipv6 / Not (Ipv6.proto = frag_proto) / udp
>> But it makes the rules much harder to use, and I don't think that there
>> is any HW that support not, and adding such feature to all items is overkill.
>>
>>  
>>> Bit string suggested above will allow to match:
>>>  - UDP over IPv6 with any extension headers:
>>>     eth / ipv6 (ext_hdrs mask empty) / udp / end
>>>  - UDP over IPv6 without any extension headers:
>>>     eth / ipv6 (ext_hdrs mask full, spec empty) / udp / end
>>>  - UDP over IPv6 without fragment header:
>>>     eth / ipv6 (ext.spec & ~FRAGMENT, ext.mask | FRAGMENT) / udp / end
>>>  - UDP over IPv6 with fragment header
>>>     eth / ipv6 (ext.spec | FRAGMENT, ext.mask | FRAGMENT) / udp / end
>>>
>>> where FRAGMENT is 1 << IPPROTO_FRAGMENT.
>>>
>> Please see my response regarding this above.
>>
>>> Above I intentionally keep 'proto' unspecified in ipv6
>>> since otherwise it would specify the next header after IPv6
>>> header.
>>>
>>> Extension headers mask should be empty by default.
> 
> This is a deliberate design choice/issue with rte_flow: an empty pattern
> matches everything; adding items only narrows the selection. As Andrew said
> there is currently no way to provide a specific item to reject, it can only
> be done globally on a pattern through INVERT that no PMD implements so far.
> 
> So we have two requirements here: the ability to specifically match IPv6
> fragment headers and the ability to reject them.
> 

I think that one of key requirements here is an ability to say
that an extension header may be anywhere (or it must be no
extension header anywhere), since specification using a pattern
item suggests specified order of items, but it could be other
extension headers before frag header and after it before UDP protocol
header.

> To match IPv6 fragment headers, we need a dedicated pattern item. The
> generic RTE_FLOW_ITEM_TYPE_IPV6_EXT is useless for that on its own, it must
> be completed with RTE_FLOW_ITEM_TYPE_IPV6_EXT_FRAG and associated object
> to match individual fields if needed (like all the others
> protocols/headers).
> 

Yes, I agree, but it is strictly required if we want to match
on fragment header content or see it in exact place in next
protocols chain.

> Then to reject a pattern item... My preference goes to a new "NOT" meta item
> affecting the meaning of the item coming immediately after in the pattern
> list. That would be ultra generic, wouldn't break any ABI/API and like
> INVERT, wouldn't even require a new object associated with it.
> 

Yes, that's true, but I'm not sure if it is easy to do in HW.
Also, *NOT* scope could be per item field in fact, not whole
item. It sounds like it is getting more and more complicated.

> To match UDPv6 traffic when there is no fragment header, one could then do
> something like:
> 
>  eth / ipv6 / not / ipv6_ext_frag / udp
> 
> PMD support would be trivial to implement (I'm sure!)
> 

The problem is an interpretation of the above pattern.
Strictly speaking only UDP packets with exactly one
not frag extension header match the pattern.
What about packets without any extension headers?
Or packet with two (more) extension headers when the first
one is not frag header?

> We may later implement other kinds of "operator" items as Andrew suggested,
> for bit-wise stuff and so on. Let's keep adding features on a needed basis
> though.
> 

Thanks,
Andrew.
Previous message: [dpdk-dev] [Bug 501] igb: segfault in rte_eth_tx_done_cleanup() when using advanced descriptors
Next message: [dpdk-dev] [PATCH 1/2] ethdev: allow negative values in flow rule types
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the dev mailing list