[dpdk-dev] [PATCH] ethdev: add flow tag

Yigit, Ferruh ferruh.yigit at linux.intel.com
Tue Oct 8 14:57:22 CEST 2019


On 7/11/2019 2:59 AM, Yongseok Koh wrote:
> On Tue, Jul 09, 2019 at 10:38:06AM +0200, Adrien Mazarguil wrote:
>> On Fri, Jul 05, 2019 at 06:05:50PM +0000, Yongseok Koh wrote:
>>>> On Jul 5, 2019, at 6:54 AM, Adrien Mazarguil <adrien.mazarguil at 6wind.com> wrote:
>>>>
>>>> On Thu, Jul 04, 2019 at 04:23:02PM -0700, Yongseok Koh wrote:
>>>>> A tag is a transient data which can be used during flow match. This can be
>>>>> used to store match result from a previous table so that the same pattern
>>>>> need not be matched again on the next table. Even if outer header is
>>>>> decapsulated on the previous match, the match result can be kept.
>>>>>
>>>>> Some device expose internal registers of its flow processing pipeline and
>>>>> those registers are quite useful for stateful connection tracking as it
>>>>> keeps status of flow matching. Multiple tags are supported by specifying
>>>>> index.
>>>>>
>>>>> Example testpmd commands are:
>>>>>
>>>>>  flow create 0 ingress pattern ... / end
>>>>>    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
>>>>>            set_tag index 3 value 0x123456 mask 0xffffff /
>>>>>            vxlan_decap / jump group 1 / end
>>>>>
>>>>>  flow create 0 ingress pattern ... / end
>>>>>    actions set_tag index 2 value 0xcc00 mask 0xff00 /
>>>>>            set_tag index 3 value 0x123456 mask 0xffffff /
>>>>>            vxlan_decap / jump group 1 / end
>>>>>
>>>>>  flow create 0 ingress group 1
>>>>>    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
>>>>>            eth ... / end
>>>>>    actions ... jump group 2 / end
>>>>>
>>>>>  flow create 0 ingress group 1
>>>>>    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
>>>>>            tag index is 3 value spec 0x123456 value mask 0xffffff /
>>>>>            eth ... / end
>>>>>    actions ... / end
>>>>>
>>>>>  flow create 0 ingress group 2
>>>>>    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
>>>>>            eth ... / end
>>>>>    actions ... / end
>>>>>
>>>>> Signed-off-by: Yongseok Koh <yskoh at mellanox.com>
>>>>
>>>> Hi Yongseok,
>>>>
>>>> Only high level questions for now, while it unquestionably looks useful,
>>>> from a user standpoint exposing the separate index seems redundant and not
>>>> necessarily convenient. Using the following example to illustrate:
>>>>
>>>> actions set_tag index 3 value 0x123456 mask 0xfffff
>>>>
>>>> pattern tag index is 3 value spec 0x123456 value mask 0xffffff
>>>>
>>>> I might be missing something, but why isn't this enough:
>>>>
>>>> pattern tag index is 3 # match whatever is stored at index 3
>>>>
>>>> Assuming it can work, then why bother with providing value spec/mask on
>>>> set_tag? A flow rule pattern matches something, sets some arbitrary tag to
>>>> be matched by a subsequent flow rule and that's it. It even seems like
>>>> relying on the index only on both occasions is enough for identification.
>>>>
>>>> Same question for the opposite approach; relying on the value, never
>>>> mentioning the index.
>>>>
>>>> I'm under the impression that the index is a hardware-specific constraint
>>>> that shouldn't be exposed (especially since it's an 8-bit field). If so, a
>>>> PMD could keep track of used indices without having them exposed through the
>>>> public API.
>>>
>>>
>>> Thank you for review, Adrien.
>>> Hope you are doing well. It's been long since we talked each other. :-)
>>
>> Yeah clearly! Hope you're doing well too. I'm somewhat busy hence slow to
>> answer these days...
>>
>>  <dev at dpdk.org> hey!
>>  <dev at dpdk.org> no private talks!
>>
>> Back to the topic:
>>
>>> Your approach will work too in general but we have a request from customer that
>>> they want to partition this limited tag storage. Assuming that HW exposes 32bit
>>> tags (those are 'registers' in HW pipeline in mlx5 HW). Then, customers want to
>>> store multiple data even in a 32-bit storage. For example, 16bit vlan tag, 8bit
>>> table id and 8bit flow id. As they want to split one 32bit storage, I thought it
>>> is better to provide mask when setting/matching the value. Even some customer
>>> wants to store multiple flags bit by bit like ol_flags. They do want to alter
>>> only partial bits.
>>>
>>> And for the index, it is to reference an entry of tags array as HW can provide
>>> larger registers than 32-bit. For example, mlx5 HW would provide 4 of 32b
>>> storage which users can use for their own sake.
>>> 	tag[0], tag[1], tag[2], tag[3]
>>
>> OK, looks like I missed the point then. I initially took it for a funky
>> alternative to RTE_FLOW_ITEM_TYPE_META & RTE_FLOW_ACTION_TYPE_SET_META
>> (ingress extended [1]) but while it could be used like that, it's more of a
>> way to temporarily store and retrieve a small amount of data, correct?
> 
> Correct.
> 
>> Out of curiosity, are these registers independent from META and other
>> items/actions in mlx5, otherwise what happens if they are combined?
> 
> I thought about combining it but I chose this way. Because it is transient. META
> can be set by packet descriptor on Tx and can be delivered to host via mbuf on
> Rx, but this TAG item can't. If I combine it, users have to query this
> capability for each 32b storage. And also, there should be a way to request data
> from such storages (i.e. new action , e.g. copy_meta). Let's say there are 4x32b
> storages - meta[4]. If user wants to get one 32b data (meta[i]) out of them to
> mbuf->metadata, it should be something like,
> 	ingress / pattern .. /
> 	actions ... set_meta index i data x / copy_meta_to_rx index i
> And if user wants to set meta[i] via mbuf on Tx,
> 	egress / pattern meta index is i data is x ... /
> 	actions ... copy_meta_to_tx index i
> 
> For sure, user is also responsible for querying these capabilities per each
> meta[] storage.
> 
> As copy_meta_to_tx/rx isn't a real action, this example would confuse user.
> 	egress / pattern meta index is i data is x ... /
> 	actions ... copy_meta_to_tx index i
> 
> User might misunderstand the order of two things - item meta and copy_meta
> action. I also thought about having capability bits per each meta[] storage but
> it also looked complex.
> 
> I do think rte_flow item/action is better to be simple, atomic and intuitive.
> That's why I made this choice.
> 
>> Are there other uses for these registers? Say, referencing their contents
>> from other places in a flow rule so they don't have to be hard-coded?
> 
> Possible.
> Actually, this feature is needed by connection tracking of OVS-DPDK.
> 
>> Right now I'm still uncomfortable with such a feature in the public API
>> because compared to META [1], this approach looks very hardware-specific and
>> seemingly difficult to map on different HW architectures.
> 
> I wouldn't say it is HW-specific. Like I explained above, I just define this new
> item/action to make things easy-to-use and intuitive.
> 
>> However, the main problem is that as described, its end purpose seems
>> redundant with META, which I think can cover the use cases you gave. So what
>> can an application do with this that couldn't be done in a more generic
>> fashion through META?
>>
>> I may still be missing something and I'm open to ideas, but assuming it
>> doesn't make it into the public rte_flow API, it remains an interesting
>> feature on its own merit which could be added to DPDK as PMD-specific
>> pattern items/actions [2]. mlx5 doesn't have any yet, but it's pretty common
>> for PMDs to expose a public header that dedicated applications can include
>> to use this kind of features (look for rte_pmd_*.h, e.g. rte_pmd_ixgbe.h).
>> No problem with that.
> 
> That's good info. Thanks. But still considering connection-tracking-like
> use-cases, this transient storage on multi-table flow pipeline is quite useful.
> 
> 
> thanks,
> Yongseok
> 
>> [1] "[PATCH] ethdev: extend flow metadata"
>>     https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2019-July%2F137305.html&data=02%7C01%7Cyskoh%40mellanox.com%7Ccd2d2d88786f43d9603708d70448c623%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636982582929119170&sdata=4xI5tJ9pcVn1ooTwmZ1f0O%2BaY9p%2FL%2F8O23gr2OW7ZpI%3D&reserved=0
>>
>> [2] "Negative types"
>>     https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdoc.dpdk.org%2Fguides%2Fprog_guide%2Frte_flow.html%23negative-types&data=02%7C01%7Cyskoh%40mellanox.com%7Ccd2d2d88786f43d9603708d70448c623%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636982582929119170&sdata=gFYRsOd8RzINShMvMR%2FXFKwV5RHAwThsDrvwnCrDIiQ%3D&reserved=0

Is this RFC still valid, will there be any follow up?
If not am marking it as rejected in next a few days.


More information about the dev mailing list