[RFC V1 1/1] net: extend VXLAN header to support more extensions
Thomas Monjalon
thomas at monjalon.net
Fri Feb 9 14:44:32 CET 2024
09/02/2024 13:11, Ferruh Yigit:
> On 2/9/2024 10:12 AM, Thomas Monjalon wrote:
> > 09/02/2024 00:54, Ferruh Yigit:
> >> On 1/30/2024 11:25 AM, Gavin Li wrote:
> >>> Currently, DPDK supports VXLAN and VXLAN-GPE with similar header
> >>> structures and we are working on adding support for VXLAN-GBP which is
> >>> another extension to VXLAN. More extension of VXLAN may be added in the
> >>> future.
> >>>
> >>> VXLAN and VXLAN-GBP use the same UDP port(4789) while VXLAN-GPE uses a
> >>> different one, 4790. The three protocols have the same header length and
> >>> overall similar header structure as below.
> >>> 0 1 2 3
> >>> 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
> >>> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> >>> |R|R|R|R|I|R|R|R| Reserved |
> >>> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> >>> | VXLAN Network Identifier (VNI) | Reserved |
> >>> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> >>>
> >>> Figure 1: VXLAN Header
> >>>
> >>> 0 1 2 3
> >>> 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
> >>> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> >>> |R|R|Ver|I|P|B|O| Reserved |Next Protocol |
> >>> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> >>> | VXLAN Network Identifier (VNI) | Reserved |
> >>> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> >>>
> >>> Figure 2: VXLAN-GPE Header
> >>>
> >>> 0 1 2 3
> >>> 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
> >>> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> >>> |G|R|R|R|I|R|R|R|R|D|R|R|A|R|R|R| Group Policy ID |
> >>> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> >>> | VXLAN Network Identifier (VNI) | Reserved |
> >>> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> >>>
> >>> Figure 3: VXLAN-GBP Extension
> >>>
> >>> Both VXLAN-GPE and VXLAN-GBP extended VXLAN by redefining its reserved
> >>> bits, which means the packets can be processed with same pattern and most
> >>> of the code can be reused. Instead of adding more new items by
> >>> copying/pasting code for the VXLAN extensions in the future, it’s better
> >>> to use existing VXLAN infrastructure and add support code in it.
> >>>
> >>
> >> Hi Gavin,
> >>
> >> The motivation is to prevent code duplication, and the code mentioned is
> >> the driver code, right?
> >
> > The motivation is mainly to provide a unified and more explicit API.
> >
>
> From user perspective, I think existing approach is more explicit,
> because it sets VXLAN or VXLAN_GPE flow types.
>
> I am trying to understand the benefit, how unifying flow type in the API
> helps to the user?
>
> >> Overall OK to unify "struct rte_vxlan_hdr" although it makes the struct
> >> a little complex, perhaps we can consider extraction some nested structs
> >> as named struct, no strong opinion.
> >>
> >>
> >> But not sure about removing the flow item types for VXLAN-GPE, or not
> >> adding for VXLAN-GBP.
> >>
> >> Think about a case user adding a rule, which has a item type as VXLAN
> >> and in the protocol header some bits are set, lets say first word, last
> >> byte is set, how driver will know if to take it as GPE "next protocol"
> >> or "group policy id".
> >
> > The driver may decide depending on the UDP port and some distinguishing flags.
> > If you want to match on GBP, you should includes the GBP flag in your pattern,
> > no need to use a separate item.
> >
>
> Why not be more explicit?
> It helps to driver to know more about the pattern to be able to create
> proper flow rule, if there is an obvious way for driver to differentiate
> these protocol extensions, and flow item type is redundant, I can
> understand the proposal, but otherwise I think better to keep flow items
> for extensions.
In any case we need the simple VXLAN item.
If we have GPE and GBP specialized items,
what means a match on the simple VXLAN?
Does it include packets with other extensions or exclude them?
Matching the bits in the protocol make such decision explicit.
> When a rule is set in HW, HW may not care about the protocol, as long as
> bits in the rule match with the packet, HW can apply the action.
> But for driver to be able to set the rule properly, it needs more
> explicit information.
Yes information is in the pattern to match.
> Lets assume driver API get a pattern with 'RTE_FLOW_ITEM_TYPE_VXLAN'
> type and "struct rte_flow_item_vxlan", at this point driver doesn't know
> if it is VXLAN or any of the extensions.
Yes it knows because of the matched bits in the pattern.
If the rule specify a match on GBP flag = 1, it is GBP only.
If the rule specify a match on GBP flag = 0, it excludes GBP.
If the rule does not mask GBP flag, it includes GBP.
> Should driver go and check GBP flags to deduce if it is GBP flag, and
Yes
> what if they are all zero, how can driver can tell if this is GBP flag
> that is zero or is it VXLAN header and bit is reserved.
As explained above, value & mask does the matching.
> Or I will just make up a sample, it may not be accurate but please take
> it as a case to prove a point.
> Lets assume driver API again get "struct rte_flow_item_vxlan" whose
> first word's last bit is set, which union in the struct will driver
> check to detect this, GPE one or GBP one?
It depends on the G flag.
> This can be GPE-"next protocol" or GBP-"Group policy id", if driver
> knows this it can set the mask better for the field in the rule.
> Driver may try to deduce this extension information from other
> information, but why not proper specific flow type for each extension
> instead?
I believe relying on bit matching is a cleaner approach
than creating a new item each time a protocol has a variation.
And it makes include/exclude policy easier to understand
thanks to the value/mask logic.
> >> And if a specific HW detects VXLAN-GPE and VXLAN-GBP protocols as two
> >> separate protocols, won't only having capability to describe VXLAN
> >> protocol in SW be a limitation.
> >
> > I think the driver may know based on the flow rule.
> >
> > That's a good question about the item granularity.
> > What is the separation between a different protocol and protocol options?
> > How flow item should match protocol variations?
> > My current thinking is that we should minimize the number of items.
> >
> >> If the target is to remove code duplication in the driver, can it be
> >> accomplished by extracting code that parse the common fields of these
> >> protocols?
> >
> > Driver code is not a concern as far as any driver can implement the API.
> >
>
> Got it, I want to clarify this one.
> Where the code duplication via copy/paste that commit log mention occurs?
>
> >
> >>> In this patch, all the VXLAN extension header will be merged with VXLAN as
> >>> union if the overlapped field has different format among protocols. The
> >>> existing VXLAN-GPE will be marked as deprecated and new extensions of
> >>> VXLAN should be added to VXLAN instead of a new RTE item.
> >>>
> >>> Signed-off-by: Gavin Li <gavinl at nvidia.com>
> >>> ---
> >>> doc/guides/rel_notes/deprecation.rst | 5 +++
> >>> lib/ethdev/rte_flow.h | 13 +++++-
> >>> lib/net/rte_vxlan.h | 67 ++++++++++++++++++++++++++--
> >>> 3 files changed, 80 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
> >>> index 81b93515cb..f9cf931b77 100644
> >>> --- a/doc/guides/rel_notes/deprecation.rst
> >>> +++ b/doc/guides/rel_notes/deprecation.rst
> >>> @@ -95,6 +95,11 @@ Deprecation Notices
> >>> - ``rte_flow_item_pppoe``
> >>> - ``rte_flow_item_pppoe_proto_id``
> >>>
> >>> +* ethdev: The flow item ``RTE_FLOW_ITEM_TYPE_VXLAN_GPE`` is replaced with ``RTE_FLOW_ITEM_TYPE_VXLAN``.
> >>> + The item ``RTE_FLOW_ITEM_TYPE_VXLAN_GPE``, the struct ``rte_flow_item_vxlan_gpe``, its mask ``rte_flow_item_vxlan_gpe_mask``,
> >>> + and the header struct ``rte_vxlan_gpe_hdr`` with the macro ``RTE_ETHER_VXLAN_GPE_HLEN``
> >>> + will be removed in DPDK 25.11.
> >>> +
> >>>
> >>
> >> We have 24.11 in between.
> >
> > Isn't it too soon?
> > Should we remove at all?
More information about the dev
mailing list