[dpdk-dev] [PATCH v3 2/4] ethdev: Add tunnel encap/decap actions

Shahaf Shuler shahafs at mellanox.com
Tue Apr 10 13:06:43 CEST 2018


Hi,

Adding small comment on top of Adrien's

Tuesday, April 10, 2018 1:20 PM, Adrien Mazarguil:
> On Mon, Apr 09, 2018 at 05:10:35PM +0100, Mohammad Abdul Awal wrote:
> > On 06/04/2018 21:26, Adrien Mazarguil wrote:
> > > On Fri, Apr 06, 2018 at 01:24:01PM +0100, Declan Doherty wrote:
> > > > Add new flow action types and associated action data structures to
> > > > support the encapsulation and decapsulation of the virtual tunnel
> > > > endpoints.
> > > >
> > > > The RTE_FLOW_ACTION_TYPE_TUNNEL_ENCAP action will cause the
> > > > matching flow to be encapsulated in the virtual tunnel endpoint
> > > > overlay defined in the tunnel_encap action data.
> > > >
> > > > The RTE_FLOW_ACTION_TYPE_TUNNEL_DECAP action will cause all
> > > > virtual tunnel endpoint overlays up to and including the first
> > > > instance of the flow item type defined in the tunnel_decap action
> > > > data for the matching flows.
> > > >
> > > > Signed-off-by: Declan Doherty <declan.doherty at intel.com>
> > > This generic approach looks flexible enough to cover the use cases
> > > that immediately come to mind (VLAN, VXLAN), its design is sound.
> > >
> > > However, while I'm aware it's not a concern at this point, it won't
> > > be able to deal with stateful tunnel or encapsulation types (e.g.
> > > IPsec or TCP) which will require additional meta data or some
> > > run-time assistance from the application.
> > >
> > > Eventually for more complex use cases, dedicated encap/decap actions
> > > will have to appear, so the issue I wanted to raise before going further is
> this:
> > >
> > > Going generic inevitably trades some of the usability; flat
> > > structures dedicated to VXLAN encap/decap with only the needed info
> > > to get the job done would likely be easier to implement in PMDs and
> > > use in applications. Any number of such actions can be added to rte_flow
> without ABI impact.
> > >
> > > If VXLAN is the only use case at this point, my suggestion would be
> > > to go with simpler RTE_FLOW_ACTION_TYPE_VXLAN_(ENCAP|DECAP)
> actions,
> > > with fixed
> > > L2/L3/L4/L5 header definitions to prepend according to RFC 7348.
> > We can go this way and this will increase the action for more and more
> > tunneling protocols being added. Current proposal is already a generic
> > approach which specifies as a tunnel for all the tunneling protocols.
> 
> Right, on the other hand there are not that many standard encapsulations
> offloaded by existing devices. rte_flow could easily handle dedicated actions
> for each of them without problem.
> 
> My point is that many of those (will eventually) have their own quirks to
> manage when doing encap/decap, it's not just a matter of prepending or
> removing a bunch of header definitions, otherwise we could as well let
> applications simply provide an arbitrary buffer to prepend.
> 
> Consider that the "generic" part is already built into rte_flow as the way
> patterns and action are handled. Adding another generic layer on top of that
> could make things more inconvenient than necessary to applications (my
> main concern).
> 
> You'd need another layer of validation/error reporting machinery to properly
> let applications know they cannot encap VXLAN on top of TCP on top of
> QinQinQinQinQ for instance. Either a single bounded encapsulation definition
> or a combination at the action list level is needed to avoid that.
> 
> > > Now we can start with the generic approach, see how it fares and add
> > > dedicated encap/decap later as needed.
> > >
> > > More comments below.
> <snip>
> > > > +Action: ``TUNNEL_ENCAP``
> > > > +^^^^^^^^^^^^^^^^^^^^^^

The ENCAP/DECAP doesn't have to be in the context of tunnel.
For example - let's take GRE - application may want to decap the GRE and encap it with L2. The L2 encapsulation is not related to any tunnel. 
Same for the other direction - VM sends Eth frame, and we want to decap the Eth and encap with GRE.

I think those action should be free from the tunnel association and just provide flow items we want to encap/decap or in a more generic way offset to the packet headers and buffer to encap (not sure how many devices supports that, may be overkill at this point). 

> > > > +
> > > > +Performs an encapsulation action by encapsulating the flows
> > > > +matched by the pattern items according to the network overlay
> > > > +defined in the ``rte_flow_action_tunnel_encap`` pattern items.
> > > > +
> > > > +This action modifies the payload of matched flows. The pattern
> > > > +items specified in the ``rte_flow_action_tunnel_encap`` action
> > > > +structure must defined a valid set of overlay headers, from the
> > > > +Ethernet header up to the overlay header. The pattern must be
> terminated with the RTE_FLOW_ITEM_TYPE_END item type.
> > > Regarding the use of a pattern list, if you consider PMDs are
> > > already iterating on a list of actions when encountering
> > > RTE_FLOW_ACTION_TYPE_TUNNEL_ENCAP, it adds yet another inner
> loop.
> > We understand that it is implementation specifics. If we do not go for
> > another inner loop, all the bundling need to be handled in the same
> > function, which seems more clumsy to me. This also breaks the tunnel
> > endpoint concept.
> > >
> > > How about making each encountered
> RTE_FLOW_ACTION_TYPE_TUNNEL_ENCAP
> > > provide exactly one item instead (in encap, i.e. reverse order)?
> > Again, if we have tunnel action, security action, and other actions,
> > all the processing and tracking need to be done in one function. Now
> > we will need ETH_ENCAP/DECAP, UDP_ENCAP/DECAP,
> NVGRE_ENCAP/DECAP, etc.
> 
> Well, the number of DECAP actions doesn't need to perfectly reflect that of
> ENCAP since it implies all preceding layers. No problem with that.
> 
> Regarding multiple dedicated actions, my suggestion was for a single generic
> one as in this patch, but each instance on the ENCAP side would deal with a
> single protocol layer, instead of having a single ENCAP action with multiple
> inner layers (and thus an inner loop).
> 
> PMDs also gain the ability to precisely report which encap step fails by making
> rte_flow_error point to the problematic object to ease debugging of flow
> rules on the application side.
> 
> Why would that break the tunnel idea and more importantly, how would it
> prevent PMD developers from splitting their processing into multiple
> functions?
> 
> > >
> > > In which case perhaps "GENERIC" would be a better fit than "TUNNEL".
> > >
> <snip>
> > > > +
> > > > +   +-------+--------------------------+------------+
> > > > +   | Index | Flow Item Type           | Flow Item  |
> > > > +   +=======+==========================+============+
> > > > +   | 0     | RTE_FLOW_ITEM_TYPE_ETH   | eth item   |
> > > > +   +-------+--------------------------+------------+
> > > > +   | 1     | RTE_FLOW_ITEM_TYPE_IPV4  | ipv4 item  |
> > > > +   +-------+--------------------------+------------+
> > > > +   | 2     | RTE_FLOW_ITEM_TYPE_UDP   | udp item   |
> > > > +   +-------+--------------------------+------------+
> > > > +   | 3     | RTE_FLOW_ITEM_TYPE_VXLAN | vxlan item |
> > > > +   +-------+--------------------------+------------+
> > > > +   | 4     | RTE_FLOW_ITEM_TYPE_END   | NULL       |
> > > > +   +-------+--------------------------+------------+
> > > One possible issue is that it relies on objects normally found on
> > > the pattern side of flow rules. Those are supposed to match
> > > something, they are not intended for packet header generation. While
> their "spec" and "mask"
> > > fields might make sense in this context, the "last" field is odd.
> > >
> > > You must define them without leaving anything open for
> > > interpretation by PMDs and users alike. Defining things as
> > > "undefined" is fine as long as it's covered.
> > Please note that the "void *item" in the
> > "rte_flow_action_tunnel_encap.pattern" points to the data structure
> > defined for the corresponding rte_flow_item_type instead of a
> > rte_flow_item structure. As an example, for the rte_flow_item_eth type,
> the "void *item"
> > will point to a "struct rte_flow_item_eth" instance. Thats why we have
> > defined struct rte_flow_action_item inside struct
> > rte_flow_action_tunnel_encap. So, no question of spec, mask, last
> anymore.
> 
> Right, I noticed that after commenting its structure definition below.
> 
> I think I won't be the only one confused by this approach, also because a
> mask is needed in addition to a specification structure, otherwise how do you
> plan to tell what fields are relevant in application-provided protocol headers?
> 
> An application might set unusual IPv4/UDP/VXLAN fields and expect them to
> be part of the encapsulated traffic. Without a mask, a PMD must take
> headers verbatim, and I don't think many devices are ready for that yet.
> 
> Hence my other suggestion: defining inflexible $PROTOCOL_(ENCAP|DECAP)
> actions that do not allow more than what's defined by official RFCs for
> $PROTOCOL.
> 
> <snip>
> > > > + */
> > > > +struct rte_flow_action_tunnel_encap {
> > > > +	struct rte_flow_action_item {
> > > > +		enum rte_flow_item_type type;
> > > > +		/**< Flow item type. */
> > > > +		const void *item;
> > > > +		/**< Flow item definition which points to the data of
> > > > +		 * corresponding rte_flow_item_type.
> > > > +		 */
> > > I see it's a new action type, albeit a bit confusing (there is no
> > > RTE_FLOW_ACTION_TYPE_ITEM).
> > >
> > > I suggest the standard pattern item type since you're going with
> > > enum rte_flow_item_type anyway. Keep in mind you need some kind of
> > > mask to tell what fields are relevant. An application might
> > > otherwise want to encap with unsupported properties (e.g. specific IPv4
> ToS field and whatnot).
> > >
> > > How about a single "struct rte_flow_pattern_item item", neither
> > > const and neither a pointer. It's generic enough, enclosed
> > > spec/last/mask pointers take care of the specifics. You just need to
> > > define what's supposed to happen when "last" is set.
> > Please see the comment above regarding this field.
> 
> Point still stands, if you need to distinguish spec and mask, a more complete
> structure is needed. Instead of adding a new (confusing) type, you should
> use rte_flow_item and define what happens when "last" is set.
> 
> You should define it as reserved for now, any non-NULL value is an error. This
> field might also be useful later.
> 
> <snip>
> > > > +};
> > > > +
> > > > +/**
> > > > + * RTE_FLOW_ACTION_TYP_TUNNEL_DECAP
> > > > + *
> > > > + * Virtual tunnel end-point decapsulation action data.
> > > > + *
> > > > + * Non-terminating action by default.
> > > > + */
> > > > +struct rte_flow_action_tunnel_decap {
> > > > +	enum rte_flow_item_type type;
> > > > +	/**<
> > > > +	 * Flow item type of virtual tunnel end-point to be decapsulated
> > > > +	 */
> > > > +};
> > > Note that contrary to ENCAP, DECAP wouldn't necessarily need
> > > repeated actions to peel each layer off. The current definition is fine.
> > To clarify, the the decap is upto the PMD to remove all the header for
> > a specified type. For example, for
> >
> > rte_flow_item_type type=RTE_FLOW_ITEM_TYPE_VXLAN, the PMD will
> peel off (ETH, IPV4, UDP, VXLAN) header all together.
> 
> Yep, that's fine, whether we use multiple actions or a single one doing
> multiple things, a single DECAP can peel them off all at once :)
> 
> > >
> > > > +
> > > > +/**
> > > >    * Definition of a single action.
> > > >    *
> > > >    * A list of actions is terminated by a END action.
> > > > --
> > > > 2.7.4
> > > >
> 
> If the reasons I gave did not manage to convince you about choosing
> between
> either fixed (VLAN|VXLAN)_(ENCAP|DECAP) actions or generic encap/decap
> actions that deal with a single protocol layer at once instead of the
> proposed approach, I'm ready to try it out as an experimental API (all new
> objects tagged as experimental) *if* you address the lack of mask, which
> remains an open issue.
> 
> --
> Adrien Mazarguil
> 6WIND


More information about the dev mailing list