[dpdk-dev] [PATCH v3 0/3] ethdev: add generic L2/L3 tunnel encapsulation actions

Adrien Mazarguil adrien.mazarguil at 6wind.com
Thu Oct 11 15:12:21 CEST 2018

Hey Ori,

(removing most of the discussion, I'll only reply to the summary)

On Thu, Oct 11, 2018 at 08:48:05AM +0000, Ori Kam wrote:
> Hi Adrian,
> Thanks for your comments please see my answer below and inline.
> Due to a very short time limit and the fact that we have more than
> 4 patches that are based on this we need to close it fast.
> As I can see there are number of options:
> * the old approach that neither of us like. And which mean that for 
>    every tunnel we create a new command.

Just to be sure, you mean that for each new tunnel *type* a new rte_flow
action *type* must be added to DPDK right? Because the above reads like with
your proposal, a single flow rule can manage any number of TEPs and flow
rule creation for subsequent tunnels can be somehow bypassed.

One flow *rule* is still needed per TEP or did I miss something?

> * My proposed suggestion as is. Which is easier for at least number of application
>    to implement and faster in most cases.
> * My suggestion with different name, but then we need to find also a name
>    for the decap and also a name for decap_l3. This approach is also problematic
>    since we have 2 API that are doing the same thig. For example in test-pmd encap
>    vxlan in which API shell we use?

Since you're doing this for MPLSoUDP and MPLSoGRE, you could leave
VXLAN/NVGRE encap as is, especially since (AFAIK) there are series still
relying on their API floating on the ML.

> * Combine between my suggestion and the current one by replacing the raw
>    buffer with list of items. Less code duplication easier on the validation ( that 
>    don't think we need to validate the encap data) but we loss insertion rate.

Already suggested in the past [1], this led to VXLAN and NVGRE encap as we
know them.

> * your suggestion of  list of action that each action is one item. Main problem
>    is speed.  Complexity form the application side and time to implement.

Speed matters a lot to me also (go figure) but I still doubt this approach
is measurably faster. On the usability side, compared to one action per
protocol layer which better fits the rte_flow model, I'm also not

If we put aside usability and performance on which we'll never agree, there
is still one outstanding issue: the lack of mask. Users cannot tell which
fields are relevant and to be kept as is, and which are not.

How do applications know what blanks are filled in by HW? How do PMDs know
what applications expect? There's a risk of sending incomplete or malformed
packets depending on the implementation.

One may expect PMDs and HW to just "do the sensible thing" but some
applications won't know that some fields are not offloaded and will be
emitted with an unexpected value, while others will attempt to force a
normally offloaded field to some specific value and expect it to leave
unmodified. This cannot be predicted by the PMD, something is needed.

Assuming you add a mask pointer to address this, generic encap should be
functionally complete but not all that different from what we currently have
for VXLAN/NVGRE and from Declan's earlier proposal for generic encap [1];
PMD must parse the buffer (using a proper packet parser with your approach),
collect relevant fields, see if anything's unsupported while doing so before
proceeding with the flow rule.

Anyway, if you add that mask and rename these actions (since they should work
with pretty much anything, not necessarily tunnels, i.e. lazy applications
could ask HW to prepend missing Ethernet headers to pure IP traffic), they
can make sense. How about labeling this "raw" encap/decap?


 struct rte_flow_action_raw_encap {
     uint8_t *data; /**< Encapsulation data. */
     uint8_t *preserve; /**< Bit-mask of @p data to preserve on output. */
     size_t size; /**< Size of @p data and @p preserve. */

I guess decap could use the same object. Since there is no way to define a
sensible default behavior that works across multiple vendors when "preserve"
is not provided, I think this field cannot be NULL.

As for "L3 decap", well, can't one just provide a separate encap action?
I mean a raw decap action, followed by another action doing raw encap of the
intended L2? A separate set of actions seems unnecessary for that.

[1] "[PATCH v3 2/4] ethdev: Add tunnel encap/decap actions"

Adrien Mazarguil

More information about the dev mailing list