[dpdk-dev] [PATCH v3 1/7] ethdev: introduce Tx generic tunnel L3/L4 offload

Olivier Matz olivier.matz at 6wind.com
Wed Mar 28 14:52:31 CEST 2018


Hi Xueming, Yongseok,

On Thu, Mar 22, 2018 at 01:55:05PM +0000, Xueming(Steven) Li wrote:
> 
> 
> > -----Original Message-----
> > From: Yongseok Koh
> > Sent: Wednesday, March 21, 2018 9:41 AM
> > To: Xueming(Steven) Li <xuemingl at mellanox.com>
> > Cc: Wenzhuo Lu <wenzhuo.lu at intel.com>; Jingjing Wu <jingjing.wu at intel.com>;
> > Thomas Monjalon <thomas at monjalon.net>; Olivier MATZ
> > <olivier.matz at 6wind.com>; Shahaf Shuler <shahafs at mellanox.com>; Ferruh
> > Yigit <ferruh.yigit at intel.com>; dev at dpdk.org
> > Subject: Re: [PATCH v3 1/7] ethdev: introduce Tx generic tunnel L3/L4
> > offload
> > 
> > On Mon, Mar 05, 2018 at 10:51:15PM +0800, Xueming Li wrote:
> > > This patch introduce new TX offload flags for device that supports
> > > tunnel agnostic L3/L4 checksum and TSO offload.
> > >
> > > The support from the device is for inner and outer checksums on
> > > IPV4/TCP/UDP and TSO for *any packet with the following format*:
> > >
> > > < some headers > / [optional IPv4/IPv6] / [optional TCP/UDP] / <some
> > > headers> / [optional inner IPv4/IPv6] / [optional TCP/UDP]
> > >
> > > For example the following packets can use this feature:
> > >
> > > 1. eth / ipv4 / udp / VXLAN / ip / tcp 2. eth / ipv4 / GRE / MPLS /
> > > ipv4 / udp
> > >
> > > Signed-off-by: Xueming Li <xuemingl at mellanox.com>
> > > ---
> > >  lib/librte_ether/rte_ethdev.h | 24 ++++++++++++++++++++++++
> > >  lib/librte_mbuf/rte_mbuf.c    |  5 +++++
> > >  lib/librte_mbuf/rte_mbuf.h    | 18 ++++++++++++++++--
> > >  3 files changed, 45 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/lib/librte_ether/rte_ethdev.h
> > > b/lib/librte_ether/rte_ethdev.h index 036153306..66d12d3e0 100644
> > > --- a/lib/librte_ether/rte_ethdev.h
> > > +++ b/lib/librte_ether/rte_ethdev.h
> > > @@ -980,6 +980,30 @@ struct rte_eth_conf {
> > >   *   the same mempool and has refcnt = 1.
> > >   */
> > >  #define DEV_TX_OFFLOAD_SECURITY         0x00020000
> > > +/**< Generic tunnel L3/L4 checksum offload. To enable this offload
> > > +feature
> > > + * for a packet to be transmitted on hardware supporting generic
> > > +tunnel L3/L4
> > > + * checksum offload:
> > > + *  - fill outer_l2_len and outer_l3_len in mbuf
> > > + *  - fill l2_len and l3_len in mbuf
> > > + *  - set the flags PKT_TX_TUNNEL_xxx (use PKT_TX_TUNNEL_UNKNOWN if
> > > +undefined)
> > > + *  - set the flags PKT_TX_OUTER_IP_CKSUM
> > > + *  - set the flags PKT_TX_IP_CKSUM
> > > + *  - set the flags PKT_TX_TCP_CKSUM, PKT_TX_SCTP_CKSUM or
> > > +PKT_TX_UDP_CKSUM  */
> > > +#define DEV_TX_OFFLOAD_GENERIC_TNL_CKSUM	0x00040000
> > 
> > It looks redundant to me. Isn't it same as having DEV_TX_OFFLOAD_*_CKSUM?
> > According to the API document, when PKT_TX_*_CKSUM is set, all the
> > necessary length fields should be filled in (e.g. mbuf->outer_l?_len and
> > mbuf->l?_len). It doesn't need to specify any tunnel type for checksum.
> > For example, in order to request checksum offload for an unknown tunnel
> > type which is similar to VxLAN, what should app do? Even without defining
> > this new offload flag, it is currently possible by setting
> > 	(PKT_TX_OUTER_IP_CKSUM | PKT_TX_OUTER_IPV4 | PKT_TX_IP_CKSUM |
> > 	 PKT_TX_IPV4 | PKT_TX_UDP_CKSUM)
> > along with various length fields.
> 
> Agree it almost same as existing checksum offloading requirement, besides:
> 1. it does not demand pseudo checksum
> 2. no need to clear IP checksum
> I guess the requirement on API doc is a super set and differences could be 
> documented on PMD doc.

Since commit 4fb7e803eb1a ("ethdev: add Tx preparation"), setting the
IP checksum to 0 and the pseudo header checksum in TCP is not required.
The user just need to set the flags, then before transmitting the packet,
call tx_prepare() which will do the proper work, according to the hw
requirements.

It looks that the API documentation of PKT_TX_TCP_SEG should be updated.
I'll submit a patch for this.


> For mlx5 original chksum offloading, hw will parse everything w/o demanding
> mbuf headers, a little better performance than SWP. The question is how to 
> choose between SWP and HW checksum offloading - assuming PKT_TX_TUNNEL_UNKNOWN
> is the key. Take L3 VXLAN(no inner L2 header) as example, only SWP could 
> offload it correctly and PKT_TX_TUNNEL_UNKOWN should be used here, not 
> PKT_TX_TUNNEL_VXLAN, make sense?
> 
> Another question, how does user app know the NIC capability of generic tunnel 
> checksum offload?

After reading the patchset, I still don't see what you want to achieve.
In patch 1, we talk about these 2 packets:

1. eth / ipv4 / udp / VXLAN / ip / tcp     (L3 VXLAN as you said above)
2. eth / ipv4 / GRE / MPLS / ipv4 / udp

What is not doable with current API?

I didn't try, but I think the following would work currently
with tx_prepare() + tx_burst().

- inner TSO on 1.

  flags = PKT_TX_TUNNEL_VXLAN | PKT_TX_IPV4 | PKT_TX_IP_CKSUM |
          PKT_TX_TCP_CKSUM | PKT_TX_TCP_SEG
  outer_l2_len = size(eth)
  outer_l3_len = size(ipv4)
  l2_len = size(udp + vxlan)
  l3_len = size(ip)

- inner TCP checksum on 1.

  flags = PKT_TX_TUNNEL_VXLAN | PKT_TX_IPV4 | PKT_TX_IP_CKSUM |
          PKT_TX_TCP_CKSUM
  outer_l2_len = size(eth)
  outer_l3_len = size(ipv4)
  l2_len = size(udp + vxlan)
  l3_len = size(ip)

- inner UDP checksum on 2.

  flags = PKT_TX_TUNNEL_GRE | PKT_TX_IPV4 | PKT_TX_IP_CKSUM |
          PKT_TX_UDP_CKSUM
  outer_l2_len = size(eth)
  outer_l3_len = size(ipv4)
  l2_len = size(gre + mpls)
  l3_len = size(ipv4)

Yes, writing the tunnel len in l2_len is a bit strange, but it
should do the trick.

It's still unclear for me why you need to add a "generic" flag.
Is there something you can't do today with the current API?


> 
> > 
> > > +/**< Generic tunnel segmentation offload. To enable it, the user needs
> > to:
> > > + *  - fill outer_l2_len and outer_l3_len in mbuf
> > > + *  - fill l2_len and l3_len in mbuf
> > > + *  - set the flags PKT_TX_TUNNEL_xxx (use PKT_TX_TUNNEL_UNKNOWN if
> > > +undefined)
> > > + *  - set the flags PKT_TX_OUTER_IPV4 or PKT_TX_OUTER_IPV6
> > > + *  - if it's UDP tunnel, set the flags PKT_TX_OUTER_UDP
> > > + *  - set the flags PKT_TX_IPV4 or PKT_TX_IPV6
> > > + *  - set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies
> > > + *    PKT_TX_OUTER_IP_CKSUM, PKT_TX_IP_CKSUM and PKT_TX_TCP_CKSUM)
> > > + * Hardware that supports generic tunnel TSO offload only update
> > > +outer/inner
> > > + * L3/L4 fields, tunnel fields are not touched.
> > > + */
> > > +#define DEV_TX_OFFLOAD_GENERIC_TNL_TSO		0x00080000
> > 
> > Practically, for the existing TSO offloads for tunneled packets, tunnel
> > type
> > (PKT_TX_TUNNEL_*) is needed for knowing if transport is UDP-based or IP-
> > based because the length field of UDP header should be updated for TSO.
> > AFAIK, there's no TCP-based tunnel. For PKT_TX_TUNNEL_UNKNOWN, the only
> > thing _unknown_ seems to be the type of transport and by defining
> > PKT_TX_OUTER_UDP, you seem to recognize the type of transport between UDP
> > and IP (it's not non-UDP). Given that, the new flags still look redundant
> > and ambiguous. If PKT_TX_TUNNEL_VXLAN is set, there's no need to set
> > PKT_TX_OUTER_UDP redundantly.
> > 
> > Instead, I think defining PKT_TX_TUNNEL_UDP_UNKNOWN and
> > PKT_TX_TUNNEL_IP_UNKNOWN could be a good alternative. It is only my humble
> > two cents and I'd like to hear from more people regarding this.
> > 
> 

I agree with Yongseok here. The flag PKT_TX_TUNNEL_VXLAN implies that
the protocol is UDP. Having a flag PKT_TX_TUNNEL_UDP or PKT_TX_TUNNEL_IP
could make sense if we find a use case that cannot be done currently with
the existing API (for instance, another tunnel type over UDP).

> This flag looks more like a feature notification. Current tunnel offloading 
> capability like "DEV_TX_OFFLOAD_VXLAN_TNL_TSO" ties to VXLAN tunnel type,
> DEV_TX_OFFLOAD_GENERIC_TNL_TSO should be able to let users know that NIC's
> capability of offloading generic tunnel type like MPLS-in-UDP that not defined
> yet. PKT_TX_OUTER_UDP is something must to have to flag an exists of outer UDP
> in such case.

So, maybe, instead of DEV_TX_OFFLOAD_GENERIC_TNL_TSO it could be
DEV_TX_OFFLOAD_UDP_TUNNEL_TSO. Because it is not possible for a hardware
to support TSO on top of any kind of tunnel type.

For instance, if the tunnel header has a checksum or a sequence id, the
hardware must be aware of it.



More information about the dev mailing list