[dpdk-dev] [PATCH 1/3] mbuf:add two TX offload flags and change three fields

Ananyev, Konstantin konstantin.ananyev at intel.com
Thu Nov 27 18:01:33 CET 2014



> -----Original Message-----
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ananyev, Konstantin
> Sent: Thursday, November 27, 2014 2:56 PM
> To: Liu, Jijiang; Olivier Matz (olivier.matz at 6wind.com)
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 1/3] mbuf:add two TX offload flags and change three fields
> 
> 
> 
> >
> > -----Original Message-----
> > From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> > Sent: Thursday, November 27, 2014 6:00 PM
> > To: Liu, Jijiang; dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 1/3] mbuf:add two TX offload flags and change three fields
> >
> > Hi Jijiang,
> >
> > Please see some comments below.
> >
> > On 11/27/2014 09:18 AM, Jijiang Liu wrote:
> > > In place of removing the PKT_TX_VXLAN_CKSUM, we introduce 2 new flags: PKT_TX_OUT_IP_CKSUM,
> PKT_TX_UDP_TUNNEL_PKT,
> > and a new field: l4_tun_len.
> > > Replace the inner_l2_len and the inner_l3_len field with the outer_l2_len and outer_l3_len field.
> > >
> > > PKT_TX_OUT_IP_CKSUM: is not used for non-tunnelling packet;hardware outer checksum for tunnelling packet.
> > > PKT_TX_UDP_TUNNEL_PKT: is used to tell PMD that the transmit packet is a UDP tunneling packet.
> > > l4_tun_len: for VXLAN packet, it should be udp header length plus VXLAN header length.
> > >
> > > Signed-off-by: Jijiang Liu <jijiang.liu at intel.com>
> > > ---
> > >   lib/librte_mbuf/rte_mbuf.c |    2 +-
> > >   lib/librte_mbuf/rte_mbuf.h |   23 ++++++++++++++---------
> > >   2 files changed, 15 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> > > index 87c2963..e89c310 100644
> > > --- a/lib/librte_mbuf/rte_mbuf.c
> > > +++ b/lib/librte_mbuf/rte_mbuf.c
> > > @@ -240,7 +240,7 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
> > >   	case PKT_TX_SCTP_CKSUM: return "PKT_TX_SCTP_CKSUM";
> > >   	case PKT_TX_UDP_CKSUM: return "PKT_TX_UDP_CKSUM";
> > >   	case PKT_TX_IEEE1588_TMST: return "PKT_TX_IEEE1588_TMST";
> > > -	case PKT_TX_VXLAN_CKSUM: return "PKT_TX_VXLAN_CKSUM";
> > > +	case PKT_TX_UDP_TUNNEL_PKT: return "PKT_TX_UDP_TUNNEL_PKT";
> > >   	case PKT_TX_TCP_SEG: return "PKT_TX_TCP_SEG";
> > >   	default: return NULL;
> >
> > As I said as a reply to the cover letter, I suggest to use PKT_TX_OUT_UDP_CKSUM instead of PKT_TX_UDP_TUNNEL_PKT.
> 
> HW don't support outer L4 checksum offload.
> But to calculate inner checksums correctly, it needs a hint from SW about L4 Tunneling Type.
> Currently the following values are recognised by HW:
> 
> L4 Tunneling Type (Teredo / GRE header / VXLAN header) indication:
> 00b - No UDP / GRE tunneling (field must be set to zero if EIPT equals to zero)
> 01b - UDP tunneling header (any UDP tunneling, VXLAN and Geneve).
> 10b - GRE tunneling header
> Else - reserved
> 
> You can check yourself:
> http://www.intel.com/content/www/us/en/embedded/products/networking/xl710-10-40-controller-datasheet.html
> Sections 8.4.2.2.1 and 8.4.4.2
> 
> >
> > Also, the PKT_TX_OUT_IP_CKSUM case is missing here.
> >
> > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > > index 367fc56..48cd8e1 100644
> > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > @@ -99,10 +99,9 @@ extern "C" {
> > >   #define PKT_RX_TUNNEL_IPV6_HDR (1ULL << 12) /**< RX tunnel packet with IPv6 header. */
> > >   #define PKT_RX_FDIR_ID       (1ULL << 13) /**< FD id reported if FDIR match. */
> > >   #define PKT_RX_FDIR_FLX      (1ULL << 14) /**< Flexible bytes reported if FDIR match. */
> > > -/* add new RX flags here */
> > >
> >
> > We should probably not remove this line.
> >
> >
> > >   /* add new TX flags here */
> > > -#define PKT_TX_VXLAN_CKSUM   (1ULL << 50) /**< TX checksum of VXLAN computed by NIC */
> > > +#define PKT_TX_UDP_TUNNEL_PKT (1ULL << 50) /**< TX packet is an UDP
> > > +tunneling packet */
> > >   #define PKT_TX_IEEE1588_TMST (1ULL << 51) /**< TX IEEE1588 packet to
> > > timestamp. */
> > >
> > >   /**
> > > @@ -125,13 +124,20 @@ extern "C" {
> > >   #define PKT_TX_IP_CKSUM      (1ULL << 54) /**< IP cksum of TX pkt. computed by NIC. */
> > >   #define PKT_TX_IPV4_CSUM     PKT_TX_IP_CKSUM /**< Alias of PKT_TX_IP_CKSUM. */
> > >
> > > +#define PKT_TX_VLAN_PKT      (1ULL << 55) /**< TX packet is a 802.1q VLAN packet. */
> > > +
> > >   /** Tell the NIC it's an IPv4 packet. Required for L4 checksum offload or TSO. */
> > > -#define PKT_TX_IPV4          PKT_RX_IPV4_HDR
> > > +#define PKT_TX_IPV4          (1ULL << 56)
> > >
> > >   /** Tell the NIC it's an IPv6 packet. Required for L4 checksum offload or TSO. */
> > > -#define PKT_TX_IPV6          PKT_RX_IPV6_HDR
> > > +#define PKT_TX_IPV6          (1ULL << 57)
> >
> > The description in comment does not match the description in the cover letter.
> >
> > Also, I think replacing PKT_RX_IPV[46]_HDR by the value may go in another commit.
> >
> >
> > > -#define PKT_TX_VLAN_PKT      (1ULL << 55) /**< TX packet is a 802.1q VLAN packet. */
> > > +/** Outer IP cksum of TX pkt. computed by NIC for tunneling packet */
> > > +#define PKT_TX_OUTER_IP_CKSUM   (1ULL << 58)
> > > +#define PKT_TX_OUTER_IPV4_CSUM  PKT_TX_OUTER_IP_CKSUM /**< Alias of
> > > +PKT_TX_OUTER_IP_CKSUM. */
> >
> > Why do we need an alias?
> >
> > By the way, I think the alias of PKT_TX_IP_CKSUM is also uneeded and can be removed. But it's not the topic of your series.
> >
> > Also, the name PKT_TX_OUTER_IP_CKSUM does not match the name in the cover letter and commit logs.
> >
> >
> > > +
> > > +/** Tell the NIC it's an outer IPv6 packet for tunneling packet.*/
> > > +#define PKT_TX_OUTER_IPV6    (1ULL << 59)
> > >
> >
> > This flag is not in the cover letter or commit log. What is its purpose?
> 
> 
> My bad, forgot that for outer IP, will also need to specify it's type.
> So same story here as for inner IP.
> So in total, we might need 3 flags for outer IP:
> 
> /* Tells HW that outer IP is IPV4 and checksum for it should be calculated by HW. */
> PKT_TX_OUTER_IP_CKSUM
> 
> /* Tells HW that outer IP is IPV4 and checksum for it should not be calculated by HW. */
> PKT_TX_OUTER_IPV4
> 
> /* Tells HW that outer IP is IPV6. */
> PKT_TX_OUTER_IPV6
> 
> >
> >
> > >   /**
> > >    * TCP segmentation offload. To enable this offload feature for a @@
> > > -266,10 +272,9 @@ struct rte_mbuf {
> > >   			uint64_t tso_segsz:16; /**< TCP TSO segment size */
> > >
> > >   			/* fields for TX offloading of tunnels */
> > > -			uint64_t inner_l3_len:9; /**< inner L3 (IP) Hdr Length. */
> > > -			uint64_t inner_l2_len:7; /**< inner L2 (MAC) Hdr Length. */
> > > -
> > > -			/* uint64_t unused:8; */
> > > +			uint64_t outer_l3_len:9; /**< outer L3 (IP) Hdr Length. */
> > > +			uint64_t outer_l2_len:7; /**< outer L2 (MAC) Hdr Length. */
> > > +			uint64_t l4_tun_len:8; /**< L4 tunnelling header length */
> > >   		};
> > >   	};
> > >   } __rte_cache_aligned;
> > >
> >
> > About l4_tun_len, I have another comment I forgot to add in the cover letter. Can we remove it and include its length in
> outer_l2_len
> > instead? For instance, replace:
> >
> >       mb->l2_len =  eth_hdr_in;
> >       mb->l3_len = ipv4_hdr_in;
> >       mb->outer_l2_len = eth_hdr_out;
> >       mb->outer_l3_len = ipv4_hdr_out;
> >       mb->l4tun_len = vxlan_hdr;
> >       mb->ol_flags |= PKT_TX_OUT_IP_CKSUM  | PKT_TX_UDP_TUNNEL |
> >         PKT_TX_IP_CKSUM |  PKT_TX_TCP_CKSUM;
> >
> > by:
> >
> >       mb->l2_len =  eth_hdr_in;
> >       mb->l3_len = ipv4_hdr_in;
> >       mb->outer_l2_len = eth_hdr_out + vxlan_hdr;
> >       mb->outer_l3_len = ipv4_hdr_out;
> >       mb->ol_flags |= PKT_TX_OUT_IP_CKSUM  | PKT_TX_UDP_TUNNEL |
> >         PKT_TX_IP_CKSUM |  PKT_TX_TCP_CKSUM;
> >
> > I think it won't bother the driver, and it's coherent with case B.2 of your cover letter.
> 
> You probably meant:
> mb->l2_len =  eth_hdr_in + vxlan_hdr;
> ?
> Yes, I think it could be done that way too.
> Though I still prefer to keep l4tun_len - it makes things a bit cleaner (at least to me).
> After all  we do have space for it in mbuf's tx_offload.

As one more thing in favour of separate l4tun_len field:
l2_len is 7 bit long, so in theory it might be not enough, as for FVL:
12:18 L4TUNLEN L4 Tunneling Length (Teredo / GRE header / VXLAN header) defined in Words. 


> Konstantin
> 
> >
> > Regards,
> > Olivier



More information about the dev mailing list