[dpdk-dev] [PATCH 01/17] mbuf: add definitions of unified packet types

Zhang, Helin helin.zhang at intel.com
Mon Feb 2 02:43:25 CET 2015


Hi Olivier

> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Friday, January 30, 2015 9:56 PM
> To: Zhang, Helin; dev at dpdk.org
> Cc: Stephen Hemminger
> Subject: Re: [dpdk-dev] [PATCH 01/17] mbuf: add definitions of unified packet
> types
> 
> Hi Helin,
> 
> On 01/29/2015 04:15 AM, Helin Zhang wrote:
> > As there are only 6 bit flags in ol_flags for indicating packet types,
> > which is not enough to describe all the possible packet types hardware
> > can recognize. For example, i40e hardware can recognize more than 150
> > packet types. Unified packet type is composed of tunnel type, L3 type,
> > L4 type and inner L3 type fields, and can be stored in 16 bits mbuf
> > field of 'packet_type'.
> >
> > Signed-off-by: Helin Zhang <helin.zhang at intel.com>
> > Signed-off-by: Cunming Liang <cunming.liang at intel.com>
> > Signed-off-by: Jijiang Liu <jijiang.liu at intel.com>
> > ---
> >  lib/librte_mbuf/rte_mbuf.h | 74
> > ++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 74 insertions(+)
> >
> > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > index 16059c6..94ae344 100644
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -165,6 +165,80 @@ extern "C" {
> >  /* Use final bit of flags to indicate a control mbuf */
> >  #define CTRL_MBUF_FLAG       (1ULL << 63) /**< Mbuf contains control
> data */
> >
> > +/*
> > + * Sixteen bits are divided into several fields to mark packet types.
> > +Note that
> > + * each field is indexical.
> > + * - Bit 3:0 is for tunnel types.
> > + * - Bit 7:4 is for L3 or outer L3 (for tunneling case) types.
> > + * - Bit 10:8 is for L4 types. It can also be used for inner L4 types for
> > + *   tunneling packets.
> > + * - Bit 13:11 is for inner L3 types.
> > + * - Bit 15:14 is reserved.
> 
> Is there a reason why using this specific order?
Yes, to support ixgbe Vector PMD, outer L3 types and L4 types need to be contiguous
and in this order.

> 
> Also, there are 4 bits for outer L3 types and 3 bits for inner L3 types, but both of
> them have 6 different supported types. Is it intentional?
Yes, it is to support ixgbe Vector PMD. Contiguous 7 bits are needed, though 1 bit wasted.

> 
> > + *
> > + * To be compitable with Vector PMD, RTE_PTYPE_L3_IPV4,
> > + RTE_PTYPE_L3_IPV4_EXT,
> 
> compitable -> compatible
Good catch! It will be fixed in next version. Thanks!

> 
> > + * RTE_PTYPE_L3_IPV6, RTE_PTYPE_L3_IPV6_EXT, RTE_PTYPE_L4_TCP,
> > +RTE_PTYPE_L4_UDP
> > + * and RTE_PTYPE_L4_SCTP should be kept as below in a contiguous 7 bits.
> > + *
> > + * Note that L3 types values are selected for checking IPV4/IPV6
> > +header from
> > + * performance point of view. Reading annotations of
> > +RTE_ETH_IS_IPV4_HDR and
> > + * RTE_ETH_IS_IPV6_HDR is needed for any future changes of L3 type
> values.
> > + */
> > +#define RTE_PTYPE_UNKNOWN                   0x0000 /*
> 0b0000000000000000 */
> > +/* bit 3:0 for tunnel types */
> > +#define RTE_PTYPE_TUNNEL_IP                 0x0001 /*
> 0b0000000000000001 */
> > +#define RTE_PTYPE_TUNNEL_TCP                0x0002 /*
> 0b0000000000000010 */
> > +#define RTE_PTYPE_TUNNEL_UDP                0x0003 /*
> 0b0000000000000011 */
> > +#define RTE_PTYPE_TUNNEL_GRE                0x0004 /*
> 0b0000000000000100 */
> > +#define RTE_PTYPE_TUNNEL_VXLAN              0x0005 /*
> 0b0000000000000101 */
> > +#define RTE_PTYPE_TUNNEL_NVGRE              0x0006 /*
> 0b0000000000000110 */
> > +#define RTE_PTYPE_TUNNEL_GENEVE             0x0007 /*
> 0b0000000000000111 */
> > +#define RTE_PTYPE_TUNNEL_GRENAT             0x0008 /*
> 0b0000000000001000 */
> > +#define RTE_PTYPE_TUNNEL_GRENAT_MAC         0x0009 /*
> 0b0000000000001001 */
> > +#define RTE_PTYPE_TUNNEL_GRENAT_MACVLAN     0x000a /*
> 0b0000000000001010 */
> > +#define RTE_PTYPE_TUNNEL_MASK               0x000f /*
> 0b0000000000001111 */
> > +/* bit 7:4 for L3 types */
> > +#define RTE_PTYPE_L3_IPV4                   0x0010 /*
> 0b0000000000010000 */
> > +#define RTE_PTYPE_L3_IPV4_EXT               0x0030 /*
> 0b0000000000110000 */
> > +#define RTE_PTYPE_L3_IPV6                   0x0040 /*
> 0b0000000001000000 */
> > +#define RTE_PTYPE_L3_IPV4_EXT_UNKNOWN       0x0090 /*
> 0b0000000010010000 */
> > +#define RTE_PTYPE_L3_IPV6_EXT               0x00c0 /*
> 0b0000000011000000 */
> > +#define RTE_PTYPE_L3_IPV6_EXT_UNKNOWN       0x00e0 /*
> 0b0000000011100000 */
> > +#define RTE_PTYPE_L3_MASK                   0x00f0 /*
> 0b0000000011110000 */
> 
> can we expect that when RTE_PTYPE_L3_IPV4, RTE_PTYPE_L3_IPV4_EXT or
> RTE_PTYPE_L3_IPV4_EXT_UNKNOWN is set, the hardware also verified the
> L3 checksum?
RTE_PTYPE_L3_IPV4 means there is NONE-EXT. Each time only one of above 3 can be set.
These bits don't indicate any checksum, checksum should be indicated by other flags.
They are just for packet types hardware can recognized.

> 
> My understanding is:
> 
> - if packet_type is IPv4* and PKT_RX_IP_CKSUM_BAD is 0
>    -> checksum was checked by hw and is good
> - if packet_type is IPv4* and PKT_RX_IP_CKSUM_BAD is 1
>    -> checksum was checked by hw and is bad
> - if packet_type is not IPv4*
>    -> checksum was not checked by hw
> 
> I think it would solve the problem asked by Stephen
> http://dpdk.org/ml/archives/dev/2015-January/011550.html
> 
> > +/* bit 10:8 for L4 types */
> > +#define RTE_PTYPE_L4_TCP                    0x0100 /*
> 0b0000000100000000 */
> > +#define RTE_PTYPE_L4_UDP                    0x0200 /*
> 0b0000001000000000 */
> > +#define RTE_PTYPE_L4_FRAG                   0x0300 /*
> 0b0000001100000000 */
> > +#define RTE_PTYPE_L4_SCTP                   0x0400 /*
> 0b0000010000000000 */
> > +#define RTE_PTYPE_L4_ICMP                   0x0500 /*
> 0b0000010100000000 */
> > +#define RTE_PTYPE_L4_NONFRAG                0x0600 /*
> 0b0000011000000000 */
> > +#define RTE_PTYPE_L4_MASK                   0x0700 /*
> 0b0000011100000000 */
> 
> Same question for L4.
> 
> Note: it would means that if a hardware is able to recognize a TCP packet but
> not to verify the checksum, it has to set RTE_PTYPE_L4 to unknown.
> 
> > +/* bit 13:11 for inner L3 types */
> > +#define RTE_PTYPE_INNER_L3_IPV4             0x0800 /*
> 0b0000100000000000 */
> > +#define RTE_PTYPE_INNER_L3_IPV4_EXT         0x1000 /*
> 0b0001000000000000 */
> > +#define RTE_PTYPE_INNER_L3_IPV6             0x1800 /*
> 0b0001100000000000 */
> > +#define RTE_PTYPE_INNER_L3_IPV6_EXT         0x2000 /*
> 0b0010000000000000 */
> > +#define RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN 0x2800 /*
> > +0b0010100000000000 */ #define
> RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN 0x3000 /* 0b0011000000000000
> */
We cannot define the hardware behaviors, it just reports the hardware recognized
packet information directly to the mbuf.
Based on my experiment on i40e hardware, if a IPV4 packet with wrong checksum,
by default, the PMD driver cannot see the packet at all. So we don't need to care
about it too much!
Thanks for the good comments!

> > +#define RTE_PTYPE_INNER_L3_MASK             0x3800 /*
> 0b0011100000000000 */
> > +/* bit 15:14 reserved */
> > +
> > +/**
> > + * Check if the (outer) L3 header is IPv4. To avoid comparing IPv4
> > +types one by
> > + * one, bit 4 is selected to be used for IPv4 only. Then checking bit
> > +4 can
> > + * determin if it is an IPV4 packet.
> > + */
> > +#define  RTE_ETH_IS_IPV4_HDR(ptype) ((ptype) & RTE_PTYPE_L3_IPV4)
> > +
> > +/**
> > + * Check if the (outer) L3 header is IPv4. To avoid comparing IPv4
> > +types one by
> > + * one, bit 6 is selected to be used for IPv4 only. Then checking bit
> > +6 can
> > + * determin if it is an IPV4 packet.
> > + */
> > +#define  RTE_ETH_IS_IPV6_HDR(ptype) ((ptype) & RTE_PTYPE_L3_IPV6)
> > +
> > +/* Check if it is a tunneling packet */ #define
> > +RTE_ETH_IS_TUNNEL_PKT(ptype) ((ptype) & RTE_PTYPE_TUNNEL_MASK)
> > +
> >  /**
> >   * Get the name of a RX offload flag
> >   *
> >
> 
> Thanks,
> Olivier


More information about the dev mailing list