[dpdk-dev] [EXT] Re: [PATCH 1/3] mbuf: add Tx offloads for packet marking

Nithin Dabilpuram ndabilpuram at marvell.com
Mon May 4 12:04:57 CEST 2020


On Mon, May 04, 2020 at 11:16:40AM +0200, Olivier Matz wrote:
> On Mon, May 04, 2020 at 01:57:06PM +0530, Nithin Dabilpuram wrote:
> > Hi Olivier,
> > 
> > On Mon, May 04, 2020 at 10:06:34AM +0200, Olivier Matz wrote:
> > > External Email
> > > 
> > > ----------------------------------------------------------------------
> > > Hi,
> > > 
> > > On Fri, May 01, 2020 at 04:48:21PM +0530, Jerin Jacob wrote:
> > > > On Fri, Apr 17, 2020 at 12:53 PM Nithin Dabilpuram
> > > > <nithind1988 at gmail.com> wrote:
> > > > >
> > > > > From: Nithin Dabilpuram <ndabilpuram at marvell.com>
> > > > >
> > > > > Introduce PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_ECN
> > > > > and PKT_TX_MARK_VLAN_DEI Tx offload flags to support
> > > > > packet marking.
> > > > >
> > > > > When packet marking feature in Traffic manager is enabled,
> > > > > application has to the use the three new flags to indicate
> > > > > to PMD on whether packet marking needs to be enabled on the
> > > > > specific mbuf or not. By setting the three flags, it is
> > > > > assumed by PMD that application has already verified the
> > > > > applicability of marking on that specific packet and
> > > > > PMD need not perform further checks as per RFC.
> > > > >
> > > > > Signed-off-by: Krzysztof Kanas <kkanas at marvell.com>
> > > > > Signed-off-by: Nithin Dabilpuram <ndabilpuram at marvell.com>
> > > > 
> > > > None of the ethdev TM driver implementations has supported packet
> > > > marking support.
> > > > rte_tm and rte_mbuf maintainers(Christian, Oliver), Could you review this patch?
> > > 
> > > As you know, the number of mbuf flags is limited (only 18 bits are
> > > remaining), so I think we should use them with care, i.e. for features
> > > that are generic enough.
> > 
> > I agree, but I believe this is one of the basic flags needed like other 
> > Tx checksum offload flags (like PKT_TX_IP_CKSUM, PKT_TX_IPV4, etc) which 
> > are needed to identify on which packets HW should/can apply packet marking.
> 
> PKT_TX_IP_CKSUM tells the hardware to offload the checksum
> calculation. This is pretty straightforward and there is no other
> dependency than the offload feature advertised by the PMD.
> 
> I'm sorry, I have not a lot of experience with rte_tm.h, so it's
> difficult for me to have a global view of what is done for instance when
> PKT_TX_MARK_VLAN_DEI is set, and what happens when it is not set.
> 
> Can you confirm that my understanding below is correct? (or correct me
> where I'm wrong)
> 
> Before your patch:
> - the application enables the port and traffic manager on it
> - the application calls rte_tm_mark_vlan_dei() to select which traffic
>   class must be marked
> - when a packet is transmitted, the traffic class is determined by the
>   hardware, and if the hardware recognizes a VLAN packet, the VLAN DEI
>   bit is set depending on traffic class
> 
> The problem is for packets that cannot be recognized by the hardware,
> correct?

Yes. Octeontx2 HW always depends on application knowledge instead of walking 
through all the layers of packet data in Tx to identify what packet it is 
and where the l2, l3, l4 headers start for performance reasons. 

I believe there are other hardware too that have the same expectation
and hence we have a need for PKT_TX_IPv4, PKT_TX_IPv6 kind of flags.

Hence we want to make use of mbuf:tx_offload field and PKT_TX_* flags 
for identifying the packet and knowing what are its l2,l3,l4 offsets.

> 
> So your patch is a way to force the hardware to recognize mark set the
> VLAN DEI on packets that are not recognized as VLAN packets?
> 
> How the is traffic class of the packet determined?

Packet is coloured based on Single Rate[1] or Dual Rate[2] Shaping result
and packet color determines traffic class. The exact behavior of 
packet color to traffic class mapping is mentioned in TM spec based on
few other RFC's.

[1] https://tools.ietf.org/html/rfc2697
[2] https://tools.ietf.org/html/rfc2698

> 
> 
> > > From what I understand, this feature is bound to octeontx2, so using a
> > > mbuf dynamic flag would make more sense here. There are some examples in
> > > dpdk repository, just grep for "dynflag".
> > 
> > This is not octeontx2 specific flag but any "packet marking feature" enabled
> > PMD would need these flags to identify on which packets marking needs to be 
> > done. This is the first PMD that supports packet marking feature and
> > hence it was not exposed earlier.
> > 
> > For example to mark VLAN DEI, PMD cannot always assume that there is preexisting
> > VLAN header from Byte 12 as there is no gaurantee that ethernet header
> > always starts at Byte 0 (Custom headers before ethernet hdr).
> > 
> > > 
> > > Also, I think that the feature availability should be advertised through
> > > an ethdev offload, so an application can know at initialization time
> > > that these flags can be used.
> > 
> > Feature availablity is already part of TM spec in rte_tm.h 
> > struct rte_tm_capabilities:mark_vlan_dei_supported
> > struct rte_tm_capabilities:mark_ip_ecn_[sctp|tcp]_supported
> > struct rte_tm_capabilities:mark_ip_dscp_supported
> 
> Does this mean that any driver advertising this existing feature flag
> has to support the new mbuf flags too? Shouldn't we have a specific
> feature for it?

Yes, I thought PMD's need to support both.
I'm fine adding specific feature flag for the offload flags alone
if you insist or if there are other PMD's which don't need the offload flags
for packet marking. I was not able to find out about other PMD's as
none of the existing PMD's support packet marking.

> 
> Please also see few comments below.
> 
> > > > > ---
> > > > >  doc/guides/nics/features.rst    | 14 ++++++++++++++
> > > > >  lib/librte_mbuf/rte_mbuf.c      |  6 ++++++
> > > > >  lib/librte_mbuf/rte_mbuf_core.h | 36 ++++++++++++++++++++++++++++++++++--
> > > > >  3 files changed, 54 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> > > > > index edd21c4..bc978fb 100644
> > > > > --- a/doc/guides/nics/features.rst
> > > > > +++ b/doc/guides/nics/features.rst
> > > > > @@ -913,6 +913,20 @@ Supports to get Rx/Tx packet burst mode information.
> > > > >  * **[implements] eth_dev_ops**: ``rx_burst_mode_get``, ``tx_burst_mode_get``.
> > > > >  * **[related] API**: ``rte_eth_rx_burst_mode_get()``, ``rte_eth_tx_burst_mode_get()``.
> > > > >
> > > > > +.. _nic_features_traffic_manager_packet_marking_offload:
> > > > > +
> > > > > +Traffic Manager Packet marking offload
> > > > > +--------------------------------------
> > > > > +
> > > > > +Supports enabling a packet marking offload specific mbuf.
> > > > > +
> > > > > +* **[uses]     mbuf**: ``mbuf.ol_flags:PKT_TX_MARK_IP_DSCP``,
> > > > > +  ``mbuf.ol_flags:PKT_TX_MARK_IP_ECN``, ``mbuf.ol_flags:PKT_TX_MARK_VLAN_DEI``,
> > > > > +  ``mbuf.ol_flags:PKT_TX_IPV4``, ``mbuf.ol_flags:PKT_TX_IPV6``.
> > > > > +* **[uses]     mbuf**: ``mbuf.l2_len``.
> > > > > +* **[related] API**: ``rte_tm_mark_ip_dscp()``, ``rte_tm_mark_ip_ecn()``,
> > > > > +  ``rte_tm_mark_vlan_dei()``.
> > > > > +
> > > > >  .. _nic_features_other:
> > > > >
> > > > >  Other dev ops not represented by a Feature
> > > > > diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c
> > > > > index cd5794d..5c6896d 100644
> > > > > --- a/lib/librte_mbuf/rte_mbuf.c
> > > > > +++ b/lib/librte_mbuf/rte_mbuf.c
> > > > > @@ -880,6 +880,9 @@ const char *rte_get_tx_ol_flag_name(uint64_t mask)
> > > > >         case PKT_TX_SEC_OFFLOAD: return "PKT_TX_SEC_OFFLOAD";
> > > > >         case PKT_TX_UDP_SEG: return "PKT_TX_UDP_SEG";
> > > > >         case PKT_TX_OUTER_UDP_CKSUM: return "PKT_TX_OUTER_UDP_CKSUM";
> > > > > +       case PKT_TX_MARK_VLAN_DEI: return "PKT_TX_MARK_VLAN_DEI";
> > > > > +       case PKT_TX_MARK_IP_DSCP: return "PKT_TX_MARK_IP_DSCP";
> > > > > +       case PKT_TX_MARK_IP_ECN: return "PKT_TX_MARK_IP_ECN";
> > > > >         default: return NULL;
> > > > >         }
> > > > >  }
> > > > > @@ -916,6 +919,9 @@ rte_get_tx_ol_flag_list(uint64_t mask, char *buf, size_t buflen)
> > > > >                 { PKT_TX_SEC_OFFLOAD, PKT_TX_SEC_OFFLOAD, NULL },
> > > > >                 { PKT_TX_UDP_SEG, PKT_TX_UDP_SEG, NULL },
> > > > >                 { PKT_TX_OUTER_UDP_CKSUM, PKT_TX_OUTER_UDP_CKSUM, NULL },
> > > > > +               { PKT_TX_MARK_VLAN_DEI, PKT_TX_MARK_VLAN_DEI, NULL },
> > > > > +               { PKT_TX_MARK_IP_DSCP, PKT_TX_MARK_IP_DSCP, NULL },
> > > > > +               { PKT_TX_MARK_IP_ECN, PKT_TX_MARK_IP_ECN, NULL },
> > > > >         };
> > > > >         const char *name;
> > > > >         unsigned int i;
> > > > > diff --git a/lib/librte_mbuf/rte_mbuf_core.h b/lib/librte_mbuf/rte_mbuf_core.h
> > > > > index b9a59c8..d9f1290 100644
> > > > > --- a/lib/librte_mbuf/rte_mbuf_core.h
> > > > > +++ b/lib/librte_mbuf/rte_mbuf_core.h
> > > > > @@ -187,11 +187,40 @@ extern "C" {
> > > > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE */
> > > > >
> > > > >  #define PKT_FIRST_FREE (1ULL << 23)
> > > > > -#define PKT_LAST_FREE (1ULL << 40)
> > > > > +#define PKT_LAST_FREE (1ULL << 37)
> > > > >
> > > > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE  */
> > > > >
> > > > >  /**
> > > > > + * Packet marking offload flags. These flags indicated what kind
> > > > > + * of packet marking needs to be applied on a given mbuf when
> > > > > + * appropriate Traffic Manager configuration is in place.
> > > > > + * When user set's these flags on a mbuf, below assumptions are made
> > > > > + * 1) When PKT_TX_MARK_VLAN_DEI is set,
> > > > > + * a) PMD assumes pkt to be a 802.1q packet.
> 
> What does that imply?

I meant by setting the flag, a packet has VLAN header adhering to IEEE 802.1Q spec.

> 
> > > > > + * b) Application should also set mbuf.l2_len where 802.1Q header is
> > > > > + *    at (mbuf.l2_len - 6) offset.
> 
> Why mbuf.l2_len - 6 ?
L2 header when VLAN header is preset will be 
{custom header 'X' Bytes}:{Ethernet SRC+DST (12B)}:{VLAN Header (4B)}:{Ether Type (2B)}
l2_len = X + 12 + 4 + 2
So, VLAN header starts at (l2_len - 6) bytes.

> 
> > > > > + * 2) When PKT_TX_MARK_IP_DSCP is set,
> > > > > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6
> > > > > + *    to indicate whether if it is IPv4 packet or IPv6 packet
> > > > > + *    for DSCP marking. It should also set PKT_TX_IP_CKSUM if it is
> > > > > + *    IPv4 pkt.
> > > > > + * b) Application should also set mbuf.l2_len that indicates
> > > > > + *    start offset of L3 header.
> > > > > + * 3) When PKT_TX_MARK_IP_ECN is set,
> > > > > + * a) Application should also set either PKT_TX_IPV4 or PKT_TX_IPV6.
> > > > > + *    It should also set PKT_TX_IP_CKSUM if it is IPv4 pkt.
> > > > > + * b) PMD will assume pkt L4 protocol is either TCP or SCTP and
> > > > > + *    ECN is set to 2'b01 or 2'b10 as per RFC 3168 and hence HW
> > > > > + *    can mark the packet for a configured color.
> > > > > + * c) Application should also set mbuf.l2_len that indicates
> > > > > + *    start offset of L3 header.
> > > > > + */
> > > > > +#define PKT_TX_MARK_VLAN_DEI           (1ULL << 38)
> > > > > +#define PKT_TX_MARK_IP_DSCP            (1ULL << 39)
> > > > > +#define PKT_TX_MARK_IP_ECN             (1ULL << 40)
> 
> We should have one comment per define.
Ack, will fix in V2.

> 
> 
> > > > > +
> > > > > +/**
> > > > >   * Outer UDP checksum offload flag. This flag is used for enabling
> > > > >   * outer UDP checksum in PMD. To use outer UDP checksum, the user needs to
> > > > >   * 1) Enable the following in mbuf,
> > > > > @@ -384,7 +413,10 @@ extern "C" {
> > > > >                 PKT_TX_MACSEC |          \
> > > > >                 PKT_TX_SEC_OFFLOAD |     \
> > > > >                 PKT_TX_UDP_SEG |         \
> > > > > -               PKT_TX_OUTER_UDP_CKSUM)
> > > > > +               PKT_TX_OUTER_UDP_CKSUM | \
> > > > > +               PKT_TX_MARK_VLAN_DEI |   \
> > > > > +               PKT_TX_MARK_IP_DSCP |    \
> > > > > +               PKT_TX_MARK_IP_ECN)
> > > > >
> > > > >  /**
> > > > >   * Mbuf having an external buffer attached. shinfo in mbuf must be filled.
> > > > > --
> > > > > 2.8.4
> > > > >


More information about the dev mailing list