[dpdk-dev] [PATCH v1] gso: fix marking TCP checksum flag in TCP segments

Ophir Munk ophirmu at mellanox.com
Tue Apr 24 13:45:36 CEST 2018


Hi Konstantin,
Please see inline

> -----Original Message-----
> From: Ananyev, Konstantin [mailto:konstantin.ananyev at intel.com]
> Sent: Tuesday, April 24, 2018 1:56 PM
> To: Ophir Munk <ophirmu at mellanox.com>; Hu, Jiayu <jiayu.hu at intel.com>;
> dev at dpdk.org
> Cc: Thomas Monjalon <thomas at monjalon.net>; Olga Shern
> <olgas at mellanox.com>; Pascal Mazon <pascal.mazon at 6wind.com>;
> stable at dpdk.org
> Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP segments
> 
> 
> 
> > -----Original Message-----
> > From: Ophir Munk [mailto:ophirmu at mellanox.com]
> > Sent: Tuesday, April 24, 2018 10:44 AM
> > To: Hu, Jiayu <jiayu.hu at intel.com>; dev at dpdk.org; Ananyev, Konstantin
> > <konstantin.ananyev at intel.com>
> > Cc: Thomas Monjalon <thomas at monjalon.net>; Olga Shern
> > <olgas at mellanox.com>; Pascal Mazon <pascal.mazon at 6wind.com>;
> > stable at dpdk.org
> > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> > segments
> >
> > Hi Jiayu,
> > Please find comments inline
> >
> > > -----Original Message-----
> > > From: Hu, Jiayu [mailto:jiayu.hu at intel.com]
> > > Sent: Monday, April 23, 2018 7:14 AM
> > > To: Ophir Munk <ophirmu at mellanox.com>; dev at dpdk.org; Ananyev,
> > > Konstantin <konstantin.ananyev at intel.com>
> > > Cc: Thomas Monjalon <thomas at monjalon.net>; Olga Shern
> > > <olgas at mellanox.com>; Pascal Mazon <pascal.mazon at 6wind.com>;
> > > stable at dpdk.org
> > > Subject: RE: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> > > segments
> > >
> > > Hi Ophir,
> > >
> > > In the GSO design, the GSO library doesn't care about checksums,
> > > which means it doesn't check if input packets have correct
> > > checksums, and it doesn't do any checksum related work for the
> > > output GSO segments. It depends on the callers to use HW or SW
> > > checksum calculation for output packets. This is why the GSO library
> > > doesn't set PKT_TX_TCP_CKSUM. So I don't think it's a bug.
> > >
> >
> > Can you please reconsider this design? I think the GSO library should
> > imitate the HW behavior where TCP segments checksum is automatically
> > calculated without explicitly requesting it. I am not saying that GSO library
> itself should calculate the checksums - but at least it should mark each
> segment as requiring this calculation.
> 
> But gso has no idea how this packet will be processed after it.

GSO shouldn't know. It should only mark the fact that a new TCP segment was created without a TCP checksum. 

> Caller can choose to calculate L3/L4 cksum in SW or might be going to use
> HW offloads.

Assuming TSO is configured then you suggest that TAP PMD will mark by itself the TCP_CKSUM flag for all packets returned from GSO library?

> In later case nothing stops the caller to update mbuf->ol_flags in a way he
> likes (TCP_CKSUM, IP_CKSUM, etc.).
> Konstantin
> 

Please note that TCP_SEG flag is cleared by GSO library in 2 different cases:
1. Packet length equals to or is bigger than GSO size. In this case new TCP segments are created with no TCP checksum. 
2. Packet length is smaller than GSO size. In this case no TCP segmentation is required. The original packet is returned and its existing TCP checksum is OK.

In the latter case TAP PMD will always calculate TCP checksum in SW (performance concerns) where this could have been saved. 
I am thinking of a practical scenario where TSO is configured but all packets are smaller than GSO size, however TAP PMD unnecessarily recalculates their checksums.

How do you suggest to avoid this scenario?

> >
> > > In my opinion, it's not a good idea to enable HW TCP checksum
> > > calculation silently, and without the aware of the caller. In fact,
> > > the caller always know it does SW TSO (i.e. GSO), instead of real HW TSO.
> >
> > This is not correct. Consider net_failsafe with 2 sub-devices: one is
> > a HW PCI device, the other one is a SW TAP device. Failsafe must work
> transparently with these two sub-devices and the caller cannot tell if TSO is
> done in SW or HW.
> >
> > > If the caller wants HW
> > > checksum calculation, it can add PKT_TX_TCP_CKSUM to ol_flags before
> > > or after calling the GSO library.
> > >
> >
> > FYI - TAP TSO patches were submitted to dpdk.org mailing list. These
> patches use the GSO library.
> >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdpd
> >
> k.org%2Fdev%2Fpatchwork%2Fpatch%2F38666%2F&data=02%7C01%7Coph
> irmu%40me
> >
> llanox.com%7C7455c8e31c7a4364bc7108d5a9d20008%7Ca652971c7d2e4d
> 9ba6a4d1
> >
> 49256f461b%7C0%7C0%7C636601641779974217&sdata=CF7EvhXG%2BrH%
> 2BPiQEbvM0
> > mC%2FSpqobneKaoV03j5VrSDw%3D&reserved=0
> >
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdpd
> >
> k.org%2Fdev%2Fpatchwork%2Fpatch%2F38667%2F&data=02%7C01%7Coph
> irmu%40me
> >
> llanox.com%7C7455c8e31c7a4364bc7108d5a9d20008%7Ca652971c7d2e4d
> 9ba6a4d1
> >
> 49256f461b%7C0%7C0%7C636601641779974217&sdata=j9WVIj%2FKq6EN
> WXu3mr5By1
> > toSowU8bqJRGZ19SxiGoI%3D&reserved=0
> >
> > Running testpmd with TAP TSO is currently broken without the suggested
> librte_gso patch.
> > Please note testpmd implementation (app/test-pmd/csumonly.c
> > b/app/test-pmd/csumonly.c) in case *both* TSO and TCP CKSUM are
> > configured:
> >
> >   if (tso_segsz)
> >       ol_flags |= PKT_TX_TCP_SEG;    // *** if TSO is applicable - the packet
> flags are only marked with PKT_TX_TCP_SEG and no
> > PKT_TX_TCP_CKSUM ***
> >    else if (tx_offloads & DEV_TX_OFFLOAD_TCP_CKSUM)
> >        ol_flags |= PKT_TX_TCP_CKSUM;     // *** PKT_TX_TCP_CKSUM is
> marked only if TSO is not applicable ***
> >   else {
> >       tcp_hdr->cksum =
> >          get_udptcp_checksum(l3_hdr, tcp_hdr,
> >
> > In other words - testpmd does not set TCP_CKSUM along with TCP_SEG
> > therefore using testpmd with TAP/TSO will result in TCP segments with 0
> (incorrect) TCP checksums.
> >
> > In addition - please note the comments in lib/librte_mbuf/rte_mbuf.h
> > which specify that PKT_TX_TCP_SEG flag implies the PKT_TX_TCP_CKSUM
> > (hence it is not required to be explicitly set by the caller)
> >
> > /**
> > * TCP segmentation offload. To enable this offload feature for a
> > * packet to be transmitted on hardware supporting TSO:
> > *  - set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies
> > *    PKT_TX_TCP_CKSUM)
> > ...
> >
> > > Add Konstantin for more suggestions.
> > >
> > > Thanks,
> > > Jiayu
> > >
> > > > -----Original Message-----
> > > > From: Ophir Munk [mailto:ophirmu at mellanox.com]
> > > > Sent: Sunday, April 22, 2018 10:21 PM
> > > > To: dev at dpdk.org; Hu, Jiayu <jiayu.hu at intel.com>
> > > > Cc: Thomas Monjalon <thomas at monjalon.net>; Olga Shern
> > > > <olgas at mellanox.com>; Pascal Mazon <pascal.mazon at 6wind.com>;
> > > Ophir
> > > > Munk <ophirmu at mellanox.com>; stable at dpdk.org
> > > > Subject: [PATCH v1] gso: fix marking TCP checksum flag in TCP
> > > > segments
> > > >
> > > > Large TCP packets which are marked with PKT_TX_TCP_SEG flag are
> > > > segmented and the flag is cleared in the resulting segments,
> > > > however, the segments checksum is not updated. It is therefore
> > > > required to set the PKT_TX_TCP_CKSUM flag in each TCP segment in
> > > > order to mark for the sending driver the need to update the TCP
> > > > checksum before transmitting the segment.
> > > >
> > > > Fixes: 119583797b6a ("gso: support TCP/IPv4 GSO")
> > > > Cc: stable at dpdk.org
> > > >
> > > > Signed-off-by: Ophir Munk <ophirmu at mellanox.com>
> > > > ---
> > > >  lib/librte_gso/rte_gso.c | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > >
> > > > diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> > > > index a44e3d4..e9ce9ce 100644
> > > > --- a/lib/librte_gso/rte_gso.c
> > > > +++ b/lib/librte_gso/rte_gso.c
> > > > @@ -50,12 +50,14 @@ rte_gso_segment(struct rte_mbuf *pkt,
> > > >  			((IS_IPV4_GRE_TCP4(pkt->ol_flags) &&
> > > >  			 (gso_ctx->gso_types &
> > > > DEV_TX_OFFLOAD_GRE_TNL_TSO)))) {
> > > >  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> > > > +		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
> > > >  		ret = gso_tunnel_tcp4_segment(pkt, gso_size, ipid_delta,
> > > >  				direct_pool, indirect_pool,
> > > >  				pkts_out, nb_pkts_out);
> > > >  	} else if (IS_IPV4_TCP(pkt->ol_flags) &&
> > > >  			(gso_ctx->gso_types &
> > > > DEV_TX_OFFLOAD_TCP_TSO)) {
> > > >  		pkt->ol_flags &= (~PKT_TX_TCP_SEG);
> > > > +		pkt->ol_flags |= PKT_TX_TCP_CKSUM;
> > > >  		ret = gso_tcp4_segment(pkt, gso_size, ipid_delta,
> > > >  				direct_pool, indirect_pool,
> > > >  				pkts_out, nb_pkts_out);
> > > > --
> > > > 2.7.4



More information about the dev mailing list