[dpdk-dev] [PATCH v3 2/5] gso: add TCP/IPv4 GSO support

Ananyev, Konstantin konstantin.ananyev at intel.com
Thu Sep 14 11:10:56 CEST 2017



> -----Original Message-----
> From: Kavanagh, Mark B
> Sent: Thursday, September 14, 2017 10:01 AM
> To: Ananyev, Konstantin <konstantin.ananyev at intel.com>; Hu, Jiayu <jiayu.hu at intel.com>
> Cc: dev at dpdk.org; Tan, Jianfeng <jianfeng.tan at intel.com>
> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> 
> >From: Ananyev, Konstantin
> >Sent: Thursday, September 14, 2017 9:40 AM
> >To: Kavanagh, Mark B <mark.b.kavanagh at intel.com>; Hu, Jiayu
> ><jiayu.hu at intel.com>
> >Cc: dev at dpdk.org; Tan, Jianfeng <jianfeng.tan at intel.com>
> >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >
> >
> >
> >> -----Original Message-----
> >> From: Kavanagh, Mark B
> >> Sent: Thursday, September 14, 2017 9:35 AM
> >> To: Hu, Jiayu <jiayu.hu at intel.com>; Ananyev, Konstantin
> ><konstantin.ananyev at intel.com>
> >> Cc: dev at dpdk.org; Tan, Jianfeng <jianfeng.tan at intel.com>
> >> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >>
> >> >From: Hu, Jiayu
> >> >Sent: Thursday, September 14, 2017 2:00 AM
> >> >To: Ananyev, Konstantin <konstantin.ananyev at intel.com>; Kavanagh, Mark B
> >> ><mark.b.kavanagh at intel.com>
> >> >Cc: dev at dpdk.org; Tan, Jianfeng <jianfeng.tan at intel.com>
> >> >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >> >
> >> >Hi Konstantin,
> >> >
> >> >> -----Original Message-----
> >> >> From: Ananyev, Konstantin
> >> >> Sent: Wednesday, September 13, 2017 11:13 PM
> >> >> To: Kavanagh, Mark B <mark.b.kavanagh at intel.com>; Hu, Jiayu
> >> >> <jiayu.hu at intel.com>
> >> >> Cc: dev at dpdk.org; Tan, Jianfeng <jianfeng.tan at intel.com>
> >> >> Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >> >>
> >> >> Hi Mark,
> >> >>
> >> >> > -----Original Message-----
> >> >> > From: Kavanagh, Mark B
> >> >> > Sent: Wednesday, September 13, 2017 3:52 PM
> >> >> > To: Ananyev, Konstantin <konstantin.ananyev at intel.com>; Hu, Jiayu
> >> >> <jiayu.hu at intel.com>
> >> >> > Cc: dev at dpdk.org; Tan, Jianfeng <jianfeng.tan at intel.com>
> >> >> > Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >> >> >
> >> >> > >From: Ananyev, Konstantin
> >> >> > >Sent: Wednesday, September 13, 2017 10:38 AM
> >> >> > >To: Hu, Jiayu <jiayu.hu at intel.com>
> >> >> > >Cc: dev at dpdk.org; Kavanagh, Mark B <mark.b.kavanagh at intel.com>;
> >> >> Tan, Jianfeng
> >> >> > ><jianfeng.tan at intel.com>
> >> >> > >Subject: RE: [PATCH v3 2/5] gso: add TCP/IPv4 GSO support
> >> >> > >
> >> >> > >
> >> >> > >
> >> >> > >> > > +
> >> >> > >> > > +int
> >> >> > >> > > +gso_tcp4_segment(struct rte_mbuf *pkt,
> >> >> > >> > > +		uint16_t gso_size,
> >> >> > >> > > +		uint8_t ipid_delta,
> >> >> > >> > > +		struct rte_mempool *direct_pool,
> >> >> > >> > > +		struct rte_mempool *indirect_pool,
> >> >> > >> > > +		struct rte_mbuf **pkts_out,
> >> >> > >> > > +		uint16_t nb_pkts_out)
> >> >> > >> > > +{
> >> >> > >> > > +	struct ipv4_hdr *ipv4_hdr;
> >> >> > >> > > +	uint16_t tcp_dl;
> >> >> > >> > > +	uint16_t pyld_unit_size;
> >> >> > >> > > +	uint16_t hdr_offset;
> >> >> > >> > > +	int ret = 1;
> >> >> > >> > > +
> >> >> > >> > > +	ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *)
> >> >> +
> >> >> > >> > > +			pkt->l2_len);
> >> >> > >> > > +	/* Don't process the fragmented packet */
> >> >> > >> > > +	if (unlikely((ipv4_hdr->fragment_offset & rte_cpu_to_be_16(
> >> >> > >> > > +
> >> >> 	IPV4_HDR_DF_MASK)) == 0)) {
> >> >> > >> >
> >> >> > >> >
> >> >> > >> > It is not a check for fragmented packet - it is a check that
> >> >> fragmentation
> >> >> > >is allowed for that packet.
> >> >> > >> > Should be IPV4_HDR_DF_MASK - 1,  I think.
> >> >> > >
> >> >> > >DF bit doesn't indicate is packet fragmented or not.
> >> >> > >It forbids to fragment packet any further.
> >> >> > >To check is packet already fragmented or not, you have to check MF bit
> >> >> and
> >> >> > >frag_offset.
> >> >> > >Both have to be zero for un-fragmented packets.
> >> >> > >
> >> >> > >>
> >> >> > >> IMO, IPV4_HDR_DF_MASK whose value is (1 << 14) is used to get DF
> >bit.
> >> >> It's a
> >> >> > >> little-endian value. But ipv4_hdr->fragment_offset is big-endian
> >order.
> >> >> > >> So the value of DF bit should be "ipv4_hdr->fragment_offset &
> >> >> > >rte_cpu_to_be_16(
> >> >> > >> IPV4_HDR_DF_MASK)". If this value is 0, the input packet is
> >fragmented.
> >> >> > >>
> >> >> > >> >
> >> >> > >> > > +		pkts_out[0] = pkt;
> >> >> > >> > > +		return ret;
> >> >> > >> > > +	}
> >> >> > >> > > +
> >> >> > >> > > +	tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) - pkt-
> >> >> >l3_len -
> >> >> > >> > > +		pkt->l4_len;
> >> >> > >> >
> >> >> > >> > Why not use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len?
> >> >> > >>
> >> >> > >> Yes, we can use pkt->pkt_len - pkt->l2_len -pkt_l3_len - pkt_l4_len
> >> >here.
> >> >> > >>
> >> >> > >> >
> >> >> > >> > > +	/* Don't process the packet without data */
> >> >> > >> > > +	if (unlikely(tcp_dl == 0)) {
> >> >> > >> > > +		pkts_out[0] = pkt;
> >> >> > >> > > +		return ret;
> >> >> > >> > > +	}
> >> >> > >> > > +
> >> >> > >> > > +	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
> >> >> > >> > > +	pyld_unit_size = gso_size - hdr_offset - ETHER_CRC_LEN;
> >> >> > >> >
> >> >> > >> > Hmm, why do we need to count CRC_LEN here?
> >> >> > >>
> >> >> > >> Yes, we shouldn't count ETHER_CRC_LEN here. Its length should be
> >> >> > >> included in gso_size.
> >> >> > >
> >> >> > >Why?
> >> >> > >What is the point to account crc len into this computation?
> >> >> > >Why not just assume that gso_size is already a max_frame_size -
> >crc_len
> >> >> > >As I remember, when we RX packet crc bytes will be already stripped,
> >> >> > >when user populates the packet, he doesn't care about crc bytes too.
> >> >> >
> >> >> > Hi Konstantin,
> >> >> >
> >> >> > When packet is tx'd, the 4B for CRC are added back into the packet; if
> >the
> >> >> payload is already at max capacity, then the actual segment size
> >> >> > will be 4B larger than expected (e.g. 1522B, as opposed to 1518B).
> >> >> > To prevent that from happening, we account for the CRC len in this
> >> >> calculation.
> >> >>
> >> >>
> >> >> Ok, and what prevents you to set gso_ctx.gso_size = 1514;  /*ether frame
> >> >> size without crc bytes */
> >> >> ?
> >>
> >> Hey Konstantin,
> >>
> >> If the user sets the gso_size to 1514, the resultant output segments' size
> >should be 1514, and not 1518.
> 
> Just to clarify - I meant here that the final output segment, including CRC len, should be 1514. I think this is where we're crossing wires ;)
> 
> >
> >Yes and then NIC HW will add CRC bytes for you.
> >You are not filling CRC bytes in HW, and when providing to the HW size to send
> >- it is a payload size
> >(CRC bytes are not accounted).
> >Konstantin
> 
> Yes, exactly - in that case though, the gso_size specified by the user is not the actual final output segment size, but (segment size - 4B),
> right?

CRC bytes will be add by HW, it is totally transparent for user.

> 
> We can set that expectation in documentation, but from an application's/user's perspective, do you think that this might be
> confusing/misleading?

I think it would be much more confusing to make user account for CRC bytes.
Let say when in DPDK you form a packet and send it out via rte_eth_tx_burst()
you specify only your payload size, not payload size plus crc bytes that HW will add for you.
Konstantin

> 
> Thanks again,
> Mark
> 
> >
> > Consequently, the payload capacity
> >> of each segment would be reduced accordingly.
> >> The user only cares about the output segment size (i.e. gso_ctx.gso_size);
> >we need to ensure that the size of the segments that are
> >> produced is consistent with that. As a result, we need to ensure that any
> >packet overhead is accounted for in the segment size, before we
> >> can determine how much space remains for data.
> >>
> >> Hope this makes sense.
> >>
> >> Thanks,
> >> Mark
> >>
> >> >
> >> >Exactly, applications can set 1514 to gso_segsz instead of 1518, if the
> >lower
> >> >layer
> >> >will add CRC to the packet.
> >> >
> >> >Jiayu
> >> >
> >> >> Konstantin
> >> >>
> >> >> >
> >> >> > If I've missed anything, please do let me know!
> >> >> >
> >> >> > Thanks,
> >> >> > Mark
> >> >> >
> >> >> > >
> >> >> > >Konstantin


More information about the dev mailing list