[dpdk-dev] [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support

Jiayu Hu jiayu.hu at intel.com
Mon Sep 4 05:31:50 CEST 2017


Hi Konstantin,

About the IP identifier, I check the linux codes and have some feedbacks inline.

On Wed, Aug 30, 2017 at 09:38:33AM +0800, Ananyev, Konstantin wrote:
> 
> 
> > -----Original Message-----
> > From: Hu, Jiayu
> > Sent: Thursday, August 24, 2017 3:16 PM
> > To: dev at dpdk.org
> > Cc: Kavanagh, Mark B <mark.b.kavanagh at intel.com>; Ananyev, Konstantin <konstantin.ananyev at intel.com>; Tan, Jianfeng
> > <jianfeng.tan at intel.com>; Hu, Jiayu <jiayu.hu at intel.com>
> > Subject: [PATCH 2/5] gso/lib: add TCP/IPv4 GSO support
> > 
> > This patch adds GSO support for TCP/IPv4 packets. Supported packets
> > may include a single VLAN tag. TCP/IPv4 GSO assumes that all input
> > packets have correct checksums, and doesn't update checksums for output
> > packets (the responsibility for this lies with the application).
> > Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.
> > 
> > TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
> > MBUF, to organize an output packet. Note that we refer to these two
> > chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
> > header, while the indirect mbuf simply points to a location within the
> > original packet's payload. Consequently, use of the GSO library requires
> > multi-segment MBUF support in the TX functions of the NIC driver.
> > 
> > If a packet is GSOed, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
> > result, when all of its GSOed segments are freed, the packet is freed
> > automatically.
> > 
> > Signed-off-by: Jiayu Hu <jiayu.hu at intel.com>
> > Signed-off-by: Mark Kavanagh <mark.b.kavanagh at intel.com>
> > ---
> > +void
> > +gso_update_pkt_headers(struct rte_mbuf *pkt, uint16_t nb_segments,
> > +		struct rte_mbuf **out_segments)
> > +{
> > +	struct ipv4_hdr *ipv4_hdr;
> > +	struct tcp_hdr *tcp_hdr;
> > +	struct rte_mbuf *seg;
> > +	uint32_t sent_seq;
> > +	uint16_t offset, i;
> > +	uint16_t tail_seg_idx = nb_segments - 1, id;
> > +
> > +	switch (pkt->packet_type) {
> > +	case ETHER_VLAN_IPv4_TCP_PKT:
> > +	case ETHER_IPv4_TCP_PKT:
> 
> Might be worth to put code below in a separate function:
> update_inner_tcp_hdr(..) or so.
> Then you can reuse it for tunneled cases too.
> 
> > +		ipv4_hdr = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) 
> > +				pkt->l2_len);
> > +		tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
> > +		id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
> > +		sent_seq = rte_be_to_cpu_32(tcp_hdr->sent_seq);
> > +
> > +		for (i = 0; i < nb_segments; i++) {
> > +			seg = out_segments[i];
> > +
> > +			offset = seg->l2_len;
> > +			update_ipv4_header(rte_pktmbuf_mtod(seg, char *),
> > +					offset, seg->pkt_len, id);
> > +			id++;
> 
> Who would be responsible to make sure that we wouldn't have consecutive packets with the IPV4 id?
> Would be the upper layer that forms the packet or gso library or ...?

Linux supports two kinds of IP identifier: fixed identifier and incremental identifier, and
which one to use depends on upper protocol modules. Specifically, if the protocol module
wants fixed identifiers, it will set SKB_GSO_TCP_FIXEDID to skb->gso_type, and then
inet_gso_segment() will keep identifiers the same. Otherwise, all segments will have
incremental identifiers. The reason for this design is that some protocols may choose fixed
IP identifiers, like TCP (from RFC791). This design also shows that linux ignores the issue
of repeated IP identifiers.

>From the perspective of DPDK, we need to solve two problems. One is if ignore the issue of
repeated IP identifiers. The other is if the GSO library provides an interface to upper
applications to enable them to choose fixed or incremental identifiers, or simply uses
incremental IP identifiers.

Do you have any suggestions?

Thanks,
Jiayu

> 
> > +
> > +			offset += seg->l3_len;
> > +			update_tcp_header(rte_pktmbuf_mtod(seg, char *),
> > +					offset, sent_seq, i < tail_seg_idx);
> > +			sent_seq += seg->next->data_len;
> > +		}
> > +		break;
> > +	}
> > +}
> > --
> > 2.7.4


More information about the dev mailing list