[dpdk-dev] [PATCH v7 2/3] lib/gro: add TCP/IPv4 GRO support

Ananyev, Konstantin konstantin.ananyev at intel.com
Fri Jun 30 14:07:13 CEST 2017


Hi Jiayu,

> > > +
> > > +int32_t
> > > +gro_tcp4_reassemble(struct rte_mbuf *pkt,
> > > +		struct gro_tcp_tbl *tbl,
> > > +		uint32_t max_packet_size)
> > > +{
> > > +	struct ether_hdr *eth_hdr;
> > > +	struct ipv4_hdr *ipv4_hdr;
> > > +	struct tcp_hdr *tcp_hdr;
> > > +	uint16_t ipv4_ihl, tcp_hl, tcp_dl;
> > > +
> > > +	struct tcp_key key;
> > > +	uint32_t cur_idx, prev_idx, item_idx;
> > > +	uint32_t i, key_idx;
> > > +
> > > +	eth_hdr = rte_pktmbuf_mtod(pkt, struct ether_hdr *);
> > > +	ipv4_hdr = (struct ipv4_hdr *)(eth_hdr + 1);
> > > +	ipv4_ihl = IPv4_HDR_LEN(ipv4_hdr);
> > > +
> > > +	/* check if the packet should be processed */
> > > +	if (ipv4_ihl < sizeof(struct ipv4_hdr))
> > > +		goto fail;
> > > +	tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + ipv4_ihl);
> > > +	tcp_hl = TCP_HDR_LEN(tcp_hdr);
> > > +	tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) - ipv4_ihl
> > > +		- tcp_hl;
> > > +	if (tcp_dl == 0)
> > > +		goto fail;
> > > +
> > > +	/* find a key and traverse all packets in its item group */
> > > +	key.eth_saddr = eth_hdr->s_addr;
> > > +	key.eth_daddr = eth_hdr->d_addr;
> > > +	key.ip_src_addr[0] = rte_be_to_cpu_32(ipv4_hdr->src_addr);
> > > +	key.ip_dst_addr[0] = rte_be_to_cpu_32(ipv4_hdr->dst_addr);
> >
> > Your key.ip_src_addr[1-3] still contains some junk.
> > How memcmp below supposed to worj properly?
> 
> When allocate an item, we already guarantee the content of its
> memory space is 0. So memcpy won't be error.

key is allocated on the stack.
Obviously fileds that are not initialized manually will contain undefined values,
i.e.: ip_src-addr[1-3].
Then below you are doing:
memcp((&(tbl->keys[i].key), &key, sizeof(struct tcp_key));
...
memcpy(&(tbl->keys[key_idx].key), &key, sizeof(struct tcp_key));

So I still think you are comparing/copying some junk here.

> 
> > BTW why do you need 4 elems here, why just not uint32_t ip_src_addr;?
> > Same for ip_dst_addr.
> 
> I think tcp6 and tcp4 can share the same table structure. So I use
> 128bit IP address here. You mean we need to use different structures
> for tcp4 and tcp6?

That would be my preference.

> 
> >
> > > +	key.src_port = rte_be_to_cpu_16(tcp_hdr->src_port);
> > > +	key.dst_port = rte_be_to_cpu_16(tcp_hdr->dst_port);
> > > +	key.recv_ack = rte_be_to_cpu_32(tcp_hdr->recv_ack);
> > > +	key.tcp_flags = tcp_hdr->tcp_flags;
> > > +
> > > +	for (i = 0; i < tbl->max_key_num; i++) {
> > > +		if (tbl->keys[i].is_valid &&
> > > +				(memcmp(&(tbl->keys[i].key), &key,
> > > +						sizeof(struct tcp_key))
> > > +				 == 0)) {
> > > +			cur_idx = tbl->keys[i].start_index;
> > > +			prev_idx = cur_idx;
> > > +			while (cur_idx != INVALID_ARRAY_INDEX) {
> > > +				if (check_seq_option(tbl->items[cur_idx].pkt,
> > > +							tcp_hdr,
> > > +							tcp_hl) > 0) {
> >
> > As I remember linux gro also check ipv4 packet_id - it should be consecutive.
> 
> IP fragmented packets have the same IP ID, but others are consecutive.

Yes, you assume that they are consecutive.
But the only way to know for sure that they are - check it.
Another thing - I think we need to perform GRO only for TCP packets with only ACK bit set
(i.e. - no GRO for FIN/SYN/PUSH/URG/, etc.).

> As we  suppose GRO can merge IP fragmented packets, so I think we shouldn't check if
> the IP ID is consecutive here. How do you think?
> 
> >
> > > +					if (merge_two_tcp4_packets(
> > > +								tbl->items[cur_idx].pkt,
> > > +								pkt,
> > > +								max_packet_size) > 0) {
> > > +						/* successfully merge two packets */
> > > +						tbl->items[cur_idx].is_groed = 1;
> > > +						return 1;
> > > +					}
> >
> > If you allow more then packet per flow to be stored in the table, then you should be
> > prepared that new segment can fill a gap between 2 packets.
> > Probably the easiest thing - don't allow more then one 'item' per flow.
> 
> We allow the table to store same flow but out-of-order arriving packets. For
> these packets, they will occupy different 'item' and we won't re-merge them.
> For example, there are three same flow tcp packets: p1, p2 and p3. And p1 arrives
> first, then p3, and last is p2. So TCP GRO will allocate one 'item' for p1 and one
> 'item' for p3, and when p2 arrives, p2 will be merged with p1. Therefore, in the
> table, we will have two 'item': item1 to store merged p1 and p2, item2 to store p3.
> 
> As you can see, TCP GRO can only merges sequential arriving packets. If we want to
> merge all out-of-order arriving packets, we need to re-process these packets which
> are already processed and have one 'item'. IMO, this procedure will be very complicated.
> So we don't do that.
> 
> Sorry, I don't understand how to allow one 'item' per-flow. Because packets are arriving
> out-of-order. If we don't re-process these packets which already have one 'item', how to
> guarantee it?

As I understand you'll have an input array:
<seq=2, seq=1> - you wouldn't be able to merge it.
So I think your merge need be prepared to merge both smaller and bigger sequence.
About one item my thought : instead of allowing N items per key(flow) - for simplicity
just allow one item per flow.
In that case we wouldn't allow holes, but still would be able to merge reordered packets.
Alternatively you can store items ordered by seq, and after merge, try to merge neighbor
Items too.

> 
> >
> > > +					/**
> > > +					 * fail to merge two packets since
> > > +					 * it's beyond the max packet length.
> > > +					 * Insert it into the item group.
> > > +					 */
> > > +					goto insert_to_item_group;
> > > +				} else {
> > > +					prev_idx = cur_idx;
> > > +					cur_idx = tbl->items[cur_idx].next_pkt_idx;
> > > +				}
> > > +			}
> > > +			/**
> > > +			 * find a corresponding item group but fails to find
> > > +			 * one packet to merge. Insert it into this item group.
> > > +			 */
> > > +insert_to_item_group:
> > > +			item_idx = find_an_empty_item(tbl);
> > > +			/* the item number is greater than the max value */
> > > +			if (item_idx == INVALID_ARRAY_INDEX)
> > > +				return -1;
> > > +			tbl->items[prev_idx].next_pkt_idx = item_idx;
> > > +			tbl->items[item_idx].pkt = pkt;
> > > +			tbl->items[item_idx].is_groed = 0;
> > > +			tbl->items[item_idx].next_pkt_idx = INVALID_ARRAY_INDEX;
> > > +			tbl->items[item_idx].is_valid = 1;
> > > +			tbl->items[item_idx].start_time = rte_rdtsc();
> > > +			tbl->item_num++;
> > > +			return 0;
> > > +		}
> > > +	}
> > > +
> > > +	/**
> > > +	 * merge fail as the given packet has a new key.
> > > +	 * So insert a new key.
> > > +	 */
> > > +	item_idx = find_an_empty_item(tbl);
> > > +	key_idx = find_an_empty_key(tbl);
> > > +	/**
> > > +	 * if current key or item number is greater than the max
> > > +	 * value, don't insert the packet into the table and return
> > > +	 * immediately.
> > > +	 */
> > > +	if (item_idx == INVALID_ARRAY_INDEX ||
> > > +			key_idx == INVALID_ARRAY_INDEX)
> > > +		return -1;
> > > +	tbl->items[item_idx].pkt = pkt;
> > > +	tbl->items[item_idx].next_pkt_idx = INVALID_ARRAY_INDEX;
> > > +	tbl->items[item_idx].is_groed = 0;
> > > +	tbl->items[item_idx].is_valid = 1;
> > > +	tbl->items[item_idx].start_time = rte_rdtsc();
> >


More information about the dev mailing list