[dpdk-dev] [PATCH v5] ip_frag: remove padding length of fragment

luyicai luyicai at huawei.com
Tue Dec 15 04:18:50 CET 2020



> -----Original Message-----
> From: Ananyev, Konstantin [mailto:konstantin.ananyev at intel.com] 
> Sent: Monday, December 14, 2020 10:45 PM
> To: luyicai <luyicai at huawei.com>; dev at dpdk.org
> Cc: Zhoujingbin (Robin, Russell Lab) <zhoujingbin at huawei.com>; chenchanghu <chenchanghu at huawei.com>; Lilijun (Jerry) <jerry.lilijun at huawei.com>; Linhaifeng <haifeng.lin at huawei.com>; Guohongzhi (Russell Lab) <guohongzhi1 at huawei.com>; wangyunjian <wangyunjian at huawei.com>; stable at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v5] ip_frag: remove padding length of fragment


> Hi Yicai,
 
> In some situations, we would get several ip fragments, which total 
> data length is less than min_ip_len(64) and padding with zeros.
> We simulated intermediate fragments by modifying the MTU.
> To illustrate the problem, we simplify the packet format and ignore 
> the impact of the packet header.In namespace2, a packet whose data 
> length is 1520 is sent.
> When the packet passes tap2, the packet is divided into two
> fragments: fragment A and B, similar to (1520 = 1510 + 10).
> When the packet passes tap3, the larger fragment packet A is divided 
> into two fragments A1 and A2, similar to (1510 = 1500 + 10).
> Finally, the bond interface receives three fragments:
> A1, A2, and B (1520 = 1500 + 10 + 10).
> One fragmented packet A2 is smaller than the minimum Ethernet frame 
> length, so it needs to be padded.
> 
> |---------------------------------------------------|
> |                      HOST                         |
> | |--------------|   |----------------------------| |
> | |      ns2     |   |      |--------------|      | |
> | |  |--------|  |   |  |--------|    |--------|  | |
> | |  |  tap1  |  |   |  |  tap2  | ns1|  tap3  |  | |
> | |  |mtu=1510|  |   |  |mtu=1510|    |mtu=1500|  | |
> | |--|1.1.1.1 |--|   |--|1.1.1.2 |----|2.1.1.1 |--| |
> |    |--------|         |--------|    |--------|    |
> |         |                 |              |        |
> |         |-----------------|              |        |
> |                                          |        |
> |                                      |--------|   |
> |                                      |  bond  |   |
> |--------------------------------------|mtu=1500|---|
>                                        |--------|
> 
> When processing the preceding packets above, DPDK would aggregate 
> fragmented packets A2 and B.
> And error packets are generated, which padding(zero) is displayed in 
> the middle of the packet.
> 
> A2 + B:
> 0000   fa 16 3e 9f fb 82 fa 47 b2 57 dc 20 08 00 45 00
> 0010   00 33 b4 66 00 ba 3f 01 c1 a5 01 01 01 01 02 01
> 0020   01 02 c0 c1 c2 c3 c4 c5 c6 c7 00 00 00 00 00 00
> 0030   00 00 00 00 00 00 00 00 00 00 00 00 c8 c9 ca cb
> 0040   cc cd ce cf d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 da db
> 0050   dc dd de df e0 e1 e2 e3 e4 e5 e6
> 
> So, we would calculate the length of padding, and remove the padding 
> in pkt_len and data_len before aggregation.
> 
> Fixes: 7f0983ee331c ("ip_frag: check fragment length of incoming 
> packet")
> Cc: stable at dpdk.org
> 
> Signed-off-by: Yicai Lu <luyicai at huawei.com>
> ---
> v4 -> v5: Update the comments and description.
> ---
>  lib/librte_ip_frag/rte_ipv4_reassembly.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/librte_ip_frag/rte_ipv4_reassembly.c 
> b/lib/librte_ip_frag/rte_ipv4_reassembly.c
> index 1dda8ac..fdf66a4 100644
> --- a/lib/librte_ip_frag/rte_ipv4_reassembly.c
> +++ b/lib/librte_ip_frag/rte_ipv4_reassembly.c
> @@ -104,6 +104,7 @@ struct rte_mbuf *
>  	const unaligned_uint64_t *psd;
>  	uint16_t flag_offset, ip_ofs, ip_flag;
>  	int32_t ip_len;
> +	int32_t trim;
> 
>  	flag_offset = rte_be_to_cpu_16(ip_hdr->fragment_offset);
>  	ip_ofs = (uint16_t)(flag_offset & RTE_IPV4_HDR_OFFSET_MASK); @@ 
> -117,14 +118,15 @@ struct rte_mbuf *
> 
>  	ip_ofs *= RTE_IPV4_HDR_OFFSET_UNITS;
>  	ip_len = rte_be_to_cpu_16(ip_hdr->total_length) - mb->l3_len;
> +	trim  = mb->pkt_len - (ip_len + mb->l3_len + mb->l2_len);
> 
>  	IP_FRAG_LOG(DEBUG, "%s:%d:\n"
> -		"mbuf: %p, tms: %" PRIu64
> -		", key: <%" PRIx64 ", %#x>, ofs: %u, len: %d, flags: %#x\n"
> +		"mbuf: %p, tms: %" PRIu64 ", key: <%" PRIx64 ", %#x>"
> +		"ofs: %u, len: %d, padding: %d, flags: %#x\n"
>  		"tbl: %p, max_cycles: %" PRIu64 ", entry_mask: %#x, "
>  		"max_entries: %u, use_entries: %u\n\n",
>  		__func__, __LINE__,
> -		mb, tms, key.src_dst[0], key.id, ip_ofs, ip_len, ip_flag,
> +		mb, tms, key.src_dst[0], key.id, ip_ofs, ip_len, trim, ip_flag,
>  		tbl, tbl->max_cycles, tbl->entry_mask, tbl->max_entries,
>  		tbl->use_entries);
> 
> @@ -134,6 +136,10 @@ struct rte_mbuf *
>  		return NULL;
>  	}
> 
> +	if (unlikely(trim > 0)) {
> +		rte_pktmbuf_trim(mb, trim);
> +	}

> As a nit {} braces are not required for single expression.
> LGTM in general, just one thing: shouldn't we have the same fix for ipv6 then?
> Konstantin

Hi Konstantin,

Thanks!

During the problem analysis, we have discussed on ipv6 
and concluded that it does not exist in ipv6.

For ipv6, it consists of the following parts: 
basic header = 40(bytes)
DMAC = 6(bytes)
SMAC = 6(bytes)
Type = 2(bytes)
CRC = 4(bytes)
fragment header = 8(bytes)
...

40 + 6 + 6 + 2 + 4 + 8 = 66 (bytes)

Total is already greater than min_ip_len(64). So it doesn't 
need to be padded with zeros.

> +
>  	/* try to find/add entry into the fragment's table. */
>  	if ((fp = ip_frag_find(tbl, dr, &key, tms)) == NULL) {
>  		IP_FRAG_MBUF2DR(dr, mb);
> --
> 1.9.5.msysgit.1



More information about the dev mailing list