答复: [DPDK i40e XL710] tcp packet loss occurs occasionally when use dpdk19.11 i40e NIC

jiangheng (G) jiangheng14 at huawei.com
Mon Oct 24 04:57:09 CEST 2022


> Hi team
> I am using XL710 i40e NIC on dpdk19.11. I found that the NIC occasionally lost packets when I enabled the TSO(not GSO) feature.
>
> For example, I send below mbuf to rte_eth_tx_burst, :
>
> m is 0x2a166cf40, pkt_len=32822, ol_flags=d4000000000000, nb_segs=23, port=10007 dump mbuf at 0x2a166cf40, iova=2e166d008, buf_len=2176
>   pkt_len=32822, ol_flags=d4000000000000, nb_segs=23, in_port=65535
>   segment at 0x2a166cf40, data=0x2a166d088, data_len=1514
>   Dump data at [0x2a166d088], len=1514
> 00000000: 3C FD FE 9E 99 29 3C FD FE 9E 98 59 08 00 45 00 | <....)<....Y..E.
> 00000010: 80 28 03 C8 00 00 FF 06 00 00 42 42 42 0D 42 42 | .(........BBB.BB
> 00000020: 42 0C 27 17 79 89 00 07 79 6E 11 B5 93 DC 50 18 | B.'.y...yn....P.
> 00000030: FF FF 08 A4 00 00 23 2A 2A 2A 2A 2A 2A 2A 2A 2A | ......#*********
> 00000040: 2A 2A 2A 2A 2A 2A 2A 00 33 30 30 32 3A 36 36 2E | *******.3002:66.
>   segment at 0x2a166c580, data=0x2a166c6fe, data_len=1460
>   Dump data at [0x2a166c6fe], len=1460
>   segment at 0x2a166bbc0, data=0x2a166bd3e, data_len=1460
>   Dump data at [0x2a166bd3e], len=1460
>   segment at 0x2a166b200, data=0x2a166b37e, data_len=1460
>   Dump data at [0x2a166b37e], len=1460
>   segment at 0x2a166a840, data=0x2a166a9be, data_len=1460
>   Dump data at [0x2a166a9be], len=1460
>   segment at 0x2a1669e80, data=0x2a1669ffe, data_len=1460
>   Dump data at [0x2a1669ffe], len=1460
>   segment at 0x2a16694c0, data=0x2a166963e, data_len=1460
>   Dump data at [0x2a166963e], len=1460
>   segment at 0x2a1668b00, data=0x2a1668c7e, data_len=1460
>   Dump data at [0x2a1668c7e], len=1460
>   segment at 0x2a1668140, data=0x2a16682be, data_len=1460
>   Dump data at [0x2a16682be], len=1460
>   segment at 0x2a1667780, data=0x2a16678fe, data_len=1460
>   Dump data at [0x2a16678fe], len=1460
>   segment at 0x2a1666dc0, data=0x2a1666f3e, data_len=1460
>   Dump data at [0x2a1666f3e], len=1460
>   segment at 0x2a1666400, data=0x2a166657e, data_len=1460
>   Dump data at [0x2a166657e], len=1460
>   segment at 0x2a14b9400, data=0x2a14b957e, data_len=1460
>   Dump data at [0x2a14b957e], len=1460
>   segment at 0x2a14b9dc0, data=0x2a14b9f3e, data_len=1460
>   Dump data at [0x2a14b9f3e], len=1460
>   segment at 0x2a14ba780, data=0x2a14ba8fe, data_len=1460
>   Dump data at [0x2a14ba8fe], len=1460
>   segment at 0x2a14bb140, data=0x2a14bb2be, data_len=1460
>   Dump data at [0x2a14bb2be], len=1460
>   segment at 0x2a14bbb00, data=0x2a14bbc7e, data_len=1460
>   Dump data at [0x2a14bbc7e], len=1460
>   segment at 0x2a14bc4c0, data=0x2a14bc63e, data_len=1460
>   Dump data at [0x2a14bc63e], len=1460
>   segment at 0x2a14bce80, data=0x2a14bcffe, data_len=1460
>   Dump data at [0x2a14bcffe], len=1460
>   segment at 0x2a14bd840, data=0x2a14bd9be, data_len=1460
>   Dump data at [0x2a14bd9be], len=1460
>   segment at 0x2a14be200, data=0x2a14be37e, data_len=1460
>   Dump data at [0x2a14be37e], len=1460
>   segment at 0x2a14bebc0, data=0x2a14bed3e, data_len=1460
>   Dump data at [0x2a14bed3e], len=1460
>   segment at 0x2a14bf580, data=0x2a14bf6fe, data_len=648
>   Dump data at [0x2a14bf6fe], len=648
>
>  rte_eth_tx_burst return value is 1, indicating send success. 
>  I use tcpdump to capture packets at the peer. The length of the captured packets is 29200, but actuall len is 32768, loss 3568.

>  The count of while loops is equal to the number of mbuf nb_segs, everything seems good…
>  https://github.com/DPDK/dpdk/blob/v19.11/drivers/net/i40e/i40e_rxtx.c#L1180
>
>  I am not familiar with i40e driver, and too much debugging message will not recur this issue. I wonder if there is a better way to debug? Please give me some ideas, Thanks a lot!
>
>
>  TSO config:
>  mbufs->ol_flags = d4000000000000
>  mbufs->tso_segsz = 1460
>  mbufs->l2_len = 14
>  mbufs->l3_len = 20
>  mbufs->l4_len = 20

Nobody reply this email?
The number of sent packets measured by the dpdk-procinfo "tx_good_packets" item  is the same as that measured by the upper-layer app. 
Does it indicates that NIC and sw pmd not discard data?


More information about the users mailing list