<div dir="ltr">Hi Thomas Monjalon,<br><br>First, let's describe the scenario where we discovered the problem at that time as follows:<br><br>This error can be reproduced as follows:<br>1. In the client ECS with an MTU of 1500, initiate traffic using the command "iperf3 -c {dst ip} -b 1m -M 125 -t 8000". It will trigger TCP segmentation.<br>2. On the host machine, TCP segmentation is performed through the 'rte_gso_segment' function.<br>3. After the gso, a packet in one mbuf will be split into multiple segments.<br>4. When calculating the TCP checksum using the 'rte_raw_cksum_mbuf' function, it will enter the 'hard case: process checksum of several segments' of the function. At this point, a calculation error may occur.<br>5. In the destination ECS, the InCsumErrors statistic can be viewed using the command "netstat -st | grep -i csum". The erroneous packets can also be confirmed via the tcpdump command.<br><br>The following is a detailed description of a captured erroneous packet. <br><br>The hex stream of the packet is as follows:<br>00163e0b6bd2eeffffffffff0800450000a50d7a40004006b94bc0a8f91dc0a8f91ed5d2145146f9d990e10d6a2d8010020040a200000101080a95ac86ba091145d3ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff<br><br>This is a packet in the format of Eth + IPv4 + TCP + Payload.<br><br>Taking the above-mentioned packet as an example, the calculation process of 'rte_raw_cksum_mbuf' is as follows:<br>1. Due to the small MSS, TSO fragmentation was triggered, generating 3 mbufs.<br>2. The data_len of the first mbuf is 66 bytes, containing the Ethernet header, IPv4 header, and TCP header.<br>3. The data_len of the second mbuf is 61 bytes.<br>4. The data_len of the third mbuf is 52 bytes.<br>5. When calculating the checksum of the TCP header for such an mbuf chain using the rte_raw_cksum_mbuf function, the 'tmp' value obtained during the calculation of the third mbuf is 0x19FFE6.<br>6. After applying rte_bswap16, tmp becomes 0xE6FF, with 0x19 discarded. This results in an incorrect final checksum.<br><br>Second, Not all multiseg packets will cause calculation errors in the 'rte_raw_cksum_mbuf' function. There are two cases that can lead to incorrect final results.<br>1. If the value of 'tmp' is greater than 0xFFFF, 'tmp = rte_bswap16((uint16_t)tmp)' will drop high 16 bit.<br>2. Both 'tmp' and 'sum' is uint32_t, if the value of 'sum' is greater than 0xFFFFFFFF, 'sum += tmp' will drop the carry when overflow. <br><br>Third, in our online cloud network, we found that the problem only occurs when there are 3 or more segments. I believe that the aforementioned issue may be triggered when there are 3 or more segments, but a test case with 3 segments is sufficient to detect this problem.</div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Mon, Aug 11, 2025 at 10:42 PM Thomas Monjalon <<a href="mailto:thomas@monjalon.net">thomas@monjalon.net</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hello,<br>
<br>
04/08/2025 05:54, Su Sai:<br>
> The rte_raw_cksum_mbuf function is used to compute<br>
> the raw checksum of a packet.<br>
> If the packet payload stored in multi mbuf, the function<br>
> will goto the hard case. In hard case,<br>
> the variable 'tmp' is a type of uint32_t,<br>
> so rte_bswap16 will drop high 16 bit.<br>
> Meanwhile, the variable 'sum' is a type of uint32_t,<br>
> so 'sum += tmp' will drop the carry when overflow.<br>
> Both drop will make cksum incorrect.<br>
> This commit fixes the above bug.<br>
<br>
Thank you for the fix and the associated test.<br>
<br>
Please could describe the exact condition to get a wrong checksum?<br>
Does it happen with all multiseg packets?<br>
3 segments is a minimum? Any other constraint to reproduce?<br>
<br>
<br>
</blockquote></div>