<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:DengXian;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:Aptos;
panose-1:2 11 0 4 2 2 2 2 2 4;}
@font-face
{font-family:"\@DengXian";
panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
font-size:10.0pt;
font-family:"Aptos",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0cm;
margin-right:0cm;
margin-bottom:0cm;
margin-left:36.0pt;
font-size:10.0pt;
font-family:"Aptos",sans-serif;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;
mso-ligatures:none;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:1995453284;
mso-list-type:hybrid;
mso-list-template-ids:-1023768702 134807569 134807577 134807579 134807567 134807577 134807579 134807567 134807577 134807579;}
@list l0:level1
{mso-level-text:"%1\)";
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l0:level2
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l0:level3
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l0:level4
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l0:level5
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l0:level6
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l0:level7
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l0:level8
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-18.0pt;}
@list l0:level9
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
ol
{margin-bottom:0cm;}
ul
{margin-bottom:0cm;}
--></style>
</head>
<body lang="en-DE" link="blue" vlink="purple" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt">Hi Dariusz,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">It is very appreciated that you took a look at the issue and provided suggestions. This time, we again performed tests using two
<b>directly</b> connected machines and focused on ICMP (IPv4) packets in addition to ICMPv6 packets mentioned in the original problem description. The issue remains the same. I would like to highlight two points in our setup:<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<ol style="margin-top:0cm" start="1" type="1">
<li class="MsoListParagraph" style="margin-left:0cm;mso-list:l0 level1 lfo1"><span style="font-size:11.0pt">ICMP packets immediately cannot be captured on PF1 right after setting the nic into the multiport eswitch mode. And if I switch off the multiport eswitch
mode by using following two commands, ICMP communication is resumed immediately, which shall prove that configs, such as firewall, on the system are correct. I would also assume it has little to do with a running DPDK application, as communication is already
broken before starting an application like testpmd.<br>
<br>
</span><span style="font-size:10.5pt">sudo devlink dev param set pci/0000:3b:00.0 name esw_multiport value
<b>false</b> cmode runtime<o:p></o:p></span></li></ol>
<p class="MsoListParagraph"><span style="font-size:10.5pt">sudo devlink dev param set pci/0000:3b:00.1 name esw_multiport value
<b>false</b> cmode runtime<br>
<br>
<o:p></o:p></span></p>
<ol style="margin-top:0cm" start="2" type="1">
<li class="MsoListParagraph" style="margin-left:0cm;mso-list:l0 level1 lfo1"><span style="font-size:11.0pt">In this setup, we do not use MLNX_OFED drivers but rely on the upstream Mellanox drivers from Linux kernel 6.5.0 (which is greater than the suggested
kernel version 6.3). Would that make a difference? Could you share some more detailed information regarding the environment setup on your side? The firmware version we are using for Mellanox
</span><span style="font-size:11.0pt">ConnectX-6 is 22.39.1002.</span><span style="font-size:11.0pt"><o:p></o:p></span></li></ol>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Looking forward to your further reply. Thanks in advance.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Best regards,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt">Tao Li<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
<div id="mail-editor-reference-message-container">
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal" style="margin-bottom:12.0pt"><b><span style="font-size:12.0pt;color:black">From:
</span></b><span style="font-size:12.0pt;color:black">Dariusz Sosnowski <dsosnowski@nvidia.com><br>
<b>Date: </b>Friday, 19. April 2024 at 19:30<br>
<b>To: </b>Tao Li <byteocean@hotmail.com>, users@dpdk.org <users@dpdk.org><br>
<b>Cc: </b>tao.li06@sap.com <tao.li06@sap.com><br>
<b>Subject: </b>RE: Packets cannot reach host's kernel in multiport e-switch mode (mlx5 driver)<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:11.0pt">Hi,<br>
<br>
I could not reproduce the issue locally with testpmd, with flow isolation enabled. I can see ICMP packets passing both ways to kernel interfaces of PF0 and PF1.<br>
Without flow isolation, it is expected that traffic coming to the host will be hijacked by DPDK (depending on the MAC address, multicast config and promiscuous mode).<br>
<br>
Could you please run testpmd with the following command line parameters and execute the following commands?<br>
<br>
Testpmd command line:<br>
dpdk-testpmd -a 3b:00.0,dv_flow_en=2,representor=pf0-1vf0 -- --flow-isolate-all -i<br>
<br>
Testpmd commands:<br>
port stop all<br>
flow configure 0 queues_number 4 queues_size 64<br>
flow configure 1 queues_number 4 queues_size 64<br>
flow configure 2 queues_number 4 queues_size 64<br>
flow configure 3 queues_number 4 queues_size 64<br>
port start 0<br>
port start 1<br>
port start 2<br>
port start 3<br>
set verbose 1<br>
set fwd rxonly<br>
start<br>
<br>
With this testpmd running, could you please test if both PF0 and PF1 kernel interfaces are reachable and all packets pass?<br>
<br>
Best regards,<br>
Dariusz Sosnowski<br>
<br>
> From: Tao Li <byteocean@hotmail.com> <br>
> Sent: Wednesday, April 10, 2024 10:18<br>
> To: users@dpdk.org<br>
> Cc: tao.li06@sap.com<br>
> Subject: Packets cannot reach host's kernel in multiport e-switch mode (mlx5 driver)<br>
> <br>
> External email: Use caution opening links or attachments <br>
> <br>
> Hi All,<br>
> <br>
> I am currently experimenting with a feature newly supported by DPDK 23.11, known as "</span><a href="https://doc.dpdk.org/guides/nics/mlx5.html#multiport-e-switch"><span style="font-size:11.0pt">https://doc.dpdk.org/guides/nics/mlx5.html#multiport-e-switch</span></a><span style="font-size:11.0pt">"
to improve communication reliability on the server side. During the trials, I encountered an issue in which activating multiport e-switch mode on the NIC disrupts the hypervisor's software running on the second PF interface (PF1). More specifically, packets
coming from the second PF (PF1) cannot be delivered to hypervisor's kernel network stack, right after setting the multiport e-switch mode for the NIC as guided in documentation. A snapshot of the packet trace comparison on the second PF (PF1, ens2f1np1) before
and after setting the multiport e-switch mode is attached here. Packets marked with the gray color/italic in the second trace are missing under the multiport e-switch mode.<br>
> <br>
> ----<test environment>-----<br>
> ConnectX-6 Dx with firmware version 22.39.1002<br>
> Linux kernel version: 6.6.16<br>
> DPDK: 23.11<br>
> ----</test environment>------<br>
> <br>
> ----<packet trace after setting multiport e-switch mode>------<br>
> 14:37:24.835716 04:3f:72:e8:cf:cb > 33:33:00:00:00:01, ethertype IPv6 (0x86dd), length 78: fe80::63f:72ff:fee8:cfcb > ff02::1: ICMP6, router advertisement, length 24<br>
> <br>
> 14:37:28.527829 90:3c:b3:33:83:fb > 33:33:00:00:00:01, ethertype IPv6 (0x86dd), length 78: fe80::923c:b3ff:fe33:83fb > ff02::1: ICMP6, router advertisement, length 24<br>
> <br>
> 14:37:28.528359 04:3f:72:e8:cf:cb > 90:3c:b3:33:83:fb, ethertype IPv6 (0x86dd), length 94: fe80::63f:72ff:fee8:cfcb.54096 > fe80::923c:b3ff:fe33:83fb.179: Flags [S], seq 2779843599, win 33120, options [mss 1440,sackOK,TS val 1610632473 ecr 0,nop,wscale 7],
length 0 // link-local addresses are used<br>
> <br>
> 14:37:29.559918 04:3f:72:e8:cf:cb > 90:3c:b3:33:83:fb, ethertype IPv6 (0x86dd), length 94: fe80::63f:72ff:fee8:cfcb.54096 > fe80::923c:b3ff:fe33:83fb.179: Flags [S], seq 2779843599, win 33120, options [mss 1440,sackOK,TS val 1610633505 ecr 0,nop,wscale 7],
length 0<br>
> <br>
> 14:37:30.583925 04:3f:72:e8:cf:cb > 90:3c:b3:33:83:fb, ethertype IPv6 (0x86dd), length 94: fe80::63f:72ff:fee8:cfcb.54096 > fe80::923c:b3ff:fe33:83fb.179: Flags [S], seq 2779843599, win 33120, options [mss 1440,sackOK,TS val 1610634529 ecr 0,nop,wscale 7],
length 0<br>
> ----</packet trace after setting multiport e-switch mode>------<br>
> <br>
> ----<packet trace before setting multiport e-switch mode> ------<br>
> 16:09:40.375865 90:3c:b3:33:83:fb > 33:33:00:00:00:01, ethertype IPv6 (0x86dd), length 78: fe80::923c:b3ff:fe33:83fb > ff02::1: ICMP6, router advertisement, length 24<br>
> <br>
> 16:09:40.376473 fa:e4:cf:2d:11:b9 > 90:3c:b3:33:83:fb, ethertype IPv6 (0x86dd), length 94: fe80::f8e4:cfff:fe2d:11b9.36168 > fe80::923c:b3ff:fe33:83fb.179: Flags [S], seq 3409227589, win 33120, options [mss 1440,sackOK,TS val 2302010436 ecr 0,nop,wscale 7],
length 0<br>
> <br>
> 16:09:40.376692 90:3c:b3:33:83:fb > fa:e4:cf:2d:11:b9, ethertype IPv6 (0x86dd), length 94: fe80::923c:b3ff:fe33:83fb.179 > fe80::f8e4:cfff:fe2d:11b9.36168: Flags [S.], seq 3495571820, ack 3409227590, win 63196, options [mss 9040,sackOK,TS val 1054058675 ecr
2302010436,nop,wscale 9], length 0<br>
> <br>
> 16:09:40.376711 fa:e4:cf:2d:11:b9 > 90:3c:b3:33:83:fb, ethertype IPv6 (0x86dd), length 86: fe80::f8e4:cfff:fe2d:11b9.36168 > fe80::923c:b3ff:fe33:83fb.179: Flags [.], ack 1, win 259, options [nop,nop,TS val 2302010436 ecr 1054058675], length 0<br>
> <br>
> 16:09:40.376865 fa:e4:cf:2d:11:b9 > 90:3c:b3:33:83:fb, ethertype IPv6 (0x86dd), length 193: fe80::f8e4:cfff:fe2d:11b9.36168 > fe80::923c:b3ff:fe33:83fb.179: Flags [P.], seq 1:108, ack 1, win 259, options [nop,nop,TS val 2302010436 ecr 1054058675], length
107: BGP<br>
> <br>
> 16:09:40.376986 90:3c:b3:33:83:fb > fa:e4:cf:2d:11:b9, ethertype IPv6 (0x86dd), length 86: fe80::923c:b3ff:fe33:83fb.179 > fe80::f8e4:cfff:fe2d:11b9.36168: Flags [.], ack 108, win 124, options [nop,nop,TS val 1054058676 ecr 2302010436], length 0<br>
> ----</packet trace before setting multiport e-switch mode> ------<br>
> <br>
> Attempts to ping from another directly connected host to this hypervisor also resulted in incoming ICMP packets not being captured, which is reproducible in another testing environment setup. In the end, I was able to restore communication on the second PF
by using a vdev TAP device and performing packet forwarding between the TAP device and PF1, as shown in our public examplary
</span><a href="https://github.com/byteocean/multiport-eswitch-example"><span style="font-size:11.0pt">https://github.com/byteocean/multiport-eswitch-example</span></a><span style="font-size:11.0pt">.<br>
> <br>
> Enabling the isolation mode on PF1 by starting testpmd or programmably using `rte_flow_isolate()` leads to no change from the behavior as described above, but only affects whether packets can be captured and processed by the DPDK application.<br>
> ----<command to start testpmd> ------<br>
> sudo ./dpdk-testpmd -a 3b:00.0,dv_flow_en=2,dv_esw_en=1,fdb_def_rule_en=1,representor=pf0-1vf0 -- -i --rxq=1 --txq=1 --flow-isolate-all<br>
> ----</command to start testpmd> ------<br>
> <br>
> Any experience sharing or comment on the above described issue is very appreciated. Thanks a lot in advance.<br>
> <br>
> Best regards,<br>
> Tao Li<o:p></o:p></span></p>
</div>
</div>
</div>
</div>
</body>
</html>