Trex with mlx5 driver - Multiple streams with different VLAN priority causes high CPU utilization

Dariusz Sosnowski dsosnowski at nvidia.com
Fri Apr 19 14:31:19 CEST 2024


Thank you for the info.

You might have run into an issue, where NIC generates backpressure to the SW because of very frequent switching between different VLAN priorities on Tx datapath in the NIC.

Could you please apply the following QoS configuration on all interfaces and rerun the test with different VLAN priorities?

sudo mlnx_qos -i <iface> --trust=dscp
for dscp in {0..63}; do sudo mlnx_qos -i <iface> --dscp2prio set,$dscp,0; sleep 0.001;done

These commands will map internally all priorities to priority 0. This workaround should reduce the backpressure without affecting the generated traffic.

Best regards,
Dariusz Sosnowski

> -----Original Message-----
> From: Rubens Figueiredo <rubens.figueiredo at bisdn.de>
> Sent: Thursday, April 18, 2024 15:21
> To: Dariusz Sosnowski <dsosnowski at nvidia.com>
> Cc: users at dpdk.org
> Subject: Re: Trex with mlx5 driver - Multiple streams with different VLAN
> priority causes high CPU utilization
> 
> External email: Use caution opening links or attachments
> 
> 
> Hi Dariusz,
> 
> Thank you for the help.
> 
> If the two parallel stream use the same VLAN priority then the issue is not
> visible anymore. Trex output visible below.
> 
> Different priority
> CPU util.  |            96.97% |              0.0% | Tx bps L2  |         3.59 Gbps |          0.29
> bps | 3.59 Gbps Tx bps L1  |         3.64 Gbps |          0.37 bps | 3.64 Gbps
> Tx pps     |       298.49 Kpps |             0 pps | 298.49 Kpps
> Line Util. |            3.64 % |               0 % |
> 
> Same priority
> CPU util.  |             0.54% |              0.0% | Tx bps L2  |        23.98 Gbps |             0
> bps | 23.98 Gbps Tx bps L1  |         24.3 Gbps |             0 bps | 24.3 Gbps
> Tx pps     |         1.99 Mpps |             0 pps | 1.99 Mpps
> Line Util. |            24.3 % |               0 % |
> 
> I have attached the requested output to the email.
> 
> Best,
> Rubens
> 
> On 4/18/24 14:48, Dariusz Sosnowski wrote:
> > Hi Rubens,
> >
> > Would you be able to provide the output of "ethtool -S <iface>" for both VFs
> before and after the test?
> > Does the same issue appear on this system if both parallel streams use the
> same VLAN priority?
> >
> > Best regards,
> > Dariusz Sosnowski
> >
> >> From: Rubens Figueiredo <rubens.figueiredo at bisdn.de>
> >> Sent: Wednesday, April 17, 2024 19:07
> >> To: users at dpdk.org
> >> Subject: Trex with mlx5 driver - Multiple streams with different VLAN
> >> priority causes high CPU utilization
> >>
> >> Hello community,
> >> I am facing a strange issue in the Trex stateless code, version v3.02 and
> v3.04. I am using the Mellanox Cx-5, and have created two VFs on top of the
> PF 0. The mlx5_core version I am using is the 5.7-1.0.2, and the ofed version is
> MLNX_OFED_LINUX-5.7-1.0.2.0 (OFED-5.7-1.0.2).
> >> I have created the following issue in the trex-core repository
> [here](https://github.com/cisco-system-traffic-generator/trex-
> core/issues/1124), and was recommended to post the issue in here. In the
> github issue you see screenshots of the issue I am facing.
> >> I am trying to create two parallel streams with different VLAN priorities, but
> the load generated is not what I expect it to be, and CPU util. seems incredibly
> high (~99%).
> >> I have reproduced this issue with the --software and non software version.
> >> The script I used is below.
> >> import stl_path
> >> from trex.stl.api import *
> >>
> >> import time
> >> import pprint
> >> from ipaddress import ip_address, ip_network
> >>
> >> import argparse
> >> import configparser
> >> import os
> >> import json
> >>
> >>
> >> def get_packet(tos, mac_dst, ip_src, size):
> >>      # pkt = Ether(src="02:00:00:00:00:01",dst="00:00:00:01:00:01") /
> >> IP(src="10.0.0.2", tos=tos) / UDP(sport=4444, dport=4444)
> >>
> >>      pkt = (
> >>          Ether(src="00:01:00:00:00:02", dst=mac_dst)
> >>          # Ether(dst="11:11:11:11:11:11")
> >>          # / Dot1AD(vlan=0)
> >>          / Dot1Q(vlan=0, prio=tos)
> >>          / IP(src=ip_src)
> >>          / UDP(sport=4444, dport=4444)
> >>      )
> >>      pad = max(0, size - len(pkt)) * "x"
> >>
> >>      return pkt / pad
> >>
> >> def main():
> >>      """ """
> >>      tx_port = 0
> >>      rx_port = 1
> >>
> >>      c = STLClient()
> >>
> >>      # connect to server
> >>      c.connect()
> >>
> >>      # prepare our ports
> >>      c.reset(ports=[tx_port, rx_port])
> >>
> >>      streams = []
> >>      s = STLStream(
> >>          packet=STLPktBuilder(
> >>              pkt=get_packet(4,"00:11:22:33:44:55", "10.1.0.2",512),
> >>              # vm = vm,
> >>          ),
> >>          isg=0 * 1000000,
> >>          mode=STLTXCont(pps=1.2*10**6),
> >>          # flow_stats = STLFlowLatencyStats(pg_id = 0)
> >>          flow_stats = STLFlowStats(pg_id=0),
> >>      )
> >>
> >>      streams.append(s)
> >>
> >>      s2 = STLStream(
> >>          packet=STLPktBuilder(
> >>              pkt=get_packet(2,"00:11:22:33:44:55", "10.1.0.2",512),
> >>              # vm = vm,
> >>          ),
> >>          isg=0 * 1000000,
> >>          mode=STLTXCont(pps=1.2*10**6),
> >>          # flow_stats = STLFlowLatencyStats(pg_id = 0)
> >>          flow_stats = STLFlowStats(pg_id=1),
> >>      )
> >>
> >>      streams.append(s2)
> >>
> >>      c.add_streams(streams, ports=[tx_port])
> >>
> >>      c.clear_stats()
> >>
> >>      c.start(ports=[tx_port], duration=60, mult="25gbpsl1")
> >>
> >>      c.wait_on_traffic(ports=[tx_port, rx_port])
> >>
> >>      stats = c.get_stats()
> >>      print(stats)
> >>
> >> if __name__ == "__main__":
> >>      main()
> >>
> >>
> >> And the configuration is
> >> - port_limit: 2
> >>    version: 2
> >>    port_bandwidth_gb: 100
> >>    interfaces: ["3b:00.2", "3b:00.3"]
> >>    port_info:
> >>      - dest_mac: 00:00:00:00:00:01
> >>        src_mac: 00:01:00:00:00:01
> >>      - dest_mac: 00:00:00:00:00:02
> >>        src_mac: 00:01:00:00:00:02
> >>    c: 14
> >>    platform:
> >>      master_thread_id: 8
> >>      latency_thread_id: 27
> >>      dual_if:
> >>        - socket: 0
> >>          threads:
> >> [9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26]
> >>
> >>
> >> BISDN GmbH
> >> Körnerstraße 7-10
> >> 10785 Berlin
> >> Germany
> >>
> >> Phone: +49-30-6108-1-6100
> >>
> >> Managing Directors:
> >> Dr.-Ing. Hagen Woesner, Andreas Köpsel
> >>
> >> Commercial register:
> >> Amtsgericht Berlin-Charlottenburg HRB 141569 B VAT ID No:
> DE283257294
> >> ________________________________________
> --
> BISDN GmbH
> Körnerstraße 7-10
> 10785 Berlin
> Germany
> 
> 
> Phone:
> +49-30-6108-1-6100
> 
> 
> Managing Directors:
> Dr.-Ing. Hagen Woesner, Andreas
> Köpsel
> 
> 
> Commercial register:
> Amtsgericht Berlin-Charlottenburg HRB 141569 B VAT ID No: DE283257294



More information about the users mailing list