[dpdk-dev] [Bug 759] testpmd: bonding mode 4 on mlx5 doesn't work

bugzilla at dpdk.org bugzilla at dpdk.org
Fri Jul 16 14:40:05 CEST 2021


https://bugs.dpdk.org/show_bug.cgi?id=759

            Bug ID: 759
           Summary: testpmd: bonding mode 4 on mlx5 doesn't work
           Product: DPDK
           Version: 20.11
          Hardware: x86
               URL: https://mails.dpdk.org/archives/dev/2021-July/213360.h
                    tml
                OS: Linux
            Status: UNCONFIRMED
          Severity: normal
          Priority: Normal
         Component: testpmd
          Assignee: dev at dpdk.org
          Reporter: xhavli56 at stud.fit.vutbr.cz
  Target Milestone: ---

Bonding mode 4 on mlx5 with dedicated queues disabled doesn't work. LACP
configuration does not happen correctly and application is not receiving any
packets.

1) Create virtual ethdevs for dumping traffic [Terminal 1]
# ip link add veth0a type veth peer name veth0b
# ip link add veth1a type veth peer name veth1b
# ip link set veth0a up
# ip link set veth0b up
# ip link set veth1a up
# ip link set veth1b up

1) Run testpmd with dedicated queues disabled [Terminal 1]
(closely following
https://doc.dpdk.org/dts/test_plans/pmd_bonded_8023ad_test_plan.html#test-case-basic-behavior-start-stop)
# dpdk-testpmd -v -- -i
...
EAL: RTE Version: 'DPDK 20.11.2'
...
EAL: Probe PCI driver: mlx5_pci (15b3:101d) device: 0000:c4:00.0 (socket 0)
EAL: Probe PCI driver: mlx5_pci (15b3:101d) device: 0000:c4:00.1 (socket 0)
...
Port 0: 04:3F:72:C7:B8:8C
...
Port 1: 04:3F:72:C7:B8:8D
...
testpmd> port stop all
testpmd> create bonded device 4 0
testpmd> add bonding slave 0 2
testpmd> add bonding slave 1 2
testpmd> set bonding lacp dedicated_queues 2 disable
testpmd> set allmulti 0 on
testpmd> set allmulti 1 on
testpmd> set allmulti 2 on
testpmd> set portlist 2
testpmd> port start all
testpmd> start

2) Run dpdk-pdump to clone traffic to virtual ethdevs [Terminal 2]
# dpdk-pdump -- --multi --pdump 'port=0,queue=*,rx-dev=veth0a,tx-dev=veth0a'
--pdump 'port=1,queue=*,rx-dev=veth1a,tx-dev=veth1a'

3) Run tcpdump to listen on port 0 [Terminal 3]
# tcpdump -v -i veth0b

4) Run tcpdump to listen on port 1 [Terminal 4]
# tcpdump -v -i veth1b

EXPECTED RESULTS in Terminal 3 / Terminal 4 (in other words, port 0 / port 1):
/* Correct LACP packets such as this one */
09:19:54.458868 LACPv1, length 110
        Actor Information TLV (0x01), length 20
          System 0c:29:ef:e4:5a:00 (oui Unknown), System Priority 32768, Key
10, Port 600, Port Priority 32768
          State Flags [Activity, Aggregation, Synchronization, Collecting,
Distributing]
        Partner Information TLV (0x02), length 20
          System 04:3f:72:c7:b8:8c (oui Unknown), System Priority 65535, Key
65535, Port 2, Port Priority 255
          State Flags [Activity, Aggregation, Synchronization, Collecting,
Distributing]
        Collector Information TLV (0x03), length 16
          Max Delay 0
        Terminator TLV (0x00), length 0

/* Packets from generator are received and they are distributed across
   both ports. The sample below is from Terminal 3 (port 0).
   Packets with src IP 192.168.0.103-104 can be seen in Terminal 4
   (port 1).*/
09:42:43.802538 IP (tos 0x0, ttl 255, id 37307, offset 0, flags [none], proto
UDP (17), length 106)
    192.168.0.101.1024 > 10.0.0.1.1024: UDP, length 78
09:42:43.902533 IP (tos 0x0, ttl 255, id 37307, offset 0, flags [none], proto
UDP (17), length 106)
    192.168.0.102.1024 > 10.0.0.1.1024: UDP, length 78
09:42:44.202533 IP (tos 0x0, ttl 255, id 37307, offset 0, flags [none], proto
UDP (17), length 106)
    192.168.0.105.1024 > 10.0.0.1.1024: UDP, length 78

ACTUAL RESULTS in Terminal 3 / Terminal 4:
/* LACP packets with incorrect actor/partner MAC address */
09:28:47.668358 LACPv1, length 110
        Actor Information TLV (0x01), length 20
          System 0c:29:ef:e4:5a:00 (oui Unknown), System Priority 32768, Key
10, Port 600, Port Priority 32768
          State Flags [Activity, Aggregation, Synchronization, Collecting,
Distributing]
        Partner Information TLV (0x02), length 20
          System 00:00:00:00:00:00 (oui Unknown), System Priority 65535, Key
65535, Port 2, Port Priority 255
          State Flags [Activity, Aggregation, Synchronization, Collecting,
Distributing]
        Collector Information TLV (0x03), length 16
          Max Delay 0
        Terminator TLV (0x00), length 0
/* No packets received from generator */


NIC: ConnectX-6
OS: CentOS 8 with 4.18.0-240.10.1.el8_3.x86_64 kernel
OFED: MLNX_OFED_LINUX-5.3-1.0.0.1
firmware-version: 22.30.1004
Link aggregation on switch should be configured correctly (tested with custom
try-out app, which works, that's where I got the expected results)

I suspect there might be some inner dependency on dedicated queues being
enabled.
So what happens if we enable dedicated queues? We get this error:
bond_ethdev_8023ad_flow_set(267) - bond_ethdev_8023ad_flow_set: port not
started (slave_port=0 queue_id=1)

Link to a thread in mailing list where this error causing issue and potential
fix were discussed is in URL.

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the dev mailing list