[dpdk-dev] nvgre inner rss problem in mlx5

wenxu wenxu at ucloud.cn
Tue Aug 3 10:44:53 CEST 2021


Hi nvidia teams,


I test the upstream dpdk for vxlan encap offload  with dpdk-testpmd


# lspci | grep Ether
19:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]
19:00.1 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]



Fw version is 16.31.1014 
#ethtool -i net2
driver: mlx5_core
version: 5.13.0-rc3+
firmware-version: 16.31.1014 (MT_0000000080)
expansion-rom-version: 
bus-info: 0000:19:00.0




start the eswitch


echo 0 > /sys/class/net/net2/device/sriov_numvfs
echo 1 > /sys/class/net/net2/device/sriov_numvfs
echo 0000:19:00.2 > /sys/bus/pci/drivers/mlx5_core/unbind
devlink dev eswitch set pci/0000:19:00.0  mode switchdev
echo 0000:19:00.2 > /sys/bus/pci/drivers/mlx5_core/bind


ip link shows
4: net2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 1c:34:da:77:fb:d8 brd ff:ff:ff:ff:ff:ff
    vf 0 MAC 4e:41:8f:92:41:44, spoof checking off, link-state disable, trust off, query_rss off
    vf 1 MAC 00:00:00:00:00:00, spoof checking off, link-state disable, trust off, query_rss off
8: pf0vf0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 4e:41:8f:92:41:44 brd ff:ff:ff:ff:ff:ff
10: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 46:87:9e:9e:c8:23 brd ff:ff:ff:ff:ff:ff


net2 is pf, pf0vf0 is vf represntor, eth0 is vf.




start the pmd
./dpdk-testpmd -c 0x1f  -n 4 -m 4096 --file-prefix=ovs -a "0000:19:00.0,representor=pf0vf0,dv_flow_en=1,dv_esw_en=1,dv_xmeta_en=1"  --huge-dir=/mnt/ovsdpdk  -- -i --flow-isolate-all --forward-mode=rxonly --rxq=4 --txq=4 --auto-start --nb-cores=4




testpmd> set vxlan ip-version ipv4 vni 1000 udp-src 0 udp-dst 4789 ip-src 172.168.152.50 ip-dst 172.168.152.73 eth-src 1c:34:da:77:fb:d8 eth-dst 3c:fd:fe:bb:1c:0c
testpmd> flow create 1 ingress priority 0 group 0 transfer pattern eth src is 46:87:9e:9e:c8:23 dst is 5a:9e:0f:74:6c:5e type is 0x0800 / ipv4 tos spec 0x0 tos mask 0x3 / end actions count / vxlan_encap / port_id original 0 id 0 / end
port_flow_complain(): Caught PMD error type 16 (specific action): port does not belong to E-Switch being configured: Invalid argument


Add the rule fail for "port does not belong to E-Switch being configured"


I checkout  with the dpdk codes


In the function
flow_dv_validate_action_port_id


if (act_priv->domain_id != dev_priv->domain_id)
                return rte_flow_error_set
                                (error, EINVAL,
                                 RTE_FLOW_ERROR_TYPE_ACTION, NULL, 
                                 "port does not belong to"
                                 " E-Switch being configured");


The domain_id of vf representor is not the same as domain_id of PF.


And check the mlx5_dev_spawn the vlaue of  domain_id for vf representor and PF will be always diffirent.


mlx5_dev_spawn
  /*
         * Look for sibling devices in order to reuse their switch domain
         * if any, otherwise allocate one.
         */
        MLX5_ETH_FOREACH_DEV(port_id, NULL) {
                const struct mlx5_priv *opriv =
                        rte_eth_devices[port_id].data->dev_private;


                if (!opriv ||
                    opriv->sh != priv->sh ||
                        opriv->domain_id ==
                        RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID)
                        continue;
                priv->domain_id = opriv->domain_id;
                break;
        }
        if (priv->domain_id == RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID) {
                err = rte_eth_switch_domain_alloc(&priv->domain_id);



The MLX5_ETH_FOREACH_DEV will never for PF eth_dev. 
mlx5_eth_find_next(uint16_t port_id, struct rte_device *odev)
{
        while (port_id < RTE_MAX_ETHPORTS) {
                struct rte_eth_dev *dev = &rte_eth_devices[port_id];


                if (dev->state != RTE_ETH_DEV_UNUSED &&
                    dev->device &&
                    (dev->device == odev ||
                     (dev->device->driver &&
                     dev->device->driver->name &&
                     ((strcmp(dev->device->driver->name,
                              MLX5_PCI_DRIVER_NAME) == 0) ||
                      (strcmp(dev->device->driver->name,
                              MLX5_AUXILIARY_DRIVER_NAME) == 0)))))



Although the state of eth_dev is ATTACHED. But the driver is not set .
The driver only set in the rte_pci_probe_one_driver which all ports
on the same device is probed.
So at this moment representor vf will never find the PF one, this will
lead the repsentor vf choose another domain_id


So in this case it should put the pci_driver to the mlx5_driver_probe (mlx5_os_pci_probe)






BR
wenxu









More information about the dev mailing list