[dpdk-dev] [PATCH 01/14] net/mlx5: add representor recognition on kernels 5.x

Stephen Hemminger stephen at networkplumber.org
Thu Mar 21 20:08:34 CET 2019


On Thu, 21 Mar 2019 15:31:36 +0000
Slava Ovsiienko <viacheslavo at mellanox.com> wrote:

> Hi, Stephen
> 
> > -----Original Message-----
> > From: Stephen Hemminger <stephen at networkplumber.org>
> > Sent: Thursday, March 21, 2019 17:09
> > To: Shahaf Shuler <shahafs at mellanox.com>
> > Cc: Slava Ovsiienko <viacheslavo at mellanox.com>; dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH 01/14] net/mlx5: add representor recognition
> > on kernels 5.x
> > 
> > On Thu, 21 Mar 2019 12:13:50 +0000
> > Shahaf Shuler <shahafs at mellanox.com> wrote:
> >   
> > > Hi Slava,
> > >
> > > Small comments below. Once fixed you can put my acked-by on the next  
> > version.  
> > >
> > > Thursday, March 21, 2019 10:11 AM, Viacheslav Ovsiienko:  
> > > > Subject: [PATCH 01/14] net/mlx5: add representor recognition on
> > > > kernels 5.x
> > > >
> > > > The master device and VF representors were distinguished by presence
> > > > of port name, master device did not have one. The new Linux kernels
> > > > starting from 5.0 provide the port name for master device and the
> > > > implemented representor recognizing method does not work.
> > > > The new recognizing method is based on quiering the VF number,
> > > > created on the base of the device.
> > > >
> > > > The IFLA_NUM_VF attribute is returned by kernel if IFLA_EXT_MASK
> > > > attribute is specified in the Netlink request message.
> > > >
> > > > Also the presence of device symlink in device sysfs folder is added
> > > > to distinguish representors with sysfs based method.
> > > >
> > > > Signed-off-by: Viacheslav Ovsiienko <viacheslavo at mellanox.com>
> > > >
> > > > ---
> > > >
> > > > v3: - rebased over new port naming  
> > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatche
> > s.dpdk.org%2Fpatch%2F51245%2F&data=02%7C01%7Cviacheslavo%40
> > mellanox.com%7C94cc885cbb8d4aade9dd08d6ae0f26cd%7Ca652971c7d2e4
> > d9ba6a4d149256f461b%7C0%7C0%7C636887777455248723&sdata=FDq
> > 950ksokxsNac8cBM293W263uPfVeY1xA7Cx%2F4FLk%3D&reserved=0  
> > > >     - master recognition is reinforced by checking vport for -1
> > > >       for new port naming schema
> > > >
> > > > v2: - fopen replaced with opendir to detect whether directory exists
> > > >
> > > > v1:
> > > > https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpat
> > > >  
> > ches.dpdk.org%2Fpatch%2F50411%2F&data=02%7C01%7Cviacheslavo%
> > 40me  
> > > >  
> > llanox.com%7C94cc885cbb8d4aade9dd08d6ae0f26cd%7Ca652971c7d2e4d9b
> > a6a4  
> > > >  
> > d149256f461b%7C0%7C0%7C636887777455248723&sdata=JkWKbb6LV
> > diIHW%2  
> > > > FpJEQHcD7hvFLWdGmM%2BTVhM%2F%2F80Uk%3D&reserved=0
> > > > ---
> > > >  drivers/net/mlx5/Makefile      | 10 ++++++++++
> > > >  drivers/net/mlx5/meson.build   |  4 ++++
> > > >  drivers/net/mlx5/mlx5.c        |  2 +-
> > > >  drivers/net/mlx5/mlx5.h        |  1 +
> > > >  drivers/net/mlx5/mlx5_ethdev.c | 13 +++++++++++--
> > > >  drivers/net/mlx5/mlx5_nl.c     | 36
> > > > +++++++++++++++++++++++++++++++++---
> > > >  6 files changed, 60 insertions(+), 6 deletions(-)
> > > >
> > > > diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
> > > > index
> > > > 1ed299d..3dd7e38 100644
> > > > --- a/drivers/net/mlx5/Makefile
> > > > +++ b/drivers/net/mlx5/Makefile
> > > > @@ -231,6 +231,16 @@ mlx5_autoconf.h.new:
> > > > $(RTE_SDK)/buildtools/auto- config-h.sh
> > > >  		enum RDMA_NLDEV_ATTR_NDEV_INDEX \
> > > >  		$(AUTOCONF_OUTPUT)
> > > >  	$Q sh -- '$<' '$@' \
> > > > +		HAVE_IFLA_NUM_VF \
> > > > +		linux/if_link.h \
> > > > +		enum IFLA_NUM_VF \
> > > > +		$(AUTOCONF_OUTPUT)
> > > > +	$Q sh -- '$<' '$@' \
> > > > +		HAVE_IFLA_EXT_MASK \
> > > > +		linux/if_link.h \
> > > > +		enum IFLA_EXT_MASK \
> > > > +		$(AUTOCONF_OUTPUT)
> > > > +	$Q sh -- '$<' '$@' \
> > > >  		HAVE_IFLA_PHYS_SWITCH_ID \
> > > >  		linux/if_link.h \
> > > >  		enum IFLA_PHYS_SWITCH_ID \
> > > > diff --git a/drivers/net/mlx5/meson.build
> > > > b/drivers/net/mlx5/meson.build index 0cf2f08..e3cb9bc 100644
> > > > --- a/drivers/net/mlx5/meson.build
> > > > +++ b/drivers/net/mlx5/meson.build
> > > > @@ -133,6 +133,10 @@ if build
> > > >  		'ETHTOOL_LINK_MODE_50000baseCR2_Full_BIT' ],
> > > >  		[ 'HAVE_ETHTOOL_LINK_MODE_100G', 'linux/ethtool.h',
> > > >  		'ETHTOOL_LINK_MODE_100000baseKR4_Full_BIT' ],
> > > > +		[ 'HAVE_IFLA_NUM_VF', 'linux/if_link.h',
> > > > +		'IFLA_NUM_VF' ],
> > > > +		[ 'HAVE_IFLA_EXT_MASK', 'linux/if_link.h',
> > > > +		'IFLA_EXT_MASK' ],
> > > >  		[ 'HAVE_IFLA_PHYS_SWITCH_ID', 'linux/if_link.h',
> > > >  		'IFLA_PHYS_SWITCH_ID' ],
> > > >  		[ 'HAVE_IFLA_PHYS_PORT_NAME', 'linux/if_link.h', diff --git
> > > > a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> > > > ad1975c..ea3d00c 100644
> > > > --- a/drivers/net/mlx5/mlx5.c
> > > > +++ b/drivers/net/mlx5/mlx5.c
> > > > @@ -13,7 +13,6 @@
> > > >  #include <errno.h>
> > > >  #include <net/if.h>
> > > >  #include <sys/mman.h>
> > > > -#include <linux/netlink.h>
> > > >  #include <linux/rtnetlink.h>
> > > >
> > > >  /* Verbs header. */
> > > > @@ -1001,6 +1000,7 @@
> > > >  	priv->nl_socket_route =	mlx5_nl_init(NETLINK_ROUTE);
> > > >  	priv->nl_sn = 0;
> > > >  	priv->representor = !!switch_info->representor;
> > > > +	priv->master = !!switch_info->master;
> > > >  	priv->domain_id = RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID;
> > > >  	priv->representor_id =
> > > >  		switch_info->representor ? switch_info->port_name : -1; diff
> > > > --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index
> > > > a88cb4a..58bc37f 100644
> > > > --- a/drivers/net/mlx5/mlx5.h
> > > > +++ b/drivers/net/mlx5/mlx5.h
> > > > @@ -214,6 +214,7 @@ struct mlx5_priv {
> > > >  	uint16_t mtu; /* Configured MTU. */
> > > >  	unsigned int isolated:1; /* Whether isolated mode is enabled. */
> > > >  	unsigned int representor:1; /* Device is a port representor. */
> > > > +	unsigned int master:1; /* Device is a E-Switch master. */
> > > >  	uint16_t domain_id; /* Switch domain identifier. */
> > > >  	int32_t representor_id; /* Port representor identifier. */
> > > >  	/* RX/TX queues. */
> > > > diff --git a/drivers/net/mlx5/mlx5_ethdev.c
> > > > b/drivers/net/mlx5/mlx5_ethdev.c index 84d761c..81f2a42 100644
> > > > --- a/drivers/net/mlx5/mlx5_ethdev.c
> > > > +++ b/drivers/net/mlx5/mlx5_ethdev.c
> > > > @@ -1362,8 +1362,10 @@ int mlx5_fw_version_get(struct rte_eth_dev
> > > > *dev, char *fw_ver, size_t fw_size)
> > > >  		.port_name = 0,
> > > >  		.switch_id = 0,
> > > >  	};
> > > > +	DIR *dir;
> > > >  	bool port_name_set = false;
> > > >  	bool port_switch_id_set = false;
> > > > +	bool device_dir = false;
> > > >  	char c;
> > > >
> > > >  	if (!if_indextoname(ifindex, ifname)) { @@ -1375,6 +1377,8 @@ int
> > > > mlx5_fw_version_get(struct rte_eth_dev *dev, char *fw_ver, size_t
> > > > fw_size)
> > > >  	      ifname);
> > > >  	MKSTR(phys_switch_id, "/sys/class/net/%s/phys_switch_id",
> > > >  	      ifname);
> > > > +	MKSTR(pci_device, "/sys/class/net/%s/device",
> > > > +	      ifname);
> > > >
> > > >  	file = fopen(phys_port_name, "rb");
> > > >  	if (file != NULL) {
> > > > @@ -1391,8 +1395,13 @@ int mlx5_fw_version_get(struct rte_eth_dev
> > > > *dev, char *fw_ver, size_t fw_size)
> > > >  		fscanf(file, "%" SCNx64 "%c", &data.switch_id, &c) == 2 &&
> > > >  		c == '\n';
> > > >  	fclose(file);
> > > > -	data.master = port_switch_id_set && !port_name_set;
> > > > -	data.representor = port_switch_id_set && port_name_set;
> > > > +	dir = opendir(pci_device);
> > > > +	if (dir != NULL) {
> > > > +		closedir(dir);
> > > > +		device_dir = true;
> > > > +	}
> > > > +	data.master = port_switch_id_set && (!port_name_set ||
> > > > device_dir);
> > > > +	data.representor = port_switch_id_set && port_name_set &&
> > > > !device_dir;  
> > >
> > > Add assert that device cannot be both master and representor.  
> > 
> > Error checking would be but assert() is usually not a useful in drivers.
> > It causes crash, and is often compiled out.  
> 
> PMD is a user mode driver, so standard assert() seems to be relevant.
> But I agree, it would be good for portable code to have its own
> definition of assert. Say, "rte_assert". It would allow us to define/redefine
> the code behavior if assertion fails.
> 
> As for me, I think asserts are EXTREMELY useful, it saves a lot of time while
> debugging, and it is proved by my own practice of mlx5 PMD debugging
> (beside other projects).  Assert inserted in right place stops the quite wrong
> situation evolving and allows us to have a good catch and find
> the root of the problem quickly.
> 
> WBR,
> Slava

You may misunderstand what I meant. Asserts are useful, but they need to be
used only when real error handling is not possible. In general, it is better
to log and return an error on invalid data than crash the whole application.
Especially since DPDK now supports hot plug and it could be that device is
added to a working application.

For me, the worry is that an application on Azure starts up using synthetic
datapath successfully, and the hot plug of VF (accelerated networking) might
expose a driver bug. In that case, there still is a fallback to ignore the
mlx device.


More information about the dev mailing list