[dpdk-dev] [PATCH v2 6/7] net/mlx5: probe all port representors

Adrien Mazarguil adrien.mazarguil at 6wind.com
Wed Jun 27 15:32:28 CEST 2018


On Sun, Jun 17, 2018 at 10:15:07AM +0000, Shahaf Shuler wrote:
> Hi Adrien,
> 
> Saturday, June 16, 2018 11:58 AM, Xueming(Steven) Li:
> > Subject: RE: [dpdk-dev] [PATCH v2 6/7] net/mlx5: probe all port representors
> > 
> > > -----Original Message-----
> > > From: dev <dev-bounces at dpdk.org> On Behalf Of Adrien Mazarguil
> > > Sent: Thursday, June 14, 2018 4:35 PM
> > > To: Shahaf Shuler <shahafs at mellanox.com>
> > > Cc: dev at dpdk.org
> > > Subject: [dpdk-dev] [PATCH v2 6/7] net/mlx5: probe all port
> > > representors
> > >
> > > Probe existing port representors in addition to their master device and
> > associate them automatically.
> > >
> > > To avoid name collision between Ethernet devices, their names use the
> > > same convention as ixgbe and i40e PMDs, that is, instead of only a PCI
> > address in DBDF notation:
> > >
> > > - "net_{DBDF}_0" for master/switch devices.
> 
> This is breaking compatibility for application using the device names in order to attach them to the application (e.g. OVS-DPDK). 
> Before this patch the naming scheme for non-representor port is "{DBDF}". 
> 
> Can we preserve the compatibility and add appropriate suffix for the representor case? 

There's one issue if representors are hot-plugged. The name of the master
device, which happens to be that of the switch domain, cannot be
updated. The form "net_{DBDF}_0" seems expected for PMDs that support
representors (see ixgbe and i40e).

Now since representor hot-plugging is not supported yet, I guess we could
postpone this problem by keeping the old format in the meantime, however
ideally, these applications should not rely on it. The only safe assumption
they can make is the uniqueness of any given name among ethdevs.

PCI bus addresses, if needed, should be retrieved by looking at the
underlying bus object.

By the way, while thinking again about a past comment from Xueming [1],
maybe it's finally time to remove support for multiple Verbs ports on mlx5
after all. This should drop another unnecessary loop and the need for the
unused "port %u" suffix at all while naming the device.

So how about the following plan for v3:

- Adding a patch that drops support for multiple Verbs ports (note for
  Xueming, yes I changed my mind *again* :)

- If you really think this will break OVS (please confirm), then when no
  "representor" parameter is provided (regardless of the presence of any
  representors), name format will use the usual "{DBDF}" notation as you
  suggested.

- Otherwise as soon as a "representor" is found on the command line, the new
  format will be used, again regardless of the presence of any representors.

- In both cases, representors if any, will be named according to the format
  specified in this patch.

[1] https://mails.dpdk.org/archives/dev/2018-June/104015.html

<snip>
> > >  	for (i = 0; i < attr.orig_attr.phys_port_cnt; ++i) {
> > > -		eth_list[i] = mlx5_dev_spawn_one(dpdk_dev, ibv_dev, vf,
> > > -						 &attr, i + 1);
> > > -		if (eth_list[i])
> > > -			continue;
> > > -		/* Save rte_errno and roll back in case of failure. */
> > > -		ret = rte_errno;
> > > -		while (i--) {
> > > -			mlx5_dev_close(eth_list[i]);
> > > -			if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> > > -				rte_free(eth_list[i]->data->dev_private);
> > > -			claim_zero(rte_eth_dev_release_port(eth_list[i]));
> > > -		}
> > > -		free(eth_list);
> > > -		rte_errno = ret;
> > > -		return NULL;
> > > +		eth_list[n] = mlx5_dev_spawn_one(dpdk_dev, ibv_dev[j],
> > vf,
> > > +						 &attr, i + 1,
> > > +						 j ? eth_list[0] : NULL,
> > > +						 j - 1);
> 
> The representor id is according to the sort made by qsort (based on device names).
> A better way may be to set it according to the sysfs information, like you do in the mlx5_get_ifname function.
> What do you think? 

I agree that the current approach sucks, hence the big fat warnings I left
around (see discussion with Xueming [2]). Problem is that the needed
information is not yet known at this stage; there is no private structure to
rely on to use mlx5_get_ifname() directly.

I'd also rather see these assumptions go in any case. I'll attempt to
improve things for v3 in preparation of allowing representors to be probed
on their own anytime, possibly even before the master device.

[2] https://mails.dpdk.org/archives/dev/2018-June/104059.html

<snip>
> > > diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index
> > > 997b04a33..0fe467140 100644
> > > --- a/drivers/net/mlx5/mlx5.h
> > > +++ b/drivers/net/mlx5/mlx5.h
> > > @@ -161,6 +161,10 @@ struct priv {
> > >  	uint16_t mtu; /* Configured MTU. */
> > >  	uint8_t port; /* Physical port number. */
> > >  	unsigned int isolated:1; /* Whether isolated mode is enabled. */
> > > +	unsigned int representor:1; /* Device is a port representor. */
> 
> Why we need above flag? Why can't we use RTE_ETH_DEV_REPRESENTOR from eth_dev->data->dev_flags. 

Problem is that this flag can only be set once the ethdev is fully
instantiated and can't be relied on internally where needed (e.g. during
clean up in error handling code). It's reported to applications but not used
internally.

As a device property, it's actually pretty similar to the VF bit or
offloaded capabilities where checking exposed information would be
needlessly complex.

Now maybe it could be part of struct mlx5_dev_config as well. I initially
assumed this object was only for user-provided parameters but looks like
it's not the case. I intend to move it there for v3.

-- 
Adrien Mazarguil
6WIND


More information about the dev mailing list