[dpdk-dev] [PATCH v2 6/7] net/mlx5: probe all port representors

Xueming(Steven) Li xuemingl at mellanox.com
Wed Jun 27 19:30:34 CEST 2018



> -----Original Message-----
> From: Adrien Mazarguil <adrien.mazarguil at 6wind.com>
> Sent: Wednesday, June 27, 2018 9:32 PM
> To: Shahaf Shuler <shahafs at mellanox.com>
> Cc: Xueming(Steven) Li <xuemingl at mellanox.com>; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2 6/7] net/mlx5: probe all port representors
> 
> On Sun, Jun 17, 2018 at 10:15:07AM +0000, Shahaf Shuler wrote:
> > Hi Adrien,
> >
> > Saturday, June 16, 2018 11:58 AM, Xueming(Steven) Li:
> > > Subject: RE: [dpdk-dev] [PATCH v2 6/7] net/mlx5: probe all port
> > > representors
> > >
> > > > -----Original Message-----
> > > > From: dev <dev-bounces at dpdk.org> On Behalf Of Adrien Mazarguil
> > > > Sent: Thursday, June 14, 2018 4:35 PM
> > > > To: Shahaf Shuler <shahafs at mellanox.com>
> > > > Cc: dev at dpdk.org
> > > > Subject: [dpdk-dev] [PATCH v2 6/7] net/mlx5: probe all port
> > > > representors
> > > >
> > > > Probe existing port representors in addition to their master
> > > > device and
> > > associate them automatically.
> > > >
> > > > To avoid name collision between Ethernet devices, their names use
> > > > the same convention as ixgbe and i40e PMDs, that is, instead of
> > > > only a PCI
> > > address in DBDF notation:
> > > >
> > > > - "net_{DBDF}_0" for master/switch devices.
> >
> > This is breaking compatibility for application using the device names in order to attach them to the
> application (e.g. OVS-DPDK).
> > Before this patch the naming scheme for non-representor port is "{DBDF}".
> >
> > Can we preserve the compatibility and add appropriate suffix for the representor case?
> 
> There's one issue if representors are hot-plugged. The name of the master device, which happens to be
> that of the switch domain, cannot be updated. The form "net_{DBDF}_0" seems expected for PMDs that
> support representors (see ixgbe and i40e).
> 
> Now since representor hot-plugging is not supported yet, I guess we could postpone this problem by
> keeping the old format in the meantime, however ideally, these applications should not rely on it. The
> only safe assumption they can make is the uniqueness of any given name among ethdevs.
> 
> PCI bus addresses, if needed, should be retrieved by looking at the underlying bus object.
> 
> By the way, while thinking again about a past comment from Xueming [1], maybe it's finally time to
> remove support for multiple Verbs ports on mlx5 after all. This should drop another unnecessary loop
> and the need for the unused "port %u" suffix at all while naming the device.
> 
> So how about the following plan for v3:
> 
> - Adding a patch that drops support for multiple Verbs ports (note for
>   Xueming, yes I changed my mind *again* :)
> 
> - If you really think this will break OVS (please confirm), then when no
>   "representor" parameter is provided (regardless of the presence of any
>   representors), name format will use the usual "{DBDF}" notation as you
>   suggested.
> 
> - Otherwise as soon as a "representor" is found on the command line, the new
>   format will be used, again regardless of the presence of any representors.

The port creation sequence of upcoming hot plug looks like this:
0000:81:00.1
0000:81:00.1,representor=0
0000:81:00.1,representor=1

So the PF attaching comes always w/o "representor" parameter.

> 
> - In both cases, representors if any, will be named according to the format
>   specified in this patch.
> 
> [1]
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2
> 018-
> June%2F104015.html&data=02%7C01%7Cxuemingl%40mellanox.com%7Cad9a1b32e5e241e375d208d5dc32778b%7Ca652971
> c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636657031666639542&sdata=w4oNeWXwKXS0%2BNSZsYQaneW%2BkFxvWIHZFHLoM
> fLOxkg%3D&reserved=0
> 
> <snip>
> > > >  	for (i = 0; i < attr.orig_attr.phys_port_cnt; ++i) {
> > > > -		eth_list[i] = mlx5_dev_spawn_one(dpdk_dev, ibv_dev, vf,
> > > > -						 &attr, i + 1);
> > > > -		if (eth_list[i])
> > > > -			continue;
> > > > -		/* Save rte_errno and roll back in case of failure. */
> > > > -		ret = rte_errno;
> > > > -		while (i--) {
> > > > -			mlx5_dev_close(eth_list[i]);
> > > > -			if (rte_eal_process_type() == RTE_PROC_PRIMARY)
> > > > -				rte_free(eth_list[i]->data->dev_private);
> > > > -			claim_zero(rte_eth_dev_release_port(eth_list[i]));
> > > > -		}
> > > > -		free(eth_list);
> > > > -		rte_errno = ret;
> > > > -		return NULL;
> > > > +		eth_list[n] = mlx5_dev_spawn_one(dpdk_dev, ibv_dev[j],
> > > vf,
> > > > +						 &attr, i + 1,
> > > > +						 j ? eth_list[0] : NULL,
> > > > +						 j - 1);
> >
> > The representor id is according to the sort made by qsort (based on device names).
> > A better way may be to set it according to the sysfs information, like you do in the mlx5_get_ifname
> function.
> > What do you think?
> 
> I agree that the current approach sucks, hence the big fat warnings I left around (see discussion with
> Xueming [2]). Problem is that the needed information is not yet known at this stage; there is no
> private structure to rely on to use mlx5_get_ifname() directly.
> 
> I'd also rather see these assumptions go in any case. I'll attempt to improve things for v3 in
> preparation of allowing representors to be probed on their own anytime, possibly even before the
> master device.
> 
> [2]
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2
> 018-
> June%2F104059.html&data=02%7C01%7Cxuemingl%40mellanox.com%7Cad9a1b32e5e241e375d208d5dc32778b%7Ca652971
> c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636657031666639542&sdata=5eOWb69duEB%2BkIW1ZGkv%2FLxkZfwErQOd%2FV7
> nDpN2jOg%3D&reserved=0
> 
> <snip>
> > > > diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
> > > > index
> > > > 997b04a33..0fe467140 100644
> > > > --- a/drivers/net/mlx5/mlx5.h
> > > > +++ b/drivers/net/mlx5/mlx5.h
> > > > @@ -161,6 +161,10 @@ struct priv {
> > > >  	uint16_t mtu; /* Configured MTU. */
> > > >  	uint8_t port; /* Physical port number. */
> > > >  	unsigned int isolated:1; /* Whether isolated mode is enabled. */
> > > > +	unsigned int representor:1; /* Device is a port representor. */
> >
> > Why we need above flag? Why can't we use RTE_ETH_DEV_REPRESENTOR from eth_dev->data->dev_flags.
> 
> Problem is that this flag can only be set once the ethdev is fully instantiated and can't be relied on
> internally where needed (e.g. during clean up in error handling code). It's reported to applications
> but not used internally.
> 
> As a device property, it's actually pretty similar to the VF bit or offloaded capabilities where
> checking exposed information would be needlessly complex.
> 
> Now maybe it could be part of struct mlx5_dev_config as well. I initially assumed this object was only
> for user-provided parameters but looks like it's not the case. I intend to move it there for v3.
> 
> --
> Adrien Mazarguil
> 6WIND


More information about the dev mailing list