[dpdk-dev] [RFC PATCH 5/6] bus/mlx5_pci: register a PCI driver

Parav Pandit parav at mellanox.com
Thu Jun 18 12:03:06 CEST 2020


> From: Gaëtan Rivet <grive at u256.net>
> Sent: Tuesday, June 16, 2020 3:17 AM
> 
> On 10/06/20 17:17 +0000, Parav Pandit wrote:
> > Create a mlx5 bus driver framework for invoking drivers of multiple
> > classes who have registered with the mlx5_pci bus driver.
> >
> > Validate user class arguments for supported class combinations.
> >
> > Signed-off-by: Parav Pandit <parav at mellanox.com>
> > ---
> >  drivers/bus/mlx5_pci/Makefile           |   1 +
> >  drivers/bus/mlx5_pci/meson.build        |   2 +-
> >  drivers/bus/mlx5_pci/mlx5_pci_bus.c     | 253
> ++++++++++++++++++++++++
> >  drivers/bus/mlx5_pci/rte_bus_mlx5_pci.h |   1 +
> >  4 files changed, 256 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/bus/mlx5_pci/Makefile
> > b/drivers/bus/mlx5_pci/Makefile index b36916e52..327076fe4 100644
> > --- a/drivers/bus/mlx5_pci/Makefile
> > +++ b/drivers/bus/mlx5_pci/Makefile
> > @@ -15,6 +15,7 @@ CFLAGS += -I$(BUILDDIR)/drivers/common/mlx5
> CFLAGS
> > += -I$(RTE_SDK)/drivers/bus/pci  CFLAGS += -Wno-strict-prototypes
> > LDLIBS += -lrte_eal
> > +LDLIBS += -lrte_kvargs
> >  LDLIBS += -lrte_common_mlx5
> >  LDLIBS += -lrte_pci -lrte_bus_pci
> >
> > diff --git a/drivers/bus/mlx5_pci/meson.build
> > b/drivers/bus/mlx5_pci/meson.build
> > index cc4a84e23..5111baa4e 100644
> > --- a/drivers/bus/mlx5_pci/meson.build
> > +++ b/drivers/bus/mlx5_pci/meson.build
> > @@ -1,6 +1,6 @@
> >  # SPDX-License-Identifier: BSD-3-Clause  # Copyright(c) 2020 Mellanox
> > Technologies Ltd
> >
> > -deps += ['pci', 'bus_pci', 'common_mlx5']
> > +deps += ['pci', 'bus_pci', 'common_mlx5', 'kvargs']
> >  install_headers('rte_bus_mlx5_pci.h')
> >  sources = files('mlx5_pci_bus.c')
> > diff --git a/drivers/bus/mlx5_pci/mlx5_pci_bus.c
> > b/drivers/bus/mlx5_pci/mlx5_pci_bus.c
> > index 66db3c7b0..8108e35ea 100644
> > --- a/drivers/bus/mlx5_pci/mlx5_pci_bus.c
> > +++ b/drivers/bus/mlx5_pci/mlx5_pci_bus.c
> > @@ -3,12 +3,265 @@
> >   */
> >
> >  #include "rte_bus_mlx5_pci.h"
> > +#include <mlx5_common_utils.h>
> >
> >  static TAILQ_HEAD(mlx5_pci_bus_drv_head, rte_mlx5_pci_driver) drv_list
> =
> >  				TAILQ_HEAD_INITIALIZER(drv_list);
> >
> > +struct class_map {
> > +	const char *name;
> > +	enum mlx5_class dev_class;
> > +};
> 
> Defining a type here does not seem useful.
> You could return an "enum mlx5_class" from the function is_valid_class()
> further below, instead of a class_map entry. You have the
> MLX5_CLASS_INVALID sentinel value to mark the inexistant mapping.
> 
> Making this type anonymous, you should merge it with the array below:
> 
> static const struct {
> 	const char *name;
> 	enum mlx5_class class; // Remember to change this enum to a
> 			       // fixed width type by the way.
Yes, I acked to changed to define, but I forgot that I use the enum here.
So I am going to keep the enum as code reads more clear with enum.

> } mlx5_classes[] = {
> 	...
> };
> 
> > +
> > +const struct class_map mlx5_classes[] = {
> 
> This is not really a class list, but as the type describes, a mapping between
> names and binary id.
> 
> I think mlx5_class_names would fit better.
o.k.
> 
> > +	{ .name = "vdpa", .dev_class = MLX5_CLASS_VDPA },
> > +	{ .name = "net", .dev_class = MLX5_CLASS_NET }, };
> 
> Globals should be static even if not exposed through header.
Yes.
> 
> > +
> > +static const enum mlx5_class mlx5_valid_class_combo[] = {
> > +	MLX5_CLASS_NET,
> > +	MLX5_CLASS_VDPA,
> 
> How do you describe future combos?
> Should we expect MLX5_CLASS_NET | MLX5_CLASS_REGEX for example?
> 
Correct.

> > +	/* New class combination should be added here */ };
> > +
> > +static const struct class_map *is_valid_class(const char *class_name)
> 
> I don't think the function name conveys its actual use.
> mlx5_class_from_name() for example would align with other DPDK APIs.
Make sense. Will change.
> 
> > +{
> > +	unsigned int i;
> > +
> > +	for (i = 0; i < sizeof(mlx5_classes) / sizeof(struct class_map);
> 
> RTE_DIM(mlx5_classes) must be used instead.
> 
> > +	     i++) {
> > +		if (strcmp(class_name, mlx5_classes[i].name) == 0)
> > +			return &mlx5_classes[i];
> > +
> > +	}
> > +	return NULL;
> > +}
> > +
> > +static int is_valid_class_combo(const char *class_names) {
> > +	enum mlx5_class user_classes = 0;
> > +	char *nstr = strdup(class_names);
> 
> Not a fan of assignment with malloc directly at var declaration, you are
> missing some checks here.
> 
Ok. will do assignment explicitly and do the check.

> > +	const struct class_map *entry;
> > +	char *copy = nstr;
> 
> copy is nondescript and dangerous. I'd use nstr_orig instead.
Ok. will change varialble name from copy to nstr_orig.

> 
> > +	int invalid = 0;
> > +	unsigned int i;
> > +	char *found;
> 
> Reading the code below, it seems that the kvlist should only be defined if the
> devargs length is > 0, so class_names here should be defined.
> 
> However you do not handle the OOM case explicitly here, if nstr cannot be
> allocated on strdup().
> 
Will return error on OOM.

> > +
> > +	while (nstr) {
> > +		/* Extract each individual class name */
> > +		found = strsep(&nstr, ":");
> 
> I have not seen the feature test macros (_DEFAULT_SOURCE) in the
> Makefile, it seems required for strsep()?
> 
If its mandatory meson build should have complained?

> > +		if (!found)
> > +			continue;
> > +
> > +		/* Check if its a valid class */
> > +		entry = is_valid_class(found);
> 
> As said earlier, you have no use for the full map entry, you could return the
> mlx5_class type instead, as you have the MLX5_CLASS_INVALID sentinel
> available to mark the !found case.
> 
> > +		if (!entry) {
> > +			invalid = EINVAL;
> 
> Just toggling invalid = 1; here would be better.
> Return EINVAL explicitly if (invalid).
> 
> > +			break;
> > +		}
> > +		user_classes |= entry->dev_class;
> > +	}
> > +	if (copy)
> > +		free(copy);
> > +	if (invalid)
> > +		return invalid;
> 
> You are returning EINVAL here, but -EINVAL below.
> It should be aligned, in DPDK usually error return values are negative.
> positive errno should only be used for rte_errno.
> 
> > +
> > +	/* Verify if user specified valid supported combination */
> > +	for (i = 0;
> > +	     i < sizeof(mlx5_valid_class_combo) / sizeof(enum mlx5_class);
> 
> RTE_DIM() here;
> 
Ack.

> > +	     i++) {
> > +		if (mlx5_valid_class_combo[i] == user_classes)
> 
> return 0; directly?
> 
Yes.

> > +			break;
> > +	}
> > +	/* Not found any valid class combination */
> > +	if (i == sizeof(mlx5_valid_class_combo) / sizeof(enum mlx5_class))
> 
> This would simplify this check, where you can directly return -ENODEV for
> example to differentiate from invalid class name and invalid combo.
> 
> > +		return -EINVAL;
> > +	else
> > +		return 0;
> > +}
> > +
> > +static int
> > +mlx5_bus_opt_handler(__rte_unused const char *key, const char *value,
> > +		       void *opaque)
> > +{
> > +	int *ret = opaque;
> > +
> > +	*ret = is_valid_class_combo(value);
> > +	if (*ret)
> > +		DRV_LOG(ERR, "Invalid mlx5 classes %s. Maybe typo in
> device"
> > +			" class argument setting?", value);
> 
> Error message should not be cut in half, it makes it difficult to grep.
> If you differentiate between typo in name and invalid combo you could
> directly warn the user about the proper error.
> 
Will have it in one line.

> (You can ignore the warning from checkpatch.sh about the long lines on a
> string if there is one.)
> 
> > +	return 0;
> > +}
> > +
> > +static int
> > +mlx5_bus_options_valid(const struct rte_devargs *devargs) {
> > +	struct rte_kvargs *kvlist;
> > +	const char *key = MLX5_CLASS_ARG_NAME;
> > +	int ret = 0;
> > +
> > +	if (devargs == NULL)
> > +		return 0;
> > +	kvlist = rte_kvargs_parse(devargs->args, NULL);
> > +	if (kvlist == NULL)
> > +		return 0;
> > +	if (rte_kvargs_count(kvlist, key))
> > +		rte_kvargs_process(kvlist, key, mlx5_bus_opt_handler,
> &ret);
> > +	rte_kvargs_free(kvlist);
> > +	return ret;
> > +}
> > +
> >  void
> >  rte_mlx5_pci_driver_register(struct rte_mlx5_pci_driver *driver)  {
> >  	TAILQ_INSERT_TAIL(&drv_list, driver, next);  }
> > +
> > +static bool
> > +mlx5_bus_match(const struct rte_mlx5_pci_driver *drv,
> > +		 const struct rte_pci_device *pci_dev) {
> > +	const struct rte_pci_id *id_table;
> > +
> > +	for (id_table = drv->id_table; id_table->vendor_id != 0; id_table++) {
> > +		/* check if device's ids match the class driver's ones */
> > +		if (id_table->vendor_id != pci_dev->id.vendor_id &&
> > +				id_table->vendor_id != PCI_ANY_ID)
> > +			continue;
> > +		if (id_table->device_id != pci_dev->id.device_id &&
> > +				id_table->device_id != PCI_ANY_ID)
> > +			continue;
> > +		if (id_table->subsystem_vendor_id !=
> > +		    pci_dev->id.subsystem_vendor_id &&
> > +		    id_table->subsystem_vendor_id != PCI_ANY_ID)
> > +			continue;
> > +		if (id_table->subsystem_device_id !=
> > +		    pci_dev->id.subsystem_device_id &&
> > +		    id_table->subsystem_device_id != PCI_ANY_ID)
> > +			continue;
> > +		if (id_table->class_id != pci_dev->id.class_id &&
> > +				id_table->class_id != RTE_CLASS_ANY_ID)
> > +			continue;
> > +
> > +		return true;
> > +	}
> > +	return false;
> > +}
> > +
> > +/**
> > + * DPDK callback to register to probe multiple PCI class devices.
> > + *
> > + * @param[in] pci_drv
> > + *   PCI driver structure.
> > + * @param[in] dev
> > + *   PCI device information.
> > + *
> > + * @return
> > + *   0 on success, 1 to skip this driver, a negative errno value otherwise
> > + *   and rte_errno is set.
> > + */
> > +static int
> > +mlx5_bus_pci_probe(struct rte_pci_driver *drv __rte_unused,
> > +		     struct rte_pci_device *dev)
> > +{
> > +	struct rte_mlx5_pci_driver *class;
> > +	int ret = 0;
> > +
> > +	ret = mlx5_bus_options_valid(dev->device.devargs);
> > +	if (ret)
> > +		return ret;
> > +
> > +	TAILQ_FOREACH(class, &drv_list, next) {
> > +		if (!mlx5_bus_match(class, dev))
> > +			continue;
> > +
> > +		if (!mlx5_class_enabled(dev->device.devargs, class-
> >dev_class))
> > +			continue;
> > +
> > +		ret = class->probe(drv, dev);
> > +		if (!ret)
> > +			class->loaded = true;
> > +	}
> > +	return 0;
> > +}
> > +
> > +/**
> > + * DPDK callback to remove one or more class devices for a PCI device.
> > + *
> > + * This function removes all class devices belong to a given PCI device.
> > + *
> > + * @param[in] pci_dev
> > + *   Pointer to the PCI device.
> > + *
> > + * @return
> > + *   0 on success, the function cannot fail.
> > + */
> > +static int
> > +mlx5_bus_pci_remove(struct rte_pci_device *dev) {
> > +	struct rte_mlx5_pci_driver *class;
> > +
> > +	/* Remove each class driver in reverse order */
> 
> Why use a revese order specifically?
> 
> > +	TAILQ_FOREACH_REVERSE(class, &drv_list, mlx5_pci_bus_drv_head,
> next) {
> > +		if (!mlx5_class_enabled(dev->device.devargs, class-
> >dev_class))
> 
> You are parsing the devargs each time to check a single class.
> If you return the proper bitmap instead, do not parse the devargs within this
> loop, do it beforehand and check only that (devargs_class_bits & class-
> >dev_class).
> 
> Same thing in the probe().
> 
> > +			continue;
> > +
> > +		if (class->loaded)
> > +			class->remove(dev);
> > +	}
> > +	return 0;
> > +}
> > +
> > +static int
> > +mlx5_bus_pci_dma_map(struct rte_pci_device *dev, void *addr,
> > +		     uint64_t iova, size_t len)
> > +{
> > +	struct rte_mlx5_pci_driver *class;
> > +	int ret = -EINVAL;
> > +
> > +	TAILQ_FOREACH(class, &drv_list, next) {
> > +		if (!class->dma_map)
> > +			continue;
> > +
> > +		return class->dma_map(dev, addr, iova, len);
> 
> Is there a specific class that could have priority for the DMA?
> 
No.

> > +	}
> > +	return ret;
> > +}
> > +
> > +static int
> > +mlx5_bus_pci_dma_unmap(struct rte_pci_device *dev, void *addr,
> > +		       uint64_t iova, size_t len)
> > +{
> > +	struct rte_mlx5_pci_driver *class;
> > +	int ret = -EINVAL;
> > +
> > +	TAILQ_FOREACH_REVERSE(class, &drv_list, mlx5_pci_bus_drv_head,
> next) {
> > +		if (!class->dma_unmap)
> > +			continue;
> > +
> > +		return class->dma_unmap(dev, addr, iova, len);
> 
> If you have two classes A -> B having dma_map() + dma_unmap(), you will
> dma_map() with A then dma_unmap() with B, due to the _REVERSE()
> iteration?
> 
There isn't plan for two drivers to do so.
If two classes do that its source of an error.
Will enhance the bus when that need arise.

> Why use reversed iteration at all by the way for dinit? If your ops is sound
> any order should be ok.
> 
Because deinit must be always reverse of init() code regardless.

> > +	}
> > +	return ret;
> > +}


More information about the dev mailing list