[dpdk-dev] [EXT] Re: [PATCH v3 1/1] bus/pci: optimise scanning with whitelist/blacklist

Sunil Kumar Kori skori at marvell.com
Tue Apr 28 15:52:52 CEST 2020


>-----Original Message-----
>From: Gaëtan Rivet <grive at u256.net>
>Sent: Tuesday, April 28, 2020 12:14 AM
>To: Sunil Kumar Kori <skori at marvell.com>
>Cc: stephen at networkplumber.org; david.marchand at redhat.com; Jerin Jacob
>Kollanukkaran <jerinj at marvell.com>; dev at dpdk.org
>Subject: [EXT] Re: [dpdk-dev] [PATCH v3 1/1] bus/pci: optimise scanning with
>whitelist/blacklist
>
>External Email
>
>----------------------------------------------------------------------
>Hello Sunil,
>
>As it seems that this patch does not overly alarm people using the PCI
>bus, I have a few comments on this version. Sending those a little
>earlier will allow you to change as needed before proceeding with your
>tests.
>
>On 20/04/20 12:25 +0530, Sunil Kumar Kori wrote:
>> rte_bus_scan API scans all the available PCI devices irrespective of white
>> or black listing parameters then further devices are probed based on white
>> or black listing parameters. So unnecessary CPU cycles are wasted during
>> rte_pci_scan.
>>
>> For Octeontx2 platform with core frequency 2.4 Ghz, rte_bus_scan
>consumes
>> around 26ms to scan around 90 PCI devices but all may not be used by the
>> application. So for the application which uses 2 NICs, rte_bus_scan
>> consumes few microseconds and rest time is saved with this patch.
>>
>> Patch restricts devices to be scanned as per below mentioned conditions:
>>  - All devices will be scanned if no parameters are passed.
>>  - Only white listed devices will be scanned if white list is available.
>>  - All devices, except black listed, will be scanned if black list is
>>    available.
>>
>> Signed-off-by: Sunil Kumar Kori <skori at marvell.com>
>> ---
>> v3:
>>  - remove __rte_experimental from private function.
>>  - remove entry from map file too.
>> v2:
>>  - Added function to validate ignorance of device based on PCI address.
>>  - Marked device validation function as experimental.
>>
>>  drivers/bus/pci/bsd/pci.c    | 13 ++++++++++++-
>>  drivers/bus/pci/linux/pci.c  |  3 +++
>>  drivers/bus/pci/pci_common.c | 34
>++++++++++++++++++++++++++++++++++
>>  drivers/bus/pci/private.h    | 11 +++++++++++
>>  4 files changed, 60 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/bus/pci/bsd/pci.c b/drivers/bus/pci/bsd/pci.c
>> index ebbfeb13a..c8d954751 100644
>> --- a/drivers/bus/pci/bsd/pci.c
>> +++ b/drivers/bus/pci/bsd/pci.c
>> @@ -338,6 +338,7 @@ rte_pci_scan(void)
>>  			.match_buf_len = sizeof(matches),
>>  			.matches = &matches[0],
>>  	};
>> +	struct rte_pci_addr pci_addr;
>>
>>  	/* for debug purposes, PCI can be disabled */
>>  	if (!rte_eal_has_pci())
>> @@ -357,9 +358,19 @@ rte_pci_scan(void)
>>  			goto error;
>>  		}
>>
>> -		for (i = 0; i < conf_io.num_matches; i++)
>> +		for (i = 0; i < conf_io.num_matches; i++) {
>> +			pci_addr.domain = matches[i].pc_sel.pc_domain;
>> +			pci_addr.bus = matches[i].pc_sel.pc_bus;
>> +			pci_addr.devid = matches[i].pc_sel.pc_dev;
>> +			pci_addr.function = matches[i].pc_sel.pc_func;
>> +
>> +			/* Check that device should be ignored or not  */
>
>This comment is unnecessary, the function name should be sufficient to
>describe the check done.
>
Ack

>> +			if (pci_addr_ignore_device(&pci_addr))
>> +				continue;
>> +
>
>As this function is almost a copy of pci_ignore_device(), with a twist
>on the addr, I think the name pci_ignore_device_addr() would be more
>helpful.
>
>I think in general however, that exposed symbols, even internals,
>should be prefixed with rte_. It was (almost) ok for
>pci_ignore_device() to forego the namespace as it is a static function
>that will be mangled on export, but that won't be the case for your
>function.
>
>Please add rte_ prefix.
>
Ack

>>  			if (pci_scan_one(fd, &matches[i]) < 0)
>>  				goto error;
>> +		}
>>
>>  		dev_count += conf_io.num_matches;
>>  	} while(conf_io.status == PCI_GETCONF_MORE_DEVS);
>> diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c
>> index 71b0a3053..92bdad826 100644
>> --- a/drivers/bus/pci/linux/pci.c
>> +++ b/drivers/bus/pci/linux/pci.c
>> @@ -487,6 +487,9 @@ rte_pci_scan(void)
>>  		if (parse_pci_addr_format(e->d_name, sizeof(e->d_name),
>&addr) != 0)
>>  			continue;
>>
>> +		if (pci_addr_ignore_device(&addr))
>> +			continue;
>> +
>>  		snprintf(dirname, sizeof(dirname), "%s/%s",
>>  				rte_pci_get_sysfs_path(), e->d_name);
>>
>> diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c
>> index 3f5542076..a350a1993 100644
>> --- a/drivers/bus/pci/pci_common.c
>> +++ b/drivers/bus/pci/pci_common.c
>> @@ -589,6 +589,40 @@ pci_dma_unmap(struct rte_device *dev, void
>*addr, uint64_t iova, size_t len)
>>  	return -1;
>>  }
>>
>> +static struct rte_devargs *
>> +pci_addr_devargs_lookup(const struct rte_pci_addr *pci_addr)
>> +{
>> +	struct rte_devargs *devargs;
>> +	struct rte_pci_addr addr;
>> +
>> +	RTE_EAL_DEVARGS_FOREACH("pci", devargs) {
>> +		devargs->bus->parse(devargs->name, &addr);
>
>Why not use rte_pci_addr_parse directly there? The bus->parse() API,
>while stable, is one-level of indirection removed from what's done,
>it's simpler for the reader to see the intent by using the proper function.
>
>Return value should be checked. If the devargs name is not parseable,
>there are other issues at hand (memory corruption), we should skip the
>device as well or crash, but not proceed with comparison.
>
Ack

>> +		if (!rte_pci_addr_cmp(pci_addr, &addr))
>> +			return devargs;
>> +	}
>> +	return NULL;
>> +}
>> +
>> +bool
>> +pci_addr_ignore_device(const struct rte_pci_addr *pci_addr)
>> +{
>> +	struct rte_devargs *devargs = pci_addr_devargs_lookup(pci_addr);
>> +
>> +	switch (rte_pci_bus.bus.conf.scan_mode) {
>> +	case RTE_BUS_SCAN_WHITELIST:
>> +		if (devargs && devargs->policy == RTE_DEV_WHITELISTED)
>> +			return false;
>> +		break;
>> +	case RTE_BUS_SCAN_UNDEFINED:
>> +	case RTE_BUS_SCAN_BLACKLIST:
>> +		if (devargs == NULL ||
>> +		    devargs->policy != RTE_DEV_BLACKLISTED)
>> +			return false;
>> +		break;
>> +	}
>> +	return true;
>> +}
>> +
>>  static bool
>>  pci_ignore_device(const struct rte_pci_device *dev)
>>  {
>
>The logic seems ok to me.
>
>However, the logic is the same as the one in rte_pci_probe(). During
>probe, any device on the bus would have already been vetted during scan.
>It should be ok to probe all existing rte_pci_device.
>
>The same argument can be made for rte_pci_get_iommu_class() then, no
>need to use pci_ignore_device(). It is done after the scan() so it
>should be ok.
>
>And if pci_ignore_device() can be completely removed, then you should
>rename your function from rte_pci_ignore_device_addr() to
>rte_pci_ignore_device() altogether.
>
Ack

>> diff --git a/drivers/bus/pci/private.h b/drivers/bus/pci/private.h
>> index a205d4d9f..75af786f7 100644
>> --- a/drivers/bus/pci/private.h
>> +++ b/drivers/bus/pci/private.h
>> @@ -42,6 +42,17 @@ int rte_pci_scan(void);
>>  void
>>  pci_name_set(struct rte_pci_device *dev);
>>
>> +/**
>> + * Validate whether a device with given pci address should be ignored or
>not.
>> + *
>> + * @param pci_addr
>> + *	PCI address of device to be validated
>> + * @return
>> + *	1: if device is to be ignored,
>> + *	0: if device is to be scanned,
>> + */
>> +bool pci_addr_ignore_device(const struct rte_pci_addr *pci_addr);
>> +
>>  /**
>>   * Add a PCI device to the PCI Bus (append to PCI Device list). This function
>>   * also updates the bus references of the PCI Device (and the generic device
>> --
>> 2.17.1
>
>Best regards,
>--
>Gaëtan


More information about the dev mailing list