[dpdk-dev] [EXT] Re: [PATCH v3 1/1] bus/pci: optimise scanning with whitelist/blacklist
Sunil Kumar Kori
skori at marvell.com
Tue Apr 28 15:52:52 CEST 2020
>-----Original Message-----
>From: Gaëtan Rivet <grive at u256.net>
>Sent: Tuesday, April 28, 2020 12:14 AM
>To: Sunil Kumar Kori <skori at marvell.com>
>Cc: stephen at networkplumber.org; david.marchand at redhat.com; Jerin Jacob
>Kollanukkaran <jerinj at marvell.com>; dev at dpdk.org
>Subject: [EXT] Re: [dpdk-dev] [PATCH v3 1/1] bus/pci: optimise scanning with
>whitelist/blacklist
>
>External Email
>
>----------------------------------------------------------------------
>Hello Sunil,
>
>As it seems that this patch does not overly alarm people using the PCI
>bus, I have a few comments on this version. Sending those a little
>earlier will allow you to change as needed before proceeding with your
>tests.
>
>On 20/04/20 12:25 +0530, Sunil Kumar Kori wrote:
>> rte_bus_scan API scans all the available PCI devices irrespective of white
>> or black listing parameters then further devices are probed based on white
>> or black listing parameters. So unnecessary CPU cycles are wasted during
>> rte_pci_scan.
>>
>> For Octeontx2 platform with core frequency 2.4 Ghz, rte_bus_scan
>consumes
>> around 26ms to scan around 90 PCI devices but all may not be used by the
>> application. So for the application which uses 2 NICs, rte_bus_scan
>> consumes few microseconds and rest time is saved with this patch.
>>
>> Patch restricts devices to be scanned as per below mentioned conditions:
>> - All devices will be scanned if no parameters are passed.
>> - Only white listed devices will be scanned if white list is available.
>> - All devices, except black listed, will be scanned if black list is
>> available.
>>
>> Signed-off-by: Sunil Kumar Kori <skori at marvell.com>
>> ---
>> v3:
>> - remove __rte_experimental from private function.
>> - remove entry from map file too.
>> v2:
>> - Added function to validate ignorance of device based on PCI address.
>> - Marked device validation function as experimental.
>>
>> drivers/bus/pci/bsd/pci.c | 13 ++++++++++++-
>> drivers/bus/pci/linux/pci.c | 3 +++
>> drivers/bus/pci/pci_common.c | 34
>++++++++++++++++++++++++++++++++++
>> drivers/bus/pci/private.h | 11 +++++++++++
>> 4 files changed, 60 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/bus/pci/bsd/pci.c b/drivers/bus/pci/bsd/pci.c
>> index ebbfeb13a..c8d954751 100644
>> --- a/drivers/bus/pci/bsd/pci.c
>> +++ b/drivers/bus/pci/bsd/pci.c
>> @@ -338,6 +338,7 @@ rte_pci_scan(void)
>> .match_buf_len = sizeof(matches),
>> .matches = &matches[0],
>> };
>> + struct rte_pci_addr pci_addr;
>>
>> /* for debug purposes, PCI can be disabled */
>> if (!rte_eal_has_pci())
>> @@ -357,9 +358,19 @@ rte_pci_scan(void)
>> goto error;
>> }
>>
>> - for (i = 0; i < conf_io.num_matches; i++)
>> + for (i = 0; i < conf_io.num_matches; i++) {
>> + pci_addr.domain = matches[i].pc_sel.pc_domain;
>> + pci_addr.bus = matches[i].pc_sel.pc_bus;
>> + pci_addr.devid = matches[i].pc_sel.pc_dev;
>> + pci_addr.function = matches[i].pc_sel.pc_func;
>> +
>> + /* Check that device should be ignored or not */
>
>This comment is unnecessary, the function name should be sufficient to
>describe the check done.
>
Ack
>> + if (pci_addr_ignore_device(&pci_addr))
>> + continue;
>> +
>
>As this function is almost a copy of pci_ignore_device(), with a twist
>on the addr, I think the name pci_ignore_device_addr() would be more
>helpful.
>
>I think in general however, that exposed symbols, even internals,
>should be prefixed with rte_. It was (almost) ok for
>pci_ignore_device() to forego the namespace as it is a static function
>that will be mangled on export, but that won't be the case for your
>function.
>
>Please add rte_ prefix.
>
Ack
>> if (pci_scan_one(fd, &matches[i]) < 0)
>> goto error;
>> + }
>>
>> dev_count += conf_io.num_matches;
>> } while(conf_io.status == PCI_GETCONF_MORE_DEVS);
>> diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c
>> index 71b0a3053..92bdad826 100644
>> --- a/drivers/bus/pci/linux/pci.c
>> +++ b/drivers/bus/pci/linux/pci.c
>> @@ -487,6 +487,9 @@ rte_pci_scan(void)
>> if (parse_pci_addr_format(e->d_name, sizeof(e->d_name),
>&addr) != 0)
>> continue;
>>
>> + if (pci_addr_ignore_device(&addr))
>> + continue;
>> +
>> snprintf(dirname, sizeof(dirname), "%s/%s",
>> rte_pci_get_sysfs_path(), e->d_name);
>>
>> diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c
>> index 3f5542076..a350a1993 100644
>> --- a/drivers/bus/pci/pci_common.c
>> +++ b/drivers/bus/pci/pci_common.c
>> @@ -589,6 +589,40 @@ pci_dma_unmap(struct rte_device *dev, void
>*addr, uint64_t iova, size_t len)
>> return -1;
>> }
>>
>> +static struct rte_devargs *
>> +pci_addr_devargs_lookup(const struct rte_pci_addr *pci_addr)
>> +{
>> + struct rte_devargs *devargs;
>> + struct rte_pci_addr addr;
>> +
>> + RTE_EAL_DEVARGS_FOREACH("pci", devargs) {
>> + devargs->bus->parse(devargs->name, &addr);
>
>Why not use rte_pci_addr_parse directly there? The bus->parse() API,
>while stable, is one-level of indirection removed from what's done,
>it's simpler for the reader to see the intent by using the proper function.
>
>Return value should be checked. If the devargs name is not parseable,
>there are other issues at hand (memory corruption), we should skip the
>device as well or crash, but not proceed with comparison.
>
Ack
>> + if (!rte_pci_addr_cmp(pci_addr, &addr))
>> + return devargs;
>> + }
>> + return NULL;
>> +}
>> +
>> +bool
>> +pci_addr_ignore_device(const struct rte_pci_addr *pci_addr)
>> +{
>> + struct rte_devargs *devargs = pci_addr_devargs_lookup(pci_addr);
>> +
>> + switch (rte_pci_bus.bus.conf.scan_mode) {
>> + case RTE_BUS_SCAN_WHITELIST:
>> + if (devargs && devargs->policy == RTE_DEV_WHITELISTED)
>> + return false;
>> + break;
>> + case RTE_BUS_SCAN_UNDEFINED:
>> + case RTE_BUS_SCAN_BLACKLIST:
>> + if (devargs == NULL ||
>> + devargs->policy != RTE_DEV_BLACKLISTED)
>> + return false;
>> + break;
>> + }
>> + return true;
>> +}
>> +
>> static bool
>> pci_ignore_device(const struct rte_pci_device *dev)
>> {
>
>The logic seems ok to me.
>
>However, the logic is the same as the one in rte_pci_probe(). During
>probe, any device on the bus would have already been vetted during scan.
>It should be ok to probe all existing rte_pci_device.
>
>The same argument can be made for rte_pci_get_iommu_class() then, no
>need to use pci_ignore_device(). It is done after the scan() so it
>should be ok.
>
>And if pci_ignore_device() can be completely removed, then you should
>rename your function from rte_pci_ignore_device_addr() to
>rte_pci_ignore_device() altogether.
>
Ack
>> diff --git a/drivers/bus/pci/private.h b/drivers/bus/pci/private.h
>> index a205d4d9f..75af786f7 100644
>> --- a/drivers/bus/pci/private.h
>> +++ b/drivers/bus/pci/private.h
>> @@ -42,6 +42,17 @@ int rte_pci_scan(void);
>> void
>> pci_name_set(struct rte_pci_device *dev);
>>
>> +/**
>> + * Validate whether a device with given pci address should be ignored or
>not.
>> + *
>> + * @param pci_addr
>> + * PCI address of device to be validated
>> + * @return
>> + * 1: if device is to be ignored,
>> + * 0: if device is to be scanned,
>> + */
>> +bool pci_addr_ignore_device(const struct rte_pci_addr *pci_addr);
>> +
>> /**
>> * Add a PCI device to the PCI Bus (append to PCI Device list). This function
>> * also updates the bus references of the PCI Device (and the generic device
>> --
>> 2.17.1
>
>Best regards,
>--
>Gaëtan
More information about the dev
mailing list