[dpdk-dev] [PATCH v7 8/9] ethdev: representor iterator compare complete info
Xueming(Steven) Li
xuemingl at nvidia.com
Tue Mar 2 16:18:14 CET 2021
>-----Original Message-----
>From: Andrew Rybchenko <andrew.rybchenko at oktetlabs.ru>
>Sent: Tuesday, March 2, 2021 10:10 PM
>To: Xueming(Steven) Li <xuemingl at nvidia.com>
>Cc: dev at dpdk.org; Slava Ovsiienko <viacheslavo at nvidia.com>; Asaf Penso <asafp at nvidia.com>; NBU-Contact-Thomas Monjalon
><thomas at monjalon.net>; Ferruh Yigit <ferruh.yigit at intel.com>; Ray Kinsella <mdr at ashroe.eu>; Neil Horman
><nhorman at tuxdriver.com>
>Subject: Re: [PATCH v7 8/9] ethdev: representor iterator compare complete info
>
>On 3/2/21 2:43 PM, Xueming Li wrote:
>> The NIC can have multiple PCIe links and can be attached to multiple
>> hosts, for example the same single NIC can be shared for multiple
>> server units in the rack. On each PCIe link NIC can provide multiple
>> PFs and VFs/SFs based on these ones. The full representor identifier
>> consists of three indices - controller index, PF index, and VF or SF index (if any).
>>
>> SR-IOV and SubFunction are created on top of PF. PF index is
>> introduced because there might be multiple PFs in the bonding
>> configuration and only bonding device is probed.
>>
>> In eth representor comparator callback, ethdev representor ID was
>> compared with devarg. Since controller index and PF index not
>> compared, callback returned representor from other PF or controller.
>>
>> This patch adds new API to convert representor controller, pf and
>> vf/sf index to representor ID. Representor comparer callback convert
>> representor info into ID and compare with device representor ID.
>>
>> Signed-off-by: Xueming Li <xuemingl at nvidia.com>
>> ---
>> lib/librte_ethdev/ethdev_driver.h | 32 +++++++++++
>> lib/librte_ethdev/rte_class_eth.c | 38 ++++++++++---
>> lib/librte_ethdev/rte_ethdev.c | 91 +++++++++++++++++++++++++++++++
>> lib/librte_ethdev/version.map | 1 +
>> 4 files changed, 153 insertions(+), 9 deletions(-)
>>
>> diff --git a/lib/librte_ethdev/ethdev_driver.h
>> b/lib/librte_ethdev/ethdev_driver.h
>> index 7b0f392e34..3ff1b90b2f 100644
>> --- a/lib/librte_ethdev/ethdev_driver.h
>> +++ b/lib/librte_ethdev/ethdev_driver.h
>> @@ -1244,6 +1244,38 @@ struct rte_eth_devargs {
>> enum rte_eth_representor_type type; /* type of representor */ };
>>
>> +/**
>> + * PMD helper function to convert representor ID from location detail
>
>Full stop is missing above.
>
>Also I'm not sure in term "convert" here. It sounds a bit misleading to me. May be just "get". I.e PMD helper function to get
>representor ID by location detail.
>and
>rte_eth_representor_id_get()
>
>What do you think?
Looks good, I was struggling on the naming, 'get' looks straightforward, no conversion but lookup, thanks!
>
>> + *
>> + * Convert representor ID from controller, pf and (sf or vf).
>> + * The mapping is retrieved from rte_eth_representor_info_get().
>> + *
>> + * For backward compatibility, if no representor info, direct
>> + * map legacy VF (no controller and pf).
>> + *
>> + * @param ethdev
>> + * Handle of ethdev port.
>> + * @param repr_id
>> + * Pointer to converted representor ID.
>> + * @param type
>> + * Representor type.
>> + * @param controller
>> + * Controller ID, -1 if unspecified.
>> + * @param pf
>> + * PF port ID, -1 if unspecified.
>> + * @param representor_port
>> + * VF or SF representor port number, -1 if unspecified.
>
>Mixing input and output parameters looks bad to me.
>May I suggest to put repr_id the last?
>I.e. to have all input parameters first.
>(May be I've already mentioned it, if I missed your reply, please, repeat it once again).
Make sense
>
>> + *
>> + * @return
>> + * Negative errno value on error, 0 on success.
>> + */
>> +__rte_internal
>> +int
>> +rte_eth_representor_id_convert(const struct rte_eth_dev *ethdev,
>> + uint16_t *repr_id,
>> + enum rte_eth_representor_type type,
>> + int controller, int pf, int representor_port);
>> +
>> /**
>> * PMD helper function to parse ethdev arguments
>> *
>> diff --git a/lib/librte_ethdev/rte_class_eth.c
>> b/lib/librte_ethdev/rte_class_eth.c
>> index 051c892b40..f7b7e659e7 100644
>> --- a/lib/librte_ethdev/rte_class_eth.c
>> +++ b/lib/librte_ethdev/rte_class_eth.c
>> @@ -65,9 +65,10 @@ eth_representor_cmp(const char *key __rte_unused,
>> {
>> int ret;
>> char *values;
>> - const struct rte_eth_dev_data *data = opaque;
>> - struct rte_eth_devargs representors;
>> - uint16_t index;
>> + const struct rte_eth_dev *edev = opaque;
>> + const struct rte_eth_dev_data *data = edev->data;
>> + struct rte_eth_devargs eth_da;
>> + uint16_t id, nc, np, nf, i, c, p, f;
>>
>> if ((data->dev_flags & RTE_ETH_DEV_REPRESENTOR) == 0)
>> return -1; /* not a representor port */ @@ -76,17 +77,36 @@
>> eth_representor_cmp(const char *key __rte_unused,
>> values = strdup(value);
>> if (values == NULL)
>> return -1;
>> - memset(&representors, 0, sizeof(representors));
>> - ret = rte_eth_devargs_parse_representor_ports(values, &representors);
>> + memset(ð_da, 0, sizeof(eth_da));
>> + ret = rte_eth_devargs_parse_representor_ports(values, ð_da);
>> free(values);
>> if (ret != 0)
>> return -1; /* invalid devargs value */
>>
>> + if (eth_da.nb_mh_controllers == 0 && eth_da.nb_ports == 0 &&
>> + eth_da.nb_representor_ports == 0)
>> + return -1;
>> + nc = eth_da.nb_mh_controllers > 0 ? eth_da.nb_mh_controllers : 1;
>> + np = eth_da.nb_ports > 0 ? eth_da.nb_ports : 1;
>> + nf = eth_da.nb_representor_ports > 0 ? eth_da.nb_representor_ports :
>> +1;
>> +
>> /* Return 0 if representor id is matching one of the values. */
>> - for (index = 0; index < representors.nb_representor_ports; index++)
>> - if (data->representor_id ==
>> - representors.representor_ports[index])
>> + for (i = 0; i < nc * np * nf; ++i) {
>> + c = i / (np * nf);
>> + p = (i / nf) % np;
>> + f = i % nf;
>> + if (rte_eth_representor_id_convert(edev,
>> + &id,
>> + eth_da.type,
>> + eth_da.nb_mh_controllers == 0 ? -1 :
>> + eth_da.mh_controllers[c],
>> + eth_da.nb_ports == 0 ? -1 : eth_da.ports[p],
>> + eth_da.nb_representor_ports == 0 ? -1 :
>> + eth_da.representor_ports[f]) < 0)
>> + continue;
>> + if (data->representor_id == id)
>> return 0;
>> + }
>> return -1; /* no match */
>> }
>>
>> @@ -112,7 +132,7 @@ eth_dev_match(const struct rte_eth_dev *edev,
>>
>> ret = rte_kvargs_process(kvlist,
>> eth_params_keys[RTE_ETH_PARAM_REPRESENTOR],
>> - eth_representor_cmp, edev->data);
>> + eth_representor_cmp, (void *)(uintptr_t)edev);
>> if (ret != 0)
>> return -1;
>> /* search for representor key */
>> diff --git a/lib/librte_ethdev/rte_ethdev.c
>> b/lib/librte_ethdev/rte_ethdev.c index c88e345e7d..78cdef11be 100644
>> --- a/lib/librte_ethdev/rte_ethdev.c
>> +++ b/lib/librte_ethdev/rte_ethdev.c
>> @@ -5623,6 +5623,97 @@ rte_eth_devargs_parse(const char *dargs, struct rte_eth_devargs *eth_da)
>> return result;
>> }
>>
>> +int
>> +rte_eth_representor_id_convert(const struct rte_eth_dev *ethdev,
>> + uint16_t *repr_id,
>> + enum rte_eth_representor_type type,
>> + int controller, int pf, int representor_port) {
>> + int ret, n, i, count;
>> + struct rte_eth_representor_info *info = NULL;
>> + size_t size;
>> +
>> + if (type == RTE_ETH_REPRESENTOR_NONE)
>> + return 0;
>> + if (repr_id == NULL)
>> + return -EINVAL;
>> +
>> + /* Get PMD representor range info. */
>> + ret = rte_eth_representor_info_get(ethdev->data->port_id, NULL);
>> + if (ret < 0) {
>> + if (type == RTE_ETH_REPRESENTOR_VF && controller == -1 &&
>> + pf == -1) {
>> + /* Direct mapping for legacy VF representor. */
>> + *repr_id = representor_port;
>> + return 0;
>> + } else {
>> + return ret;
>> + }
>> + }
>> + n = ret;
>> + size = sizeof(*info) + n * sizeof(info->ranges[0]);
>> + info = calloc(1, size);
>> + if (info == NULL)
>> + return -ENOMEM;
>> + ret = rte_eth_representor_info_get(ethdev->data->port_id, info);
>> + if (ret < 0)
>> + goto out;
>> +
>> + /* Default controller and pf to caller. */
>> + if (controller == -1)
>> + controller = info->controller;
>> + if (pf == -1)
>> + pf = info->pf;
>> +
>> + /* Locate representor ID. */
>> + ret = -ENOENT;
>> + for (i = 0; i < n; ++i) {
>> + if (info->ranges[i].type != type)
>> + continue;
>> + /* PMD hit: ignore controller if -1. */
>> + if (info->ranges[i].controller != -1 &&
>> + info->ranges[i].controller != (uint16_t)controller)
>
>First of all I don't understand why 'controller' is cast to uint16_t here. Both 'controller' and range->controller have 'int' type.
>
Yes, I should have removed the cast.
>Second, I'm sorry, but I still don't understand the ignore logic. Why information retrieval may return -1 for controller and/or PF? What
>does it mean?
In some circumstances, PMD don't need the controller and PF, i.e. no bonding and multi-host, it locate device from PCI address.
Since openstack can't tell different case, these info are sent without difference, that's why PMD want to ignore them.
But I have a workaround in PMD now, will remove it in next version. But keeping flexibility might be a good choice IMHO.
>
>Above fallback to the device controller and pf if unspecified by the caller look good and make sense.
>
>> + continue;
>> + count = info->ranges[i].id_end - info->ranges[i].id_base + 1;
>> + switch (info->ranges[i].type) {
>> + case RTE_ETH_REPRESENTOR_PF:
>> + if (pf >= info->ranges[i].pf + count)
>> + continue;
>
>Condition must be stricter. We must ensure that requested port within both boundaries of the range.
>I.e. representor_port should not be smaller than
>info->ranges[i].pf.
>It is required for below subtraction.
Good catch!
>
>> + *repr_id = info->ranges[i].id_base +
>> + (pf - info->ranges[i].pf);
>> + ret = 0;
>> + goto out;
>> + case RTE_ETH_REPRESENTOR_VF:
>> + /* PMD hit: ignore pf if -1. */
>> + if (info->ranges[i].pf != -1 &&
>> + info->ranges[i].pf != (uint16_t)pf)
>
>Same as above. Cast seems to be not required.
>
>> + continue;
>> + if (representor_port >= info->ranges[i].vf + count)
>
>Same as above.
>
>> + continue;
>> + *repr_id = info->ranges[i].id_base +
>> + (representor_port - info->ranges[i].vf);
>> + ret = 0;
>> + goto out;
>> + case RTE_ETH_REPRESENTOR_SF:
>> + /* PMD hit: ignore pf if -1. */
>> + if (info->ranges[i].pf != -1 &&
>> + info->ranges[i].pf != (uint16_t)pf)
>
>Same as above. Cast seems to be not required.
>
>> + continue;
>> + if (representor_port >= info->ranges[i].sf + count)
>> + continue;
>
>Same as above.
>
>> + *repr_id = info->ranges[i].id_base +
>> + (representor_port - info->ranges[i].sf);
>> + ret = 0;
>> + goto out;
>> + default:
>> + break;
>> + }
>> + }
>> +out:
>> + free(info);
>> + return ret;
>> +}
>> +
>> static int
>> eth_dev_handle_port_list(const char *cmd __rte_unused,
>> const char *params __rte_unused,
>> diff --git a/lib/librte_ethdev/version.map
>> b/lib/librte_ethdev/version.map index bb6f7436c2..2891f5734e 100644
>> --- a/lib/librte_ethdev/version.map
>> +++ b/lib/librte_ethdev/version.map
>> @@ -268,6 +268,7 @@ INTERNAL {
>> rte_eth_hairpin_queue_peer_bind;
>> rte_eth_hairpin_queue_peer_unbind;
>> rte_eth_hairpin_queue_peer_update;
>> + rte_eth_representor_id_convert;
>> rte_eth_switch_domain_alloc;
>> rte_eth_switch_domain_free;
>> };
>>
More information about the dev
mailing list