[dpdk-dev] [PATCH v2] ethdev: fix xstats retrieve by id API

Ferruh Yigit ferruh.yigit at intel.com
Thu Oct 19 19:31:37 CEST 2017


On 10/19/2017 8:03 AM, Ivan Malov wrote:
> On Thu, Oct 19, 2017 at 11:06:37AM +0000, Van Haaren, Harry wrote:
>>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ferruh Yigit
>>> Sent: Wednesday, October 18, 2017 10:49 PM
>>> To: Ivan Malov <Ivan.Malov at oktetlabs.ru>; Daly, Lee <lee.daly at intel.com>;
>>> thomas at monjalon.net
>>> Cc: dev at dpdk.org; Kozak, KubaX <kubax.kozak at intel.com>
>>> Subject: Re: [dpdk-dev] [PATCH v2] ethdev: fix xstats retrieve by id API
>>>
>>> On 10/18/2017 4:10 AM, Ivan Malov wrote:
>>>> On 10/12/2017 2:31 PM, Lee Daly wrote:
>>>>> From: Lee <lee.daly at intel.com>
>>>>>
>>>>> Fix xstats functions, rte_eth_xstats_get_names_by_id()
>>>>> and rte_eth_xstats_get_by_id(), in current implementation
>>>>> ethdev level reads all xstat values and filters out
>>>>> the ones requested by the application. This behavior doesn't
>>>>> benefit from PMD ops and doesn't provide the benefit the
>>>>> API was created in the first place for. APIs are also unnecessarily
>>>>> complicated. Both APIs have different returns for the same params.
>>>>>
>>>>> In this fix, instead of reading all the stats and finding the
>>>>> requested value, drivers can provide ops to get selected xstats.
>>>>> API no longer crashes with certain params,
>>>>>
>>>>> rte_eth_get_by_id returned seg fault with
>>>>> "ids = NULL && values != NULL && n<max"
>>>>> rte_eth_get_names_by_id returned seg fault with
>>>>> "ids = NULL && values != NULL && n=0"
>>>>> These now return max number of stats available, matching the other API.
>>>>>
>>>>> rte_eth_get_by_id returned seg fault with
>>>>> "ids != NULL && values = NULL && n<max"
>>>>> This now returns -22,(EINVAL).
>>>>>
>>>>> Standardized variable/parameter names between the 2 APIs.
>>>>>
>>>>> Overall code complexity reduced.
>>>>>
>>>>> Fixes: 79c913a42f0e ("ethdev: retrieve xstats by ID")
>>>>> Cc: kubax.kozak at intel.com
>>>>>
>>>>> Signed-off-by: Lee Daly <lee.daly at intel.com>
>>>>> Reviewed-by: Ferruh Yigit <ferruh.yigit at intel.com>
>>>>
>>>> I have a serious concern regarding the patch. There is a common scenario
>>>> of rte_eth_xstats_get_names_by_id() usage, and the patch breaks it.
>>>> Typically, the function is called with 'ids = NULL' and 'xstats_names =
>>> NULL'
>>>> in order to get the number of figures. Then the function is called one
>>> more
>>>> time with appropriate storage for 'xstats_names'. According to the patch,
>>>> on the former step get_xstats_count() is called to count the figures
>>> available.
>>>> The resulting number counts for PMD statistics + RTE_NB_STATS + some
>>> amount of
>>>> per-queue figures. However, on the latter step the following may take
>>> place:
>>>>
>>>>> +	if (dev->dev_ops->xstats_get_names_by_id != NULL)
>>>>> +		return (*dev->dev_ops->xstats_get_names_by_id)(
>>>>> +				dev, xstats_names, ids, size);
>>>>
>>>> This obviously means that in the case when 'xstats_get_names_by_id' is
>>> present,
>>>> it will be called directly, and RTE statistics will not be filled in the
>>> storage
>>>> before the PMD figures. Accordingly, the value returned by the function in
>>> this
>>>> case will count for PMD figures only (i.e. will be less than the value
>>> obtained
>>>> on the first step). This is an obvious malfunction, and it would be
>>> desirable
>>>> to provide a fix for that. I hope for your understanding.
>>>
>>> Hi Ivan,
>>>
>>> You are right, this is broken.
>>>
>>> But the problem is, ethdev API works with ids which are "basic + xstat" and
>>> PMD dev_ops
>>> works with ids which are "xstat". This mismatch ends up
>>> xstats_get_names_by_id() dev_ops
>>> being called to get all values, and filtering done in ethdev API, as done in
>>> previous
>>> implementation.
>>>
>>> If we use xstats_get_names_by_id() to get all values, why we have it at all,
>>> there is
>>> already a dev_ops to get all values.
>>>
>>> I believe we have two options,
>>>
>>> 1- Remove this by _by_id() dev_ops, but keep ethdev APIs, and APIs does the
>>> filtering of
>>> the ids. This may be missing if any PMD has an optimal way of getting subset
>>> of ids, this
>>> is the main reason of the devops.
>>
>>
>> Lets not remove the _by_id() opts, because that removes the possibility for PMDs to optimize the get_by_id() case.
>>
>>
>>> 2- Convert ids for PMD usage, what do you think about following fix:
>>> (same thing for rte_eth_xstats_get_by_id() also of course)
>>
>>
>> A quick test of this patch allows proc-info to work again to dump xstats of a primary, eg testpmd. Run testpmd first and then proc-info:
>>
>> testpmd: ./testpmd -- -i
>> proc-info: ./dpdk-procinfo  -- --xstats
>>
>> Before patch:
>> Cannot get xstat names
>> Cannot get xstat names
>>
>> After patch:
>> <correct xstats output>
>>
>> I suggest we use Ferruh's fix, and apply ASAP unless @Ivan has a better solution/suggestion?
>>
>>
>>
>>>   diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
>>>   index 0b1e92894..1145539ee 100644
>>>   --- a/lib/librte_ether/rte_ethdev.c
>>>   +++ b/lib/librte_ether/rte_ethdev.c
>>>   @@ -1631,6 +1631,7 @@ rte_eth_xstats_get_names_by_id(uint16_t port_id,
>>>    {
>>>           struct rte_eth_xstat_name *xstats_names_copy;
>>>           unsigned int expected_entries;
>>>   +       unsigned int all_ids_from_pmd = 1;
>>>           struct rte_eth_dev *dev;
>>>           unsigned int i;
>>>
>>>   @@ -1649,9 +1650,28 @@ rte_eth_xstats_get_names_by_id(uint16_t port_id,
>>>           if (ids && !xstats_names)
>>>                   return -EINVAL;
>>>
>>>   -       if (dev->dev_ops->xstats_get_names_by_id != NULL)
>>>   -               return (*dev->dev_ops->xstats_get_names_by_id)(
>>>   -                               dev, xstats_names, ids, size);
>>>   +       if (ids && dev->dev_ops->xstats_get_names_by_id != NULL && size >
>>> 0) {
>>>   +               uint64_t ids_copy[size];
>>>   +               unsigned int pmd_count;
>>>   +               unsigned int count;
>>>   +
>>>   +               pmd_count = (*dev->dev_ops->xstats_get_names_by_id)(dev,
>>> NULL,
>>>   +                                       NULL, 0);
>>>   +               count = expected_entries - pmd_count;
>>>   +
>>>   +               for (i = 0; i < size; i++) {
>>>   +                       if (ids[i] < count) {
>>>   +                               all_ids_from_pmd = 0;
>>>   +                               break;
>>>   +                       }
>>>   +
>>>   +                       ids_copy[i] = ids[i] - count;
>>>   +               }
>>>   +
>>>   +               if (all_ids_from_pmd)
>>>   +                       return (*dev->dev_ops-
>>>> xstats_get_names_by_id)(dev,
>>>   +                                       xstats_names, ids_copy, size);
>>>   +       }
>>>
>>>           /* Retrieve all stats */
>>>           if (!ids) {
> 
> The point is that I've been doing a custom check since the time when the "xstats by ID" API had been added and it have always been OK.
> Now the custom check fails.
> 
> I'm afraid the workaround suggested by Ferruh is not quite comprehensive because it only
> affects rte_eth_xstats_get_names_by_id(). At the same time rte_eth_xstats_get_by_id() also needs a fix since it contains
> the same mistake:

I already put a note into email to say rte_eth_xstats_get_by_id() also needs to
be updated, the diff was just to give the idea. So that part is OK.

> 
>> +	if (dev->dev_ops->xstats_get_by_id != NULL)
>> +		return (*dev->dev_ops->xstats_get_by_id)(dev, ids, values,
>> +			size);
> 
> Anyhow, the overall design might be slightly poor, I presume.

This is to let _by_id() dev_ops to be used to get partial data, which they are
introduced for, any better idea is welcome :)

> 
> First of all, calling get_xstats_count() at the beginning of either function to obtain 'expected_entries' and then
> calling (*dev->dev_ops->xstats_get_names_by_id) *just* to calculate 'count = expected_entries - pmd_count' (like in the hotfix)
> is so bizarre because, if I recall correctly, the aim of the API is to optimise accessing arbitrary figures. It might be expensive to call
> (*dev->dev_ops->xstats_get_names_by_id) multiple times since PMD's implementation may have an internal loop counting stats
> supported by a particular adapter.
> 
> To put it together, why isn't it possible to move
> 
>>        count += RTE_NB_STATS;
>>        count += RTE_MIN(dev->data->nb_rx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS) *
>>                 RTE_NB_RXQ_STATS;
>>        count += RTE_MIN(dev->data->nb_tx_queues, RTE_ETHDEV_QUEUE_STAT_CNTRS) *
>>                 RTE_NB_TXQ_STATS;
> 
> from get_xstats_count() and make it an individual inline function to be used in such cases (like in the hotfix) to get the "count" for basic stats?

That was something I thought, but wanted to make _by_id() function simple.
But converting above into inline function and use make sense.

> 
> Secondly, the for() loop (both in the hotfix and in the very original version of the API) is dubious as well.
> I mean, it would be better to make an agreement (restriction) in the API that all "basic" xstat IDs (if any) must be specified *first* in 'ids' array in ascending order.
> Then it would be possible to use a for() loop which will fill in basic stats or stat_names (if any), then break on the first "non-basic" ID (i.e. 'ids[i] == count'),
> and then (*dev->dev_ops->xstats_get_names_by_id) could be called with &ids[i], 'size - i', '&xstat_names[i]' arguments AND an additional (new) argument 'count' (or 'basic_count')
> so that the PMD would be able to fill in the rest of the array. The new 'basic_count' argument would provide an idea of the first ID owned by PMD.

There is no agreement in the API to ask ids should be ordered, and at this point
we can't add this, this will be API break.

And I was about mention about same thing, as far as I can see sfc has this
assumption that ids will be sorted, just that is not guaranteed. Since these
APIs were not really functioning well and always used to get all data, you may
not recognized any problem yet.

> 
> I admit that my suggestion might sound like a big controversy. So if nobody likes it, the proposed hotfix will be OK for me provided that
> it covers rte_eth_xstats_get_by_id() as well. Right now the hotfix is incomplete.

I will send a proper patch.

Thanks,
ferruh


More information about the dev mailing list