[dpdk-dev] [PATCH 2/2] lpm: hide internal data

Medvedkin, Vladimir vladimir.medvedkin at intel.com
Wed Oct 14 15:10:15 CEST 2020



On 13/10/2020 20:48, Michel Machado wrote:
> On 10/13/20 3:06 PM, Medvedkin, Vladimir wrote:
>>
>>
>> On 13/10/2020 18:46, Michel Machado wrote:
>>> On 10/13/20 11:41 AM, Medvedkin, Vladimir wrote:
>>>> Hi Michel,
>>>>
>>>> Could you please describe a condition when LPM gets inconsistent? As 
>>>> I can see if there is no free tbl8 it will return -ENOSPC.
>>>
>>>     Consider this simple example, we need to add the following two 
>>> prefixes with different next hops: 10.99.0.0/16, 18.99.99.128/25. If 
>>> the LPM table is out of tbl8s, the second prefix is not added and 
>>> Gatekeeper will make decisions in violation of the policy. The data 
>>> structure of the LPM table is consistent, but its content 
>>> inconsistent with the policy.
>>
>> Aha, thanks. So do I understand correctly that you need to add a set 
>> of routes atomically (either the entire set is installed or nothing)?
> 
>     Yes.
> 
>> If so, then I would suggest having 2 lpm and switching them atomically 
>> after a successful addition. As for now, even if you have enough 
>> tbl8's, routes are installed non atomically, i.e. there will be a time 
>> gap between adding two routes, so in this time interval the table will 
>> be inconsistent with the policy.
>> Also, if new lpm algorithms are added to the DPDK, they won't have 
>> such a thing as tbl8.
> 
>     Our code already deals with synchronization.

OK, so my suggestion here would be to add new routes to the shadow copy 
of the lpm, and if it returns -ENOSPC, than create a new LPM with double 
amount of tbl8's and add all the routes to it. Then switch the 
active-shadow LPM pointers. In this case you'll always add a bulk of 
routes atomically.

> 
>>>     We minimize the need of replacing a LPM table by allocating LPM 
>>> tables with the double of what we need (see example here 
>>> https://github.com/AltraMayor/gatekeeper/blob/95d1d6e8201861a0d0c698bfd06ad606674f1e07/lua/examples/policy.lua#L172-L183), 
>>> but the code must be ready for unexpected needs that may arise in 
>>> production.
>>>
>>
>> Usually, the table is initialized with a large enough number of 
>> entries, enough to add a possible number of routes. One tbl8 group 
>> takes up 1Kb of memory which is nothing comparing to the size of tbl24 
>> which is 64Mb.
> 
>     When the prefixes come from BGP, initializing a large enough table 
> is fine. But when prefixes come from threat intelligence, the number of 
> prefixes can vary wildly and the number of prefixes above 24 bits are 
> way more common.
> 
>> P.S. consider using rte_fib library, it has a number of advantages 
>> over LPM. You can replace the loop in __lookup_fib_bulk() with a bulk 
>> lookup call and this will probably increase the speed.
> 
>     I'm not aware of the rte_fib library. The only documentation that I 
> found on Google was https://doc.dpdk.org/api/rte__fib_8h.html and it 
> just says "FIB (Forwarding information base) implementation for IPv4 
> Longest Prefix Match".

That's true, I'm going to add programmer's guide soon.
Although the fib API is very similar to existing LPM.

> 
>>>>
>>>> On 13/10/2020 15:58, Michel Machado wrote:
>>>>> Hi Kevin,
>>>>>
>>>>>     We do need fields max_rules and number_tbl8s of struct rte_lpm, 
>>>>> so the removal would force us to have another patch to our local 
>>>>> copy of DPDK. We'd rather avoid this new local patch because we 
>>>>> wish to eventually be in sync with the stock DPDK.
>>>>>
>>>>>     Those fields are needed in Gatekeeper because we found a 
>>>>> condition in an ongoing deployment in which the entries of some LPM 
>>>>> tables may suddenly change a lot to reflect policy changes. To 
>>>>> avoid getting into a state in which the LPM table is inconsistent 
>>>>> because it cannot fit all the new entries, we compute the needed 
>>>>> parameters to support the new entries, and compare with the current 
>>>>> parameters. If the current table doesn't fit everything, we have to 
>>>>> replace it with a new LPM table.
>>>>>
>>>>>     If there were a way to obtain the struct rte_lpm_config of a 
>>>>> given LPM table, it would cleanly address our need. We have the 
>>>>> same need in IPv6 and have a local patch to work around it (see 
>>>>> https://github.com/cjdoucette/dpdk/commit/3eaf124a781349b8ec8cd880db26a78115cb8c8f). 
>>>>> Thus, an IPv4 and IPv6 solution would be best.
>>>>>
>>>>>     PS: I've added Qiaobin Fu, another Gatekeeper maintainer, to 
>>>>> this disscussion.
>>>>>
>>>>> [ ]'s
>>>>> Michel Machado
>>>>>
>>>>> On 10/13/20 9:53 AM, Kevin Traynor wrote:
>>>>>> Hi Gatekeeper maintainers (I think),
>>>>>>
>>>>>> fyi - there is a proposal to remove some members of a struct in 
>>>>>> DPDK LPM
>>>>>> API that Gatekeeper is using [1]. It would be only from DPDK 20.11 
>>>>>> but
>>>>>> as it's an LTS I guess it would probably hit Debian in a few months.
>>>>>>
>>>>>> The full thread is here:
>>>>>> http://inbox.dpdk.org/dev/20200907081518.46350-1-ruifeng.wang@arm.com/ 
>>>>>>
>>>>>>
>>>>>> Maybe you can take a look and tell us if they are needed in 
>>>>>> Gatekeeper
>>>>>> or you can workaround it?
>>>>>>
>>>>>> thanks,
>>>>>> Kevin.
>>>>>>
>>>>>> [1]
>>>>>> https://github.com/AltraMayor/gatekeeper/blob/master/gt/lua_lpm.c#L235-L248 
>>>>>>
>>>>>>
>>>>>> On 09/10/2020 07:54, Ruifeng Wang wrote:
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Kevin Traynor <ktraynor at redhat.com>
>>>>>>>> Sent: Wednesday, September 30, 2020 4:46 PM
>>>>>>>> To: Ruifeng Wang <Ruifeng.Wang at arm.com>; Medvedkin, Vladimir
>>>>>>>> <vladimir.medvedkin at intel.com>; Bruce Richardson
>>>>>>>> <bruce.richardson at intel.com>
>>>>>>>> Cc: dev at dpdk.org; Honnappa Nagarahalli
>>>>>>>> <Honnappa.Nagarahalli at arm.com>; nd <nd at arm.com>
>>>>>>>> Subject: Re: [dpdk-dev] [PATCH 2/2] lpm: hide internal data
>>>>>>>>
>>>>>>>> On 16/09/2020 04:17, Ruifeng Wang wrote:
>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Medvedkin, Vladimir <vladimir.medvedkin at intel.com>
>>>>>>>>>> Sent: Wednesday, September 16, 2020 12:28 AM
>>>>>>>>>> To: Bruce Richardson <bruce.richardson at intel.com>; Ruifeng Wang
>>>>>>>>>> <Ruifeng.Wang at arm.com>
>>>>>>>>>> Cc: dev at dpdk.org; Honnappa Nagarahalli
>>>>>>>>>> <Honnappa.Nagarahalli at arm.com>; nd <nd at arm.com>
>>>>>>>>>> Subject: Re: [PATCH 2/2] lpm: hide internal data
>>>>>>>>>>
>>>>>>>>>> Hi Ruifeng,
>>>>>>>>>>
>>>>>>>>>> On 15/09/2020 17:02, Bruce Richardson wrote:
>>>>>>>>>>> On Mon, Sep 07, 2020 at 04:15:17PM +0800, Ruifeng Wang wrote:
>>>>>>>>>>>> Fields except tbl24 and tbl8 in rte_lpm structure have no 
>>>>>>>>>>>> need to
>>>>>>>>>>>> be exposed to the user.
>>>>>>>>>>>> Hide the unneeded exposure of structure fields for better ABI
>>>>>>>>>>>> maintainability.
>>>>>>>>>>>>
>>>>>>>>>>>> Suggested-by: David Marchand <david.marchand at redhat.com>
>>>>>>>>>>>> Signed-off-by: Ruifeng Wang <ruifeng.wang at arm.com>
>>>>>>>>>>>> Reviewed-by: Phil Yang <phil.yang at arm.com>
>>>>>>>>>>>> ---
>>>>>>>>>>>>    lib/librte_lpm/rte_lpm.c | 152
>>>>>>>>>>>> +++++++++++++++++++++++---------------
>>>>>>>>>> -
>>>>>>>>>>>>    lib/librte_lpm/rte_lpm.h |   7 --
>>>>>>>>>>>>    2 files changed, 91 insertions(+), 68 deletions(-)
>>>>>>>>>>>>
>>>>>>>>>>> <snip>
>>>>>>>>>>>> diff --git a/lib/librte_lpm/rte_lpm.h 
>>>>>>>>>>>> b/lib/librte_lpm/rte_lpm.h
>>>>>>>>>>>> index 03da2d37e..112d96f37 100644
>>>>>>>>>>>> --- a/lib/librte_lpm/rte_lpm.h
>>>>>>>>>>>> +++ b/lib/librte_lpm/rte_lpm.h
>>>>>>>>>>>> @@ -132,17 +132,10 @@ struct rte_lpm_rule_info {
>>>>>>>>>>>>
>>>>>>>>>>>>    /** @internal LPM structure. */
>>>>>>>>>>>>    struct rte_lpm {
>>>>>>>>>>>> -    /* LPM metadata. */
>>>>>>>>>>>> -    char name[RTE_LPM_NAMESIZE];        /**< Name of the 
>>>>>>>>>>>> lpm. */
>>>>>>>>>>>> -    uint32_t max_rules; /**< Max. balanced rules per lpm. */
>>>>>>>>>>>> -    uint32_t number_tbl8s; /**< Number of tbl8s. */
>>>>>>>>>>>> -    struct rte_lpm_rule_info rule_info[RTE_LPM_MAX_DEPTH]; 
>>>>>>>>>>>> /**<
>>>>>>>>>> Rule info table. */
>>>>>>>>>>>> -
>>>>>>>>>>>>        /* LPM Tables. */
>>>>>>>>>>>>        struct rte_lpm_tbl_entry 
>>>>>>>>>>>> tbl24[RTE_LPM_TBL24_NUM_ENTRIES]
>>>>>>>>>>>>                __rte_cache_aligned; /**< LPM tbl24 table. */
>>>>>>>>>>>>        struct rte_lpm_tbl_entry *tbl8; /**< LPM tbl8 table. */
>>>>>>>>>>>> -    struct rte_lpm_rule *rules_tbl; /**< LPM rules. */
>>>>>>>>>>>>    };
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Since this changes the ABI, does it not need advance notice?
>>>>>>>>>>>
>>>>>>>>>>> [Basically the return value point from rte_lpm_create() will be
>>>>>>>>>>> different, and that return value could be used by 
>>>>>>>>>>> rte_lpm_lookup()
>>>>>>>>>>> which as a static inline function will be in the binary and 
>>>>>>>>>>> using
>>>>>>>>>>> the old structure offsets.]
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Agree with Bruce, this patch breaks ABI, so it can't be accepted
>>>>>>>>>> without prior notice.
>>>>>>>>>>
>>>>>>>>> So if the change wants to happen in 20.11, a deprecation notice 
>>>>>>>>> should
>>>>>>>>> have been added in 20.08.
>>>>>>>>> I should have added a deprecation notice. This change will have 
>>>>>>>>> to wait for
>>>>>>>> next ABI update window.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Do you plan to extend? or is this just speculative?
>>>>>>> It is speculative.
>>>>>>>
>>>>>>>>
>>>>>>>> A quick scan and there seems to be several projects using some 
>>>>>>>> of these
>>>>>>>> members that you are proposing to hide. e.g. BESS, NFF-Go, DPVS,
>>>>>>>> gatekeeper. I didn't look at the details to see if they are 
>>>>>>>> really needed.
>>>>>>>>
>>>>>>>> Not sure how much notice they'd need or if they update DPDK 
>>>>>>>> much, but I
>>>>>>>> think it's worth having a closer look as to how they use lpm and 
>>>>>>>> what the
>>>>>>>> impact to them is.
>>>>>>> Checked the projects listed above. BESS, NFF-Go and DPVS don't 
>>>>>>> access the members to be hided.
>>>>>>> They will not be impacted by this patch.
>>>>>>> But Gatekeeper accesses the rte_lpm internal members that to be 
>>>>>>> hided. Its compilation will be broken with this patch.
>>>>>>>
>>>>>>>>
>>>>>>>>> Thanks.
>>>>>>>>> Ruifeng
>>>>>>>>>>>>    /** LPM RCU QSBR configuration structure. */
>>>>>>>>>>>> -- 
>>>>>>>>>>>> 2.17.1
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -- 
>>>>>>>>>> Regards,
>>>>>>>>>> Vladimir
>>>>>>>
>>>>>>
>>>>
>>

-- 
Regards,
Vladimir


More information about the dev mailing list