[dpdk-dev] [PATCH v3 1/3] lib/lpm: not inline unnecessary functions

Medvedkin, Vladimir vladimir.medvedkin at intel.com
Fri Jul 5 12:31:36 CEST 2019


Hi Stephen,

On 28/06/2019 16:35, Stephen Hemminger wrote:
> On Fri, 28 Jun 2019 15:16:30 +0100
> "Medvedkin, Vladimir" <vladimir.medvedkin at intel.com> wrote:
>
>> Hi Honnappa,
>>
>> On 28/06/2019 14:57, Honnappa Nagarahalli wrote:
>>>> Hi all,
>>>>
>>>> On 28/06/2019 05:34, Stephen Hemminger wrote:
>>>>> On Fri, 28 Jun 2019 02:44:54 +0000
>>>>> "Ruifeng Wang (Arm Technology China)"<Ruifeng.Wang at arm.com>  wrote:
>>>>>   
>>>>>>>> Tests showed that the function inlining caused performance drop on
>>>>>>>> some x86 platforms with the memory ordering patches applied.
>>>>>>>> By force no-inline functions, the performance was better than
>>>>>>>> before on x86 and no impact to arm64 platforms.
>>>>>>>>
>>>>>>>> Suggested-by: Medvedkin Vladimir<vladimir.medvedkin at intel.com>
>>>>>>>> Signed-off-by: Ruifeng Wang<ruifeng.wang at arm.com>
>>>>>>>> Reviewed-by: Gavin Hu<gavin.hu at arm.com>
>>>>>>>     {
>>>>>>>
>>>>>>> Do you actually need to force noinline or is just taking of inline enough?
>>>>>>> In general, letting compiler decide is often best practice.
>>>>>> The force noinline is an optimization for x86 platforms to keep
>>>>>> rte_lpm_add() API performance with memory ordering applied.
>>>>> I don't think you answered my question. What does a recent version of
>>>>> GCC do if you drop the inline.
>>>>>
>>>>> Actually all the functions in rte_lpm should drop inline.
>>>> I'm agree with Stephen. If it is not a fastpath and size of function is not
>>>> minimal it is good to remove inline qualifier for other control plane functions
>>>> such as rule_add/delete/find/etc and let the compiler decide to inline it
>>>> (unless it affects performance).
>>> IMO, the rule needs to be simple. If it is control plane function, we should leave it to the compiler to decide. I do not think we need to worry too much about performance for control plane functions.
>> Control plane is not as important as data plane speed but it is still
>> important. For lpm we are talking not about initialization, but runtime
>> routes add/del related functions. If it is very slow the library will be
>> totally unusable because after it receives a route update it will be
>> blocked for a long time and route update queue would overflow.
> Control plane performance is more impacted by algorithmic choice.
> The original LPM had terrible (n^2?) control path. Current code is better.
> I had a patch using RB tree, but it was rejected because it used the
> /usr/include/bsd/sys/tree.h which added a dependency.

You're absolutely right,  control plane performance is mostly depends on 
algorithm. Current LPM implementation has number of problems there. One 
problem is rules_tbl[] that is a flat array containing routes for 
control plane purposes. Replacing it with a rb-tree solves this problem, 
but there are another problems. For example, when you try to add a route 
10.0.0.0/8 while a number of subroutes are exist in the table (for 
example 10.20.0.0/16), current implementation will load tbl_entry -> do 
some checks (depth, ext entry) -> conditionally store new entry. Under 
several circumstances it would take a lot time.  But in fact it needs to 
unconditionally rewrite only two ranges - from 10.0.0.0 to 10.19.255.255 
and from 10.21.0.0 to 10.255.255.255. And control plane could help us to 
get this two ranges. The best struct to do so is lc-tree because it is 
relatively easy to traverse subtree (described by 10.0.0.0/8) and get 
subroutes. We are working on a new implementation, hopefully it will be 
ready soon.

-- 
Regards,
Vladimir



More information about the dev mailing list