[dpdk-dev] [PATCH v7 4/5] hash: add lock-free read-write concurrency

Jerin Jacob jerin.jacob at caviumnetworks.com
Fri Nov 9 10:28:37 CET 2018


-----Original Message-----
> Date: Fri, 9 Nov 2018 01:34:25 +0000
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli at arm.com>
> To: Jerin Jacob <jerin.jacob at caviumnetworks.com>
> CC: "bruce.richardson at intel.com" <bruce.richardson at intel.com>,
>  "pablo.de.lara.guarch at intel.com" <pablo.de.lara.guarch at intel.com>,
>  "dev at dpdk.org" <dev at dpdk.org>, "yipeng1.wang at intel.com"
>  <yipeng1.wang at intel.com>, Dharmik Thakkar <Dharmik.Thakkar at arm.com>,
>  "Gavin Hu (Arm Technology China)" <Gavin.Hu at arm.com>, nd <nd at arm.com>,
>  "thomas at monjalon.net" <thomas at monjalon.net>, "ferruh.yigit at intel.com"
>  <ferruh.yigit at intel.com>, "hemant.agrawal at nxp.com"
>  <hemant.agrawal at nxp.com>, "chaozhu at linux.vnet.ibm.com"
>  <chaozhu at linux.vnet.ibm.com>, "Kapoor, Prasun" <Prasun.Kapoor at cavium.com>,
>  nd <nd at arm.com>
> Subject: RE: [dpdk-dev] [PATCH v7 4/5] hash: add lock-free read-write
>  concurrency
> 
> > > Agree. There are multiple micro-architectures in Arm eco-system. We
> > should establish few simple rules to make sure algorithms perform well on all
> > the available platforms. I established few rules in VPP and they are working
> > fine so far.
> >
> > Can you share that rules for everyone's benefit?
> >
> These are just few simple rules anyone can think of, but avoid the surprises.
> We identified a owner for each platform (we have this already in DPDK, even across platforms)
> Each patch submitted for Arm platforms is marked with -2 (VPP uses Gerrit)
> Every platform owner tests on her/his platform. -2 will be removed only if it does not cause regression on any platforms. Platform owner helps out with optimization where required as they understand their micro-architecture best. I guess this is what is supposed to happen through the review process in DPDK. But making sure everyone tests it before it gets merged avoids the surprises.

I think, The very same philosophy can be implemented with exiting mailing list
method, if

1) Author Cc all the architecture maintainers and platform owners for
any fast path change where it can introduce performance regression.
2) Author can CC the same list to request for performance check
along with test command if area of performance regression known before. 

I agree with last minute surprises are bad for both Author and platform
owner. I think, it can fixed by above scheme.

> 
> > > >
> > > > IMO, This scheme won't work. I think, we are introducing such
> > > > performance critical feature, we need to put under function pointer
> > > > scheme so that if an application does not need such feature it can use
> > plain loads.
> > > >
> > > IMO, we should do some more debugging before going into exploring other
> > options.
> >
> > Yes. But, how do we align with v18.11 release?
> >
> I think I have spent enough time optimizing the code. Please provide the feedback and I will work on completing the fix.
> 
> However, if the new patch is not satisfactory enough, we need another plan.

Based on the public release meeting held yesterday, RC3 date is on next Monday.

I would suggest:
- Send your exiting tested patch in mailing list for review. In my
  setup, The regression reduced to 5.7% from 24%
- Extend the patch for hash bulk case as well and check NO_HASH_MULTI_LOOKUP
  as zero with  EM_HASH_LOOKUP_COUNT value as 16 or 8 for arm64.

> 
> You had mentioned about using function pointers. I suggest, we use the function pointer only for lookup function. Otherwise, it will be too much of code duplication.
> When lock-free is not used, the function with no memory-orderings will be called. However, I am not sure about the function pointer overhead. But this will be a simple change.

It may not be very simple change as we need to take care secondary
process case as well, see struct rte_mempool::ops_index scheme.

Since rte_hash_lookup() already NOT a inline function, so making
it as inline and calling a function pointer inside may not attract much
overhead. But we can tell only after testing(Which may be not possible for
RC3)

I think, in future it make sense to have function pointer scheme to
avoid new APIs for different hash library and we can plugin other
proven hash library like urcu based one etc to DPDK.

https://lwn.net/Articles/573431/


More information about the dev mailing list