[dpdk-dev] [RFC 0/6] New sync modes for ring

Honnappa Nagarahalli Honnappa.Nagarahalli at arm.com
Tue Feb 25 01:58:37 CET 2020


<snip>

> 
> On Mon, Feb 24, 2020 at 10:29 PM Stephen Hemminger
> <stephen at networkplumber.org> wrote:
> >
> > On Mon, 24 Feb 2020 11:35:09 +0000
> > Konstantin Ananyev <konstantin.ananyev at intel.com> wrote:
> >
> > > Upfront note - that RFC is not a complete patch.
> > > It introduces an ABI breakage, plus it doesn't update ring_elem code
> > > properly, etc.
> > > I plan to deal with all these things in later versions.
> > > Right now I seek an initial feedback about proposed ideas.
> > > Would also ask people to repeat performance tests (see below) on
> > > their platforms to confirm the impact.
> > >
> > > More and more customers use(/try to use) DPDK based apps within
> > > overcommitted systems (multiple acttive threads over same pysical cores):
> > > VM, container deployments, etc.
> > > One quite common problem they hit: Lock-Holder-Preemption with
> rte_ring.
> > > LHP is quite a common problem for spin-based sync primitives
> > > (spin-locks, etc.) on overcommitted systems.
> > > The situation gets much worse when some sort of fair-locking
> > > technique is used (ticket-lock, etc.).
> > > As now not only lock-owner but also lock-waiters scheduling order
> > > matters a lot.
> > > This is a well-known problem for kernel within VMs:
> > > http://www-archive.xenproject.org/files/xensummitboston08/LHP.pdf
> > > https://www.cs.hs-rm.de/~kaiser/events/wamos2017/Slides/selcuk.pdf
These slides seem to indicate that the problems are mitigated through the Hypervisor configuration. Do we still need to address the issues?

> > > The problem with rte_ring is that while head accusion is sort of
> > > un-fair locking, waiting on tail is very similar to ticket lock
> > > schema - tail has to be updated in particular order.
> > > That makes current rte_ring implementation to perform really pure on
> > > some overcommited scenarios.
> >
> > Rather than reform rte_ring to fit this scenario, it would make more
> > sense to me to introduce another primitive. The current lockless ring
> > performs very well for the isolated thread model that DPDK was built
> > around. This looks like a case of customers violating the usage model
> > of the DPDK and then being surprised at the fallout.
> 
> I agree with Stephen here.
> 
> I think, adding more runtime check in the enqueue() and dequeue() will have a
> bad effect on the low-end cores too.
> But I agree with the problem statement that in the virtualization use case, It
> may be possible to have N virtual cores runs on a physical core.
It is hard to imagine that there are data plane applications deployed in such environments. Wouldn't this affect the performance terribly?

> 
> IMO, The best solution would be keeping the ring API same and have a
> different flavor in "compile-time". Something like liburcu did for
> accommodating different flavors.
> 
> i.e urcu-qsbr.h and urcu-bp.h will identical definition of API. The application
> can simply include ONE header file in a C file based on the flavor.
> If need both at runtime. Need to have function pointer or so in the application
> and define the function in different c file by including the approaite flavor in C
> file.
> 
> #include <urcu-qsbr.h> /* QSBR RCU flavor */ #include <urcu-bp.h> /*
> Bulletproof RCU flavor */
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> >


More information about the dev mailing list