[dpdk-dev] [PATCH v1 0/3] MCS queued lock implementation
Honnappa Nagarahalli
Honnappa.Nagarahalli at arm.com
Wed Jun 5 22:48:35 CEST 2019
+David
(had similar questions)
> -----Original Message-----
> From: Stephen Hemminger <stephen at networkplumber.org>
> Sent: Wednesday, June 5, 2019 11:48 AM
> To: Phil Yang (Arm Technology China) <Phil.Yang at arm.com>
> Cc: dev at dpdk.org; thomas at monjalon.net; jerinj at marvell.com;
> hemant.agrawal at nxp.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli at arm.com>; Gavin Hu (Arm Technology China)
> <Gavin.Hu at arm.com>; nd <nd at arm.com>
> Subject: Re: [dpdk-dev] [PATCH v1 0/3] MCS queued lock implementation
>
> On Wed, 5 Jun 2019 23:58:45 +0800
> Phil Yang <phil.yang at arm.com> wrote:
>
> > This patch set added MCS lock library and its unit test.
> >
> > The MCS lock (proposed by JOHN M. MELLOR-CRUMMEY and MICHAEL L.
> SCOTT)
> > provides scalability by spinning on a CPU/thread local variable which
> > avoids expensive cache bouncings. It provides fairness by maintaining
> > a list of acquirers and passing the lock to each CPU/thread in the order they
> acquired the lock.
> >
> > References:
> > 1.
> > http://web.mit.edu/6.173/www/currentsemester/readings/R06-scalable-syn
> > chronization-1991.pdf
> > 2. https://lwn.net/Articles/590243/
> >
> > Mirco-benchmarking result:
> > ------------------------------------------------------------------------------------------------
> > MCS lock | spinlock | ticket lock
> > ------------------------------+--------------------------------+------
> > ------------------------------+--------------------------------+------
> > ------------------------------+--------------------------------+------
> > ------------------------------+--------------------------------+------
> > ------------------------------+--------------------------------+------
> > ------------------------------+--------------------------------+--
> > Test with lock on 13 cores... | Test with lock on 14 cores... | Test with lock
> on 14 cores...
> > Core [15] Cost Time = 22426 us| Core [14] Cost Time = 47974 us| Core
> > [14] cost time = 66761 us Core [16] Cost Time = 22382 us| Core [15]
> > Cost Time = 46979 us| Core [15] cost time = 66766 us Core [17] Cost
> > Time = 22294 us| Core [16] Cost Time = 46044 us| Core [16] cost time
> > = 66761 us Core [18] Cost Time = 22412 us| Core [17] Cost Time =
> > 28793 us| Core [17] cost time = 66767 us Core [19] Cost Time = 22407
> > us| Core [18] Cost Time = 48349 us| Core [18] cost time = 66758 us
> > Core [20] Cost Time = 22436 us| Core [19] Cost Time = 19381 us| Core
> > [19] cost time = 66766 us Core [21] Cost Time = 22414 us| Core [20]
> > Cost Time = 47914 us| Core [20] cost time = 66763 us Core [22] Cost
> > Time = 22405 us| Core [21] Cost Time = 48333 us| Core [21] cost time
> > = 66766 us Core [23] Cost Time = 22435 us| Core [22] Cost Time =
> > 38900 us| Core [22] cost time = 66749 us Core [24] Cost Time = 22401 us|
> Core [23] Cost Time = 45374 us| Core [23] cost time = 66765 us Core [25] Cost
> Time = 22408 us| Core [24] Cost Time = 16121 us| Core [24] cost time =
> 66762 us Core [26] Cost Time = 22380 us| Core [25] Cost Time = 42731 us|
> Core [25] cost time = 66768 us Core [27] Cost Time = 22395 us| Core [26] Cost
> Time = 29439 us| Core [26] cost time = 66768 us
> > | Core [27] Cost Time = 38071 us| Core
> > [27] cost time = 66767 us
> > ------------------------------+--------------------------------+------
> > ------------------------------+--------------------------------+------
> > ------------------------------+--------------------------------+------
> > ------------------------------+--------------------------------+------
> > ------------------------------+--------------------------------+------
> > ------------------------------+--------------------------------+--
> > Total Cost Time = 291195 us | Total Cost Time = 544403 us | Total cost
> time = 934687 us
> > ----------------------------------------------------------------------
> > --------------------------
>
> From the user point of view there needs to be clear recommendations on
> which lock type to use.
I think the data about fairness needs to be added to this - especially for ticket lock. IMO, we should consider fairness and space (cache lines) along with cycles.
>
> And if one of the lock types is always slower it should be deprecated long term.
The ticket lock can be a drop-in replacement for spinlock. Gavin is working on a patch which will make ticket lock as the backend for spinlock through a configuration. But the performance needs to be considered.
More information about the dev
mailing list