[dpdk-dev] [PATCH] librte_eal: fix mcslock hang on weak memory

Phil Yang Phil.Yang at arm.com
Fri Aug 28 11:19:58 CEST 2020


Hi Diogo,



Thanks for your explanation.



As documented in Arm ARM<https://developer.arm.com/documentation/ddi0487/fc>  B2.9.5 Load-Exclusive and Store-Exclusive instruction usage restrictions:

" Between the Load-Exclusive and the Store-Exclusive, there are no explicit memory accesses, preloads,

direct or indirect System register writes, address translation instructions, cache or TLB maintenance

instructions, exception generating instructions, exception returns, or indirect branches."



So it is not allowed to insert (1) & (4) between (2, 3). The cmpxchg operation is atomic.



Thanks,

Phil Yang



> -----Original Message-----

> From: Diogo Behrens <diogo.behrens at huawei.com>

> Sent: Thursday, August 27, 2020 4:57 PM

> To: Phil Yang <Phil.Yang at arm.com>

> Cc: dev at dpdk.org; nd <nd at arm.com>; Honnappa Nagarahalli

> <Honnappa.Nagarahalli at arm.com>

> Subject: RE: [PATCH] librte_eal: fix mcslock hang on weak memory

>

> Hi Phil,

>

> thanks for taking a look at this. Indeed, the cmpxchg will have release

> semantics, but implicit barriers do not provide the same ordering guarantees

> as DMB ISH, ie, atomic_thread_fence(__ATOMIC_ACQ_REL).

>

> Let me try to explain the scenario. Here is a pseudo-ARM assembly with the

> instructions involved:

>

> STR me->locked  = 1   // 1: initialize the node

> LDAXR pred = tail  // 2: cmpxchg

> STLXR tail = me    // 3: cmpxchg

> STR pred->next = me // 4: the problematic store

>

> Now, ARM allows instruction 1 and 2 to be reordered, and also instructions 3

> and 4 to be reordered:

>

> LDAXR pred = tail  // 2: cmpxchg  (acquires can be moved up)

> STR me-> locked = 1   // 1: initialize the node

> STR pred->next = me // 4: the problematic store

> STLXR tail = me    // 3: cmpxchg (releases can be moved down)

>

> And since stores 1 and 4 can be reordered too, the following execution is

> possible:

>

> LDAXR pred = tail  // 2: cmpxchg

> STR pred->next = me // 4: the problematic store

> STR me-> locked = 1   // 1: initialize the node

> STLXR tail = me    // 3: cmpxchg

>

> In this subtle situation, the locking-thread can overwrite the store to next-

> >locked of the unlocking-thread (if it occurs between 4 and 1), and

> subsequently hang waiting for me->locked to become 0.

>

> We couldn't reproduce this with our available hardware, but we

> "reproduced" it with the RMEM model checker, which precisely models the

> ARM memory model (https://github.com/rems-project/rmem). The GenMC

> model checker (https://github.com/MPI-SWS/genmc) also finds the issue,

> but its intermediate memory model is slightly more general than the ARM

> model.

>

> The release attached to store (4) prevents (1) to reordering with it.

>

> Please, let me know if that makes sense or if I am missing something.

>

> Thanks,

> -Diogo

>

> -----Original Message-----

> From: Phil Yang [mailto:Phil.Yang at arm.com]

> Sent: Wednesday, August 26, 2020 12:17 PM

> To: Diogo Behrens <diogo.behrens at huawei.com<mailto:diogo.behrens at huawei.com>>

> Cc: dev at dpdk.org<mailto:dev at dpdk.org>; nd <nd at arm.com<mailto:nd at arm.com>>; Honnappa Nagarahalli

> <Honnappa.Nagarahalli at arm.com<mailto:Honnappa.Nagarahalli at arm.com>>; nd <nd at arm.com<mailto:nd at arm.com>>

> Subject: RE: [PATCH] librte_eal: fix mcslock hang on weak memory

>

> Diogo Behrens <diogo.behrens at huawei.com<mailto:diogo.behrens at huawei.com>> writes:

>

> > Subject: [PATCH] librte_eal: fix mcslock hang on weak memory

> >

> >     The initialization me->locked=1 in lock() must happen before

> >     next->locked=0 in unlock(), otherwise a thread may hang forever,

> >     waiting me->locked become 0. On weak memory systems (sHi Puch as

> ARMv8),

> >     the current implementation allows me->locked=1 to be reordered with

> >     announcing the node (pred->next=me) and, consequently, to be

> >     reordered with next->locked=0 in unlock().

> >

> >     This fix adds a release barrier to pred->next=me, forcing

> >     me->locked=1 to happen before this operation.

> >

> > Signed-off-by: Diogo Behrens <diogo.behrens at huawei.com<mailto:diogo.behrens at huawei.com>>

> > ---

> >  lib/librte_eal/include/generic/rte_mcslock.h | 9 ++++++++-

> >  1 file changed, 8 insertions(+), 1 deletion(-)

> >

> > diff --git a/lib/librte_eal/include/generic/rte_mcslock.h

> > b/lib/librte_eal/include/generic/rte_mcslock.h

> > index 2bef28351..ce553f547 100644

> > --- a/lib/librte_eal/include/generic/rte_mcslock.h

> > +++ b/lib/librte_eal/include/generic/rte_mcslock.h

> > @@ -68,7 +68,14 @@ rte_mcslock_lock(rte_mcslock_t **msl,

> rte_mcslock_t

> > *me)

> >                           */

> >                          return;

> >          }

> > -       __atomic_store_n(&prev->next, me, __ATOMIC_RELAXED);

> > +      /* The store to me->next above should also complete before the

> > node is

> > +      * visible to predecessor thread releasing the lock. Hence, the store

> > +      * prev->next also requires release semantics. Note that, for

> > example,

> > +      * on ARM, the release semantics in the exchange operation is not

> > +      * strong as a release fence and is not sufficient to enforce the

> > +      * desired order here.

>

>

> Hi Diogo,

>

> I didn't quite understand why the exchange operation with ACQ_REL

> memory ordering is not sufficient.

> It emits 'stlxr' on armv8.0-a and 'swpal' on armv8.1-a (with LSE support).

> Both of these two instructions contain a release semantics.

>

> Please check the URL for the detail.

> https://gcc.godbolt.org/z/Mc4373

>

> BTW, if you captured a deadlock issue on your testbed, could you please

> share your test case here?

>

> Thanks,

> Phil

>

>

> > +      */

> > +      __atomic_store_n(&prev->next, me, __ATOMIC_RELEASE);

> >

> >          /* The while-load of me->locked should not move above the

> previous

> >           * store to prev->next. Otherwise it will cause a deadlock. Need a

> > --

> > 2.28.0

> >




More information about the dev mailing list