[PATCH v2 09/19] rcu: use rte optional stdatomic API

Ruifeng Wang Ruifeng.Wang at arm.com
Thu Oct 26 06:24:54 CEST 2023


> -----Original Message-----
> From: Tyler Retzlaff <roretzla at linux.microsoft.com>
> Sent: Thursday, October 26, 2023 6:38 AM
> To: Ruifeng Wang <Ruifeng.Wang at arm.com>
> Cc: dev at dpdk.org; Akhil Goyal <gakhil at marvell.com>; Anatoly Burakov
> <anatoly.burakov at intel.com>; Andrew Rybchenko <andrew.rybchenko at oktetlabs.ru>; Bruce
> Richardson <bruce.richardson at intel.com>; Chenbo Xia <chenbo.xia at intel.com>; Ciara Power
> <ciara.power at intel.com>; David Christensen <drc at linux.vnet.ibm.com>; David Hunt
> <david.hunt at intel.com>; Dmitry Kozlyuk <dmitry.kozliuk at gmail.com>; Dmitry Malloy
> <dmitrym at microsoft.com>; Elena Agostini <eagostini at nvidia.com>; Erik Gabriel Carrillo
> <erik.g.carrillo at intel.com>; Fan Zhang <fanzhang.oss at gmail.com>; Ferruh Yigit
> <ferruh.yigit at amd.com>; Harman Kalra <hkalra at marvell.com>; Harry van Haaren
> <harry.van.haaren at intel.com>; Honnappa Nagarahalli <Honnappa.Nagarahalli at arm.com>;
> jerinj at marvell.com; Konstantin Ananyev <konstantin.v.ananyev at yandex.ru>; Matan Azrad
> <matan at nvidia.com>; Maxime Coquelin <maxime.coquelin at redhat.com>; Narcisa Ana Maria Vasile
> <navasile at linux.microsoft.com>; Nicolas Chautru <nicolas.chautru at intel.com>; Olivier Matz
> <olivier.matz at 6wind.com>; Ori Kam <orika at nvidia.com>; Pallavi Kadam
> <pallavi.kadam at intel.com>; Pavan Nikhilesh <pbhagavatula at marvell.com>; Reshma Pattan
> <reshma.pattan at intel.com>; Sameh Gobriel <sameh.gobriel at intel.com>; Shijith Thotton
> <sthotton at marvell.com>; Sivaprasad Tummala <sivaprasad.tummala at amd.com>; Stephen Hemminger
> <stephen at networkplumber.org>; Suanming Mou <suanmingm at nvidia.com>; Sunil Kumar Kori
> <skori at marvell.com>; thomas at monjalon.net; Viacheslav Ovsiienko <viacheslavo at nvidia.com>;
> Vladimir Medvedkin <vladimir.medvedkin at intel.com>; Yipeng Wang <yipeng1.wang at intel.com>;
> nd <nd at arm.com>
> Subject: Re: [PATCH v2 09/19] rcu: use rte optional stdatomic API
> 
> On Wed, Oct 25, 2023 at 09:41:22AM +0000, Ruifeng Wang wrote:
> > > -----Original Message-----
> > > From: Tyler Retzlaff <roretzla at linux.microsoft.com>
> > > Sent: Wednesday, October 18, 2023 4:31 AM
> > > To: dev at dpdk.org
> > > Cc: Akhil Goyal <gakhil at marvell.com>; Anatoly Burakov
> > > <anatoly.burakov at intel.com>; Andrew Rybchenko
> > > <andrew.rybchenko at oktetlabs.ru>; Bruce Richardson
> > > <bruce.richardson at intel.com>; Chenbo Xia <chenbo.xia at intel.com>;
> > > Ciara Power <ciara.power at intel.com>; David Christensen
> > > <drc at linux.vnet.ibm.com>; David Hunt <david.hunt at intel.com>; Dmitry
> > > Kozlyuk <dmitry.kozliuk at gmail.com>; Dmitry Malloy
> > > <dmitrym at microsoft.com>; Elena Agostini <eagostini at nvidia.com>; Erik
> > > Gabriel Carrillo <erik.g.carrillo at intel.com>; Fan Zhang
> > > <fanzhang.oss at gmail.com>; Ferruh Yigit <ferruh.yigit at amd.com>;
> > > Harman Kalra <hkalra at marvell.com>; Harry van Haaren
> > > <harry.van.haaren at intel.com>; Honnappa Nagarahalli
> > > <Honnappa.Nagarahalli at arm.com>; jerinj at marvell.com; Konstantin
> > > Ananyev <konstantin.v.ananyev at yandex.ru>; Matan Azrad
> > > <matan at nvidia.com>; Maxime Coquelin <maxime.coquelin at redhat.com>;
> > > Narcisa Ana Maria Vasile <navasile at linux.microsoft.com>; Nicolas
> > > Chautru <nicolas.chautru at intel.com>; Olivier Matz
> > > <olivier.matz at 6wind.com>; Ori Kam <orika at nvidia.com>; Pallavi Kadam
> > > <pallavi.kadam at intel.com>; Pavan Nikhilesh
> > > <pbhagavatula at marvell.com>; Reshma Pattan <reshma.pattan at intel.com>;
> > > Sameh Gobriel <sameh.gobriel at intel.com>; Shijith Thotton
> > > <sthotton at marvell.com>; Sivaprasad Tummala
> > > <sivaprasad.tummala at amd.com>; Stephen Hemminger
> > > <stephen at networkplumber.org>; Suanming Mou <suanmingm at nvidia.com>;
> > > Sunil Kumar Kori <skori at marvell.com>; thomas at monjalon.net;
> > > Viacheslav Ovsiienko <viacheslavo at nvidia.com>; Vladimir Medvedkin
> > > <vladimir.medvedkin at intel.com>; Yipeng Wang
> > > <yipeng1.wang at intel.com>; Tyler Retzlaff
> > > <roretzla at linux.microsoft.com>
> > > Subject: [PATCH v2 09/19] rcu: use rte optional stdatomic API
> > >
> > > Replace the use of gcc builtin __atomic_xxx intrinsics with
> > > corresponding rte_atomic_xxx optional stdatomic API
> > >
> > > Signed-off-by: Tyler Retzlaff <roretzla at linux.microsoft.com>
> > > ---
> > >  lib/rcu/rte_rcu_qsbr.c | 48 +++++++++++++++++------------------
> > >  lib/rcu/rte_rcu_qsbr.h | 68
> > > +++++++++++++++++++++++++-------------------------
> > >  2 files changed, 58 insertions(+), 58 deletions(-)
> > >
> > > diff --git a/lib/rcu/rte_rcu_qsbr.c b/lib/rcu/rte_rcu_qsbr.c index
> > > 17be93e..4dc7714 100644
> > > --- a/lib/rcu/rte_rcu_qsbr.c
> > > +++ b/lib/rcu/rte_rcu_qsbr.c
> > > @@ -102,21 +102,21 @@
> > >  	 * go out of sync. Hence, additional checks are required.
> > >  	 */
> > >  	/* Check if the thread is already registered */
> > > -	old_bmap = __atomic_load_n(__RTE_QSBR_THRID_ARRAY_ELM(v, i),
> > > -					__ATOMIC_RELAXED);
> > > +	old_bmap = rte_atomic_load_explicit(__RTE_QSBR_THRID_ARRAY_ELM(v, i),
> > > +					rte_memory_order_relaxed);
> > >  	if (old_bmap & 1UL << id)
> > >  		return 0;
> > >
> > >  	do {
> > >  		new_bmap = old_bmap | (1UL << id);
> > > -		success = __atomic_compare_exchange(
> > > +		success = rte_atomic_compare_exchange_strong_explicit(
> > >  					__RTE_QSBR_THRID_ARRAY_ELM(v, i),
> > > -					&old_bmap, &new_bmap, 0,
> > > -					__ATOMIC_RELEASE, __ATOMIC_RELAXED);
> > > +					&old_bmap, new_bmap,
> > > +					rte_memory_order_release, rte_memory_order_relaxed);
> > >
> > >  		if (success)
> > > -			__atomic_fetch_add(&v->num_threads,
> > > -						1, __ATOMIC_RELAXED);
> > > +			rte_atomic_fetch_add_explicit(&v->num_threads,
> > > +						1, rte_memory_order_relaxed);
> > >  		else if (old_bmap & (1UL << id))
> > >  			/* Someone else registered this thread.
> > >  			 * Counter should not be incremented.
> > > @@ -154,8 +154,8 @@
> > >  	 * go out of sync. Hence, additional checks are required.
> > >  	 */
> > >  	/* Check if the thread is already unregistered */
> > > -	old_bmap = __atomic_load_n(__RTE_QSBR_THRID_ARRAY_ELM(v, i),
> > > -					__ATOMIC_RELAXED);
> > > +	old_bmap = rte_atomic_load_explicit(__RTE_QSBR_THRID_ARRAY_ELM(v, i),
> > > +					rte_memory_order_relaxed);
> > >  	if (!(old_bmap & (1UL << id)))
> > >  		return 0;
> > >
> > > @@ -165,14 +165,14 @@
> > >  		 * completed before removal of the thread from the list of
> > >  		 * reporting threads.
> > >  		 */
> > > -		success = __atomic_compare_exchange(
> > > +		success = rte_atomic_compare_exchange_strong_explicit(
> > >  					__RTE_QSBR_THRID_ARRAY_ELM(v, i),
> > > -					&old_bmap, &new_bmap, 0,
> > > -					__ATOMIC_RELEASE, __ATOMIC_RELAXED);
> > > +					&old_bmap, new_bmap,
> > > +					rte_memory_order_release, rte_memory_order_relaxed);
> > >
> > >  		if (success)
> > > -			__atomic_fetch_sub(&v->num_threads,
> > > -						1, __ATOMIC_RELAXED);
> > > +			rte_atomic_fetch_sub_explicit(&v->num_threads,
> > > +						1, rte_memory_order_relaxed);
> > >  		else if (!(old_bmap & (1UL << id)))
> > >  			/* Someone else unregistered this thread.
> > >  			 * Counter should not be incremented.
> > > @@ -227,8 +227,8 @@
> > >
> > >  	fprintf(f, "  Registered thread IDs = ");
> > >  	for (i = 0; i < v->num_elems; i++) {
> > > -		bmap = __atomic_load_n(__RTE_QSBR_THRID_ARRAY_ELM(v, i),
> > > -					__ATOMIC_ACQUIRE);
> > > +		bmap = rte_atomic_load_explicit(__RTE_QSBR_THRID_ARRAY_ELM(v, i),
> > > +					rte_memory_order_acquire);
> > >  		id = i << __RTE_QSBR_THRID_INDEX_SHIFT;
> > >  		while (bmap) {
> > >  			t = __builtin_ctzl(bmap);
> > > @@ -241,26 +241,26 @@
> > >  	fprintf(f, "\n");
> > >
> > >  	fprintf(f, "  Token = %" PRIu64 "\n",
> > > -			__atomic_load_n(&v->token, __ATOMIC_ACQUIRE));
> > > +			rte_atomic_load_explicit(&v->token, rte_memory_order_acquire));
> > >
> > >  	fprintf(f, "  Least Acknowledged Token = %" PRIu64 "\n",
> > > -			__atomic_load_n(&v->acked_token, __ATOMIC_ACQUIRE));
> > > +			rte_atomic_load_explicit(&v->acked_token,
> > > +rte_memory_order_acquire));
> > >
> > >  	fprintf(f, "Quiescent State Counts for readers:\n");
> > >  	for (i = 0; i < v->num_elems; i++) {
> > > -		bmap = __atomic_load_n(__RTE_QSBR_THRID_ARRAY_ELM(v, i),
> > > -					__ATOMIC_ACQUIRE);
> > > +		bmap = rte_atomic_load_explicit(__RTE_QSBR_THRID_ARRAY_ELM(v, i),
> > > +					rte_memory_order_acquire);
> > >  		id = i << __RTE_QSBR_THRID_INDEX_SHIFT;
> > >  		while (bmap) {
> > >  			t = __builtin_ctzl(bmap);
> > >  			fprintf(f, "thread ID = %u, count = %" PRIu64 ", lock count = %u\n",
> > >  				id + t,
> > > -				__atomic_load_n(
> > > +				rte_atomic_load_explicit(
> > >  					&v->qsbr_cnt[id + t].cnt,
> > > -					__ATOMIC_RELAXED),
> > > -				__atomic_load_n(
> > > +					rte_memory_order_relaxed),
> > > +				rte_atomic_load_explicit(
> > >  					&v->qsbr_cnt[id + t].lock_cnt,
> > > -					__ATOMIC_RELAXED));
> > > +					rte_memory_order_relaxed));
> > >  			bmap &= ~(1UL << t);
> > >  		}
> > >  	}
> > > diff --git a/lib/rcu/rte_rcu_qsbr.h b/lib/rcu/rte_rcu_qsbr.h index
> > > 87e1b55..9f4aed2 100644
> > > --- a/lib/rcu/rte_rcu_qsbr.h
> > > +++ b/lib/rcu/rte_rcu_qsbr.h
> > > @@ -63,11 +63,11 @@
> > >   * Given thread id needs to be converted to index into the array and
> > >   * the id within the array element.
> > >   */
> > > -#define __RTE_QSBR_THRID_ARRAY_ELM_SIZE (sizeof(uint64_t) * 8)
> > > +#define __RTE_QSBR_THRID_ARRAY_ELM_SIZE
> > > +(sizeof(RTE_ATOMIC(uint64_t)) *
> > > +8)
> > >  #define __RTE_QSBR_THRID_ARRAY_SIZE(max_threads) \
> > >  	RTE_ALIGN(RTE_ALIGN_MUL_CEIL(max_threads, \
> > >  		__RTE_QSBR_THRID_ARRAY_ELM_SIZE) >> 3, RTE_CACHE_LINE_SIZE)
> > > -#define __RTE_QSBR_THRID_ARRAY_ELM(v, i) ((uint64_t *) \
> > > +#define __RTE_QSBR_THRID_ARRAY_ELM(v, i) ((uint64_t __rte_atomic *)
> > > +\
> >
> > Is it equivalent to ((RTE_ATOMIC(uint64_t) *)?
> 
> i'm not sure if you're asking about the resultant type of the expression or not?

I see other places are using specifier hence the question.

> 
> in this context we aren't specifying an atomic type but rather adding the atomic qualifier
> to what should already be a variable that has an atomic specified type with a cast which
> is why we use __rte_atomic.

I read from document [1] that atomic qualified type may have a different size from the original type.
If that is the case, the size difference could cause issue in bitmap array accessing.
Did I misunderstand?

[1] https://en.cppreference.com/w/c/language/atomic

> 
> >
> > >  	((struct rte_rcu_qsbr_cnt *)(v + 1) + v->max_threads) + i)
> > > #define __RTE_QSBR_THRID_INDEX_SHIFT 6  #define
> > > __RTE_QSBR_THRID_MASK 0x3f @@ -75,13 +75,13 @@
> > >
> >
> > <snip>


More information about the dev mailing list