[dpdk-dev] [PATCH v3 11/12] service: optimize with c11 one-way barrier
Phil Yang
Phil.Yang at arm.com
Wed Apr 8 12:15:43 CEST 2020
> -----Original Message-----
> From: Van Haaren, Harry <harry.van.haaren at intel.com>
> Sent: Friday, April 3, 2020 7:58 PM
> To: Phil Yang <Phil.Yang at arm.com>; thomas at monjalon.net; Ananyev,
> Konstantin <konstantin.ananyev at intel.com>;
> stephen at networkplumber.org; maxime.coquelin at redhat.com;
> dev at dpdk.org
> Cc: david.marchand at redhat.com; jerinj at marvell.com;
> hemant.agrawal at nxp.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli at arm.com>; Gavin Hu <Gavin.Hu at arm.com>;
> Ruifeng Wang <Ruifeng.Wang at arm.com>; Joyce Kong
> <Joyce.Kong at arm.com>; nd <nd at arm.com>
> Subject: RE: [PATCH v3 11/12] service: optimize with c11 one-way barrier
>
> > -----Original Message-----
> > From: Phil Yang <phil.yang at arm.com>
> > Sent: Tuesday, March 17, 2020 1:18 AM
> > To: thomas at monjalon.net; Van Haaren, Harry
> <harry.van.haaren at intel.com>;
> > Ananyev, Konstantin <konstantin.ananyev at intel.com>;
> > stephen at networkplumber.org; maxime.coquelin at redhat.com;
> dev at dpdk.org
> > Cc: david.marchand at redhat.com; jerinj at marvell.com;
> hemant.agrawal at nxp.com;
> > Honnappa.Nagarahalli at arm.com; gavin.hu at arm.com;
> ruifeng.wang at arm.com;
> > joyce.kong at arm.com; nd at arm.com
> > Subject: [PATCH v3 11/12] service: optimize with c11 one-way barrier
> >
> > The num_mapped_cores and execute_lock are synchronized with
> rte_atomic_XX
> > APIs which is a full barrier, DMB, on aarch64. This patch optimized it with
> > c11 atomic one-way barrier.
> >
> > Signed-off-by: Phil Yang <phil.yang at arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang at arm.com>
> > Reviewed-by: Gavin Hu <gavin.hu at arm.com>
> > Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
>
> Based on discussion on-list, it seems the consensus is to not use
> GCC builtins, but instead use C11 APIs "proper"? If my conclusion is
> correct, the v+1 of this patchset would require updates to that style API.
>
> Inline comments for context below, -Harry
>
>
> > ---
> > lib/librte_eal/common/rte_service.c | 50
> ++++++++++++++++++++++++++----------
> > -
> > 1 file changed, 35 insertions(+), 15 deletions(-)
> >
> > diff --git a/lib/librte_eal/common/rte_service.c
> > b/lib/librte_eal/common/rte_service.c
> > index 0843c3c..c033224 100644
> > --- a/lib/librte_eal/common/rte_service.c
> > +++ b/lib/librte_eal/common/rte_service.c
> > @@ -42,7 +42,7 @@ struct rte_service_spec_impl {
> > * running this service callback. When not set, a core may take the
> > * lock and then run the service callback.
> > */
> > - rte_atomic32_t execute_lock;
> > + uint32_t execute_lock;
> >
> > /* API set/get-able variables */
> > int8_t app_runstate;
> > @@ -54,7 +54,7 @@ struct rte_service_spec_impl {
> > * It does not indicate the number of cores the service is running
> > * on currently.
> > */
> > - rte_atomic32_t num_mapped_cores;
> > + int32_t num_mapped_cores;
>
> Any reason why "int32_t" or "uint32_t" is used over another?
> execute_lock is a uint32_t above, num_mapped_cores is an int32_t?
It should be uint32_t for num_mapped_cores.
This value will not be negative after __atomic_sub_fetch operation, because of the sequence of writer and reader accesses are guaranteed by the memory ordering.
I will update it in the next version.
Thanks,
Phil
<snip>
More information about the dev
mailing list