[dpdk-dev] [PATCH v3 00/12] generic rte atomic APIs deprecate proposal

Honnappa Nagarahalli Honnappa.Nagarahalli at arm.com
Fri Mar 20 05:51:21 CET 2020


<snip>

> > Subject: [PATCH v3 00/12] generic rte atomic APIs deprecate proposal
> >
> > DPDK provides generic rte_atomic APIs to do several atomic operations.
> > These APIs are using the deprecated __sync built-ins and enforce full
> > memory barriers on aarch64. However, full barriers are not necessary
> > in many use cases. In order to address such use cases, C language
> > offers
> > C11 atomic APIs. The C11 atomic APIs provide finer memory barrier
> > control by making use of the memory ordering parameter provided by the
> user.
> > Various patches submitted in the past [2] and the patches in this
> > series indicate significant performance gains on multiple aarch64 CPUs
> > and no performance loss on x86.
> >
> > But the existing rte_atomic API implementations cannot be changed as
> > the APIs do not take the memory ordering parameter. The only choice
> > available is replacing the usage of the rte_atomic APIs with C11
> > atomic APIs. In order to make this change, the following steps are proposed:
> >
> > [1] deprecate rte_atomic APIs so that future patches do not use
> > rte_atomic APIs (a script is added to flag the usages).
> > [2] refactor the code that uses rte_atomic APIs to use c11 atomic APIs.
> 
> On [1] above, I feel deprecating DPDKs atomic functions and failing checkpatch
> is a bit sudden. Perhaps noting that in a future release (20.11?) DPDK will
> move to a
> C11 based atomics model is a more gradual step to achieving the goal, and at
> that point add a checkpatch warning for additions of rte_atomic*?
We have been working on changing existing usages of rte_atomic APIs in DPDK to use C11 atomics. Usually, the x.11 releases have significant amount of changes (not sure how many would use rte_atomic APIs). I would prefer that in 20.11 no additional code is added using rte_atomics APIs. However, I am open to suggestions on the exact time frame.
Once we decide on the release, I think it makes sense to add a 'warning' in the checkpatch to indicate the deprecation timeline and add an 'error' after the release.

> 
> More on [2] in context below.
> 
> The above is my point-of-view, of course I'd like more people from the DPDK
> community to provide their input too.
> 
> 
> > This patchset contains:
> > 1) the checkpatch script changes to flag rte_atomic API usage in patches.
> > 2) changes to programmer guide describing writing efficient code for
> aarch64.
> > 3) changes to various libraries to make use of c11 atomic APIs.
> >
> > We are planning to replicate this idea across all the other libraries,
> > drivers, examples, test applications. In the next phase, we will add
> > changes to the mbuf, the EAL interrupts and the event timer adapter
> libraries.
> 
> About ~6/12 patches of this C11 set are targeting the Service Cores area of
> DPDK. I have some concerns over increased complexity of C11 implementation
> vs the (already complex) rte_atomic implementation today.
I agree that it C11 changes are complex, especially if one is starting out to understand what these APIs provide. From my experience, once few underlying concepts are understood, reviewing or making changes do not take too much time.

> I see other patchsets enabling C11 across other DPDK components, so maybe
> we should also discuss C11 enabling in a wider context that just service cores?
Yes, agree. We are in the process of making changes to other areas as well.

> 
> I don't think it fair to expect all developers to be well versed in C11 atomic
> semantics like understanding the complex interactions between the various
> C11 RELEASE, AQUIRE barriers requires.
C11 has been around from sometime now. To my surprise, OVS already uses C11 APIs extensively. VPP has been accepting C11 related changes from past couple of years. Having said that, I agree in general that not everyone is well versed.

> 
> As maintainer of Service Cores I'm hesitant to accept the large-scale refactor
Right now, the patches are split into multiple commits. If required I can host a call to go over simple C11 API usages (sufficient to cover the usage in service core) and the changes in this patch. If you find that particular areas need more understanding I can work on providing additional information such as memory order ladder diagrams. Please let me know what you think.

> of atomic-implementation, as it could lead to racey bugs that are likely
> extremely difficult to track down. (The recent race-on-exit has proven the
> difficulty in reproducing, and that's with an atomics model I'm quite familiar
> with).
> 
> Let me be very clear: I don't wish to block a C11 atomic implementation, but
> I'd like to discuss how we (DPDK community) can best port-to and maintain a
> complex multi-threaded service library with best-in-class performance for the
> workload.
> 
> To put some discussions/solutions on the table:
> - Shared Maintainership of a component?
>      Split in functionality and C11 atomics implementation
>      Obviously there would be collaboration required in such a case.
> - Maybe shared maintainership is too much?
>      A gentlemans/womans agreement of "helping out" with C11 atomics debug
> is enough?
I think shared maintainer ship could be too much as there are many changes. But, I and other engineers from Arm (I would include Arm ecosystem as well) can definitely help out on debug and reviews involving C11 APIs. (I see Thomas's reply on this topic).

> 
> 
> Hope my concerns are understandable, and of course input/feedback
> welcomed! -Harry
No worries at all. We are here to help and make this as easy as possible.

> 
> 
> PS: Apologies for the delay in reply - was OOO on Irish national holiday.
Same here, was on vacation for 3 days.

> 
> 
> > v3:
> > add libatomic dependency for 32-bit clang
> >
> > v2:
> > 1. fix Clang '-Wincompatible-pointer-types' WARNING.
> > 2. fix typos.
> >
> > Honnappa Nagarahalli (2):
> >   service: avoid race condition for MT unsafe service
> >   service: identify service running on another core correctly
> >
> > Phil Yang (10):
> >   doc: add generic atomic deprecation section
> >   devtools: prevent use of rte atomic APIs in future patches
> >   eal/build: add libatomic dependency for 32-bit clang
> >   build: remove redundant code
> >   vhost: optimize broadcast rarp sync with c11 atomic
> >   ipsec: optimize with c11 atomic for sa outbound sqn update
> >   service: remove rte prefix from static functions
> >   service: remove redundant code
> >   service: optimize with c11 one-way barrier
> >   service: relax barriers with C11 atomic operations
> >
> >  devtools/checkpatches.sh                         |   9 ++
> >  doc/guides/prog_guide/writing_efficient_code.rst |  60 +++++++-
> >  drivers/event/octeontx/meson.build               |   5 -
> >  drivers/event/octeontx2/meson.build              |   5 -
> >  drivers/event/opdl/meson.build                   |   5 -
> >  lib/librte_eal/common/rte_service.c              | 175 ++++++++++++----------
> > -
> >  lib/librte_eal/meson.build                       |   6 +
> >  lib/librte_ipsec/ipsec_sqn.h                     |   3 +-
> >  lib/librte_ipsec/sa.h                            |   2 +-
> >  lib/librte_rcu/meson.build                       |   5 -
> >  lib/librte_vhost/vhost.h                         |   2 +-
> >  lib/librte_vhost/vhost_user.c                    |   7 +-
> >  lib/librte_vhost/virtio_net.c                    |  16 ++-
> >  13 files changed, 181 insertions(+), 119 deletions(-)
> >
> > --
> > 2.7.4



More information about the dev mailing list