[dpdk-dev] Arm roadmap for 20.05

Mattias Rönnblom mattias.ronnblom at ericsson.com
Sat Mar 21 09:23:05 CET 2020


On 2020-03-21 09:17, Mattias Rönnblom wrote:
> On 2020-03-20 21:45, Honnappa Nagarahalli wrote:
>> <snip>
>>
>>> Subject: Re: [dpdk-dev] Arm roadmap for 20.05
>>>
>>> On 2020-03-10 17:42, Honnappa Nagarahalli wrote:
>>>> Hello,
>>>> 	Following are the work items planned for 20.05:
>>>>
>>>> 1) Use C11 atomic APIs in timer library
>>>> 2) Use C11 atomic APIs in service cores
>>>> 3) Use C11 atomics in VirtIO split ring
>>>> 4) Performance optimizations in i40e and MLX drivers for Arm platforms
>>>> 5) RCU defer API
>>>> 6) Enable Travis CI with no huge-page tests - ~25 test cases
>>>>
>>>> Thank you,
>>>> Honnappa
>>> Maybe you should have a look at legacy DPDK atomics as well? Avoiding a full
>>> barrier for the add operation, for example.
>> By legacy, I believe you meant rte_atomic APIs. Those APIs do not take memory order as a parameter. So, it is difficult to change the implementation for those APIs. For ex: the add operation could take a RELEASE or RELAXED order depending on the use case.
>> So, the proposal is to deprecate the rte_atomic APIs and use C11 APIs directly. The proposal is here: https://protect2.fireeye.com/v1/url?k=2e04311e-72d039b7-2e047185-865b3b1e120b-91a0698f69ff0d1f&q=1&e=976056f3-f089-4fa8-86b2-aa5e88331555&u=https%3A%2F%2Fpatches.dpdk.org%2Fcover%2F66745%2F
> Even though rte_atomic lacks the flexibility of C11 atomics, there might
> still be areas of improvement. Such improvements will have an instant
> effect, as opposed to waiting for all the rte_atomic users to change.
>
>
> The rte_atomic API leaves ordering unspecified, unfortunately. In the
> Linux kernel, from which DPDK seems to borrow much of the atomics and
> memory order related semantics, an atomic add doesn't imply any memory
> barriers. The current __sync_fetch_and_add()-based implementation
> implies a full barrier (ldadd+dmb) or release (ldaddal, on v8.1-a). If
> you would use C11 atomics to implement rte_atomic in ARM, you could use
> a relaxed memory order on rte_atomic*_add() (assuming you agree those
> are the implicit semantics of the legacy API) and just get an ldadd
> instruction. An alternative would be to implement the same thing in
> assembler, of course.
>
>
Another approach might be to just scrap all of the intrinsics and inline 
assembler used for all the functions in rte_atomic, on all 
architectures, and use C11 atomics instead.




More information about the dev mailing list