[RFC v2 5/6] eal: add atomic bit operations
Morten Brørup
mb at smartsharesystems.com
Fri Apr 26 14:00:22 CEST 2024
> From: Mattias Rönnblom [mailto:hofors at lysator.liu.se]
> Sent: Friday, 26 April 2024 11.39
>
> On 2024-04-25 18:18, Morten Brørup wrote:
> >> From: Mattias Rönnblom [mailto:hofors at lysator.liu.se]
> >> Sent: Thursday, 25 April 2024 16.36
> >>
> >> On 2024-04-25 12:25, Morten Brørup wrote:
> >>>> +#define rte_bit_atomic_test(addr, nr, memory_order)
> >> \
> >>>> + _Generic((addr), \
> >>>> + uint32_t *: __rte_bit_atomic_test32, \
> >>>> + uint64_t *: __rte_bit_atomic_test64)(addr, nr,
> >> memory_order)
> >>>
> >>> I wonder if these should have RTE_ATOMIC qualifier:
> >>>
> >>> + RTE_ATOMIC(uint32_t) *: __rte_bit_atomic_test32,
> >> \
> >>> + RTE_ATOMIC(uint64_t) *: __rte_bit_atomic_test64)(addr, nr,
> >> memory_order)
> >>>
> >>>
> >>>> +#define __RTE_GEN_BIT_ATOMIC_TEST(size)
> >> \
> >>>> + static inline bool \
> >>>> + __rte_bit_atomic_test ## size(const uint ## size ## _t *addr,
> >> \
> >>>
> >>> I wonder if the "addr" parameter should have RTE_ATOMIC qualifier:
> >>>
> >>> + __rte_bit_atomic_test ## size(const RTE_ATOMIC(uint ## size ## _t)
> >> *addr, \
> >>>
> >>> instead of casting into a_addr.
> >>>
> >>
> >> Check the cover letter for the rationale for the cast.
> >
> > Thanks, that clarifies it. Then...
> > For the series:
> > Acked-by: Morten Brørup <mb at smartsharesystems.com>
> >
> >>
> >> Where I'm at now is that I think C11 _Atomic is rather poor design. The
> >> assumption that an object which allows for atomic access always should
> >> require all operations upon it to be atomic, regardless of where it is
> >> in its lifetime, and which thread is accessing it, does not hold, in the
> >> general case.
> >
> > It might be slow, but I suppose the C11 standard prioritizes correctness
> over performance.
> >
>
> That's a false dichotomy, in this case. You can have both.
In theory you shouldn't need non-atomic access to atomic variables.
In reality, we want it anyway, because real CPUs are faster at non-atomic operations.
>
> > It seems locks are automatically added to _Atomic types larger than what is
> natively supported by the architecture.
> > E.g. MSVC adds locks to _Atomic types larger than 8 byte. [1]
> >
> > [1]: https://devblogs.microsoft.com/cppblog/c11-atomics-in-visual-studio-
> 2022-version-17-5-preview-2/
> >
> >>
> >> The only reason for _Atomic being as it is, as far as I can see, is to
> >> accommodate for ISAs which does not have the appropriate atomic machine
> >> instructions, and thus require a lock or some other data associated with
> >> the actual user-data-carrying bits. Neither GCC nor DPDK supports any
> >> such ISAs, to my knowledge. I suspect neither never will. So the cast
> >> will continue to work.
> >
> > I tend to agree with you on this.
> >
> > We should officially decide that DPDK treats RTE_ATOMIC types as a union of
> _Atomic and non-atomic, i.e. operations on RTE_ATOMIC types can be both atomic
> and non-atomic.
> >
>
> I think this is a subject which needs to be further explored.
Yes. It's easier exploring and deciding now, when our options are open, than after we have locked down the affected APIs.
>
> Objects that can be accessed both atomically and non-atomically should
> be without _Atomic. With my current understanding of this issue, that
> seems like the best option.
Agree.
The alterative described below is certainly no good!
It would be nice if they were marked as sometimes-atomic by a qualifier or special type, like rte_be32_t marks the network byte order variant of an uint32_t.
Furthermore, large atomic objects need the _Atomic qualifier for the compiler to add (and use) the associated lock.
Alternatively, we could specify that sometimes-atomic objects cannot be larger than 8 byte, which is what MSVC can handle without locking.
>
> You could turn it around as well, and have such marked _Atomic and have
> explicit casts to their non-_Atomic cousins when operated upon by
> non-atomic functions. Not sure how realistic that is, since
> non-atomicity is the norm. All generic selection-based "functions" must
> take this into account.
>
> >>
> >>>> + unsigned int nr, int memory_order) \
> >>>> + { \
> >>>> + RTE_ASSERT(nr < size); \
> >>>> + \
> >>>> + const RTE_ATOMIC(uint ## size ## _t) *a_addr = \
> >>>> + (const RTE_ATOMIC(uint ## size ## _t) *)addr; \
> >>>> + uint ## size ## _t mask = (uint ## size ## _t)1 << nr; \
> >>>> + return rte_atomic_load_explicit(a_addr, memory_order) &
> >> mask; \
> >>>> + }
> >>>
> >>>
> >>> Similar considerations regarding volatile qualifier for the "once"
> >> operations.
> >>>
More information about the dev
mailing list