[PATCH 3/9] eal: use barrier intrinsics when compiling with msvc

Konstantin Ananyev konstantin.v.ananyev at yandex.ru
Mon Apr 10 22:02:00 CEST 2023


06/04/2023 01:07, Tyler Retzlaff пишет:
> On Wed, Apr 05, 2023 at 10:57:02AM +0000, Konstantin Ananyev wrote:
>>
>>>>>>> Inline assembly is not supported for msvc x64 instead use
>>>>>>> _{Read,Write,ReadWrite}Barrier() intrinsics.
>>>>>>>
>>>>>>> Signed-off-by: Tyler Retzlaff <roretzla at linux.microsoft.com>
>>>>>>> ---
>>>>>>>   lib/eal/include/generic/rte_atomic.h |  4 ++++
>>>>>>>   lib/eal/x86/include/rte_atomic.h     | 10 +++++++++-
>>>>>>>   2 files changed, 13 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/lib/eal/include/generic/rte_atomic.h b/lib/eal/include/generic/rte_atomic.h
>>>>>>> index 234b268..e973184 100644
>>>>>>> --- a/lib/eal/include/generic/rte_atomic.h
>>>>>>> +++ b/lib/eal/include/generic/rte_atomic.h
>>>>>>> @@ -116,9 +116,13 @@
>>>>>>>    * Guarantees that operation reordering does not occur at compile time
>>>>>>>    * for operations directly before and after the barrier.
>>>>>>>    */
>>>>>>> +#ifndef RTE_TOOLCHAIN_MSVC
>>>>>>>   #define	rte_compiler_barrier() do {		\
>>>>>>>   	asm volatile ("" : : : "memory");	\
>>>>>>>   } while(0)
>>>>>>> +#else
>>>>>>> +#define rte_compiler_barrier() _ReadWriteBarrier()
>>>>>>> +#endif
>>>>>>>
>>>>>>>   /**
>>>>>>>    * Synchronization fence between threads based on the specified memory order.
>>>>>>> diff --git a/lib/eal/x86/include/rte_atomic.h b/lib/eal/x86/include/rte_atomic.h
>>>>>>> index f2ee1a9..5cce9ba 100644
>>>>>>> --- a/lib/eal/x86/include/rte_atomic.h
>>>>>>> +++ b/lib/eal/x86/include/rte_atomic.h
>>>>>>> @@ -27,9 +27,13 @@
>>>>>>>
>>>>>>>   #define	rte_rmb() _mm_lfence()
>>>>>>>
>>>>>>> +#ifndef RTE_TOOLCHAIN_MSVC
>>>>>>>   #define rte_smp_wmb() rte_compiler_barrier()
>>>>>>> -
>>>>>>>   #define rte_smp_rmb() rte_compiler_barrier()
>>>>>>> +#else
>>>>>>> +#define rte_smp_wmb() _WriteBarrier()
>>>>>>> +#define rte_smp_rmb() _ReadBarrier()
>>>>>>> +#endif
>>>>>>>
>>>>>>>   /*
>>>>>>>    * From Intel Software Development Manual; Vol 3;
>>>>>>> @@ -66,11 +70,15 @@
>>>>>>>   static __rte_always_inline void
>>>>>>>   rte_smp_mb(void)
>>>>>>>   {
>>>>>>> +#ifndef RTE_TOOLCHAIN_MSVC
>>>>>>>   #ifdef RTE_ARCH_I686
>>>>>>>   	asm volatile("lock addl $0, -128(%%esp); " ::: "memory");
>>>>>>>   #else
>>>>>>>   	asm volatile("lock addl $0, -128(%%rsp); " ::: "memory");
>>>>>>>   #endif
>>>>>>> +#else
>>>>>>> +	rte_compiler_barrier();
>>>>>>> +#endif
>>>>>>
>>>>>> It doesn't look right to me: compiler_barrier is not identical to LOCK-ed operation,
>>>>>> and is not enough to serve as a proper memory barrier for SMP.
>>>>>
>>>>> i think i'm confused by the macro naming here.  i'll take another look
>>>>> thank you for raising it.
>>>>>
>>>>>>
>>>>>> Another ore generic comment - do we really need to pollute all that code with RTE_TOOLCHAIN_MSVC ifdefs?
>>>>>> Right now we have ability to have subdir per arch (x86/arm/etc.).
>>>>>> Can we treat x86+windows+msvc as a special arch?
>>>>>
>>>>> i asked this question previously and confirmed in the technical board
>>>>> meeting. the answer i received was that the community did not want new
>>>>> directory/headers introduced for compiler support matrix and i should
>>>>> use #ifdef in the existing headers.
>>>>
>>>> Ok, can I then ask at least to minimize number of ifdefs to absolute
>>>> minimum?
>>>
>>> in principal no objection at all, one question though is what to do with
>>> comment based documentation attached to macros? e.g.
>>>
>>> #ifdef SOME_FOO
>>> /* some documentation */
>>> #define some_macro
>>> #else
>>> #define some_macro
>>> #endif
>>>
>>> #ifdef SOME_FOO
>>> /* some documentation 2 */
>>> #define some_macro2
>>> #else
>>> #define some_macro2
>>> #endif
>>>
>>> i can either duplicate the documentation for every define so it stays
>>> "attached" or i can only document the first expansion. let me know what
>>> you expect.
>>>
>>> so something like this?
>>>
>>> #ifdef SOME_FOO
>>> /* some documentation */
>>> #define some_macro
>>> /* some documentation 2 */
>>> #define some_macro2
>>> #else
>>> #define some_macro
>>> #define some_macro2
>>> #endif
>>>
>>> or should all documentation be duplicated? which can become a teadious
>>> redundancy for anyone maintaining it. keep in mind we might have to make
>>> an exception for rte_common.h because it seems doing this would be
>>> really ugly there. take a look let me know.
>>
>> My personal preference would be to keep one documentation block for both cases
>> (yes, I suppose it needs to be updated if required):
>>
>> /* some documentation, probably for both SOME_FOO on/off */
>> #ifdef SOME_FOO
>> #define some_macro
>> #else
>> #define some_macro
>> #endif
>>
>>
>>>
>>>> It is really hard to read an follow acode that is heavily ifdefed.
>>>> Let say above we probably don't need to re-define
>>>> rte_smp_rmb/rte_smp_wmb, as both are boiled down to
>>>> compiler_barrier(), which is already redefined.
>>>
>>> can you take a look at v2 of the patch and re-prescribe your advise
>>> here? in v2 only the intel macros expand to the compiler barrier. though
>>> i find this vexing since as you pointed out it seems they aren't
>>> supposed to be compiler only barriers according to the documentation in
>>> generic/rte_atomic.h they are intended to be memory barriers.
>>
>> Commented, pls check if I explained my thoughts clear enough there.
>>   
>>>
>>> please help me if i've goofed up in this regard.
>>>
>>>> Another question - could it be visa-versa approach:
>>>> can we replace some inline assembly with common instincts whenever possible?
>>>
>>> msvc has only intrinsics and the conditional expansion for msvc is to
>>> use those intrinsics, gcc doesn't generally define intrinsics for processor
>>> specific code does it?
>>
>> AFAIK latest gcc (and clang) versions do support majority of these instincts: __rdtsc, _xbegin/_xend, etc.
>> So my thought was - might be we can use same instincts for all compilers...
>> One implication I can think about - older versions of gcc.
>> But might be we can re-order things and have inlines only for these oldere gcc versions?
> 
> i'm going to propose if we do this we do it as a separate change later.
> 
> i fear it could turn into the following dance which seems not a lot
> better given i'm sure some people will argue there is no benefit to
> removing inline assembly for gcc/clang. my preference is not to get
> side-tracked on refactoring with the short merge window.
> 
> #if (defined(__clang__) && clang version < x) ||
>      (defined(__GNUC__) && gcc version < x)
> __asm(whatever...
> #else
> __rdtsc()
> #endif


Played a bit with https://godbolt.org/.
It seems that __rdtsc()  and RMT builtins (xbegin/xend/xtest) are 
supported all way down to gcc 4.8.1.



More information about the dev mailing list