[dpdk-dev] [PATCH] librte_eal:Using compiler memory barrier for IA processor's rte_wmb/rte_rmb.

Wang Dong dong.wang.pro at hotmail.com
Thu May 7 17:28:26 CEST 2015


Hi Konstantin,

> Hi Dong,
>
>> -----Original Message-----
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of WangDong
>> Sent: Tuesday, May 05, 2015 4:38 PM
>> To: dev at dpdk.org
>> Subject: [dpdk-dev] [PATCH] librte_eal:Using compiler memory barrier for IA processor's rte_wmb/rte_rmb.
>>
>> The current implementation of rte_wmb/rte_rmb for x86 is using processor memory barrier. It's unnessary for IA processor, compiler
>> memory barrier is enough.
>
> I wouldn't say they are 'unnecessary'.
> There are situations, even on IA, when you need _fence_ isntructions.
> So, please leave rte_*mb() macros unmodified.
OK, leave them unmodified, but I really can't find a situation to use 
sfence and lfence instructions.


> I still think that we need to create a new set of architecture dependent macros, as what discussed before.
> Probably by analogy with linux kernel rte_smp_*mb() is a good name for them.
> Though if you have some better name in mind, I am open to suggestions here.
What abount rte_dma_*mb()? I find dma_*mb() in linux-4.0.1, it looks good~~

>
>> But if dpdk runing on a AMD processor, maybe we should use processor memory barrier.
>
> As far as I remember, amd has the same memory ordering model.
It's too hard to find a AMD's software developer manual.....

Dong

> So, I don't think we need  #ifdef RTE_ARCH_X86_IA here.
>
> Konstantin
>
>> I add a macro to distinguish them, if we compile DPDK for IA processor, add the macro (RTE_ARCH_X86_IA) can improve performance
>> with compiler memory barrier. Or we can add RTE_ARCH_X86_AMD for using processor memory barrier, in this case, if didn't add the
>> macro, the memory ordering will not be guaranteed. Which macro is better?
>> If this patch applied, the PMD's old implementation of compiler memory barrier (some volatile variable) can be fixed with rte_rmb()
>> and rte_wmb() for any architecture.
>>
>> ---
>>   lib/librte_eal/common/include/arch/x86/rte_atomic.h | 10 ++++++++++
>>   1 file changed, 10 insertions(+)
>>
>> diff --git a/lib/librte_eal/common/include/arch/x86/rte_atomic.h b/lib/librte_eal/common/include/arch/x86/rte_atomic.h
>> index e93e8ee..52b1e81 100644
>> --- a/lib/librte_eal/common/include/arch/x86/rte_atomic.h
>> +++ b/lib/librte_eal/common/include/arch/x86/rte_atomic.h
>> @@ -49,10 +49,20 @@ extern "C" {
>>
>>   #define	rte_mb() _mm_mfence()
>>
>> +#ifdef RTE_ARCH_X86_IA
>> +
>> +#define rte_wmb() rte_compiler_barrier()
>> +
>> +#define rte_rmb() rte_compiler_barrier()
>> +
>> +#else
>> +
>>   #define	rte_wmb() _mm_sfence()
>>
>>   #define	rte_rmb() _mm_lfence()
>>
>> +#endif
>> +
>>   /*------------------------- 16 bit atomic operations -------------------------*/
>>
>>   #ifndef RTE_FORCE_INTRINSICS
>> --
>> 1.9.1
>


More information about the dev mailing list