[EXTERNAL] [PATCH v2 2/2] eal: add Arm WFET in power management intrinsics
    Honnappa Nagarahalli 
    Honnappa.Nagarahalli at arm.com
       
    Sun Jul  7 19:37:18 CEST 2024
    
    
  
> On Jul 5, 2024, at 5:10 PM, Pavan Nikhilesh Bhagavatula <pbhagavatula at marvell.com> wrote:
> 
>> 04/07/2024 16:55, Stephen Hemminger:
>>> On Thu, 04 Jul 2024 16:14:42 +0200
>>> Thomas Monjalon <thomas at monjalon.net> wrote:
>>> 
>>>>>> Let’s ask Pavan why this flag is used in cn10k driver.
>>>>>> 
>>>>>> From our perspective, WFE is available on all the supported arm
>> platforms in
>>>>>> DPDK.
>>>>>> Therefore, RTE_ARM_USE_WFE should be treated as a flag to choose
>> between
>>>>>> WFE
>>>>>> and non-WFE code paths due to performance reasons rather than as a
>> flag
>>>>>> that indicates
>>>>>> the availability of the instruction on the target CPU.
>>>>>> 
>>>>> 
>>>>> We are using this flag to allow application to choose between WFE and
>> non-WFE code path.
>>>>> The non-WFE path performs slightly better.
>>>> 
>>>> What's the benefit of the WFE path then?
>>> 
>>> WFE saves power at the expense of latency.
>> 
>> Yes maybe there is a misunderstanding.
>> Pavan can you confirm you were saying "throughput is better on non-WFE"?
>> but "power consumption is lower on WFE path"?
>> 
> 
> Yes, throughput is better on non-WFE and power consumption is lower on WFE path.
> 
> But the statement cant be generalized for all use-cases, it depends on lot of factors.
> So, we use RTE_ARM_USE_WFE to allow applications to decide what they want.
When WFE was enabled in DPDK, it was introduced in spinlock, ticket lock, ring etc. We ran the relevant micro-benchmarks and realized that with WFE the performance was lower. Hence it was added under a flag to allow the user to choose the feature (not as a way to say that the feature is present in the CPU).
IMO, we should not use this flag for PMD power savings. In PMD, use of WFE is purely for power savings and not performance. IIRC, there is already code and enough configurable parameters available that control when the PMD calls WFE (equivalent in other architectures). So, there is no need of a compile time flag for this. 
> 
>>> Maybe some form of hybrid approach would work best and could
>>> be always used.
>>> 
>>> For example, many implementations of mutex do a short spin poll
>>> then fall back to a waiting primitive (like futex).
> 
> This is already done across cnxk drivers and common layer I believe.
> 
>> 
> 
> 
    
    
More information about the dev
mailing list