[PATCH 0/2] introduce PM QoS interface

lihuisong (C) lihuisong at huawei.com
Tue Mar 26 03:20:45 CET 2024


Hi Tyler,

在 2024/3/23 1:55, Tyler Retzlaff 写道:
> On Fri, Mar 22, 2024 at 04:54:01PM +0800, lihuisong (C) wrote:
>> +Tyler, +Alan, +Wei, +Long for asking this similar feature on Windows.
>>
>> 在 2024/3/21 21:30, Morten Brørup 写道:
>>>> From: lihuisong (C) [mailto:lihuisong at huawei.com]
>>>> Sent: Thursday, 21 March 2024 04.04
>>>>
>>>> Hi Moren,
>>>>
>>>> Thanks for your revew.
>>>>
>>>> 在 2024/3/20 22:05, Morten Brørup 写道:
>>>>>> From: Huisong Li [mailto:lihuisong at huawei.com]
>>>>>> Sent: Wednesday, 20 March 2024 11.55
>>>>>>
>>>>>> The system-wide CPU latency QoS limit has a positive impact on the idle
>>>>>> state selection in cpuidle governor.
>>>>>>
>>>>>> Linux creates a cpu_dma_latency device under '/dev' directory to obtain the
>>>>>> CPU latency QoS limit on system and send the QoS request for userspace.
>>>>>> Please see the PM QoS framework in the following link:
>>>>>> https://docs.kernel.org/power/pm_qos_interface.html?highlight=qos
>>>>>> This feature is supported by kernel-v2.6.25.
>>>>>>
>>>>>> The deeper the idle state, the lower the power consumption, but the longer
>>>>>> the resume time. Some service are delay sensitive and very except the low
>>>>>> resume time, like interrupt packet receiving mode.
>>>>>>
>>>>>> So this series introduce PM QoS interface.
>>>>> This looks like a 1:1 wrapper for a Linux kernel feature.
>>>> right
>>>>> Does Windows or BSD offer something similar?
>>>> How do we know Windows or BSD support this similar feature?
>>> Ask Windows experts or research using Google.
>> I download freebsd source code, I didn't find this similar feature.
>> They don't even support cpuidle feature(this QoS feature affects cpuilde.).
>> I don't find any useful about this on Windows from google.
>>
>>
>> @Tyler, @Alan, @Wei and @Long
>>
>> Do you know windows support that userspace read and send CPU latency
>> which has an impact on deep level of CPU idle?
> it is unlikely you'll find an api that let's you manage things in terms
> of raw latency values as the linux knobs here do. windows more often employs
> policy centric schemes to permit the system to abstract implementation detail.
>
> powercfg is probably the closest thing you can use to tune the same
> things on windows. where you select e.g. the 'performance' scheme but it
> won't allow you to pick specific latency numbers.
>
> https://learn.microsoft.com/en-us/windows-hardware/design/device-experiences/powercfg-command-line-options

Thanks for your feedback. I will take a look at this tool.

>
>>>> The DPDK power lib just work on Linux according to the meson.build under
>>>> lib/power.
>>>> If they support this features, they can open it.
>>> The DPDK power lib currently only works on Linux, yes.
>>> But its API should still be designed to be platform agnostic, so the functions can be implemented on other platforms in the future.
>>>
>>> DPDK is on track to work across multiple platforms, including Windows.
>>> We must always consider other platforms, and not design DPDK APIs as if they are for Linux/BSD only.
>> totally understand you.
> since lib/power isn't built for windows at this time i don't think it's
> appropriate to constrain your innovation. i do appreciate the engagement
> though and would just offer general guidance that if you can design your
> api with some kind of abstraction in mind that would be great and by all
> means if you can figure out how to wrangle powercfg /Qh into satisfying the
> api in a policy centric way it might be kind of nice.
Testing this by using powercfg on Windows creates a very challenge for me.
So I don't plan to do this on Windows. If you need, you can add it, ok?
>
> i'll let other windows experts chime in here if they choose.
>
> thanks!
>
>>>>> Furthermore, any high-res timing should use nanoseconds, not microseconds or
>>>> milliseconds.
>>>>> I realize that the Linux kernel only uses microseconds for these APIs, but
>>>> the DPDK API should use nanoseconds.
>>>> Nanoseconds is more precise, it's good.
>>>> But DPDK API how use nanoseconds as you said the the Linux kernel only
>>>> uses microseconds for these APIs.
>>>> Kernel interface just know an integer value with microseconds unit.
>>> One solution is to expose nanoseconds in the DPDK API, and in the Linux specific implementation convert from/to microseconds.
>> If so, we have to modify the implementation interface on Linux. This
>> change the input/output unit about the interface.
>> And DPDK also has to do this based on kernel version. It is not good.
>> The cpuidle governor select which idle state based on the worst-case
>> latency of idle state.
>> These the worst-case latency of Cstate reported by ACPI table is in
>> microseconds as the section 8.4.1.1. _CST (C States) and 8.4.3.3.
>> _LPI (Low Power Idle States) in ACPI spec [1].
>> So it is probably not meaning to change this interface implementation.
>>
>> For the case need PM QoS in DPDK, I think, it is better to set cpu
>> latency to zero to prevent service thread from the deeper the idle
>> state.
>>> You might also want to add a note to the in-line documentation of the relevant functions that the Linux implementation only uses microsecond resolution.
>>>
>> [1] https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html
> .


More information about the dev mailing list