[dpdk-dev] [PATCH v6 05/10] power: add PMD power management API and callback
Burakov, Anatoly
anatoly.burakov at intel.com
Thu Oct 15 12:31:54 CEST 2020
On 14-Oct-20 7:41 PM, Ananyev, Konstantin wrote:
>> From: Liang Ma <liang.j.ma at intel.com>
>>
>> Add a simple on/off switch that will enable saving power when no
>> packets are arriving. It is based on counting the number of empty
>> polls and, when the number reaches a certain threshold, entering an
>> architecture-defined optimized power state that will either wait
>> until a TSC timestamp expires, or when packets arrive.
>>
>> This API mandates a core-to-single-queue mapping (that is, multiple
>> queued per device are supported, but they have to be polled on different
>> cores).
>>
>> This design is using PMD RX callbacks.
>>
>> 1. UMWAIT/UMONITOR:
>>
>> When a certain threshold of empty polls is reached, the core will go
>> into a power optimized sleep while waiting on an address of next RX
>> descriptor to be written to.
>>
>> 2. Pause instruction
>>
>> Instead of move the core into deeper C state, this method uses the
>> pause instruction to avoid busy polling.
>>
>> 3. Frequency scaling
>> Reuse existing DPDK power library to scale up/down core frequency
>> depending on traffic volume.
>>
>> Signed-off-by: Liang Ma <liang.j.ma at intel.com>
>> Signed-off-by: Anatoly Burakov <anatoly.burakov at intel.com>
>> ---
>>
>> Notes:
>> v6:
>> - Added wakeup mechanism for UMWAIT
>> - Removed memory allocation (everything is now allocated statically)
>> - Fixed various typos and comments
>> - Check for invalid queue ID
>> - Moved release notes to this patch
>>
>> v5:
>> - Make error checking more robust
>> - Prevent initializing scaling if ACPI or PSTATE env wasn't set
>> - Prevent initializing UMWAIT path if PMD doesn't support get_wake_addr
>> - Add some debug logging
>> - Replace x86-specific code path to generic path using the intrinsic check
>>
<snip>
>
>
> I think you need to check state here, and _disable() have to set state with lock grabbed.
> Otherwise this lock wouldn't protect you from race conditions.
> As an example:
>
> CP at T0:
> rte_spinlock_lock(&queue_cfg->umwait_lock);
> if (queue_cfg->wait_addr != NULL) //wait_addr == NULL, fallthrough
> rte_spinlock_unlock(&queue_cfg->umwait_lock);
>
> DP at T1:
> rte_spinlock_lock(&queue_cfg->umwait_lock);
> queue_cfg->wait_addr = target_addr;
> monitor_sync(...); // DP was put to sleep
>
> CP at T2:
> queue_cfg->cur_cb = NULL;
> queue_cfg->pwr_mgmt_state = PMD_MGMT_DISABLED;
> ret = 0;
>
> rte_power_pmd_mgmt_queue_disable() finished with success,
> but DP core wasn't wokenup.
>
> To be more specific:
> clb_umwait(...) {
> ...
> lock(&qcfg->lck);
> if (qcfg->state == ENABLED) {
> qcfg->wake_addr = addr;
> monitor_sync(addr, ...,&qcfg->lck);
> }
> unlock(&qcfg->lck);
> ...
> }
>
> _disable(...) {
> ...
> lock(&qcfg->lck);
> qcfg->state = DISABLED;
> if (qcfg->wake_addr != NULL)
> monitor_wakeup(qcfg->wake_addr);
> unlock(&qcfg->lock);
> ...
> }
True, didn't think of that. Will fix.
>> +
>> +if (!i.power_monitor) {
>> +RTE_LOG(DEBUG, POWER, "Monitoring intrinsics are not supported\n");
>> +ret = -ENOTSUP;
>> +goto end;
>> +}
>> +
>> +/* check if the device supports the necessary PMD API */
>> +if (rte_eth_get_wake_addr(port_id, queue_id,
>> +&dummy_addr, &dummy_expected,
>> +&dummy_mask, &dummy_sz) == -ENOTSUP) {
>> +RTE_LOG(DEBUG, POWER, "The device does not support rte_eth_rxq_ring_addr_get\n");
>> +ret = -ENOTSUP;
>> +goto end;
>> +}
>> +/* initialize UMWAIT spinlock */
>> +rte_spinlock_init(&queue_cfg->umwait_lock);
>
> I think don't need to do that.
> It supposed to be in valid state (otherwise you are probably in trouble anyway).
This is mostly for initialization, for when we first run the callback. I
suppose we could do it in an RTE_INIT() function or just leave it be
since the spinlocks are part of a statically allocated structure and
will default to 0 anyway (although it wouldn't be proper usage of the
API as that would be relying on implementation detail).
>
>> +
>> +/* initialize data before enabling the callback */
>> +queue_cfg->empty_poll_stats = 0;
>> +queue_cfg->cb_mode = mode;
>> +queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
>> +
>> +queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id, queue_id,
>> +clb_umwait, NULL);
>
> Would be a bit cleaner/nicer to move add_rx_callback out of switch() {}
> As you have to do it always anyway.
> Same thought for disable() and remove_rx_callback().
The functions are different for each, so we can't move them out of
switch (unless you're suggesting to unify the callback to handle all
three modes?).
--
Thanks,
Anatoly
More information about the dev
mailing list