[dpdk-dev] [PATCH v1 5/7] power: support callbacks for multiple Rx queues
Ananyev, Konstantin
konstantin.ananyev at intel.com
Tue Jun 22 11:41:11 CEST 2021
> Currently, there is a hard limitation on the PMD power management
> support that only allows it to support a single queue per lcore. This is
> not ideal as most DPDK use cases will poll multiple queues per core.
>
> The PMD power management mechanism relies on ethdev Rx callbacks, so it
> is very difficult to implement such support because callbacks are
> effectively stateless and have no visibility into what the other ethdev
> devices are doing. This places limitations on what we can do within the
> framework of Rx callbacks, but the basics of this implementation are as
> follows:
>
> - Replace per-queue structures with per-lcore ones, so that any device
> polled from the same lcore can share data
> - Any queue that is going to be polled from a specific lcore has to be
> added to the list of cores to poll, so that the callback is aware of
> other queues being polled by the same lcore
> - Both the empty poll counter and the actual power saving mechanism is
> shared between all queues polled on a particular lcore, and is only
> activated when a special designated "power saving" queue is polled. To
> put it another way, we have no idea which queue the user will poll in
> what order, so we rely on them telling us that queue X is the last one
> in the polling loop, so any power management should happen there.
> - A new API is added to mark a specific Rx queue as "power saving".
Honestly, I don't understand the logic behind that new function.
I understand that depending on HW we ca monitor either one or multiple queues.
That's ok, but why we now need to mark one queue as a 'very special' one?
Why can't rte_power_ethdev_pmgmt_queue_enable() just:
Check is number of monitored queues exceed HW/SW capabilities,
and if so then just return a failure.
Otherwise add queue to the list and treat them all equally, i.e:
go to power save mode when number of sequential empty polls on
all monitored queues will exceed EMPTYPOLL_MAX threshold?
> Failing to call this API will result in no power management, however
> when having only one queue per core it is obvious which queue is the
> "power saving" one, so things will still work without this new API for
> use cases that were previously working without it.
> - The limitation on UMWAIT-based polling is not removed because UMWAIT
> is incapable of monitoring more than one address.
>
> Signed-off-by: Anatoly Burakov <anatoly.burakov at intel.com>
> ---
> lib/power/rte_power_pmd_mgmt.c | 335 ++++++++++++++++++++++++++-------
> lib/power/rte_power_pmd_mgmt.h | 34 ++++
> lib/power/version.map | 3 +
> 3 files changed, 306 insertions(+), 66 deletions(-)
>
> diff --git a/lib/power/rte_power_pmd_mgmt.c b/lib/power/rte_power_pmd_mgmt.c
> index 0707c60a4f..60dd21a19c 100644
> --- a/lib/power/rte_power_pmd_mgmt.c
> +++ b/lib/power/rte_power_pmd_mgmt.c
> @@ -33,7 +33,19 @@ enum pmd_mgmt_state {
> PMD_MGMT_ENABLED
> };
>
> -struct pmd_queue_cfg {
> +struct queue {
> + uint16_t portid;
> + uint16_t qid;
> +};
Just a thought: if that would help somehow, it can be changed to:
union queue {
uint32_t raw;
struct { uint16_t portid, qid;
};
};
That way in queue find/cmp functions below you can operate with single raw 32-bt values.
Probably not that important, as all these functions are on slow path, but might look nicer.
> +struct pmd_core_cfg {
> + struct queue queues[RTE_MAX_ETHPORTS];
If we'll have ability to monitor multiple queues per lcore, would it be always enough?
>From other side, it is updated on control path only.
Wouldn't normal list with malloc(/rte_malloc) would be more suitable here?
> + /**< Which port-queue pairs are associated with this lcore? */
> + struct queue power_save_queue;
> + /**< When polling multiple queues, all but this one will be ignored */
> + bool power_save_queue_set;
> + /**< When polling multiple queues, power save queue must be set */
> + size_t n_queues;
> + /**< How many queues are in the list? */
> volatile enum pmd_mgmt_state pwr_mgmt_state;
> /**< State of power management for this queue */
> enum rte_power_pmd_mgmt_type cb_mode;
> @@ -43,8 +55,97 @@ struct pmd_queue_cfg {
> uint64_t empty_poll_stats;
> /**< Number of empty polls */
> } __rte_cache_aligned;
> +static struct pmd_core_cfg lcore_cfg[RTE_MAX_LCORE];
>
> -static struct pmd_queue_cfg port_cfg[RTE_MAX_ETHPORTS][RTE_MAX_QUEUES_PER_PORT];
> +static inline bool
> +queue_equal(const struct queue *l, const struct queue *r)
> +{
> + return l->portid == r->portid && l->qid == r->qid;
> +}
> +
> +static inline void
> +queue_copy(struct queue *dst, const struct queue *src)
> +{
> + dst->portid = src->portid;
> + dst->qid = src->qid;
> +}
> +
> +static inline bool
> +queue_is_power_save(const struct pmd_core_cfg *cfg, const struct queue *q) {
Here and in other places - any reason why standard DPDK coding style is not used?
> + const struct queue *pwrsave = &cfg->power_save_queue;
> +
> + /* if there's only single queue, no need to check anything */
> + if (cfg->n_queues == 1)
> + return true;
> + return cfg->power_save_queue_set && queue_equal(q, pwrsave);
> +}
> +
More information about the dev
mailing list