[dpdk-dev] QoS grinder vs pipe wrr_tokens

Dumitrescu, Cristian cristian.dumitrescu at intel.com
Wed Jun 8 17:23:32 CEST 2016

Hi Alexey,

The WRR context is compressed to use less memory footprint in order to fit the entire pipe run-time context (struct rte_sched_pipe) into a single cache line for performance reasons. Basically we trade WRR accuracy for performance.

For some typical Telco use-cases, the WRR/WFQ accuracy for the traffic class queues is not that important, as usually the traffic class queue weight ratio is big, e.g. 1:4:20. Basically, whether the actual observed ratio at run-time is 1:4:20 or 1:5:18 or 1:3:22 is not that important, as the intention really is to source most of the traffic from the queue with the largest weight, some traffic from the queue with the medium weight and not starve the lowest weight queue; this mode is very similar to strict priority between traffic class queues, with the exception that the lowest priority queues are not starved for long time.

When WRR accuracy is more important than performance, this operation should be disabled.


> -----Original Message-----
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Alexey
> Bogdanenko
> Sent: Tuesday, June 7, 2016 8:29 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] QoS grinder vs pipe wrr_tokens
> Hello,
> I have a question regarding QoS grinder implementation, specifically,
> about the way queue WRR tokens are copied from pipe to grinder and back.
> First, rte_sched_grinder uses uint16_t and rte_sched_pipe uses uint8_t
> to represent wrr_tokens. Second, instead of just copying the tokens, we
> shift bits by RTE_SCHED_WRR_SHIFT.
> What does it accomplish? Can it lead to lower scheduler accuracy due to
> a round-off error?
> version: v16.04
> Thanks,
> Alexey Bogdanenko

More information about the dev mailing list