[dpdk-users] scheduler issue

Alex Kiselev alex at therouter.net
Sat Dec 12 01:45:23 CET 2020


On 2020-12-12 01:20, Singh, Jasvinder wrote:
>> On 11 Dec 2020, at 23:37, Alex Kiselev <alex at therouter.net> wrote:
> 
>> On 2020-12-11 23:55, Singh, Jasvinder wrote:
>> On 11 Dec 2020, at 22:27, Alex Kiselev <alex at therouter.net> wrote:
> 
>>> On 2020-12-11 23:06, Singh, Jasvinder wrote:
> 
>> On 11 Dec 2020, at 21:29, Alex Kiselev <alex at therouter.net> wrote:
> 
>> On 2020-12-08 14:24, Singh, Jasvinder wrote:
> 
>> <snip>
> 
>>> [JS] now, returning to 1 mbps pipes situation, try reducing tc
>> period
> 
>>> first at subport and then at  pipe level, if that help in getting
>> even
> 
>>> traffic across low bandwidth pipes.
> 
>> reducing subport tc from 10 to 5 period also solved the problem
>> with 1
> 
>> Mbit/s pipes.
> 
>> so, my second problem has been solved,
> 
>> but the first one with some of low bandwidth pipes stop
>> transmitting still
> 
>> remains.
> 
>> I see, try removing "pkt_len <= pipe_tc_ov_credits" condition in
>> the
> 
>> grinder_credits_check() code for oversubscription case, instead use
> 
>> this pkt_len <= pipe_tc_credits + pipe_tc_ov_credits;
> 
>> if I do what you suggest, I will get this code
> 
>> enough_credits = (pkt_len <= subport_tb_credits) &&
> 
>> (pkt_len <= subport_tc_credits) &&
> 
>> (pkt_len <= pipe_tb_credits) &&
> 
>> (pkt_len <= pipe_tc_credits) &&
> 
>> (pkt_len <= pipe_tc_credits + pipe_tc_ov_credits);
> 
>> And this doesn't make sense since if condition pkt_len <=
>> pipe_tc_credits is true
> 
>> then condition (pkt_len <= pipe_tc_credits + pipe_tc_ov_credits) is
>> also always true.
> 
>> [JS] my suggestion is to remove“pkt_len <= pipe_tc_credits“,
>> “ pkt_len
> 
>> <= pipe_tc_ov_credits”and use only “pkt_len <= pipe_tc_credits
>> +
> 
>> pipe_tc_ov_credits“
> 
>> While keeping tc_ov flag on.
> 
>> Your suggestion just turns off TC_OV feature.
> 
>>> I don't see your point.
> 
>>> This new suggestion will also effectively turn off the TC_OV
>>> feature since
> 
>>> the only effect of enabling TC_OV is adding additional condition
> 
>>> pkt_len <= pipe_tc_ov_credits
> 
>>> which doesn't allow a pipe to spend more resources than it should.
> 
>>> And in the case of support congestion a pipe should spent less
>>> than %100 of pipe's maximum rate.
> 
>>> And you suggest to allow pipe to spend 100% of it's rate plus some
>>> extra.
> 
>>> I guess effect of this would even more unfair support's bandwidth
>>> distibution.
> 
>>> Btw, a pipe might stop transmitting even when there is no
>>> congestion at a subport.
> 
>> Although I didn’t try this solution but the idea here is - in a
> 
>> particular round, of pkt_len is less than pipe_tc_credits( which is
>> a
> 
>> constant value each time) but greater than pipe_tc_ov_credits, then
>> it
> 
>> might hit the situation when no packet will be scheduled from the
>> pipe
> 
>> even though there are fixed credits greater than packet size is
> 
>> available.
> 
> But that is a perfectly normal situation and that's exactly the idea
> behind TC_OV.
> It means a pipe should wait for the next subport->tc_ov_period_id
> when pipe_tc_ov_credits will be reset to a new value
> 
> But here it’s not guaranteed that new value of pipe_tc_ov_credits
> will be sufficient for low bandwidth pipe to send their packets as
> each time pipe_tc_ov_credits is freshly computed.
> 
>> pipe->tc_ov_credits = subport->tc_ov_wm * params->tc_ov_weight;
>> 
>> which allows the pipe to continue transmitting.
> 
> No that won’t happen if new tc_ov_credits value is again less than
> pkt_len and will hit deadlock.

new tc_ov_credits can't not be less than subport->tc_ov_wm_min,
and tc_ov_wm_min is equal to port->mtu.
all my scheduler ports configured with mtu 1522. etherdev ports also 
uses
the same mtu, therefore there should be no packets bigger that 1522.

Maybe I should increase port's MTU? to 1540?

> 
>> And it could not cause a permanent pipe stop which is what I am
>> facing.
> 
>>> In fairness, pipe should send the as much as packets which
>> 
>>> consumes pipe_tc_credits, regardless of extra pipe_tc_ov_credits
>>> which
>> 
>>> is extra on top of pipe_tc_credits.
>> 
>> I think it's quite the opposite. That's why after I reduced the
>> support tc_period
>> I got much more fairness. Since reducing subport tc_period also
>> reduce the tc_ov_wm_max value.
>> s->tc_ov_wm_max = rte_sched_time_ms_to_bytes(params->tc_period,
>> port->pipe_tc3_rate_max)
>> as a result a pipe transmits less bytes in one round. so pipe
>> rotation inside a grinder
>> happens much more often and a pipe can't monopolise resources.
>> 
>> in other sos implementation this is called "quantum".
> 
> Yes, so reducing tc period makes the case when all pipes ( high n low
> bandwidth) gets lower values of  tc_ov_credits  values which allow
> lesser transmission from higher bw pipes and leave bandwidth for low
> bw pipes. So, here is the thing- Either tune tc period to a value
> which prevent high bw pipe hogging most of bw or makes changes in the
> code, where oversubscription add extra credits on top of guaranteed.
> 
> One question, don’t your low bw pipes have higher priority traffic
> tc0, tc1, tc2 . Packets from those tc must be going out. Isn’t this
> the case ?

well, it would be the case after I find out
what's going on. Right now I am using a tos2tc map configured
in such a way that all ipv4 packets with any TOS values
goes into TC3.

> 
>>> 
> 
>>> 
> 
>>> 
> 
>>>>> rcv 0   rx rate 7324160 nb pkts 5722
> 
>>>>> rcv 1   rx rate 7281920 nb pkts 5689
> 
>>>>> rcv 2   rx rate 7226880 nb pkts 5646
> 
>>>>> rcv 3   rx rate 7124480 nb pkts 5566
> 
>>>>> rcv 4   rx rate 7324160 nb pkts 5722
> 
>>>>> rcv 5   rx rate 7271680 nb pkts 5681
> 
>>>>> rcv 6   rx rate 7188480 nb pkts 5616
> 
>>>>> rcv 7   rx rate 7150080 nb pkts 5586
> 
>>>>> rcv 8   rx rate 7328000 nb pkts 5725
> 
>>>>> rcv 9   rx rate 7249920 nb pkts 5664
> 
>>>>> rcv 10  rx rate 7188480 nb pkts 5616 rcv 11  rx rate 7179520 nb
>> pkts
> 
>>>>> 5609 rcv 12  rx rate 7324160 nb pkts 5722 rcv 13  rx rate
>> 7208960 nb
> 
>>>>> pkts 5632 rcv 14  rx rate 7152640 nb pkts 5588 rcv 15  rx rate
> 
>>>>> 7127040 nb pkts 5568 rcv 16  rx rate 7303680 nb pkts 5706 ....
> 
>>>>> rcv 587 rx rate 2406400 nb pkts 1880 rcv 588 rx rate 2406400 nb
>> pkts
> 
>>>>> 1880 rcv 589 rx rate 2406400 nb pkts 1880 rcv 590 rx rate
>> 2406400 nb
> 
>>>>> pkts 1880 rcv 591 rx rate 2406400 nb pkts 1880 rcv 592 rx rate
> 
>>>>> 2398720 nb pkts 1874 rcv 593 rx rate 2400000 nb pkts 1875 rcv
>> 594 rx
> 
>>>>> rate 2400000 nb pkts 1875 rcv 595 rx rate 2400000 nb pkts 1875
>> rcv
> 
>>>>> 596 rx rate 2401280 nb pkts 1876 rcv 597 rx rate 2401280 nb
>> pkts
> 
>>>>> 1876 rcv 598 rx rate 2401280 nb pkts 1876 rcv 599 rx rate
>> 2402560 nb
> 
>>>>> pkts 1877 rx rate sum 3156416000
> 
>>>> 
> 
>>>> 
> 
>>>> 
> 
>>>>>>> ... despite that there is _NO_ congestion...
> 
>>>>>>> congestion at the subport or pipe.
> 
>>>>>>>> And the subport !! doesn't use about 42 mbit/s of available
> 
>>>>>>>> bandwidth.
> 
>>>>>>>> The only difference is those test configurations is TC of
> 
>>>>>>>> generated traffic.
> 
>>>>>>>> Test 1 uses TC 1 while test 2 uses TC 3 (which is use TC_OV
> 
>>>>>>>> function).
> 
>>>>>>>> So, enabling TC_OV changes the results dramatically.
> 
>>>>>>>> ##
> 
>>>>>>>> ## test1
> 
>>>>>>>> ##
> 
>>>>>>>> hqos add profile  7 rate    2 M size 1000000 tc period 40
> 
>>>>>>>> # qos test port
> 
>>>>>>>> hqos add port 1 rate 10 G mtu 1522 frame overhead 24 queue
>> sizes
> 
>>>>>>>> 64 64 64 64
> 
>>>>>>>> hqos add port 1 subport 0 rate 300 M size 1000000 tc period
>> 10
> 
>>>>>>>> hqos add port 1 subport 0 pipes 2000 profile 7 hqos add port
>> 1
> 
>>>>>>>> subport 0 pipes 200 profile 23 hqos set port 1 lcore 3 port
>> 1
> 
>>>>>>>> subport rate 300 M number of tx flows 300 generator tx rate
>> 1M TC
> 
>>>>>>>> 1 ...
> 
>>>>>>>> rcv 284 rx rate 995840  nb pkts 778 rcv 285 rx rate 995840
>> nb
> 
>>>>>>>> pkts 778 rcv 286 rx rate 995840  nb pkts 778 rcv 287 rx rate
> 
>>>>>>>> 995840  nb pkts 778 rcv 288 rx rate 995840  nb pkts 778 rcv
>> 289
> 
>>>>>>>> rx rate 995840  nb pkts 778 rcv 290 rx rate 995840  nb pkts
>> 778
> 
>>>>>>>> rcv 291 rx rate 995840  nb pkts 778 rcv 292 rx rate 995840
>> nb
> 
>>>>>>>> pkts 778 rcv 293 rx rate 995840  nb pkts 778 rcv 294 rx rate
> 
>>>>>>>> 995840  nb pkts 778 ...
> 
>>>>>>>> sum pipe's rx rate is 298 494 720 OK.
> 
>>>>>>>> The subport rate is equally distributed to 300 pipes.
> 
>>>>>>>> ##
> 
>>>>>>>> ##  test 2
> 
>>>>>>>> ##
> 
>>>>>>>> hqos add profile  7 rate    2 M size 1000000 tc period 40
> 
>>>>>>>> # qos test port
> 
>>>>>>>> hqos add port 1 rate 10 G mtu 1522 frame overhead 24 queue
>> sizes
> 
>>>>>>>> 64 64 64 64
> 
>>>>>>>> hqos add port 1 subport 0 rate 300 M size 1000000 tc period
>> 10
> 
>>>>>>>> hqos add port 1 subport 0 pipes 2000 profile 7 hqos add port
>> 1
> 
>>>>>>>> subport 0 pipes 200 profile 23 hqos set port 1 lcore 3 port
>> 1
> 
>>>>>>>> subport rate 300 M number of tx flows 300 generator tx rate
>> 1M TC
> 
>>>>>>>> 3
> 
>>>>>>>> h5 ~ # rcli sh qos rcv
> 
>>>>>>>> rcv 0   rx rate 875520  nb pkts 684
> 
>>>>>>>> rcv 1   rx rate 856320  nb pkts 669
> 
>>>>>>>> rcv 2   rx rate 849920  nb pkts 664
> 
>>>>>>>> rcv 3   rx rate 853760  nb pkts 667
> 
>>>>>>>> rcv 4   rx rate 867840  nb pkts 678
> 
>>>>>>>> rcv 5   rx rate 844800  nb pkts 660
> 
>>>>>>>> rcv 6   rx rate 852480  nb pkts 666
> 
>>>>>>>> rcv 7   rx rate 855040  nb pkts 668
> 
>>>>>>>> rcv 8   rx rate 865280  nb pkts 676
> 
>>>>>>>> rcv 9   rx rate 846080  nb pkts 661
> 
>>>>>>>> rcv 10  rx rate 858880  nb pkts 671 rcv 11  rx rate 870400
>> nb
> 
>>>>>>>> pkts 680 rcv 12  rx rate 864000  nb pkts 675 rcv 13  rx rate
> 
>>>>>>>> 852480  nb pkts 666 rcv 14  rx rate 855040  nb pkts 668 rcv
>> 15
> 
>>>>>>>> rx rate 857600  nb pkts 670 rcv 16  rx rate 864000  nb pkts
>> 675
> 
>>>>>>>> rcv 17  rx rate 866560  nb pkts 677 rcv 18  rx rate 865280
>> nb
> 
>>>>>>>> pkts 676 rcv 19  rx rate 858880  nb pkts 671 rcv 20  rx rate
> 
>>>>>>>> 856320  nb pkts 669 rcv 21  rx rate 864000  nb pkts 675 rcv
>> 22
> 
>>>>>>>> rx rate 869120  nb pkts 679 rcv 23  rx rate 856320  nb pkts
>> 669
> 
>>>>>>>> rcv 24  rx rate 862720  nb pkts 674 rcv 25  rx rate 865280
>> nb
> 
>>>>>>>> pkts 676 rcv 26  rx rate 867840  nb pkts 678 rcv 27  rx rate
> 
>>>>>>>> 870400  nb pkts 680 rcv 28  rx rate 860160  nb pkts 672 rcv
>> 29
> 
>>>>>>>> rx rate 870400  nb pkts 680 rcv 30  rx rate 869120  nb pkts
>> 679
> 
>>>>>>>> rcv 31  rx rate 870400  nb pkts 680 rcv 32  rx rate 858880
>> nb
> 
>>>>>>>> pkts 671 rcv 33  rx rate 858880  nb pkts 671 rcv 34  rx rate
> 
>>>>>>>> 852480  nb pkts 666 rcv 35  rx rate 874240  nb pkts 683 rcv
>> 36
> 
>>>>>>>> rx rate 855040  nb pkts 668 rcv 37  rx rate 853760  nb pkts
>> 667
> 
>>>>>>>> rcv 38  rx rate 869120  nb pkts 679 rcv 39  rx rate 885760
>> nb
> 
>>>>>>>> pkts 692 rcv 40  rx rate 861440  nb pkts 673 rcv 41  rx rate
> 
>>>>>>>> 852480  nb pkts 666 rcv 42  rx rate 871680  nb pkts 681 ...
> 
>>>>>>>> ...
> 
>>>>>>>> rcv 288 rx rate 766720  nb pkts 599 rcv 289 rx rate 766720
>> nb
> 
>>>>>>>> pkts 599 rcv 290 rx rate 766720  nb pkts 599 rcv 291 rx rate
> 
>>>>>>>> 766720  nb pkts 599 rcv 292 rx rate 762880  nb pkts 596 rcv
>> 293
> 
>>>>>>>> rx rate 762880  nb pkts 596 rcv 294 rx rate 762880  nb pkts
>> 596
> 
>>>>>>>> rcv 295 rx rate 760320  nb pkts 594 rcv 296 rx rate 604160
>> nb
> 
>>>>>>>> pkts 472 rcv 297 rx rate 604160  nb pkts 472 rcv 298 rx rate
> 
>>>>>>>> 604160  nb pkts 472 rcv 299 rx rate 604160  nb pkts 472 rx
>> rate
> 
>>>>>>>> sum 258839040 FAILED.
> 
>>>>>>>> The subport rate is distributed NOT equally between 300
>> pipes.
> 
>>>>>>>> Some subport bandwith (about 42) is not being used!


More information about the users mailing list