[dpdk-dev] Fwd: high latency detected in IP pipeline example

Victor Huertas vhuertas at gmail.com
Tue Feb 18 10:49:51 CET 2020


Dear James,

I have done two different tests with the following configuration:
[PIPELINE 0 MASTER core =0]
[PIPELINE 1 core=1] --- SWQ1--->[PIPELINE 2 core=1] -----SWQ2---->
[PIPELINE 3 core=1]

The first test (sending a single ping to cross all the pipelines to measure
RTT) has been done by setting the burst_write to 32 in SWQ1 and SWQ2. NOTE:
All the times we use rte_ring_enqueue_burst in the pipelines 1 and 2 we set
the number of packets to write to 1.

The result of this first test is as shown subsquently:
64 bytes from 192.168.0.101: icmp_seq=343 ttl=63 time=59.8 ms
64 bytes from 192.168.0.101: icmp_seq=344 ttl=63 time=59.4 ms
64 bytes from 192.168.0.101: icmp_seq=345 ttl=63 time=59.2 ms
64 bytes from 192.168.0.101: icmp_seq=346 ttl=63 time=59.0 ms
64 bytes from 192.168.0.101: icmp_seq=347 ttl=63 time=59.0 ms
64 bytes from 192.168.0.101: icmp_seq=348 ttl=63 time=59.2 ms
64 bytes from 192.168.0.101: icmp_seq=349 ttl=63 time=59.3 ms
64 bytes from 192.168.0.101: icmp_seq=350 ttl=63 time=59.1 ms
64 bytes from 192.168.0.101: icmp_seq=351 ttl=63 time=58.9 ms
64 bytes from 192.168.0.101: icmp_seq=352 ttl=63 time=58.5 ms
64 bytes from 192.168.0.101: icmp_seq=353 ttl=63 time=58.4 ms
64 bytes from 192.168.0.101: icmp_seq=354 ttl=63 time=58.0 ms
64 bytes from 192.168.0.101: icmp_seq=355 ttl=63 time=58.4 ms
64 bytes from 192.168.0.101: icmp_seq=356 ttl=63 time=57.7 ms
64 bytes from 192.168.0.101: icmp_seq=357 ttl=63 time=56.9 ms
64 bytes from 192.168.0.101: icmp_seq=358 ttl=63 time=57.2 ms
64 bytes from 192.168.0.101: icmp_seq=359 ttl=63 time=57.5 ms
64 bytes from 192.168.0.101: icmp_seq=360 ttl=63 time=57.3 ms

As you can see, the RTT is quite high and the range of values is more or
less stable.

The second test is the same as the first one but setting burst_write to 1
for all SWQs. The result is this one:

64 bytes from 192.168.0.101: icmp_seq=131 ttl=63 time=10.6 ms
64 bytes from 192.168.0.101: icmp_seq=132 ttl=63 time=10.6 ms
64 bytes from 192.168.0.101: icmp_seq=133 ttl=63 time=10.5 ms
64 bytes from 192.168.0.101: icmp_seq=134 ttl=63 time=10.7 ms
64 bytes from 192.168.0.101: icmp_seq=135 ttl=63 time=10.8 ms
64 bytes from 192.168.0.101: icmp_seq=136 ttl=63 time=10.4 ms
64 bytes from 192.168.0.101: icmp_seq=137 ttl=63 time=10.7 ms
64 bytes from 192.168.0.101: icmp_seq=138 ttl=63 time=10.5 ms
64 bytes from 192.168.0.101: icmp_seq=139 ttl=63 time=10.4 ms
64 bytes from 192.168.0.101: icmp_seq=140 ttl=63 time=10.2 ms
64 bytes from 192.168.0.101: icmp_seq=141 ttl=63 time=10.4 ms
64 bytes from 192.168.0.101: icmp_seq=142 ttl=63 time=10.9 ms
64 bytes from 192.168.0.101: icmp_seq=143 ttl=63 time=11.4 ms
64 bytes from 192.168.0.101: icmp_seq=144 ttl=63 time=11.3 ms
64 bytes from 192.168.0.101: icmp_seq=145 ttl=63 time=11.5 ms
64 bytes from 192.168.0.101: icmp_seq=146 ttl=63 time=11.6 ms
64 bytes from 192.168.0.101: icmp_seq=147 ttl=63 time=11.0 ms
64 bytes from 192.168.0.101: icmp_seq=148 ttl=63 time=11.3 ms
64 bytes from 192.168.0.101: icmp_seq=149 ttl=63 time=12.0 ms
64 bytes from 192.168.0.101: icmp_seq=150 ttl=63 time=12.6 ms
64 bytes from 192.168.0.101: icmp_seq=151 ttl=63 time=12.4 ms
64 bytes from 192.168.0.101: icmp_seq=152 ttl=63 time=12.3 ms
64 bytes from 192.168.0.101: icmp_seq=153 ttl=63 time=12.8 ms
64 bytes from 192.168.0.101: icmp_seq=154 ttl=63 time=12.4 ms
64 bytes from 192.168.0.101: icmp_seq=155 ttl=63 time=12.8 ms
64 bytes from 192.168.0.101: icmp_seq=156 ttl=63 time=12.7 ms
64 bytes from 192.168.0.101: icmp_seq=157 ttl=63 time=12.6 ms
64 bytes from 192.168.0.101: icmp_seq=158 ttl=63 time=12.9 ms
64 bytes from 192.168.0.101: icmp_seq=159 ttl=63 time=13.4 ms
64 bytes from 192.168.0.101: icmp_seq=160 ttl=63 time=13.8 ms
64 bytes from 192.168.0.101: icmp_seq=161 ttl=63 time=13.4 ms
64 bytes from 192.168.0.101: icmp_seq=162 ttl=63 time=13.3 ms
64 bytes from 192.168.0.101: icmp_seq=163 ttl=63 time=13.3 ms
64 bytes from 192.168.0.101: icmp_seq=164 ttl=63 time=13.7 ms
64 bytes from 192.168.0.101: icmp_seq=165 ttl=63 time=13.7 ms
64 bytes from 192.168.0.101: icmp_seq=166 ttl=63 time=13.8 ms
64 bytes from 192.168.0.101: icmp_seq=167 ttl=63 time=14.7 ms
64 bytes from 192.168.0.101: icmp_seq=168 ttl=63 time=14.7 ms
64 bytes from 192.168.0.101: icmp_seq=169 ttl=63 time=14.7 ms
64 bytes from 192.168.0.101: icmp_seq=170 ttl=63 time=14.7 ms
64 bytes from 192.168.0.101: icmp_seq=171 ttl=63 time=14.6 ms
64 bytes from 192.168.0.101: icmp_seq=172 ttl=63 time=14.6 ms
64 bytes from 192.168.0.101: icmp_seq=173 ttl=63 time=14.5 ms
64 bytes from 192.168.0.101: icmp_seq=174 ttl=63 time=14.5 ms
64 bytes from 192.168.0.101: icmp_seq=175 ttl=63 time=15.1 ms
64 bytes from 192.168.0.101: icmp_seq=176 ttl=63 time=15.6 ms
64 bytes from 192.168.0.101: icmp_seq=177 ttl=63 time=16.0 ms
64 bytes from 192.168.0.101: icmp_seq=178 ttl=63 time=16.9 ms
64 bytes from 192.168.0.101: icmp_seq=179 ttl=63 time=17.7 ms
64 bytes from 192.168.0.101: icmp_seq=180 ttl=63 time=17.6 ms
64 bytes from 192.168.0.101: icmp_seq=181 ttl=63 time=17.9 ms
64 bytes from 192.168.0.101: icmp_seq=182 ttl=63 time=17.9 ms
64 bytes from 192.168.0.101: icmp_seq=183 ttl=63 time=18.5 ms
64 bytes from 192.168.0.101: icmp_seq=184 ttl=63 time=18.9 ms
64 bytes from 192.168.0.101: icmp_seq=185 ttl=63 time=19.8 ms
64 bytes from 192.168.0.101: icmp_seq=186 ttl=63 time=19.8 ms
64 bytes from 192.168.0.101: icmp_seq=187 ttl=63 time=10.7 ms
64 bytes from 192.168.0.101: icmp_seq=188 ttl=63 time=10.5 ms
64 bytes from 192.168.0.101: icmp_seq=189 ttl=63 time=10.4 ms
64 bytes from 192.168.0.101: icmp_seq=190 ttl=63 time=10.3 ms
64 bytes from 192.168.0.101: icmp_seq=191 ttl=63 time=10.5 ms
64 bytes from 192.168.0.101: icmp_seq=192 ttl=63 time=10.7 ms
As you mentioned, the delay has decreased a lot but it is still
considerably high (in a normal router this delay is less than 1 ms).
A second strange behaviour is seen in the evolution of the RTT detected. It
begins in 10 ms and goes increasing little by litttle to reach a peak of 20
ms aprox and then it suddely comes back to 10 ms again to increase again
till 20 ms.

Is this the behaviour you have in your case when the burst_write is set to
1?

Regards,

El mar., 18 feb. 2020 a las 8:18, James Huang (<jamsphon at gmail.com>)
escribió:

> No. We didn't see noticable throughput difference in our test.
>
> On Mon., Feb. 17, 2020, 11:04 p.m. Victor Huertas <vhuertas at gmail.com>
> wrote:
>
>> Thanks James for your quick answer.
>> I guess that this configuration modification implies that the packets
>> must be written one by one in the sw ring. Did you notice loose of
>> performance (in throughput) in your aplicación because of that?
>>
>> Regards
>>
>> El mar., 18 feb. 2020 0:10, James Huang <jamsphon at gmail.com> escribió:
>>
>>> Yes, I experienced similar issue in my application. In a short answer,
>>> set the swqs write burst value to 1 may reduce the latency significantly.
>>> The default write burst value is 32.
>>>
>>> On Mon., Feb. 17, 2020, 8:41 a.m. Victor Huertas <vhuertas at gmail.com>
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I am developing my own DPDK application basing it in the dpdk-stable
>>>> ip_pipeline example.
>>>> At this moment I am using the 17.11 LTS version of DPDK and I amb
>>>> observing
>>>> some extrange behaviour. Maybe it is an old issue that can be solved
>>>> quickly so I would appreciate it if some expert can shade a light on
>>>> this.
>>>>
>>>> The ip_pipeline example allows you to develop Pipelines that perform
>>>> specific packet processing functions (ROUTING, FLOW_CLASSIFYING,
>>>> etc...).
>>>> The thing is that I am extending some of this pipelines with my own.
>>>> However I want to take advantage of the built-in ip_pipeline capability
>>>> of
>>>> arbitrarily assigning the logical core where the pipeline (f_run()
>>>> function) must be executed so that i can adapt the packet processing
>>>> power
>>>> to the amount of the number of cores available.
>>>> Taking this into account I have observed something strange. I show you
>>>> this
>>>> simple example below.
>>>>
>>>> Case 1:
>>>> [PIPELINE 0 MASTER core =0]
>>>> [PIPELINE 1 core=1] --- SWQ1--->[PIPELINE 2 core=2] -----SWQ2---->
>>>> [PIPELINE 3 core=3]
>>>>
>>>> Case 2:
>>>> [PIPELINE 0 MASTER core =0]
>>>> [PIPELINE 1 core=1] --- SWQ1--->[PIPELINE 2 core=1] -----SWQ2---->
>>>> [PIPELINE 3 core=1]
>>>>
>>>> I send a ping between two hosts connected at both sides of the pipeline
>>>> model which allows these pings to cross all the pipelines (from 1 to 3).
>>>> What I observe in Case 1 (each pipeline has its own thread in different
>>>> core) is that the reported RTT is less than 1 ms, whereas in Case 2 (all
>>>> pipelines except MASTER are run in the same thread) is 20 ms.
>>>> Furthermore,
>>>> in Case 2, if I increase a lot (hundreds of Mbps) the packet rate this
>>>> RTT
>>>> decreases to 3 or 4 ms.
>>>>
>>>> Has somebody observed this behaviour in the past? Can it be solved
>>>> somehow?
>>>>
>>>> Thanks a lot for your attention
>>>> --
>>>> Victor
>>>>
>>>>
>>>> --
>>>> Victor
>>>>
>>>

-- 
Victor


More information about the dev mailing list