[dpdk-dev] VMXNET3 on vmware, ping delay

Vass, Sandor (Nokia - HU/Budapest) sandor.vass at nokia.com
Thu Jun 25 11:14:53 CEST 2015


Hello,
I would like to create an IP packet processor program and I choose to use DPDK because it is promising wrt its speed aspect.

I am trying to build a test environment to make the development a cheaper (not to buy HW for each developer), so I created a test setup in
- VMWare Workstation 11
- using DPDK 2.0.0
- with linux kernel 3.10.0, CentOS7
- gcc 4.8.3
- and standard, centos7 provided VMXNET3 driver, with uio_pci_generic kernel module
(shall I use vmxnet3-usermap.ko with dpdk 2.0.0? Where is it, how could I compile it?)

I set up 3 machines:
- set all machines' network interface type to VMXNET3
- set up one machine (C1) for issuing ping, its interface has an IP: 192.168.3.21
- set up one machine (C2) for being the ping target, its interface has an IP: 192.168.3.23
- set up one machine (BR) to act a L2 bridge using some of the examples provided. DPDK is compiled properly, 256x  2MB hugetables created, example application is executed and running without (major) error.
- three machines are connected linearly:  C1 - BR - C2 using two private networks on each side of BR (VMnet2 and VMnet3), so the VMs are connected by vSwitches

Ping reply arrives, definitely goes through BR (extra console logs), but there are unexpected delays with example/skeleton/basicfwd...
[root at localhost ~]# ping 192.168.3.23
PING 192.168.3.23 (192.168.3.23) 56(84) bytes of data.
64 bytes from 192.168.3.23: icmp_seq=1 ttl=64 time=1018 ms
64 bytes from 192.168.3.23: icmp_seq=2 ttl=64 time=18.7 ms
64 bytes from 192.168.3.23: icmp_seq=3 ttl=64 time=1008 ms
64 bytes from 192.168.3.23: icmp_seq=4 ttl=64 time=8.87 ms
64 bytes from 192.168.3.23: icmp_seq=5 ttl=64 time=1010 ms
64 bytes from 192.168.3.23: icmp_seq=6 ttl=64 time=10.2 ms
64 bytes from 192.168.3.23: icmp_seq=7 ttl=64 time=1012 ms
64 bytes from 192.168.3.23: icmp_seq=8 ttl=64 time=12.7 ms
64 bytes from 192.168.3.23: icmp_seq=9 ttl=64 time=1049 ms
64 bytes from 192.168.3.23: icmp_seq=10 ttl=64 time=49.8 ms
64 bytes from 192.168.3.23: icmp_seq=11 ttl=64 time=1008 ms
64 bytes from 192.168.3.23: icmp_seq=12 ttl=64 time=9.02 ms
64 bytes from 192.168.3.23: icmp_seq=13 ttl=64 time=1008 ms
64 bytes from 192.168.3.23: icmp_seq=14 ttl=64 time=8.74 ms
64 bytes from 192.168.3.23: icmp_seq=15 ttl=64 time=1007 ms
64 bytes from 192.168.3.23: icmp_seq=16 ttl=64 time=8.03 ms
64 bytes from 192.168.3.23: icmp_seq=17 ttl=64 time=1008 ms
64 bytes from 192.168.3.23: icmp_seq=18 ttl=64 time=8.96 ms
64 bytes from 192.168.3.23: icmp_seq=19 ttl=64 time=1008 ms
64 bytes from 192.168.3.23: icmp_seq=20 ttl=64 time=9.27 ms
64 bytes from 192.168.3.23: icmp_seq=21 ttl=64 time=1008 ms
...

When I switched on BR to multi_process/client_server_mp, with 2 client processes the result was almost the same:
[root at localhost ~]# ping 192.168.3.23
PING 192.168.3.23 (192.168.3.23) 56(84) bytes of data.
64 bytes from 192.168.3.23: icmp_seq=1 ttl=64 time=1003 ms
64 bytes from 192.168.3.23: icmp_seq=2 ttl=64 time=3.50 ms
64 bytes from 192.168.3.23: icmp_seq=3 ttl=64 time=1002 ms
64 bytes from 192.168.3.23: icmp_seq=4 ttl=64 time=3.94 ms
64 bytes from 192.168.3.23: icmp_seq=5 ttl=64 time=1001 ms
64 bytes from 192.168.3.23: icmp_seq=6 ttl=64 time=1010 ms
64 bytes from 192.168.3.23: icmp_seq=7 ttl=64 time=1003 ms
64 bytes from 192.168.3.23: icmp_seq=8 ttl=64 time=2003 ms
64 bytes from 192.168.3.23: icmp_seq=10 ttl=64 time=2.29 ms
64 bytes from 192.168.3.23: icmp_seq=9 ttl=64 time=3002 ms
64 bytes from 192.168.3.23: icmp_seq=12 ttl=64 time=2.66 ms
64 bytes from 192.168.3.23: icmp_seq=11 ttl=64 time=3003 ms
64 bytes from 192.168.3.23: icmp_seq=14 ttl=64 time=2.87 ms
64 bytes from 192.168.3.23: icmp_seq=13 ttl=64 time=3003 ms
64 bytes from 192.168.3.23: icmp_seq=16 ttl=64 time=2.88 ms
64 bytes from 192.168.3.23: icmp_seq=15 ttl=64 time=1003 ms
64 bytes from 192.168.3.23: icmp_seq=17 ttl=64 time=1001 ms
64 bytes from 192.168.3.23: icmp_seq=18 ttl=64 time=2.70 ms
...

And when I switched on BR to test-pdm, the ping result was kind of normal (every commandline switch left as default)
[root at localhost ~]# ping 192.168.3.23
PING 192.168.3.23 (192.168.3.23) 56(84) bytes of data.
64 bytes from 192.168.3.23: icmp_seq=1 ttl=64 time=3.52 ms
64 bytes from 192.168.3.23: icmp_seq=2 ttl=64 time=33.2 ms
64 bytes from 192.168.3.23: icmp_seq=3 ttl=64 time=3.97 ms
64 bytes from 192.168.3.23: icmp_seq=4 ttl=64 time=25.5 ms
64 bytes from 192.168.3.23: icmp_seq=5 ttl=64 time=61.1 ms
64 bytes from 192.168.3.23: icmp_seq=6 ttl=64 time=36.3 ms
64 bytes from 192.168.3.23: icmp_seq=7 ttl=64 time=35.5 ms
64 bytes from 192.168.3.23: icmp_seq=8 ttl=64 time=33.0 ms
64 bytes from 192.168.3.23: icmp_seq=9 ttl=64 time=5.32 ms
64 bytes from 192.168.3.23: icmp_seq=10 ttl=64 time=14.6 ms
64 bytes from 192.168.3.23: icmp_seq=11 ttl=64 time=34.5 ms
64 bytes from 192.168.3.23: icmp_seq=12 ttl=64 time=4.67 ms
64 bytes from 192.168.3.23: icmp_seq=13 ttl=64 time=55.0 ms
64 bytes from 192.168.3.23: icmp_seq=14 ttl=64 time=4.93 ms
64 bytes from 192.168.3.23: icmp_seq=15 ttl=64 time=5.98 ms
64 bytes from 192.168.3.23: icmp_seq=16 ttl=64 time=5.41 ms
64 bytes from 192.168.3.23: icmp_seq=17 ttl=64 time=21.0 ms
...

Though I think these values are still quite high I can accept that as this is a virtualized environment.

Could someone please explain to me what is going on with the basicfwd and client-server exapmles? According to my understanding each packet should go through BR as fast as possible, but it seems that the rte_eth_rx_burst retrieves packets only when there are at least 2 packets on the RX queue of the NIC. At least most of the times as there are cases (rarely - according to my console log) when it can retrieve 1 packet also and sometimes only 3 packets can be retrieved...

What is the difference that makes test-pdm working without major delay and the others don't?



Thanks,
Sandor



More information about the dev mailing list