[dpdk-users] How to tune configurations for measuring zero packet-loss performance of OVS-DPDK with vhost-user?
WBAHACER at 126.com
Fri Aug 25 10:40:51 CEST 2017
Hi! I've built a testbed to measure the zero packet-loss performance of
OVS-DPDK with vhost-user.
Here are the configurations of my testbed:
1. Host machine (ubuntu14.04.5, Linux-3.19.0-25):
a/ hardware: quad socket with Intel Xeon E5-4603v2 at 2.20GHz (4
cores/socket), 32GB DDR3 memory, dual-port Intel 82599ES NIC
(10Gbps/port, in socket0);
b/ BIOS settings: disable power management options including
C-state, P-state, Step Speedup and set cpu in performance mode;
c/ host OS booting parameters: isolcpus=0-7, nohz_full=0-7,
rcu_nocbs=0-7, intel_iommu=on, iommu=pt and 16 x 1G hugepages
1) version: OVS-2.6.1, DPDK-16.07.2 (using
2) configurations: 2 physical port (dpdk0 and dpdk1, vfio-pci
dirver) and 2 vhost-user port (vhost0, vhost1) were added to ovs bridge
(br0), and 1 PMD core (pinned to core 0, in socket0) was used for
forwarding. The fowarding rules were
e/ irq affinity: kill irqbalance and set smp_affinity of all irqs
to 0xff00 (core 8-15).
f/ RT priority: change RT priority of ksoftirqd (chrt -fp 2
$tid), rcuos (chrt -fp 3 $tid) and rcuob (chrt -fp 2 $tid).
2. VM setting
a/ hypervisor: QEMU-2.8.0 and KVM
b/ QEMU command:
qemu-system-x86_64 -enable-kvm -drive file=$IMAGE,if=virtio -cpu host
-smp 3 -m 4G -boot c \
-name $NAME -vnc :$VNC_INDEX -net none \
-mem-prealloc -numa node,memdev=mem \
-chardev socket,id=char1,path=$VHOSTDIR/vhost0 \
-netdev type=vhost-user,id=net1,chardev=char1,vhostforce \
-chardev socket,id=char2,path=$VHOSTDIR/vhost1 \
-netdev type=vhost-user,id=net2,chardev=char2,vhostforce \
c/ Guest OS: ubuntu14.04
d/ Guest OS booting parameters: isolcpus=0-1, nohz_full=0-1,
rcu_nocbs=0-1, and 1 x 1G hugepages
e/ irq affinity and RT priority: remove irqs and change RT priority
of isolated vcpus (vcpu0, vcpu1)
f/ Guest forwarding application: example/l2fwd build on
dpdk-16.07.2 (using ivshmem target). The function of l2fwd is to forward
packets from one port to another port, and each port has its' own
polling thread to receive packets.
g/ App configurations: two virtio ports (vhost0, vhost1, using
uio_pci_generic driver) were used by l2fwd, and l2fwd had 2 polling
threads that ran on vcpu0 and vcpu1 (pinned to physical core1 and core2,
3. Traffic generator
a/ Spirent TestCenter with 2 x 10G ports was used to generate traffic.
b/ 1 flow with 64B packet size was generated from one port and
sent to dpdk0, and then receive and count packets at another port.
Here are my results:
1. Max throughput (non zero packet-loss case): 2.03Gbps
2. Max throughput (zero packet-loss case): 100 ~ 200Mbps
And I got some information about packet loss from packet statistics in
OVS and l2fwd:
When input traffic large than 200Mbps, there may were 3 packet loss
point -- OVS rx from physical NIC (RX queue was full), OVS tx to vhost
port (vhost rx queue was full) and l2fwd tx to vhost port (vhost tx
queue was full).
I don't know why the difference between above 2 cases is so large.
I doubt that I've misconfigure my testbed. Could someone share
experience with me ?
Thanks a lot!
More information about the users