[dpdk-users] VIRTIO for containers

王志克 wangzhike at jd.com
Wed Oct 25 11:58:29 CEST 2017


Hi Jianfeng,

Thanks for your reply. Some feedback in line.

BR,
Wang Zhike

From: Tan, Jianfeng [mailto:jianfeng.tan at intel.com]
Sent: Wednesday, October 25, 2017 3:34 PM
To: 王志克; avi.cohen at huawei.com; users at dpdk.org
Subject: RE: VIRTIO for containers

Hi Zhike,

Welcome to summit patch to fix the bug.

For sender view, there are still some tricky configuration to make it faster, like enable the LRO, so that we can put off tcp fragmentation at the NIC when sending out; similar to checksum.
For receiver view, you might want to increase the ring size to get better performance. We are also looking at how to make “vhost tx zero copy” efficient.

[Wang Zhike] I once saw you mentioned that something like mmap solution may be used. Is it still on your roadmap? I am not sure whether it is same as the “vhost tx zero copy”.
Can I know the forecasted day that the optimization can be done? Some Linux kernel upstream module would be updated, or DPDK module? Just want to know which modules will be touched.

1) Yes, we have done some initial tests internally, with testpmd as the vswitch instead of OVS-DPDK; and we were comparing with KNI for exceptional path.
[Wang Zhike]Can you please kindly indicate how to configure for KNI mode? I would like to also compare it.

2) We also see similar asymmetric result. For user->kernel path, it not only copies data from mbuf to skb, but also might go above to tcp stack (you can check using perf).
[Wang Zhike] Yes, indeed.  User->kernel path, tcp/ip related work is done by vhost thread, while kernel to user  thread, tcp/ip related work is done by the app (my case netperf) in syscall.

Thanks,
Jianfeng

From: 王志克 [mailto:wangzhike at jd.com]
Sent: Tuesday, October 24, 2017 5:46 PM
To: Tan, Jianfeng <jianfeng.tan at intel.com<mailto:jianfeng.tan at intel.com>>; avi.cohen at huawei.com<mailto:avi.cohen at huawei.com>; users at dpdk.org<mailto:users at dpdk.org>
Subject: RE: VIRTIO for containers

Hi  Jianfeng,

It is proven that there is SW bug on DPDK17.05.2, which leads to this issue. I will submit patch later this week.

Now I can succeed to configure the container.  I did some simple test with netperf (one server and one client in different hosts), and some result comparing container with kernel OVS:
1. From netperf sender view, the tx thoughput increase about 100%. On sender, there is almost no packet loss. That means kernel vhost thread timely transfer data from kernel to userspace OVS+DPDK.
2. From netperf recvier view, the rx thoughput increase about 50%. On recviever, the packet loss happens on virtiouserx. There is no packet loss on tap port. That means kernel vhost thread is slow to transfer data from userspace OVS+DPDK to kernel.

May I ask some question? Thanks.

1) Did you have some benchmark data for performance?
2) Is there any explanation why the kernel vhost thread speed is different for two direction (from kernel to user, and vice versa)?

Welcome feedback if someone has such data.

Br,
Wang Zhike
From: 王志克
Sent: Tuesday, October 24, 2017 11:16 AM
To: 'Tan, Jianfeng'; avi.cohen at huawei.com<mailto:avi.cohen at huawei.com>; users at dpdk.org<mailto:users at dpdk.org>
Subject: RE: VIRTIO for containers

Thanks Jianfeng.

I finally realized that I used DPDK16.11 which does NOT support this function.

Then I use latest DPDK (17.05.2) and OVS (2.8.1), but still does not work.

ovs-vsctl add-port br0 virtiouser0 -- set Interface virtiouser0 type=dpdk options:dpdk-devargs=virtio_user0,path=/dev/vhost-net
ovs-vsctl: Error detected while setting up 'virtiouser0': could not add network device virtiouser0 to ofproto (No such device).  See ovs-vswitchd log for details.
ovs-vsctl: The default log directory is "/var/log/openvswitch".

lsmod |grep vhost
vhost_net              18152  0
vhost                  33338  1 vhost_net
macvtap                22363  1 vhost_net
tun                    27141  6 vhost_net

2017-10-23T19:00:42.743Z|00163|netdev_dpdk|INFO|Device 'virtio_user0,path=/dev/vhost-net' attached to DPDK
2017-10-23T19:00:42.743Z|00164|netdev_dpdk|WARN|Rx checksum offload is not supported on port 2
2017-10-23T19:00:42.743Z|00165|netdev_dpdk|ERR|Interface virtiouser0 MTU (1500) setup error: Invalid argument
2017-10-23T19:00:42.743Z|00166|netdev_dpdk|ERR|Interface virtiouser0(rxq:1 txq:1) configure error: Invalid argument
2017-10-23T19:00:42.743Z|00167|dpif_netdev|ERR|Failed to set interface virtiouser0 new configuration
2017-10-23T19:00:42.743Z|00168|bridge|WARN|could not add network device virtiouser0 to ofproto (No such device)

Which versions (dpdk and ovs) are you using? Thanks

Br,
Wang Zhike

From: Tan, Jianfeng [mailto:jianfeng.tan at intel.com]
Sent: Saturday, October 21, 2017 12:55 AM
To: 王志克; avi.cohen at huawei.com<mailto:avi.cohen at huawei.com>; dpdk-ovs at lists.01.org<mailto:dpdk-ovs at lists.01.org>; users at dpdk.org<mailto:users at dpdk.org>
Subject: Re: VIRTIO for containers


Hi Zhike,

On 10/20/2017 5:24 PM, 王志克 wrote:
I read this thread, and try to do the same way (legacy containers connect to ovs+dpdk). However, I meet following error when creating ovs port.

ovs-vsctl add-port br0 virtiouser0 -- set Interface virtiouser0 type=dpdk options:dpdk-devargs=net_virtio_user0,path=/dev/vhost-net
ovs-vsctl: Error detected while setting up 'virtiouser0': Error attaching device 'net_virtio_user0,path=/dev/vhost-net' to DPDK.  See ovs-vswitchd log for details.
ovs-vsctl: The default log directory is "/var/log/openvswitch".

It should not try to connect this file /dev/vhost-net if this file exists, instead it will use ioctls on it. So please check if you have vhost and vhost-net ko probed into kernel.

Thanks,
Jianfeng


Debug shows that it calls virtio_user_dev_init()->vhost_user_setup(), and failed in connect() with target /dev/vhost-net. The errno is ECONNREFUSED.
Below command indeed shows no one is listening.
lsof | grep vhost-net

In kernel OVS, I guess qemu-kvm would listne to /dev/vhost-net. But for ovs_dpdk and container, what extra work need be done? Appreciate any help.

Br,
Wang Zhike





More information about the users mailing list