[dpdk-dev] [RFC PATCH 5/5] virtio: Extend virtio-net PMD to support container environment

Xie, Huawei huawei.xie at intel.com
Wed Jan 27 10:39:04 CET 2016

On 1/26/2016 10:58 AM, Tetsuya Mukawa wrote:
> On 2016/01/25 19:15, Xie, Huawei wrote:
>> On 1/22/2016 6:38 PM, Tetsuya Mukawa wrote:
>>> On 2016/01/22 17:14, Xie, Huawei wrote:
>>>> On 1/21/2016 7:09 PM, Tetsuya Mukawa wrote:
>>>>> virtio: Extend virtio-net PMD to support container environment
>>>>> The patch adds a new virtio-net PMD configuration that allows the PMD to
>>>>> work on host as if the PMD is in VM.
>>>>> Here is new configuration for virtio-net PMD.
>>>>> To use this mode, EAL needs physically contiguous memory. To allocate
>>>>> such memory, add "--shm" option to application command line.
>>>>> To prepare virtio-net device on host, the users need to invoke QEMU
>>>>> process in special qtest mode. This mode is mainly used for testing QEMU
>>>>> devices from outer process. In this mode, no guest runs.
>>>>> Here is QEMU command line.
>>>>>  $ qemu-system-x86_64 \
>>>>>              -machine pc-i440fx-1.4,accel=qtest \
>>>>>              -display none -qtest-log /dev/null \
>>>>>              -qtest unix:/tmp/socket,server \
>>>>>              -netdev type=tap,script=/etc/qemu-ifup,id=net0,queues=1\
>>>>>              -device virtio-net-pci,netdev=net0,mq=on \
>>>>>              -chardev socket,id=chr1,path=/tmp/ivshmem,server \
>>>>>              -device ivshmem,size=1G,chardev=chr1,vectors=1
>>>>>  * QEMU process is needed per port.
>>>> Does qtest supports hot plug virtio-net pci device, so that we could run
>>>> one QEMU process in host, which provisions the virtio-net virtual
>>>> devices for the container?
>>> Theoretically, we can use hot plug in some cases.
>>> But I guess we have 3 concerns here.
>>> 1. Security.
>>> If we share QEMU process between multiple DPDK applications, this QEMU
>>> process will have all fds of  the applications on different containers.
>>> In some cases, it will be security concern.
>>> So, I guess we need to support current 1:1 configuration at least.
>>> 2. shared memory.
>>> Currently, QEMU and DPDK application will map shared memory using same
>>> virtual address.
>>> So if multiple DPDK application connects to one QEMU process, each DPDK
>>> application should have different address for shared memory. I guess
>>> this will be a big limitation.
>>> 3. PCI bridge.
>>> So far, QEMU has one PCI bridge, so we can connect almost 10 PCI devices
>>> to QEMU.
>>> (I forget correct number, but it's almost 10, because some slots are
>>> reserved by QEMU)
>>> A DPDK application needs both virtio-net and ivshmem device, so I guess
>>> almost 5 DPDK applications can connect to one QEMU process, so far.
>>> To add more PCI bridges solves this.
>>> But we need to add a lot of implementation to support cascaded PCI
>>> bridges and PCI devices.
>>> (Also we need to solve above "2nd" concern.)
>>> Anyway, if we use virtio-net PMD and vhost-user PMD, QEMU process will
>>> not do anything after initialization.
>>> (QEMU will try to read a qtest socket, then be stopped because there is
>>> no message after initialization)
>>> So I guess we can ignore overhead of these QEMU processes.
>>> If someone cannot ignore it, I guess this is the one of cases that it's
>>> nice to use your light weight container implementation.
>> Thanks for the explanation, and also in your opinion where is the best
>> place to run the QEMU instance? If we run QEMU instances in host, for
>> vhost-kernel support, we could get rid of the root privilege issue.
> Do you mean below?
> If we deploy QEMU instance on host, we can start a container without the
> root privilege.
> (But on host, still QEMU instance needs the privilege to access to
> vhost-kernel)

There is no issue running QEMU instance with root privilege on host, but
i think it is not acceptable granting the container root privilege.

> If so, I agree to deploy QEMU instance on host or other privileged
> container will be nice.
> In the case of vhost-user, to deploy on host or non-privileged container
> will be good.
>> Another issue is do you plan to support multiple virtio devices in
>> container? Currently i find the code assuming only one virtio-net device
>> in QEMU, right?
> Yes, so far, 1 port needs 1 QEMU instance.
> So if you need multiple virtio devices, you need to invoke multiple QEMU
> instances.
> Do you want to deploy 1 QEMU instance for each DPDK application, even if
> the application has multiple virtio-net ports?
> So far, I am not sure whether we need it, because this type of DPDK
> application will need only one port in most cases.
> But if you need this, yes, I can implement using QEMU PCI hotplug feature.
> (But probably we can only attach almost 10 ports. This will be limitation.)

I am OK with supporting one virtio device for the first version.

>> Btw, i have read most of your qtest code. No obvious issues found so far
>> but quite a couple of nits. You must have spent a lot of time on this.
>> It is great work!
> I appreciate your reviewing!
> BTW, my container implementation needed a QEMU patch in the case of
> vhost-user.
> But the patch has been merged in upstream QEMU, so we don't have this
> limitation any more.

Great, better put the QEMU dependency information in the commit message
> Thanks,
> Tetsuya

More information about the dev mailing list