[dpdk-dev] [PATCH v1 0/2] Virtio-net PMD Extension to work on host

Tan, Jianfeng jianfeng.tan at intel.com
Thu Dec 24 15:05:13 CET 2015


Hi Tetsuya,

After several days' studying your patch, I have some questions as follows:

1. Is physically-contig memory really necessary?
This is a too strong requirement IMHO. IVSHMEM doesn't require this in its original meaning. So how do you think of
Huawei Xie's idea of using virtual address for address translation? (In addition, virtual address of mem_table could be
different in application and QTest, but this can be addressed because SET_MEM_TABLE msg will be intercepted by
QTest)

2. Is root privilege OK in container's case?
Another reason we'd like to give up physically-contig feature is that it needs root privilege to read /proc/self/pagemap
file. Container has already been widely criticized for bad security isolation. Enabling root privilege will make it worse.
On the other hand, it's not easy to remove root privilege too. If we use vhost-net as the backend, kernel will definitely
require root privilege to create a tap device/raw socket. We tend to pick such work, which requires root, into runtime
preparation of a container. Do you agree?

3.Is one Qtest process per virtio device too heavy?
Although we can foresee that each container always owns only one virtio device, but take its possible high density
into consideration, hundreds or even thousands of container requires the same number of QTest processes. As
you mentioned that port hotplug is supported, is it possible to use just one QTest process for all virtio devices
emulation?

As you know, we have another solution according to this (which under heavy internal review). But I think we have lots
of common problems to be solved, right?

Thanks for your great work!

Thanks,
Jianfeng

> -----Original Message-----
> From: Tetsuya Mukawa [mailto:mukawa at igel.co.jp]
> Sent: Wednesday, December 16, 2015 4:37 PM
> To: dev at dpdk.org
> Cc: nakajima.yoshihiro at lab.ntt.co.jp; Tan, Jianfeng; Xie, Huawei;
> mst at redhat.com; marcandre.lureau at gmail.com; Tetsuya Mukawa
> Subject: [PATCH v1 0/2] Virtio-net PMD Extension to work on host
> 
> [Change log]
> 
> PATCH v1:
> (Just listing functionality changes and important bug fix)
> * Support virtio-net interrupt handling.
>   (It means virtio-net PMD on host and guest have same virtio-net features)
> * Fix memory allocation method to allocate contiguous memory correctly.
> * Port Hotplug is supported.
> * Rebase on DPDK-2.2.
> 
> 
> [Abstraction]
> 
> Normally, virtio-net PMD only works on VM, because there is no virtio-net
> device on host.
> This RFC patch extends virtio-net PMD to be able to work on host as virtual
> PMD.
> But we didn't implement virtio-net device as a part of virtio-net PMD.
> To prepare virtio-net device for the PMD, start QEMU process with special
> QTest mode, then connect it from virtio-net PMD through unix domain
> socket.
> 
> The virtio-net PMD on host is fully compatible with the PMD on guest.
> We can use same functionalities, and connect to anywhere QEMU virtio-net
> device can.
> For example, the PMD can use virtio-net multi queues function. Also it can
> connects to vhost-net kernel module and vhost-user backend application.
> Similar to virtio-net PMD on QEMU, application memory that uses virtio-net
> PMD will be shared between vhost backend application. But vhost backend
> application memory will not be shared.
> 
> Main target of this PMD is container like docker, rkt, lxc and etc.
> We can isolate related processes(virtio-net PMD process, QEMU and vhost-
> user backend process) by container.
> But, to communicate through unix domain socket, shared directory will be
> needed.
> 
> 
> [How to use]
> 
> So far, we need QEMU patch to connect to vhost-user backend.
> See below patch.
>  - http://patchwork.ozlabs.org/patch/552549/
> To know how to use, check commit log.
> 
> 
> [Detailed Description]
> 
>  - virtio-net device implementation
> This host mode PMD uses QEMU virtio-net device. To do that, QEMU QTest
> functionality is used.
> QTest is a test framework of QEMU devices. It allows us to implement a
> device driver outside of QEMU.
> With QTest, we can implement DPDK application and virtio-net PMD as
> standalone process on host.
> When QEMU is invoked as QTest mode, any guest code will not run.
> To know more about QTest, see below.
>  - http://wiki.qemu.org/Features/QTest
> 
>  - probing devices
> QTest provides a unix domain socket. Through this socket, driver process can
> access to I/O port and memory of QEMU virtual machine.
> The PMD will send I/O port accesses to probe pci devices.
> If we can find virtio-net and ivshmem device, initialize the devices.
> Also, I/O port accesses of virtio-net PMD will be sent through socket, and
> virtio-net PMD can initialize vitio-net device on QEMU correctly.
> 
>  - ivshmem device to share memory
> To share memory that virtio-net PMD process uses, ivshmem device will be
> used.
> Because ivshmem device can only handle one file descriptor, shared memory
> should be consist of one file.
> To allocate such a memory, EAL has new option called "--contig-mem".
> If the option is specified, EAL will open a file and allocate memory from
> hugepages.
> While initializing ivshmem device, we can set BAR(Base Address Register).
> It represents which memory QEMU vcpu can access to this shared memory.
> We will specify host physical address of shared memory as this address.
> It is very useful because we don't need to apply patch to QEMU to calculate
> address offset.
> (For example, if virtio-net PMD process will allocate memory from shared
> memory, then specify the physical address of it to virtio-net register, QEMU
> virtio-net device can understand it without calculating address offset.)
> 
> 
> [Known issues]
> 
>  - vhost-user
> So far, to use vhost-user, we need to apply a patch to QEMU.
> This is because, QEMU will not send memory information and file descriptor
> of ivshmem device to vhost-user backend.
> I have submitted the patch to QEMU.
> See "http://patchwork.ozlabs.org/patch/552549/".
> Also, we may have an issue in DPDK vhost library to handle kickfd and callfd.
> The patch for this issue is needed. I have a workaround patch, but let me
> check it more.
> If someone wants to check vhost-user behavior, I will describe it more in
> later email.
> 
> 
> 
> 
> Tetsuya Mukawa (2):
>   EAL: Add new EAL "--contig-mem" option
>   virtio: Extend virtio-net PMD to support container environment
> 
>  config/common_linuxapp                     |    1 +
>  drivers/net/virtio/Makefile                |    4 +
>  drivers/net/virtio/qtest.c                 | 1107
> ++++++++++++++++++++++++++++
>  drivers/net/virtio/virtio_ethdev.c         |  341 ++++++++-
>  drivers/net/virtio/virtio_ethdev.h         |   12 +
>  drivers/net/virtio/virtio_pci.h            |   25 +
>  lib/librte_eal/common/eal_common_options.c |    7 +
>  lib/librte_eal/common/eal_internal_cfg.h   |    1 +
>  lib/librte_eal/common/eal_options.h        |    2 +
>  lib/librte_eal/linuxapp/eal/eal_memory.c   |   77 +-
>  10 files changed, 1543 insertions(+), 34 deletions(-)
>  create mode 100644 drivers/net/virtio/qtest.c
> 
> --
> 2.1.4



More information about the dev mailing list