[dpdk-dev] [PATCH v8 0/6] virtio support for container

Yuanhan Liu yuanhan.liu at linux.intel.com
Tue Jun 14 10:34:19 CEST 2016


Series Acked-by: Yuanhan Liu <yuanhan.liu at linux.intel.com>

	--yliu

On Mon, Jun 13, 2016 at 06:38:57AM +0000, Jianfeng Tan wrote:
> v8:
>  - Change to use max_queue_pairs instead of queue_pairs to initialize
>    and deinitialize queues.
>  - Remove vhost-kernel support.
> 
> v7:
>  - CONFIG_RTE_VIRTIO_VDEV -> CONFIG_RTE_VIRTIO_USER; and corresondingly,
>    RTE_VIRTIO_VDEV -> RTE_VIRTIO_USER.
>  - uint64_t -> uintptr_t, so that it can be compiled on 32-bit platform.
>  - Rebase on latest dpdk-next-virtio branch.
>  - Abandon abstracting related code into vring_hdr_desc_init(), instead,
>    just move it behind setup_queue().
> 
> v6:
>  - Move driver related code into from driver/net/virtio/virtio-user/ to
>    driver/net/virtio/ directory, inside virtio_user_ethdev.c.
>  - Rename vdev to virtio_user in comments and code.
>  - Merge code, which lies in virtio_user_pci.c, into virtio_user_ethdev.c.
>  - Add some comments at virtio-user special handling at virtio_dev_ethdev.c.
>  - Merge document update into the 7nd commit where virtio-user is added.
>  - Add usage with vhost-switch in vhost.rst.
> 
> v5:
>  - Rename struct virtio_user_hw to struct virtio_user_dev.
>  - Rename "vdev_private" to "virtio_user_dev".
>  - Move special handling into virtio_ethdev.c from queue_setup().
>  - Add vring in virtio_user_dev (remove rte_eth_dev_data), so that
>    device does not depend on driver's data structure (rte_eth_dev_data).
>  - Remove update on doc/guides/nics/overview.rst, because virtio-user has
>    exact feature set with virtio.
>  - Change "unsigned long int" to "uint64_t", "unsigned" to "uint32_t".
>  - Remove unnecessary cast in vdev_read_dev_config().
>  - Add functions in virtio_user_dev.c with prefix of "virtio_user_".
>  - Rebase on virtio-next-virtio.
> 
> v4:
>  - Avoid using dev_type, instead use (eth_dev->pci_device is NULL) to
>    judge if it's virtual device or physical device.
>  - Change the added device name to virtio-user.
>  - Split into vhost_user.c, vhost_kernel.c, vhost.c, virtio_user_pci.c,
>    virtio_user_dev.c.
>  - Move virtio-user specific data from struct virtio_hw into struct
>    virtio_user_hw.
>  - Add support to send reset_owner message.
>  - Change del_queue implementation. (This need more check)
>  - Remove rte_panic(), and superseded with log.
>  - Add reset_owner into virtio_pci_ops.reset.
>  - Merge parameter "rx" and "tx" to "queues" to emliminate confusion.
>  - Move get_features to after set_owner.
>  - Redefine path in virtio_user_hw from char * to char [].
> 
> v3:
>  - Remove --single-file option; do no change at EAL memory.
>  - Remove the added API rte_eal_get_backfile_info(), instead we check all
>    opened files with HUGEFILE_FMT to find hugepage files owned by DPDK.
>  - Accordingly, add more restrictions at "Known issue" section.
>  - Rename parameter from queue_num to queue_size for confusion.
>  - Rename vhost_embedded.c to rte_eth_virtio_vdev.c.
>  - Move code related to the newly added vdev to rte_eth_virtio_vdev.c, to
>    reuse eth_virtio_dev_init(), remove its static declaration.
>  - Implement dev_uninit() for rte_eth_dev_detach().
>  - WARN -> ERR, in vhost_embedded.c
>  - Add more commit message for clarify the model.
> 
> v2:
>  - Rebase on the patchset of virtio 1.0 support.
>  - Fix cannot create non-hugepage memory.
>  - Fix wrong size of memory region when "single-file" is used.
>  - Fix setting of offset in virtqueue to use virtual address.
>  - Fix setting TUNSETVNETHDRSZ in vhost-user's branch.
>  - Add mac option to specify the mac address of this virtual device.
>  - Update doc.
> 
> This patchset is to provide high performance networking interface (virtio)
> for container-based DPDK applications. The way of starting DPDK apps in
> containers with ownership of NIC devices exclusively is beyond the scope.
> The basic idea here is to present a new virtual device (named virtio-user),
> which can be discovered and initialized by DPDK. To minimize the change,
> we reuse already-existing virtio PMD code (driver/net/virtio/).
> 
> Background: Previously, we usually use a virtio device in the context of
> QEMU/VM as below pic shows. Virtio nic is emulated in QEMU, and usually
> presented in VM as a PCI device.
> 
>   ------------------
>   |  virtio driver |  ----->  VM
>   ------------------
>         |
>         | ----------> (over PCI bus or MMIO or Channel I/O)
>         |
>   ------------------
>   | device emulate |
>   |                |  ----->  QEMU
>   | vhost adapter  |
>   ------------------
>         |
>         | ----------> (vhost-user protocol or vhost-net ioctls)
>         |
>   ------------------
>   | vhost backend  |
>   ------------------
>  
> Compared to QEMU/VM case, virtio support for contaner requires to embedded
> device framework inside the virtio PMD. So this converged driver actually
> plays three roles:
>   - virtio driver to drive this new kind of virtual device;
>   - device emulation to present this virtual device and reponse to the
>     virtio driver, which is originally by QEMU;
>   - and the role to communicate with vhost backend, which is also
>     originally by QEMU.
> 
> The code layout and functionality of each module:
>  
>   ----------------------
>   | ------------------ |
>   | | virtio driver  | |----> (virtio_user_ethdev.c)
>   | ------------------ |
>   |         |          |
>   | ------------------ | ------>  virtio-user PMD
>   | | device emulate |-|----> (virtio_user_dev.c)
>   | |                | |
>   | | vhost adapter  |-|----> (vhost_user.c, vhost_kernel.c, vhost.c)
>   | ------------------ |
>   ----------------------
>          |
>          | -------------- --> (vhost-user protocol)
>          |
>    ------------------
>    | vhost backend  |
>    ------------------
> 
> How to share memory? In VM's case, qemu always shares all physical layout
> to backend. But it's not feasible for a container, as a process, to share
> all virtual memory regions to backend. So only specified virtual memory
> regions (with type of shared) are sent to backend. It's a limitation that
> only addresses in these areas can be used to transmit or receive packets.
> 
> Known issues:
>  - Control queue and multi-queue are not supported yet.
>  - Cannot work with --huge-unlink.
>  - Cannot work with no-huge.
>  - Cannot work when there are more than VHOST_MEMORY_MAX_NREGIONS(8)
>    hugepages.
>  - Root privilege is a must (mainly becase of sorting hugepages according
>    to physical address).
>  - Applications should not use file name like HUGEFILE_FMT ("%smap_%d").
>  - Cannot work with vhost kernel.
> 
> How to use?
> 
> a. Apply this patchset.
> 
> b. To compile container apps:
> $: make config RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
> $: make install RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
> $: make -C examples/l2fwd RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
> $: make -C examples/vhost RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
> 
> c. To build a docker image using Dockerfile below.
> $: cat ./Dockerfile
> FROM ubuntu:latest
> WORKDIR /usr/src/dpdk
> COPY . /usr/src/dpdk
> ENV PATH "$PATH:/usr/src/dpdk/examples/l2fwd/build/"
> $: docker build -t dpdk-app-l2fwd .
> 
> d. Used with vhost-user
> $: ./examples/vhost/build/vhost-switch -c 3 -n 4 \
> 	--socket-mem 1024,1024 -- -p 0x1 --stats 1
> $: docker run -i -t -v <path_to_vhost_unix_socket>:/var/run/usvhost \
> 	-v /dev/hugepages:/dev/hugepages \
> 	dpdk-app-l2fwd l2fwd -c 0x4 -n 4 -m 1024 --no-pci \
> 	--vdev=virtio-user0,path=/var/run/usvhost -- -p 0x1
> 
> By the way, it's not necessary to run in a container.
> 
> Signed-off-by: Huawei Xie <huawei.xie at intel.com>
> Signed-off-by: Jianfeng Tan <jianfeng.tan at intel.com>
> 
> 
> Jianfeng Tan (6):
>   virtio: hide phys addr check inside pci ops
>   virtio: enable use virtual address to fill desc
>   virtio-user: add vhost user adapter layer
>   virtio-user: add device emulation layer APIs
>   virtio-user: add new virtual pci driver for virtio
>   virtio-user: add a new vdev named virtio-user
> 
>  config/common_linuxapp                           |   1 +
>  doc/guides/rel_notes/release_16_07.rst           |  12 +
>  doc/guides/sample_app_ug/vhost.rst               |  17 +
>  drivers/net/virtio/Makefile                      |   6 +
>  drivers/net/virtio/virtio_ethdev.c               |  77 ++--
>  drivers/net/virtio/virtio_ethdev.h               |   2 +
>  drivers/net/virtio/virtio_pci.c                  |  30 +-
>  drivers/net/virtio/virtio_pci.h                  |   3 +-
>  drivers/net/virtio/virtio_rxtx.c                 |   5 +-
>  drivers/net/virtio/virtio_rxtx_simple.c          |  13 +-
>  drivers/net/virtio/virtio_user/vhost.h           | 141 ++++++++
>  drivers/net/virtio/virtio_user/vhost_user.c      | 404 +++++++++++++++++++++
>  drivers/net/virtio/virtio_user/virtio_user_dev.c | 227 ++++++++++++
>  drivers/net/virtio/virtio_user/virtio_user_dev.h |  62 ++++
>  drivers/net/virtio/virtio_user_ethdev.c          | 427 +++++++++++++++++++++++
>  drivers/net/virtio/virtqueue.h                   |  10 +
>  16 files changed, 1395 insertions(+), 42 deletions(-)
>  create mode 100644 drivers/net/virtio/virtio_user/vhost.h
>  create mode 100644 drivers/net/virtio/virtio_user/vhost_user.c
>  create mode 100644 drivers/net/virtio/virtio_user/virtio_user_dev.c
>  create mode 100644 drivers/net/virtio/virtio_user/virtio_user_dev.h
>  create mode 100644 drivers/net/virtio/virtio_user_ethdev.c
> 
> -- 
> 2.1.4


More information about the dev mailing list