[dpdk-dev] [PATCH v3 1/2] virtio/vdev: add embeded device emulation

Yuanhan Liu yuanhan.liu at linux.intel.com
Fri Apr 22 00:01:21 CEST 2016


On Thu, Apr 21, 2016 at 02:56:35AM +0000, Jianfeng Tan wrote:
> Background: Previously, we usually use a virtio device in QEMU/VM's
> context as below pic shows. Virtio nic is emulated in QEMU, and usually
> presented in VM as a PCI device.
> 
> |-----------|
> |     vm    |
> |-----------| (over PCI bus or MMIO or Channel I/O)
> |    QEMU   | -> device emulation
> |-----------|
>       |
>       | (vhost-user protocol or vhost-net ioctls)
>       |
> |-----------|
> |   vhost   |
> |-----------|
> 
> Then we come to the topic that how to present a virtio device in an app
> or container, which uses virtio device to do inter process communication
> with vhost backend process. To achieve that, first of all, we need way
> in DPDK to interract with vhost backend. And then emulate a virtual
> virtio device in DPDK (which is addressed in following patch).
> 
> |-----------|
> |  DPDK app |
> |-----------|
> |  DPDK lib | -> device emulation (addressed by following patch)
> |-----------|
>       |
>       | (vhost-user protocol or vhost-net ioctls), addressed by this patch
>       |
> |-----------|
> |   vhost   |
> |-----------|
> 
> How: we implement another instance of struct virtio_pci_ops to intercept
> the communications between VM and QEMU. Instead of rd/wr ioport or PCI
> configuration space, here we directly talk with backend through the vhost
> file.

Nope, that's wrong, and here becomes a bit subtle. I will try to make
some explanation here.

Let's talk about the normal case (with QEMU) first. Where, virtio PMD
is a driver, and virito device is emulated inside QEMU, and exposed by
PCI. So, virtio PMD talks to the device with ioport rd/wr (or MMIO for
virtio 1.0).

Till now, you are right.

However, vhost-user socket is for establishing a connection, providing
HOST with enough information, so that host can directly manipulate the
vring, to dequeue/enqueue buffers.

So, what you were saying about "directly talk with backend (the virtual
virtio device you created) through vhost file" is not right. Instead,
in your case, the (virtual) virtio device and the PMD driver is in
the same process space, therefore, you could actually access or the
device info simply by normal read/write.

As you can see, It's a bit messy to mix all of them (virtio PMD driver,
virtio device emulation, and vhost-uesr frontend) in one single directory
(or even in one single file as you did). Therefore, I'd suggest you to
make a new dir, say "virtio-user" (a good name from Thomas), and put all
files related to virtio device emulation and vhost-user frontend there.

Further more, I'd suggest to divide the code into following files:

- virtio-user/virtio.c

  All virtio device emulation goes here.

- virtio-user/vhost-user.c

  The vhost-user frontend implementation

- virtio-user/vhost-kernel.c

  vhost kernel hanldings, including setting the tap device.

- And, __maybe__ another standalone file for handling the talk
  between the driver and the device. (See more for the comments
  about virtio_pci_ops below).


That would make it much clearer, IMO.

Besides that, I came up with few minor nits below. You might want to
fix them all in the next version.

> +static int
> +vhost_user_write(int fd, void *buf, int len, int *fds, int fd_num)
> +{
> +	int r;
> +	struct msghdr msgh;
> +	struct iovec iov;
> +	size_t fd_size = fd_num * sizeof(int);
> +	char control[CMSG_SPACE(fd_size)];
> +	struct cmsghdr *cmsg;
> +
> +	bzero(&msgh, sizeof(msgh));
> +	bzero(control, sizeof(control));

bzero is marked as deprecated (see the man page), use memset instead.

> +
> +static struct vhost_user_msg m __rte_unused;

Hmm, if it's not used, why define it. If it's used, why decorate it
with __rte_unused?

> +
> +static void
> +prepare_vhost_memory_user(struct vhost_user_msg *msg, int fds[])
> +{
> +	int i, num;
> +	struct hugepage_file_info huges[VHOST_MEMORY_MAX_NREGIONS];
> +	struct vhost_memory_region *mr;
> +
> +	num = get_hugepage_file_info(huges, VHOST_MEMORY_MAX_NREGIONS);
> +	if (num < 0)
> +		rte_panic("Failed to prepare memory for vhost-user\n");

Do not use rte_panic, unless it's really needed. I see no good reason
to use it in a driver. If something we need is out of order, just
return and print some log and tell the user that this driver will not
work. This would keep other components work. You may then argue that
we have only one driver in container usage, but still, it's not a
good habit.

> +static void
> +vdev_reset(struct virtio_hw *hw __rte_unused)
> +{
> +	/* do nothing according to qemu vhost user spec */

That's not the right way to quote spec, it barely tells us anything
useful. So, you should quote the content here.

> +
> +static const struct virtio_pci_ops vdev_ops = {
> +	.read_dev_cfg	= vdev_read_dev_config,
> +	.write_dev_cfg	= vdev_write_dev_config,
> +	.reset		= vdev_reset,
> +	.get_status	= vdev_get_status,
> +	.set_status	= vdev_set_status,
> +	.get_features	= vdev_get_features,
> +	.set_features	= vdev_set_features,
> +	.get_isr	= vdev_get_isr,
> +	.set_config_irq	= vdev_set_config_irq,
> +	.get_queue_num	= vdev_get_queue_num,
> +	.setup_queue	= vdev_setup_queue,
> +	.del_queue	= vdev_del_queue,
> +	.notify_queue	= vdev_notify_queue,
> +};

As stated above, this acutally does NOT belong to the virtual virtio
device emulation. It should be part of the code of the PMD driver.
You should seperate them.

> diff --git a/drivers/net/virtio/virtio_pci.h b/drivers/net/virtio/virtio_pci.h
> index b69785e..68097e6 100644
> --- a/drivers/net/virtio/virtio_pci.h
> +++ b/drivers/net/virtio/virtio_pci.h
> @@ -260,6 +260,20 @@ struct virtio_hw {
>  	struct virtio_pci_common_cfg *common_cfg;
>  	struct virtio_net_config *dev_cfg;
>  	const struct virtio_pci_ops *vtpci_ops;
> +#ifdef RTE_VIRTIO_VDEV
> +#define VHOST_KERNEL	0
> +#define VHOST_USER	1
> +	int		type; /* type of backend */
> +	uint32_t	queue_num;
> +	char		*path;
> +	int		mac_specified;
> +	int		vhostfd;
> +	int		backfd; /* tap device used in vhost-net */
> +	int		callfds[VIRTIO_MAX_VIRTQUEUES * 2 + 1];
> +	int		kickfds[VIRTIO_MAX_VIRTQUEUES * 2 + 1];
> +	uint8_t		status;
> +	struct rte_eth_dev_data *data;
> +#endif

And put all of them to the virtio "device" context, in the virtio-user/
then.

	--yliu


More information about the dev mailing list