[dpdk-dev] [PATCH v3 1/2] virtio/vdev: add embeded device emulation

Tan, Jianfeng jianfeng.tan at intel.com
Fri Apr 22 12:12:21 CEST 2016


Hi Yuanhan,

On 4/22/2016 6:01 AM, Yuanhan Liu wrote:
> On Thu, Apr 21, 2016 at 02:56:35AM +0000, Jianfeng Tan wrote:
>> Background: Previously, we usually use a virtio device in QEMU/VM's
>> context as below pic shows. Virtio nic is emulated in QEMU, and usually
>> presented in VM as a PCI device.
>>
>> |-----------|
>> |     vm    |
>> |-----------| (over PCI bus or MMIO or Channel I/O)
>> |    QEMU   | -> device emulation
>> |-----------|
>>        |
>>        | (vhost-user protocol or vhost-net ioctls)
>>        |
>> |-----------|
>> |   vhost   |
>> |-----------|
>>
>> Then we come to the topic that how to present a virtio device in an app
>> or container, which uses virtio device to do inter process communication
>> with vhost backend process. To achieve that, first of all, we need way
>> in DPDK to interract with vhost backend. And then emulate a virtual
>> virtio device in DPDK (which is addressed in following patch).
>>
>> |-----------|
>> |  DPDK app |
>> |-----------|
>> |  DPDK lib | -> device emulation (addressed by following patch)
>> |-----------|
>>        |
>>        | (vhost-user protocol or vhost-net ioctls), addressed by this patch
>>        |
>> |-----------|
>> |   vhost   |
>> |-----------|
>>
>> How: we implement another instance of struct virtio_pci_ops to intercept
>> the communications between VM and QEMU. Instead of rd/wr ioport or PCI
>> configuration space, here we directly talk with backend through the vhost
>> file.
> Nope, that's wrong, and here becomes a bit subtle. I will try to make
> some explanation here.
>
> Let's talk about the normal case (with QEMU) first. Where, virtio PMD
> is a driver, and virito device is emulated inside QEMU, and exposed by
> PCI. So, virtio PMD talks to the device with ioport rd/wr (or MMIO for
> virtio 1.0).
>
> Till now, you are right.
>
> However, vhost-user socket is for establishing a connection, providing
> HOST with enough information, so that host can directly manipulate the
> vring, to dequeue/enqueue buffers.
>
> So, what you were saying about "directly talk with backend (the virtual
> virtio device you created) through vhost file" is not right. Instead,
> in your case, the (virtual) virtio device and the PMD driver is in
> the same process space, therefore, you could actually access or the
> device info simply by normal read/write.

Here by _backend_, I mean vhost. I know you take backend as the device 
emulation. I ask Huawei, he thinks backend = device emulation + vhost. 
So maybe I should use the phrase vhost for accuracy.

>
> As you can see, It's a bit messy to mix all of them (virtio PMD driver,
> virtio device emulation, and vhost-uesr frontend) in one single directory
> (or even in one single file as you did). Therefore, I'd suggest you to
> make a new dir, say "virtio-user" (a good name from Thomas), and put all
> files related to virtio device emulation and vhost-user frontend there.
>
> Further more, I'd suggest to divide the code into following files:
>
> - virtio-user/virtio.c
>
>    All virtio device emulation goes here.
>
> - virtio-user/vhost-user.c
>
>    The vhost-user frontend implementation
>
> - virtio-user/vhost-kernel.c
>
>    vhost kernel hanldings, including setting the tap device.
>
> - And, __maybe__ another standalone file for handling the talk
>    between the driver and the device. (See more for the comments
>    about virtio_pci_ops below).
>
>
> That would make it much clearer, IMO.

Got your point here, but to be honest, I'm a little concerned to split a 
1k-lined file into 5+ files. Previously I'd like to make them converged 
together. But your suggestion is great for clean code. If any other 
could give some comments?

>
> Besides that, I came up with few minor nits below. You might want to
> fix them all in the next version.
>
>> +static int
>> +vhost_user_write(int fd, void *buf, int len, int *fds, int fd_num)
>> +{
>> +	int r;
>> +	struct msghdr msgh;
>> +	struct iovec iov;
>> +	size_t fd_size = fd_num * sizeof(int);
>> +	char control[CMSG_SPACE(fd_size)];
>> +	struct cmsghdr *cmsg;
>> +
>> +	bzero(&msgh, sizeof(msgh));
>> +	bzero(control, sizeof(control));
> bzero is marked as deprecated (see the man page), use memset instead.

I should apologize about this, you actually told me once before, but I 
failed to fix it in this version.

>
>> +
>> +static struct vhost_user_msg m __rte_unused;
> Hmm, if it's not used, why define it. If it's used, why decorate it
> with __rte_unused?

This variable is to make it easy to calculate length of some fields 
inside this struct. But seems that removing it does not leading to any 
compiling error. I'll remove it.

>
>> +
>> +static void
>> +prepare_vhost_memory_user(struct vhost_user_msg *msg, int fds[])
>> +{
>> +	int i, num;
>> +	struct hugepage_file_info huges[VHOST_MEMORY_MAX_NREGIONS];
>> +	struct vhost_memory_region *mr;
>> +
>> +	num = get_hugepage_file_info(huges, VHOST_MEMORY_MAX_NREGIONS);
>> +	if (num < 0)
>> +		rte_panic("Failed to prepare memory for vhost-user\n");
> Do not use rte_panic, unless it's really needed. I see no good reason
> to use it in a driver. If something we need is out of order, just
> return and print some log and tell the user that this driver will not
> work. This would keep other components work. You may then argue that
> we have only one driver in container usage, but still, it's not a
> good habit.

Just want to make it "fail early, fair loudly". But I agree to keep same 
stype with other driver.

>
>> +static void
>> +vdev_reset(struct virtio_hw *hw __rte_unused)
>> +{
>> +	/* do nothing according to qemu vhost user spec */
> That's not the right way to quote spec, it barely tells us anything
> useful. So, you should quote the content here.

OK, I'll add more info here.

>
>> +
>> +static const struct virtio_pci_ops vdev_ops = {
>> +	.read_dev_cfg	= vdev_read_dev_config,
>> +	.write_dev_cfg	= vdev_write_dev_config,
>> +	.reset		= vdev_reset,
>> +	.get_status	= vdev_get_status,
>> +	.set_status	= vdev_set_status,
>> +	.get_features	= vdev_get_features,
>> +	.set_features	= vdev_set_features,
>> +	.get_isr	= vdev_get_isr,
>> +	.set_config_irq	= vdev_set_config_irq,
>> +	.get_queue_num	= vdev_get_queue_num,
>> +	.setup_queue	= vdev_setup_queue,
>> +	.del_queue	= vdev_del_queue,
>> +	.notify_queue	= vdev_notify_queue,
>> +};
> As stated above, this acutally does NOT belong to the virtual virtio
> device emulation. It should be part of the code of the PMD driver.
> You should seperate them.



>
>> diff --git a/drivers/net/virtio/virtio_pci.h b/drivers/net/virtio/virtio_pci.h
>> index b69785e..68097e6 100644
>> --- a/drivers/net/virtio/virtio_pci.h
>> +++ b/drivers/net/virtio/virtio_pci.h
>> @@ -260,6 +260,20 @@ struct virtio_hw {
>>   	struct virtio_pci_common_cfg *common_cfg;
>>   	struct virtio_net_config *dev_cfg;
>>   	const struct virtio_pci_ops *vtpci_ops;
>> +#ifdef RTE_VIRTIO_VDEV
>> +#define VHOST_KERNEL	0
>> +#define VHOST_USER	1
>> +	int		type; /* type of backend */
>> +	uint32_t	queue_num;
>> +	char		*path;
>> +	int		mac_specified;
>> +	int		vhostfd;
>> +	int		backfd; /* tap device used in vhost-net */
>> +	int		callfds[VIRTIO_MAX_VIRTQUEUES * 2 + 1];
>> +	int		kickfds[VIRTIO_MAX_VIRTQUEUES * 2 + 1];
>> +	uint8_t		status;
>> +	struct rte_eth_dev_data *data;
>> +#endif
> And put all of them to the virtio "device" context, in the virtio-user/
> then.

OK, I'll define a new struct to store these fields. I consider there are 
two methods to refer these data. One is to store a pointer to the 
instance of the new struct in the struct virtio_hw; the other is to main 
them totally closed inside virtio-user, such as using a linked chain. 
Which do you prefer?

Thanks,
Jianfeng

>
> 	--yliu



More information about the dev mailing list