[dpdk-dev] [RFC PATCH v2] vhost: Add VHOST PMD

Loftus, Ciara ciara.loftus at intel.com
Tue Oct 20 16:13:08 CEST 2015


> 
> On 2015/09/24 2:47, Loftus, Ciara wrote:
> >> The patch introduces a new PMD. This PMD is implemented as thin
> wrapper
> >> of librte_vhost. It means librte_vhost is also needed to compile the PMD.
> >> The PMD can have 'iface' parameter like below to specify a path to
> connect
> >> to a virtio-net device.
> >>
> >> $ ./testpmd -c f -n 4 --vdev 'eth_vhost0,iface=/tmp/sock0' -- -i
> >>
> >> To connect above testpmd, here is qemu command example.
> >>
> >> $ qemu-system-x86_64 \
> >>         <snip>
> >>         -chardev socket,id=chr0,path=/tmp/sock0 \
> >>         -netdev vhost-user,id=net0,chardev=chr0,vhostforce \
> >>         -device virtio-net-pci,netdev=net0
> >>
> >> Signed-off-by: Tetsuya Mukawa <mukawa at igel.co.jp>
> >> ---
> >>  config/common_linuxapp                      |   6 +
> >>  drivers/net/Makefile                        |   4 +
> >>  drivers/net/vhost/Makefile                  |  61 +++
> >>  drivers/net/vhost/rte_eth_vhost.c           | 640
> >> ++++++++++++++++++++++++++++
> >>  drivers/net/vhost/rte_pmd_vhost_version.map |   4 +
> >>  mk/rte.app.mk                               |   8 +-
> >>  6 files changed, 722 insertions(+), 1 deletion(-)
> >>  create mode 100644 drivers/net/vhost/Makefile
> >>  create mode 100644 drivers/net/vhost/rte_eth_vhost.c
> >>  create mode 100644 drivers/net/vhost/rte_pmd_vhost_version.map
> >>
> >> +struct pmd_internal {
> >> +	TAILQ_ENTRY(pmd_internal) next;
> >> +	char *dev_name;
> >> +	char *iface_name;
> >> +	unsigned nb_rx_queues;
> >> +	unsigned nb_tx_queues;
> >> +	rte_atomic16_t xfer;
> > Is this flag just used to indicate the state of the virtio_net device?
> > Ie. if =0 then virtio_dev=NULL and if =1 then virtio_net !=NULL & the
> VIRTIO_DEV_RUNNING flag is set?
> 
> Hi Clara,
> 
> I am sorry for very late reply.
> 
> Yes, it is. Probably we can optimize it more.
> I will change this implementation a bit in next patches.
> Could you please check it?
Of course, thanks.

> 
> >> +
> >> +static uint16_t
> >> +eth_vhost_tx(void *q, struct rte_mbuf **bufs, uint16_t nb_bufs)
> >> +{
> >> +	struct vhost_queue *r = q;
> >> +	uint16_t i, nb_tx = 0;
> >> +
> >> +	if (unlikely(r->internal == NULL))
> >> +		return 0;
> >> +
> >> +	if (unlikely(rte_atomic16_read(&r->internal->xfer) == 0))
> >> +		return 0;
> >> +
> >> +	rte_atomic16_set(&r->tx_executing, 1);
> >> +
> >> +	if (unlikely(rte_atomic16_read(&r->internal->xfer) == 0))
> >> +		goto out;
> >> +
> >> +	nb_tx = (uint16_t)rte_vhost_enqueue_burst(r->device,
> >> +			VIRTIO_RXQ, bufs, nb_bufs);
> >> +
> >> +	rte_atomic64_add(&(r->tx_pkts), nb_tx);
> >> +	rte_atomic64_add(&(r->err_pkts), nb_bufs - nb_tx);
> >> +
> >> +	for (i = 0; likely(i < nb_tx); i++)
> >> +		rte_pktmbuf_free(bufs[i]);
> > We may not always want to free these mbufs. For example, if a call is made
> to rte_eth_tx_burst with buffers from another (non DPDK) source, they may
> not be ours to free.
> 
> Sorry, I am not sure what type of buffer you want to transfer.
> 
> This is a PMD that wraps librte_vhost.
> And I guess other PMDs cannot handle buffers from another non DPDK
> source.
> Should we take care such buffers?
> 
> I have also checked af_packet PMD.
> It seems the tx function of af_packet PMD just frees mbuf.

For example if using the PMD with an application that receives buffers from another source. Eg. a virtual switch receiving packets from an interface using the kernel driver.
I see that af_packet also frees the mbuf. I've checked the ixgbe and ring pmds though and they don't seem to free the buffers, although I may have missed something, the code for these is rather large and I am unfamiliar with most of it. If I am correct though, should this behaviour vary from PMD to PMD I wonder?
> 
> >> +
> >> +
> >> +	eth_dev = rte_eth_dev_allocated(internal->dev_name);
> >> +	if (eth_dev == NULL) {
> >> +		RTE_LOG(INFO, PMD, "failuer to find ethdev\n");
> > Typo: Failure. Same for the destroy_device function
> 
> Thanks, I will fix it in next patches.
> 
> >> +		return -1;
> >> +	}
> >> +
> >> +	internal = eth_dev->data->dev_private;
> >> +
> >> +	for (i = 0; i < internal->nb_rx_queues; i++) {
> >> +		vq = &internal->rx_vhost_queues[i];
> >> +		vq->device = dev;
> >> +		vq->internal = internal;
> >> +	}
> >> +	for (i = 0; i < internal->nb_tx_queues; i++) {
> >> +		vq = &internal->tx_vhost_queues[i];
> >> +		vq->device = dev;
> >> +		vq->internal = internal;
> >> +	}
> >> +
> >> +	dev->flags |= VIRTIO_DEV_RUNNING;
> >> +	dev->priv = eth_dev;
> >> +
> >> +	eth_dev->data->dev_link.link_status = 1;
> >> +	rte_atomic16_set(&internal->xfer, 1);
> >> +
> >> +	RTE_LOG(INFO, PMD, "New connection established\n");
> >> +
> >> +	return 0;
> > Some freedom is taken away if the new_device and destroy_device
> callbacks are implemented in the driver.
> > For example if one wishes to  call the rte_vhost_enable_guest_notification
> function when a new device is brought up. They cannot now as there is no
> scope to modify these callbacks, as is done in for example the vHost sample
> app. Is this correct?
> 
> So how about adding one more parameter to be able to choose guest
> notification behavior?
> 
> ex)
> ./testpmd --vdev 'eth_vhost0,iface=/tmp/sock0,guest_notification=0'
> 
> In above case, all queues in this device will have
> VRING_USED_F_NO_NOTIFY.

I'm not too concerned about this particular function, I was just making an example. The main concern I was expressing here and in the other thread with Bruce, is the risk that we will lose some functionality available in the library but not in the PMD. This function is an example of that. If we could find some way to retain the functionality available in the library, it would be ideal.

Thanks for the response! I will review and test further patches if they become available.

Ciara

> 
> Thanks,
> Tetsuya


More information about the dev mailing list