[dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev

Yuanhan Liu yuanhan.liu at linux.intel.com
Wed Aug 19 08:28:51 CEST 2015


On Wed, Aug 19, 2015 at 05:54:09AM +0000, Ouyang, Changchun wrote:
> Hi Yuanhan,
> 
> > -----Original Message-----
> > From: Yuanhan Liu [mailto:yuanhan.liu at linux.intel.com]
> > Sent: Wednesday, August 19, 2015 11:53 AM
> > To: Ouyang, Changchun
> > Cc: dev at dpdk.org; Xie, Huawei
> > Subject: Re: [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in
> > virtio dev
> > 
> > Hi Changchun,
> > 
> > On Wed, Aug 12, 2015 at 04:02:37PM +0800, Ouyang Changchun wrote:
> > > Each virtio device could have multiple queues, say 2 or 4, at most 8.
> > > Enabling this feature allows virtio device/port on guest has the
> > > ability to use different vCPU to receive/transmit packets from/to each
> > queue.
> > >
> > > In multiple queues mode, virtio device readiness means all queues of
> > > this virtio device are ready, cleanup/destroy a virtio device also
> > > requires clearing all queues belong to it.
> > >
> > > Signed-off-by: Changchun Ouyang <changchun.ouyang at intel.com>
> > > ---
> > [snip ..]
> > >  /*
> > > + *  Initialise all variables in vring queue pair.
> > > + */
> > > +static void
> > > +init_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx) {
> > > +	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> > > +	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
> > > +	memset(dev->virtqueue[virt_rx_q_idx], 0, sizeof(struct
> > vhost_virtqueue));
> > > +	memset(dev->virtqueue[virt_tx_q_idx], 0, sizeof(struct
> > > +vhost_virtqueue));
> > > +
> > > +	dev->virtqueue[virt_rx_q_idx]->kickfd = (eventfd_t)-1;
> > > +	dev->virtqueue[virt_rx_q_idx]->callfd = (eventfd_t)-1;
> > > +	dev->virtqueue[virt_tx_q_idx]->kickfd = (eventfd_t)-1;
> > > +	dev->virtqueue[virt_tx_q_idx]->callfd = (eventfd_t)-1;
> > > +
> > > +	/* Backends are set to -1 indicating an inactive device. */
> > > +	dev->virtqueue[virt_rx_q_idx]->backend = VIRTIO_DEV_STOPPED;
> > > +	dev->virtqueue[virt_tx_q_idx]->backend = VIRTIO_DEV_STOPPED; }
> > > +
> > > +/*
> > >   *  Initialise all variables in device structure.
> > >   */
> > >  static void
> > > @@ -258,17 +294,34 @@ init_device(struct virtio_net *dev)
> > >  	/* Set everything to 0. */
> > 
> > There is a trick here. Let me fill the context first:
> > 
> > 283 static void
> > 284 init_device(struct virtio_net *dev)
> > 285 {
> > 286         uint64_t vq_offset;
> > 287
> > 288         /*
> > 289          * Virtqueues have already been malloced so
> > 290          * we don't want to set them to NULL.
> > 291          */
> > 292         vq_offset = offsetof(struct virtio_net, mem);
> > 293
> > 294         /* Set everything to 0. */
> > 295         memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
> > 296                 (sizeof(struct virtio_net) - (size_t)vq_offset));
> > 297
> > 298         init_vring_queue_pair(dev, 0);
> > 
> > This piece of code's intention is to memset everything to zero, except the
> > `virtqueue' field, for, as the comment stated, we have already allocated
> > virtqueue.
> > 
> > It works only when `virtqueue' field is before `mem' field, and it was
> > before:
> > 
> >     struct virtio_net {
> >             struct vhost_virtqueue  *virtqueue[VIRTIO_QNUM];        /**< Contains
> > all virtqueue information. */
> >             struct virtio_memory    *mem;           /**< QEMU memory and memory
> > region information. */
> >             ...
> > 
> > After this patch, it becomes:
> > 
> >     struct virtio_net {
> >             struct virtio_memory    *mem;           /**< QEMU memory and memory
> > region information. */
> >             struct vhost_virtqueue  **virtqueue;    /**< Contains all virtqueue
> > information. */
> >             ...
> > 
> > 
> > Which actually wipes all stuff inside `struct virtio_net`, resulting to setting
> > `virtqueue' to NULL as well.
> > 
> > While reading the code(without you patch applied), I thought that it's error-
> > prone, as it is very likely that someone else besides the author doesn't know
> > such undocumented rule. And you just gave me an example :)
> > 
> > Huawei, I'm proposing a fix to call rte_zmalloc() for allocating new_ll_dev to
> > get rid of such issue. What do you think?
> > 
> > 	--yliu
> > 
> > 
> 
> Suggest you finish the latter patch review:
> [PATCH v4 04/12] vhost: set memory layout for multiple queues mode,
> After you finish reviewing this patch, I think you will change your mind :-)
> 
> This patch resolve what you concern.

Sorry that I hadn't gone that far yet. And yes, I see. I found that you
moved the barrier to `features' field, which puts the `virtqueue' field
back to the "do not set to zero" zone.

It's still an undocumented rule, and hence, error prone, IMO. But, you
reminded me that init_device() will be invoked at other place else(reset_owner()).
Hence, my solution won't work, either.

I'm wondering saving the `virtqueue'(and `mem_arr' from patch 04/12)
explicitly before memset() and restoring them after that, to get rid of
the undocumented rule. It may become uglier with more and more fields
need to be saved this way. But judging that we have two fields so far,
I'm kind of okay to that.

What do you think then? If that doesn't work, we should add some comments
inside the virtio_net struct at least, or even add a build time check.

	--yliu

> > 
> > >  	memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
> > >  		(sizeof(struct virtio_net) - (size_t)vq_offset));
> > > -	memset(dev->virtqueue[VIRTIO_RXQ], 0, sizeof(struct
> > vhost_virtqueue));
> > > -	memset(dev->virtqueue[VIRTIO_TXQ], 0, sizeof(struct
> > vhost_virtqueue));
> > >
> > > -	dev->virtqueue[VIRTIO_RXQ]->kickfd = (eventfd_t)-1;
> > > -	dev->virtqueue[VIRTIO_RXQ]->callfd = (eventfd_t)-1;
> > > -	dev->virtqueue[VIRTIO_TXQ]->kickfd = (eventfd_t)-1;
> > > -	dev->virtqueue[VIRTIO_TXQ]->callfd = (eventfd_t)-1;
> > > +	init_vring_queue_pair(dev, 0);
> > > +	dev->virt_qp_nb = 1;
> > > +}
> > > +
> > > +/*
> > > + *  Alloc mem for vring queue pair.
> > > + */
> > > +int
> > > +alloc_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx) {
> > > +	struct vhost_virtqueue *virtqueue = NULL;
> > > +	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> > > +	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
> > >
> > > -	/* Backends are set to -1 indicating an inactive device. */
> > > -	dev->virtqueue[VIRTIO_RXQ]->backend = VIRTIO_DEV_STOPPED;
> > > -	dev->virtqueue[VIRTIO_TXQ]->backend = VIRTIO_DEV_STOPPED;
> > > +	virtqueue = rte_malloc(NULL, sizeof(struct vhost_virtqueue) *
> > VIRTIO_QNUM, 0);
> > > +	if (virtqueue == NULL) {
> > > +		RTE_LOG(ERR, VHOST_CONFIG,
> > > +			"Failed to allocate memory for virt qp:%d.\n",
> > qp_idx);
> > > +		return -1;
> > > +	}
> > > +
> > > +	dev->virtqueue[virt_rx_q_idx] = virtqueue;
> > > +	dev->virtqueue[virt_tx_q_idx] = virtqueue + VIRTIO_TXQ;
> > > +
> > > +	init_vring_queue_pair(dev, qp_idx);
> > > +
> > > +	return 0;
> > >  }
> > >
> > >  /*
> > > @@ -280,7 +333,6 @@ static int
> > >  new_device(struct vhost_device_ctx ctx)  {
> > >  	struct virtio_net_config_ll *new_ll_dev;
> > > -	struct vhost_virtqueue *virtqueue_rx, *virtqueue_tx;
> > >
> > >  	/* Setup device and virtqueues. */
> > >  	new_ll_dev = rte_malloc(NULL, sizeof(struct virtio_net_config_ll),
> > > 0); @@ -291,28 +343,22 @@ new_device(struct vhost_device_ctx ctx)
> > >  		return -1;
> > >  	}
> > >
> > > -	virtqueue_rx = rte_malloc(NULL, sizeof(struct vhost_virtqueue), 0);
> > > -	if (virtqueue_rx == NULL) {
> > > -		rte_free(new_ll_dev);
> > > +	new_ll_dev->dev.virtqueue =
> > > +		rte_malloc(NULL, VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX *
> > sizeof(struct vhost_virtqueue *), 0);
> > > +	if (new_ll_dev->dev.virtqueue == NULL) {
> > >  		RTE_LOG(ERR, VHOST_CONFIG,
> > > -			"(%"PRIu64") Failed to allocate memory for rxq.\n",
> > > +			"(%"PRIu64") Failed to allocate memory for
> > dev.virtqueue.\n",
> > >  			ctx.fh);
> > > +		rte_free(new_ll_dev);
> > >  		return -1;
> > >  	}
> > >
> > > -	virtqueue_tx = rte_malloc(NULL, sizeof(struct vhost_virtqueue), 0);
> > > -	if (virtqueue_tx == NULL) {
> > > -		rte_free(virtqueue_rx);
> > > +	if (alloc_vring_queue_pair(&new_ll_dev->dev, 0) == -1) {
> > > +		rte_free(new_ll_dev->dev.virtqueue);
> > >  		rte_free(new_ll_dev);
> > > -		RTE_LOG(ERR, VHOST_CONFIG,
> > > -			"(%"PRIu64") Failed to allocate memory for txq.\n",
> > > -			ctx.fh);
> > >  		return -1;
> > >  	}
> > >
> > > -	new_ll_dev->dev.virtqueue[VIRTIO_RXQ] = virtqueue_rx;
> > > -	new_ll_dev->dev.virtqueue[VIRTIO_TXQ] = virtqueue_tx;
> > > -
> > >  	/* Initialise device and virtqueues. */
> > >  	init_device(&new_ll_dev->dev);
> > >
> > > @@ -396,7 +442,7 @@ set_owner(struct vhost_device_ctx ctx)
> > >   * Called from CUSE IOCTL: VHOST_RESET_OWNER
> > >   */
> > >  static int
> > > -reset_owner(struct vhost_device_ctx ctx)
> > > +reset_owner(__rte_unused struct vhost_device_ctx ctx)
> > >  {
> > >  	struct virtio_net_config_ll *ll_dev;
> > >
> > > @@ -434,6 +480,7 @@ static int
> > >  set_features(struct vhost_device_ctx ctx, uint64_t *pu)  {
> > >  	struct virtio_net *dev;
> > > +	uint32_t q_idx;
> > >
> > >  	dev = get_device(ctx);
> > >  	if (dev == NULL)
> > > @@ -445,22 +492,26 @@ set_features(struct vhost_device_ctx ctx,
> > uint64_t *pu)
> > >  	dev->features = *pu;
> > >
> > >  	/* Set the vhost_hlen depending on if VIRTIO_NET_F_MRG_RXBUF
> > is set. */
> > > -	if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) {
> > > -		LOG_DEBUG(VHOST_CONFIG,
> > > -			"(%"PRIu64") Mergeable RX buffers enabled\n",
> > > -			dev->device_fh);
> > > -		dev->virtqueue[VIRTIO_RXQ]->vhost_hlen =
> > > -			sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > > -		dev->virtqueue[VIRTIO_TXQ]->vhost_hlen =
> > > -			sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > > -	} else {
> > > -		LOG_DEBUG(VHOST_CONFIG,
> > > -			"(%"PRIu64") Mergeable RX buffers disabled\n",
> > > -			dev->device_fh);
> > > -		dev->virtqueue[VIRTIO_RXQ]->vhost_hlen =
> > > -			sizeof(struct virtio_net_hdr);
> > > -		dev->virtqueue[VIRTIO_TXQ]->vhost_hlen =
> > > -			sizeof(struct virtio_net_hdr);
> > > +	for (q_idx = 0; q_idx < dev->virt_qp_nb; q_idx++) {
> > > +		uint32_t virt_rx_q_idx = q_idx * VIRTIO_QNUM +
> > VIRTIO_RXQ;
> > > +		uint32_t virt_tx_q_idx = q_idx * VIRTIO_QNUM +
> > VIRTIO_TXQ;
> > > +		if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) {
> > > +			LOG_DEBUG(VHOST_CONFIG,
> > > +				"(%"PRIu64") Mergeable RX buffers
> > enabled\n",
> > > +				dev->device_fh);
> > > +			dev->virtqueue[virt_rx_q_idx]->vhost_hlen =
> > > +				sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > > +			dev->virtqueue[virt_tx_q_idx]->vhost_hlen =
> > > +				sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > > +		} else {
> > > +			LOG_DEBUG(VHOST_CONFIG,
> > > +				"(%"PRIu64") Mergeable RX buffers
> > disabled\n",
> > > +				dev->device_fh);
> > > +			dev->virtqueue[virt_rx_q_idx]->vhost_hlen =
> > > +				sizeof(struct virtio_net_hdr);
> > > +			dev->virtqueue[virt_tx_q_idx]->vhost_hlen =
> > > +				sizeof(struct virtio_net_hdr);
> > > +		}
> > >  	}
> > >  	return 0;
> > >  }
> > > @@ -826,6 +877,14 @@ int rte_vhost_feature_enable(uint64_t
> > feature_mask)
> > >  	return -1;
> > >  }
> > >
> > > +uint16_t rte_vhost_qp_num_get(struct virtio_net *dev) {
> > > +	if (dev == NULL)
> > > +		return 0;
> > > +
> > > +	return dev->virt_qp_nb;
> > > +}
> > > +
> > >  /*
> > >   * Register ops so that we can add/remove device to data core.
> > >   */
> > > --
> > > 1.8.4.2
> > >


More information about the dev mailing list