[dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in virtio dev

Yuanhan Liu yuanhan.liu at linux.intel.com
Wed Aug 19 08:39:47 CEST 2015


On Wed, Aug 19, 2015 at 02:28:51PM +0800, Yuanhan Liu wrote:
> On Wed, Aug 19, 2015 at 05:54:09AM +0000, Ouyang, Changchun wrote:
> > Hi Yuanhan,
> > 
> > > -----Original Message-----
> > > From: Yuanhan Liu [mailto:yuanhan.liu at linux.intel.com]
> > > Sent: Wednesday, August 19, 2015 11:53 AM
> > > To: Ouyang, Changchun
> > > Cc: dev at dpdk.org; Xie, Huawei
> > > Subject: Re: [dpdk-dev] [PATCH v4 02/12] vhost: support multiple queues in
> > > virtio dev
> > > 
> > > Hi Changchun,
> > > 
> > > On Wed, Aug 12, 2015 at 04:02:37PM +0800, Ouyang Changchun wrote:
> > > > Each virtio device could have multiple queues, say 2 or 4, at most 8.
> > > > Enabling this feature allows virtio device/port on guest has the
> > > > ability to use different vCPU to receive/transmit packets from/to each
> > > queue.
> > > >
> > > > In multiple queues mode, virtio device readiness means all queues of
> > > > this virtio device are ready, cleanup/destroy a virtio device also
> > > > requires clearing all queues belong to it.
> > > >
> > > > Signed-off-by: Changchun Ouyang <changchun.ouyang at intel.com>
> > > > ---
> > > [snip ..]
> > > >  /*
> > > > + *  Initialise all variables in vring queue pair.
> > > > + */
> > > > +static void
> > > > +init_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx) {
> > > > +	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> > > > +	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
> > > > +	memset(dev->virtqueue[virt_rx_q_idx], 0, sizeof(struct
> > > vhost_virtqueue));
> > > > +	memset(dev->virtqueue[virt_tx_q_idx], 0, sizeof(struct
> > > > +vhost_virtqueue));
> > > > +
> > > > +	dev->virtqueue[virt_rx_q_idx]->kickfd = (eventfd_t)-1;
> > > > +	dev->virtqueue[virt_rx_q_idx]->callfd = (eventfd_t)-1;
> > > > +	dev->virtqueue[virt_tx_q_idx]->kickfd = (eventfd_t)-1;
> > > > +	dev->virtqueue[virt_tx_q_idx]->callfd = (eventfd_t)-1;
> > > > +
> > > > +	/* Backends are set to -1 indicating an inactive device. */
> > > > +	dev->virtqueue[virt_rx_q_idx]->backend = VIRTIO_DEV_STOPPED;
> > > > +	dev->virtqueue[virt_tx_q_idx]->backend = VIRTIO_DEV_STOPPED; }
> > > > +
> > > > +/*
> > > >   *  Initialise all variables in device structure.
> > > >   */
> > > >  static void
> > > > @@ -258,17 +294,34 @@ init_device(struct virtio_net *dev)
> > > >  	/* Set everything to 0. */
> > > 
> > > There is a trick here. Let me fill the context first:
> > > 
> > > 283 static void
> > > 284 init_device(struct virtio_net *dev)
> > > 285 {
> > > 286         uint64_t vq_offset;
> > > 287
> > > 288         /*
> > > 289          * Virtqueues have already been malloced so
> > > 290          * we don't want to set them to NULL.
> > > 291          */
> > > 292         vq_offset = offsetof(struct virtio_net, mem);
> > > 293
> > > 294         /* Set everything to 0. */
> > > 295         memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
> > > 296                 (sizeof(struct virtio_net) - (size_t)vq_offset));
> > > 297
> > > 298         init_vring_queue_pair(dev, 0);
> > > 
> > > This piece of code's intention is to memset everything to zero, except the
> > > `virtqueue' field, for, as the comment stated, we have already allocated
> > > virtqueue.
> > > 
> > > It works only when `virtqueue' field is before `mem' field, and it was
> > > before:
> > > 
> > >     struct virtio_net {
> > >             struct vhost_virtqueue  *virtqueue[VIRTIO_QNUM];        /**< Contains
> > > all virtqueue information. */
> > >             struct virtio_memory    *mem;           /**< QEMU memory and memory
> > > region information. */
> > >             ...
> > > 
> > > After this patch, it becomes:
> > > 
> > >     struct virtio_net {
> > >             struct virtio_memory    *mem;           /**< QEMU memory and memory
> > > region information. */
> > >             struct vhost_virtqueue  **virtqueue;    /**< Contains all virtqueue
> > > information. */
> > >             ...
> > > 
> > > 
> > > Which actually wipes all stuff inside `struct virtio_net`, resulting to setting
> > > `virtqueue' to NULL as well.
> > > 
> > > While reading the code(without you patch applied), I thought that it's error-
> > > prone, as it is very likely that someone else besides the author doesn't know
> > > such undocumented rule. And you just gave me an example :)
> > > 
> > > Huawei, I'm proposing a fix to call rte_zmalloc() for allocating new_ll_dev to
> > > get rid of such issue. What do you think?
> > > 
> > > 	--yliu
> > > 
> > > 
> > 
> > Suggest you finish the latter patch review:
> > [PATCH v4 04/12] vhost: set memory layout for multiple queues mode,
> > After you finish reviewing this patch, I think you will change your mind :-)
> > 
> > This patch resolve what you concern.
> 
> Sorry that I hadn't gone that far yet. And yes, I see. I found that you
> moved the barrier to `features' field, which puts the `virtqueue' field
> back to the "do not set to zero" zone.
> 
> It's still an undocumented rule, and hence, error prone, IMO. But, you
> reminded me that init_device() will be invoked at other place else(reset_owner()).
> Hence, my solution won't work, either.
> 
> I'm wondering saving the `virtqueue'(and `mem_arr' from patch 04/12)
> explicitly before memset() and restoring them after that, to get rid of
> the undocumented rule. It may become uglier with more and more fields
> need to be saved this way. But judging that we have two fields so far,
> I'm kind of okay to that.

I thought it again. Nah.., let's forget that. It's not flexible, for
array fields like "virtqueue[]".

	--yliu
> 
> What do you think then? If that doesn't work, we should add some comments
> inside the virtio_net struct at least, or even add a build time check.
> 
> 	--yliu
> 
> > > 
> > > >  	memset((void *)(uintptr_t)((uint64_t)(uintptr_t)dev + vq_offset), 0,
> > > >  		(sizeof(struct virtio_net) - (size_t)vq_offset));
> > > > -	memset(dev->virtqueue[VIRTIO_RXQ], 0, sizeof(struct
> > > vhost_virtqueue));
> > > > -	memset(dev->virtqueue[VIRTIO_TXQ], 0, sizeof(struct
> > > vhost_virtqueue));
> > > >
> > > > -	dev->virtqueue[VIRTIO_RXQ]->kickfd = (eventfd_t)-1;
> > > > -	dev->virtqueue[VIRTIO_RXQ]->callfd = (eventfd_t)-1;
> > > > -	dev->virtqueue[VIRTIO_TXQ]->kickfd = (eventfd_t)-1;
> > > > -	dev->virtqueue[VIRTIO_TXQ]->callfd = (eventfd_t)-1;
> > > > +	init_vring_queue_pair(dev, 0);
> > > > +	dev->virt_qp_nb = 1;
> > > > +}
> > > > +
> > > > +/*
> > > > + *  Alloc mem for vring queue pair.
> > > > + */
> > > > +int
> > > > +alloc_vring_queue_pair(struct virtio_net *dev, uint16_t qp_idx) {
> > > > +	struct vhost_virtqueue *virtqueue = NULL;
> > > > +	uint32_t virt_rx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_RXQ;
> > > > +	uint32_t virt_tx_q_idx = qp_idx * VIRTIO_QNUM + VIRTIO_TXQ;
> > > >
> > > > -	/* Backends are set to -1 indicating an inactive device. */
> > > > -	dev->virtqueue[VIRTIO_RXQ]->backend = VIRTIO_DEV_STOPPED;
> > > > -	dev->virtqueue[VIRTIO_TXQ]->backend = VIRTIO_DEV_STOPPED;
> > > > +	virtqueue = rte_malloc(NULL, sizeof(struct vhost_virtqueue) *
> > > VIRTIO_QNUM, 0);
> > > > +	if (virtqueue == NULL) {
> > > > +		RTE_LOG(ERR, VHOST_CONFIG,
> > > > +			"Failed to allocate memory for virt qp:%d.\n",
> > > qp_idx);
> > > > +		return -1;
> > > > +	}
> > > > +
> > > > +	dev->virtqueue[virt_rx_q_idx] = virtqueue;
> > > > +	dev->virtqueue[virt_tx_q_idx] = virtqueue + VIRTIO_TXQ;
> > > > +
> > > > +	init_vring_queue_pair(dev, qp_idx);
> > > > +
> > > > +	return 0;
> > > >  }
> > > >
> > > >  /*
> > > > @@ -280,7 +333,6 @@ static int
> > > >  new_device(struct vhost_device_ctx ctx)  {
> > > >  	struct virtio_net_config_ll *new_ll_dev;
> > > > -	struct vhost_virtqueue *virtqueue_rx, *virtqueue_tx;
> > > >
> > > >  	/* Setup device and virtqueues. */
> > > >  	new_ll_dev = rte_malloc(NULL, sizeof(struct virtio_net_config_ll),
> > > > 0); @@ -291,28 +343,22 @@ new_device(struct vhost_device_ctx ctx)
> > > >  		return -1;
> > > >  	}
> > > >
> > > > -	virtqueue_rx = rte_malloc(NULL, sizeof(struct vhost_virtqueue), 0);
> > > > -	if (virtqueue_rx == NULL) {
> > > > -		rte_free(new_ll_dev);
> > > > +	new_ll_dev->dev.virtqueue =
> > > > +		rte_malloc(NULL, VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX *
> > > sizeof(struct vhost_virtqueue *), 0);
> > > > +	if (new_ll_dev->dev.virtqueue == NULL) {
> > > >  		RTE_LOG(ERR, VHOST_CONFIG,
> > > > -			"(%"PRIu64") Failed to allocate memory for rxq.\n",
> > > > +			"(%"PRIu64") Failed to allocate memory for
> > > dev.virtqueue.\n",
> > > >  			ctx.fh);
> > > > +		rte_free(new_ll_dev);
> > > >  		return -1;
> > > >  	}
> > > >
> > > > -	virtqueue_tx = rte_malloc(NULL, sizeof(struct vhost_virtqueue), 0);
> > > > -	if (virtqueue_tx == NULL) {
> > > > -		rte_free(virtqueue_rx);
> > > > +	if (alloc_vring_queue_pair(&new_ll_dev->dev, 0) == -1) {
> > > > +		rte_free(new_ll_dev->dev.virtqueue);
> > > >  		rte_free(new_ll_dev);
> > > > -		RTE_LOG(ERR, VHOST_CONFIG,
> > > > -			"(%"PRIu64") Failed to allocate memory for txq.\n",
> > > > -			ctx.fh);
> > > >  		return -1;
> > > >  	}
> > > >
> > > > -	new_ll_dev->dev.virtqueue[VIRTIO_RXQ] = virtqueue_rx;
> > > > -	new_ll_dev->dev.virtqueue[VIRTIO_TXQ] = virtqueue_tx;
> > > > -
> > > >  	/* Initialise device and virtqueues. */
> > > >  	init_device(&new_ll_dev->dev);
> > > >
> > > > @@ -396,7 +442,7 @@ set_owner(struct vhost_device_ctx ctx)
> > > >   * Called from CUSE IOCTL: VHOST_RESET_OWNER
> > > >   */
> > > >  static int
> > > > -reset_owner(struct vhost_device_ctx ctx)
> > > > +reset_owner(__rte_unused struct vhost_device_ctx ctx)
> > > >  {
> > > >  	struct virtio_net_config_ll *ll_dev;
> > > >
> > > > @@ -434,6 +480,7 @@ static int
> > > >  set_features(struct vhost_device_ctx ctx, uint64_t *pu)  {
> > > >  	struct virtio_net *dev;
> > > > +	uint32_t q_idx;
> > > >
> > > >  	dev = get_device(ctx);
> > > >  	if (dev == NULL)
> > > > @@ -445,22 +492,26 @@ set_features(struct vhost_device_ctx ctx,
> > > uint64_t *pu)
> > > >  	dev->features = *pu;
> > > >
> > > >  	/* Set the vhost_hlen depending on if VIRTIO_NET_F_MRG_RXBUF
> > > is set. */
> > > > -	if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) {
> > > > -		LOG_DEBUG(VHOST_CONFIG,
> > > > -			"(%"PRIu64") Mergeable RX buffers enabled\n",
> > > > -			dev->device_fh);
> > > > -		dev->virtqueue[VIRTIO_RXQ]->vhost_hlen =
> > > > -			sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > > > -		dev->virtqueue[VIRTIO_TXQ]->vhost_hlen =
> > > > -			sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > > > -	} else {
> > > > -		LOG_DEBUG(VHOST_CONFIG,
> > > > -			"(%"PRIu64") Mergeable RX buffers disabled\n",
> > > > -			dev->device_fh);
> > > > -		dev->virtqueue[VIRTIO_RXQ]->vhost_hlen =
> > > > -			sizeof(struct virtio_net_hdr);
> > > > -		dev->virtqueue[VIRTIO_TXQ]->vhost_hlen =
> > > > -			sizeof(struct virtio_net_hdr);
> > > > +	for (q_idx = 0; q_idx < dev->virt_qp_nb; q_idx++) {
> > > > +		uint32_t virt_rx_q_idx = q_idx * VIRTIO_QNUM +
> > > VIRTIO_RXQ;
> > > > +		uint32_t virt_tx_q_idx = q_idx * VIRTIO_QNUM +
> > > VIRTIO_TXQ;
> > > > +		if (dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF)) {
> > > > +			LOG_DEBUG(VHOST_CONFIG,
> > > > +				"(%"PRIu64") Mergeable RX buffers
> > > enabled\n",
> > > > +				dev->device_fh);
> > > > +			dev->virtqueue[virt_rx_q_idx]->vhost_hlen =
> > > > +				sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > > > +			dev->virtqueue[virt_tx_q_idx]->vhost_hlen =
> > > > +				sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > > > +		} else {
> > > > +			LOG_DEBUG(VHOST_CONFIG,
> > > > +				"(%"PRIu64") Mergeable RX buffers
> > > disabled\n",
> > > > +				dev->device_fh);
> > > > +			dev->virtqueue[virt_rx_q_idx]->vhost_hlen =
> > > > +				sizeof(struct virtio_net_hdr);
> > > > +			dev->virtqueue[virt_tx_q_idx]->vhost_hlen =
> > > > +				sizeof(struct virtio_net_hdr);
> > > > +		}
> > > >  	}
> > > >  	return 0;
> > > >  }
> > > > @@ -826,6 +877,14 @@ int rte_vhost_feature_enable(uint64_t
> > > feature_mask)
> > > >  	return -1;
> > > >  }
> > > >
> > > > +uint16_t rte_vhost_qp_num_get(struct virtio_net *dev) {
> > > > +	if (dev == NULL)
> > > > +		return 0;
> > > > +
> > > > +	return dev->virt_qp_nb;
> > > > +}
> > > > +
> > > >  /*
> > > >   * Register ops so that we can add/remove device to data core.
> > > >   */
> > > > --
> > > > 1.8.4.2
> > > >


More information about the dev mailing list