[dpdk-dev] [PATCH 1/3] vhost: pre update used ring for Tx and Rx

Yuanhan Liu yuanhan.liu at linux.intel.com
Wed Jun 1 08:55:57 CEST 2016


On Wed, Jun 01, 2016 at 06:40:41AM +0000, Xie, Huawei wrote:
> >  	/* Retrieve all of the head indexes first to avoid caching issues. */
> >  	for (i = 0; i < count; i++) {
> > -		desc_indexes[i] = vq->avail->ring[(vq->last_used_idx + i) &
> > -					(vq->size - 1)];
> > +		used_idx = (vq->last_used_idx + i) & (vq->size - 1);
> > +		desc_indexes[i] = vq->avail->ring[used_idx];
> > +
> > +		vq->used->ring[used_idx].id  = desc_indexes[i];
> > +		vq->used->ring[used_idx].len = 0;
> > +		vhost_log_used_vring(dev, vq,
> > +				offsetof(struct vring_used, ring[used_idx]),
> > +				sizeof(vq->used->ring[used_idx]));
> >  	}
> >  
> >  	/* Prefetch descriptor index. */
> >  	rte_prefetch0(&vq->desc[desc_indexes[0]]);
> > -	rte_prefetch0(&vq->used->ring[vq->last_used_idx & (vq->size - 1)]);
> > -
> >  	for (i = 0; i < count; i++) {
> >  		int err;
> >  
> > -		if (likely(i + 1 < count)) {
> > +		if (likely(i + 1 < count))
> >  			rte_prefetch0(&vq->desc[desc_indexes[i + 1]]);
> > -			rte_prefetch0(&vq->used->ring[(used_idx + 1) &
> > -						      (vq->size - 1)]);
> > -		}
> >  
> >  		pkts[i] = rte_pktmbuf_alloc(mbuf_pool);
> >  		if (unlikely(pkts[i] == NULL)) {
> > @@ -916,18 +920,12 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
> >  			rte_pktmbuf_free(pkts[i]);
> >  			break;
> >  		}
> > -
> > -		used_idx = vq->last_used_idx++ & (vq->size - 1);
> > -		vq->used->ring[used_idx].id  = desc_indexes[i];
> > -		vq->used->ring[used_idx].len = 0;
> > -		vhost_log_used_vring(dev, vq,
> > -				offsetof(struct vring_used, ring[used_idx]),
> > -				sizeof(vq->used->ring[used_idx]));
> >  	}
> 
> Had tried post-updating used ring in batch,  but forget the perf change.

I would assume pre-updating gives better performance gain, as we are
fiddling with avail and used ring together, which would be more cache
friendly.

> One optimization would be on vhost_log_used_ring.
> I have two ideas,
> a) In QEMU side, we always assume use ring will be changed. so that we
> don't need to log used ring in VHOST.
> 
> Michael: feasible in QEMU? comments on this?
> 
> b) We could always mark the total used ring modified rather than entry
> by entry.

I doubt it's worthwhile. One fact is that vhost_log_used_ring is
a non operation in most time: it will take action only in the short
gap of during live migration.

And FYI, I even tried with all vhost_log_xxx being removed, it showed
no performance boost at all. Therefore, it's not a factor that will
impact performance.

	--yliu


More information about the dev mailing list