[dpdk-dev] vhost: batch used descriptors chains write-back with packed ring

Ilya Maximets i.maximets at samsung.com
Wed Dec 5 17:01:23 CET 2018


On 28.11.2018 12:47, Maxime Coquelin wrote:
> Instead of writing back descriptors chains in order, let's
> write the first chain flags last in order to improve batching.

I'm not sure if this fully compliant with virtio spec.
It says that 'each side (driver and device) are only required to poll
(or test) a single location in memory', but it does not forbid to
test other descriptors. So, if the driver will try to check not only
'the next device descriptor after the one they processed previously,
in circular order' but a few descriptors ahead, it could read an
inconsistent memory because there are no more write barriers between
updates for flags and id/len for them.

What do you think ?

> 
> With Kernel's pktgen benchmark, ~3% performance gain is measured.
> 
> Signed-off-by: Maxime Coquelin <maxime.coquelin at redhat.com>
> Tested-by: Jens Freimann <jfreimann at redhat.com>
> Reviewed-by: Jens Freimann <jfreimann at redhat.com>
> ---
>  lib/librte_vhost/virtio_net.c | 37 ++++++++++++++++++++++-------------
>  1 file changed, 23 insertions(+), 14 deletions(-)
> 
> diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
> index 5e1a1a727..f54642c2d 100644
> --- a/lib/librte_vhost/virtio_net.c
> +++ b/lib/librte_vhost/virtio_net.c
> @@ -135,19 +135,10 @@ flush_shadow_used_ring_packed(struct virtio_net *dev,
>  			struct vhost_virtqueue *vq)
>  {
>  	int i;
> -	uint16_t used_idx = vq->last_used_idx;
> +	uint16_t head_flags, head_idx = vq->last_used_idx;
>  
> -	/* Split loop in two to save memory barriers */
> -	for (i = 0; i < vq->shadow_used_idx; i++) {
> -		vq->desc_packed[used_idx].id = vq->shadow_used_packed[i].id;
> -		vq->desc_packed[used_idx].len = vq->shadow_used_packed[i].len;
> -
> -		used_idx += vq->shadow_used_packed[i].count;
> -		if (used_idx >= vq->size)
> -			used_idx -= vq->size;
> -	}
> -
> -	rte_smp_wmb();
> +	if (unlikely(vq->shadow_used_idx == 0))
> +		return;
>  
>  	for (i = 0; i < vq->shadow_used_idx; i++) {
>  		uint16_t flags;
> @@ -165,12 +156,22 @@ flush_shadow_used_ring_packed(struct virtio_net *dev,
>  			flags &= ~VRING_DESC_F_AVAIL;
>  		}
>  
> -		vq->desc_packed[vq->last_used_idx].flags = flags;
> +		vq->desc_packed[vq->last_used_idx].id =
> +			vq->shadow_used_packed[i].id;
> +		vq->desc_packed[vq->last_used_idx].len =
> +			vq->shadow_used_packed[i].len;
> +
> +		if (i > 0) {
> +			vq->desc_packed[vq->last_used_idx].flags = flags;
>  
> -		vhost_log_cache_used_vring(dev, vq,
> +			vhost_log_cache_used_vring(dev, vq,
>  					vq->last_used_idx *
>  					sizeof(struct vring_packed_desc),
>  					sizeof(struct vring_packed_desc));
> +		} else {
> +			head_idx = vq->last_used_idx;
> +			head_flags = flags;
> +		}
>  
>  		vq->last_used_idx += vq->shadow_used_packed[i].count;
>  		if (vq->last_used_idx >= vq->size) {
> @@ -180,7 +181,15 @@ flush_shadow_used_ring_packed(struct virtio_net *dev,
>  	}
>  
>  	rte_smp_wmb();
> +
> +	vq->desc_packed[head_idx].flags = head_flags;
>  	vq->shadow_used_idx = 0;
> +
> +	vhost_log_cache_used_vring(dev, vq,
> +				head_idx *
> +				sizeof(struct vring_packed_desc),
> +				sizeof(struct vring_packed_desc));
> +
>  	vhost_log_cache_sync(dev, vq);
>  }
>  
> 


More information about the dev mailing list