[dpdk-dev] [PATCH] vhost: batch used descriptors chains write-back with packed ring
Ilya Maximets
i.maximets at samsung.com
Wed Dec 12 16:23:22 CET 2018
On 12.12.2018 11:24, Maxime Coquelin wrote:
> Instead of writing back descriptors chains in order, let's
> write the first chain flags last in order to improve batching.
>
> With Kernel's pktgen benchmark, ~3% performance gain is measured.
>
> Signed-off-by: Maxime Coquelin <maxime.coquelin at redhat.com>
> ---
> lib/librte_vhost/virtio_net.c | 39 +++++++++++++++++++++--------------
> 1 file changed, 24 insertions(+), 15 deletions(-)
>
Hi.
I made some rough testing on my ARMv8 system with this patch and v1 of it.
Here is the performance difference with current master:
v1: +1.1 %
v2: -3.6 %
So, write barriers are quiet heavy in practice.
My testcase is the three instances of testpmd on a same host (with v11 from Jens):
txonly (virtio_user0) --> fwd mode io (vhost0, vhost1) --> rxonly (virtio_user1)
Best regards, Ilya Maximets.
> diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
> index 5e1a1a727..c0b3d1137 100644
> --- a/lib/librte_vhost/virtio_net.c
> +++ b/lib/librte_vhost/virtio_net.c
> @@ -135,19 +135,10 @@ flush_shadow_used_ring_packed(struct virtio_net *dev,
> struct vhost_virtqueue *vq)
> {
> int i;
> - uint16_t used_idx = vq->last_used_idx;
> + uint16_t head_flags, head_idx = vq->last_used_idx;
>
> - /* Split loop in two to save memory barriers */
> - for (i = 0; i < vq->shadow_used_idx; i++) {
> - vq->desc_packed[used_idx].id = vq->shadow_used_packed[i].id;
> - vq->desc_packed[used_idx].len = vq->shadow_used_packed[i].len;
> -
> - used_idx += vq->shadow_used_packed[i].count;
> - if (used_idx >= vq->size)
> - used_idx -= vq->size;
> - }
> -
> - rte_smp_wmb();
> + if (unlikely(vq->shadow_used_idx == 0))
> + return;
>
> for (i = 0; i < vq->shadow_used_idx; i++) {
> uint16_t flags;
> @@ -165,12 +156,24 @@ flush_shadow_used_ring_packed(struct virtio_net *dev,
> flags &= ~VRING_DESC_F_AVAIL;
> }
>
> - vq->desc_packed[vq->last_used_idx].flags = flags;
> + vq->desc_packed[vq->last_used_idx].id =
> + vq->shadow_used_packed[i].id;
> + vq->desc_packed[vq->last_used_idx].len =
> + vq->shadow_used_packed[i].len;
> +
> + rte_smp_wmb();
>
> - vhost_log_cache_used_vring(dev, vq,
> + if (i > 0) {
> + vq->desc_packed[vq->last_used_idx].flags = flags;
> +
> + vhost_log_cache_used_vring(dev, vq,
> vq->last_used_idx *
> sizeof(struct vring_packed_desc),
> sizeof(struct vring_packed_desc));
> + } else {
> + head_idx = vq->last_used_idx;
> + head_flags = flags;
> + }
>
> vq->last_used_idx += vq->shadow_used_packed[i].count;
> if (vq->last_used_idx >= vq->size) {
> @@ -179,8 +182,14 @@ flush_shadow_used_ring_packed(struct virtio_net *dev,
> }
> }
>
> - rte_smp_wmb();
> + vq->desc_packed[head_idx].flags = head_flags;
> vq->shadow_used_idx = 0;
> +
> + vhost_log_cache_used_vring(dev, vq,
> + head_idx *
> + sizeof(struct vring_packed_desc),
> + sizeof(struct vring_packed_desc));
> +
> vhost_log_cache_sync(dev, vq);
> }
>
>
More information about the dev
mailing list