[dpdk-dev] [PATCH v2 00/16] vhost packed ring performance optimization

Liu, Yong yong.liu at intel.com
Mon Sep 23 11:29:34 CEST 2019


Sure, have changed state of V1.

> -----Original Message-----
> From: Gavin Hu (Arm Technology China) [mailto:Gavin.Hu at arm.com]
> Sent: Monday, September 23, 2019 5:05 PM
> To: Liu, Yong <yong.liu at intel.com>; maxime.coquelin at redhat.com; Bie, Tiwei
> <tiwei.bie at intel.com>; Wang, Zhihong <zhihong.wang at intel.com>
> Cc: dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2 00/16] vhost packed ring performance
> optimization
> 
> Hi Marvin,
> 
> A general comment for the series, could you mark V1 Superseded?
> 
> /Gavin
> 
> > -----Original Message-----
> > From: dev <dev-bounces at dpdk.org> On Behalf Of Marvin Liu
> > Sent: Friday, September 20, 2019 12:36 AM
> > To: maxime.coquelin at redhat.com; tiwei.bie at intel.com;
> > zhihong.wang at intel.com
> > Cc: dev at dpdk.org; Marvin Liu <yong.liu at intel.com>
> > Subject: [dpdk-dev] [PATCH v2 00/16] vhost packed ring performance
> > optimization
> >
> > Packed ring has more compact ring format and thus can significantly
> > reduce the number of cache miss. It can lead to better performance.
> > This has been approved in virtio user driver, on normal E5 Xeon cpu
> > single core performance can raise 12%.
> >
> > http://mails.dpdk.org/archives/dev/2018-April/095470.html
> >
> > However vhost performance with packed ring performance was decreased.
> > Through analysis, mostly extra cost was from the calculating of each
> > descriptor flag which depended on ring wrap counter. Moreover, both
> > frontend and backend need to write same descriptors which will cause
> > cache contention. Especially when doing vhost enqueue function, virtio
> > refill packed ring function may write same cache line when vhost doing
> > enqueue function. This kind of extra cache cost will reduce the benefit
> > of reducing cache misses.
> >
> > For optimizing vhost packed ring performance, vhost enqueue and dequeue
> > function will be splitted into fast and normal path.
> >
> > Several methods will be taken in fast path:
> >   Uroll burst loop function into more pieces.
> >   Handle descriptors in one cache line simultaneously.
> >   Prerequisite check that whether I/O space can copy directly into mbuf
> >     space and vice versa.
> >   Prerequisite check that whether descriptor mapping is successful.
> >   Distinguish vhost used ring update function by enqueue and dequeue
> >     function.
> >   Buffer dequeue used descriptors as many as possible.
> >   Update enqueue used descriptors by cache line.
> >   Cache memory region structure for fast conversion.
> >   Disable sofware prefetch is hardware can do better.
> >
> > After all these methods done, single core vhost PvP performance with 64B
> > packet on Xeon 8180 can boost 40%.
> >
> > v2:
> > - Utilize compiler's pragma to unroll loop, distinguish clang/icc/gcc
> > - Buffered dequeue used desc number changed to (RING_SZ - PKT_BURST)
> > - Optimize dequeue used ring update when in_order negotiated
> >
> > Marvin Liu (16):
> >   vhost: add single packet enqueue function
> >   vhost: unify unroll pragma parameter
> >   vhost: add burst enqueue function for packed ring
> >   vhost: add single packet dequeue function
> >   vhost: add burst dequeue function
> >   vhost: rename flush shadow used ring functions
> >   vhost: flush vhost enqueue shadow ring by burst
> >   vhost: add flush function for burst enqueue
> >   vhost: buffer vhost dequeue shadow ring
> >   vhost: split enqueue and dequeue flush functions
> >   vhost: optimize enqueue function of packed ring
> >   vhost: add burst and single zero dequeue functions
> >   vhost: optimize dequeue function of packed ring
> >   vhost: cache address translation result
> >   vhost: check whether disable software pre-fetch
> >   vhost: optimize packed ring dequeue when in-order
> >
> >  lib/librte_vhost/Makefile     |   24 +
> >  lib/librte_vhost/rte_vhost.h  |   27 +
> >  lib/librte_vhost/vhost.h      |   33 +
> >  lib/librte_vhost/virtio_net.c | 1071 +++++++++++++++++++++++++++------
> >  4 files changed, 960 insertions(+), 195 deletions(-)
> >
> > --
> > 2.17.1
> 
> IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended
> recipient, please notify the sender immediately and do not disclose the
> contents to any other person, use it for any purpose, or store or copy the
> information in any medium. Thank you.


More information about the dev mailing list