[dpdk-dev] [PATCH 0/8] Introduce emudev library and iavf emudev driver

Xia, Chenbo chenbo.xia at intel.com
Tue Dec 22 04:09:19 CET 2020


Hi Maxime,

> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin at redhat.com>
> Sent: Monday, December 21, 2020 8:02 PM
> To: Xia, Chenbo <chenbo.xia at intel.com>; Thomas Monjalon <thomas at monjalon.net>;
> David Marchand <david.marchand at redhat.com>
> Cc: dev <dev at dpdk.org>; Stephen Hemminger <stephen at networkplumber.org>; Liang,
> Cunming <cunming.liang at intel.com>; Lu, Xiuchun <xiuchun.lu at intel.com>; Li,
> Miao <miao.li at intel.com>; Wu, Jingjing <jingjing.wu at intel.com>
> Subject: Re: [dpdk-dev] [PATCH 0/8] Introduce emudev library and iavf emudev
> driver
> 
> 
> 
> On 12/21/20 10:52 AM, Maxime Coquelin wrote:
> > Hi Chenbo,
> >
> > On 12/19/20 7:11 AM, Xia, Chenbo wrote:
> >> Hi David,
> >>
> >>> -----Original Message-----
> >>> From: David Marchand <david.marchand at redhat.com>
> >>> Sent: Friday, December 18, 2020 5:54 PM
> >>> To: Xia, Chenbo <chenbo.xia at intel.com>
> >>> Cc: dev <dev at dpdk.org>; Thomas Monjalon <thomas at monjalon.net>; Stephen
> >>> Hemminger <stephen at networkplumber.org>; Liang, Cunming
> >>> <cunming.liang at intel.com>; Lu, Xiuchun <xiuchun.lu at intel.com>; Li, Miao
> >>> <miao.li at intel.com>; Wu, Jingjing <jingjing.wu at intel.com>
> >>> Subject: Re: [PATCH 0/8] Introduce emudev library and iavf emudev driver
> >>>
> >>> On Fri, Dec 18, 2020 at 9:02 AM Chenbo Xia <chenbo.xia at intel.com> wrote:
> >>>>
> >>>> This series introduces a new device abstraction called emudev for
> >>> emulated
> >>>> devices. A new library (librte_emudev) is implemented. The first emudev
> >>>> driver is also introduced, which emulates Intel Adaptive Virtual
> >>> Function
> >>>> (iavf) as a software network device.
> >>>>
> >>>> This series has a dependency on librte_vfio_user patch series:
> >>>> http://patchwork.dpdk.org/cover/85389/
> >>>>
> >>>> Background & Motivation
> >>>> -----------------------
> >>>> The disaggregated/multi-process QEMU is using VFIO-over-socket/vfio-user
> >>>> as the main transport mechanism to disaggregate IO services from QEMU.
> >>>> Therefore, librte_vfio_user is introduced in DPDK to accommodate
> >>>> emulated devices for high performance I/O. Although vfio-user library
> >>>> provides possibility of emulating devices in DPDK, DPDK does not have
> >>>> a device abstraction for emulated devices. A good device abstraction
> >>> will
> >>>> be useful for applications or high performance data path driver. With
> >>>> this consideration, emudev library is designed and implemented. It also
> >>>> make it possbile to keep modular design on emulated devices by
> >>> implementing
> >>>> data path related logic in a standalone driver (e.g., an ethdev driver)
> >>>> and keeps the unrelated logic in the emudev driver.
> >>>
> >>> Since you mention performance, how does it compare to vhost-user/virtio?
> >>
> >> I think it depends on the device specification (i.e., how complex its data
> path
> >> handling is). A first try on iavf spec shows better performance than virtio
> >> in our simple tests.
> >
> > That's interesting! How big is the performance difference? And how do
> > we explain it?
> >
> > If there are improvements that could be done in the Virtio
> > specification, it would be great to know and work on their
> > implementations. It worries me a bit that every one is coming with
> > his new device emulation every release, making things like live-
> > migration difficult to achieve in the future.
> 
> I did a quick review of the IAVF emudev driver to understand what other
> factors than ring layout could explain a performance gain.
> 
> My understanding is that part of the performance gain may come from
> following things that are supported/implemented in Vhost-user backend
> and not in IAVF driver:
> 1. Memory hotplug. It seems the datapath is not safe against memory
> hotplug in the VM, which causes the memory tables to be updated
> asynchronously from the datapath. In order to support it in Vhost-user
> library, we had to introduce locks to ensure the datapath isn't
> accessing the shared memory while it is being remapped.

I think now it uses the similar way that vhost-user does.

First, in the vfio-user patch series, we introduce a callback lock_dp to lock
the data path when messages like DMA MAP/UNMAP come. It will lock datapath
in our data path driver.

Note that the data path handling is in our data path driver:
http://patchwork.dpdk.org/cover/85500/

For modular design, iavf_emu driver emulates the device but the iavf back-end
driver does data path handling.

> 
> 2. Live-migration. This feature does not seem supported in the driver,
> as I could not find dirty pages tracking mechanism. On Vhost-user side,
> supporting implies adding more branch conditions in the hot path, even
> when it is not used.

Yes, we don't support this now in this version. And yes, when we support this
feature, it will introduce complexity in data path.

> 
> 3. No IOMMU support. Same here, this is supported in Vhost-user library,
> and adding its support in IAVF driver would introduce some more branches
> in the hot path even when not used.

Yes, vIOMMU is not fully supported in vfio-user spec for now and I also agree
when we have to support it, it will slow down the data path.

> 
> 4. Out of bound memory accesses checks. While
> rte_iavf_emu_get_dma_vaddr() provides a way to ensure the full requested
> length is mapped, the data path does not use it. It does not even ensure

In fact, it uses it 😊. I think you may miss our data path driver. Here's a 
use: http://patchwork.dpdk.org/patch/85504/

> the translated address is non-NULL. It makes it trivial for a malicious
> guest to make the hypervisor's vSwitch to crash by simply passing random
> values as buffer's address and length. Fixing it is trivial, but it will
> add several more checks and loops (if a buffer is spanned across two
> pages) in the hot path.

I don't quite understand this one. First, rte_iavf_emu_get_dma_vaddr() is the
only way to translate address. And I think this function will ensure the input
address is valid. Looking at the vhost side, vhost_iova_to_vva() does similar
things when vIOMMU is not used. Do I miss something? Just correct me if I am
wrong.

> 
> Other than that, there is for sure a performance gain due to all the
> features Virtio-net supports that we have to check and handle in the
> hotpath, like indirect descriptors or mergeable buffers for example.

I think iavf has similar features like indirect and mergeable to recv/xmit
large pkts. But I believe there will be some feature difference between
iavf and virtio/vhost.

I think you are correct that for this version, it's not fair to compare virtio/vhost
with iavf back-end because iavf back-end has not supported some features, and besides,
we have not optimized the data path of iavf back-end. We expect the performance of
iavf back-end in the same level of virtio 1.1 and hopefully better because of the ring
layout. But let's see when we can do complete performance analysis 😊.

Thanks!
Chenbo

> 
> Best regards,
> Maxime
> 
> > Regards,
> > Maxime
> >
> >> Thanks!
> >> Chenbo
> >>
> >>>
> >>>
> >>> --
> >>> David Marchand
> >>



More information about the dev mailing list