[dpdk-dev] [RFC] Accelerating Data Movement for DPDK vHost with DMA Engines
    Maxime Coquelin 
    maxime.coquelin at redhat.com
       
    Fri Apr 17 10:40:15 CEST 2020
    
    
  
On 4/17/20 10:29 AM, Fu, Patrick wrote:
> Hi Jerin,
> 
>> -----Original Message-----
>> From: Jerin Jacob <jerinjacobk at gmail.com>
>> Sent: Friday, April 17, 2020 4:02 PM
>> To: Fu, Patrick <patrick.fu at intel.com>
>> Cc: dev at dpdk.org; Maxime Coquelin <maxime.coquelin at redhat.com>; Ye,
>> Xiaolong <xiaolong.ye at intel.com>; Hu, Jiayu <jiayu.hu at intel.com>; Wang,
>> Zhihong <zhihong.wang at intel.com>; Liang, Cunming
>> <cunming.liang at intel.com>
>> Subject: Re: [dpdk-dev] [RFC] Accelerating Data Movement for DPDK vHost
>> with DMA Engines
>>
>> On Fri, Apr 17, 2020 at 12:56 PM Fu, Patrick <patrick.fu at intel.com> wrote:
>>>
>>> Background
>>> ====================================
>>> DPDK vhost library implements a user-space VirtIO net backend allowing
>> host applications to directly communicate with VirtIO front-end in VMs and
>> containers. However, every vhost enqueue/dequeue operation requires to
>> copy packet buffers between guest and host memory. The overhead of
>> copying large bulk of data makes the vhost backend become the I/O
>> bottleneck. DMA engines, including un-core DMA accelerator, like Crystal
>> Beach DMA (CBDMA) and Data Streaming Accelerator (DSA), and discrete
>> card general purpose DMA, are extremely efficient in data movement within
>> system memory. Therefore, we propose a set of asynchronous DMA data
>> movement API in vhost library for DMA acceleration. With offloading packet
>> copies in vhost data-path from the CPU to the DMA engine, which can not
>> only accelerate data transfers, but also save precious CPU core resources.
>>>
>>> New API Overview
>>> ====================================
>>> The proposed APIs in the vhost library support various DMA engines to
>> accelerate data transfers in the data-path. For the higher performance, DMA
>> engines work in an asynchronous manner, where DMA data transfers and
>> CPU computations are executed in parallel. The proposed API consists of
>> control path API and data path API. The control path API includes
>> Registration API and DMA operation callback, and the data path API includes
>> asynchronous API. To remove the dependency of vendor specific DMA
>> engines, the DMA operation callback provides generic DMA data transfer
>> abstractions. To support asynchronous DMA data movement, the new async
>> API provides asynchronous ring operation semantic in data-path. To
>> enable/disable DMA acceleration for virtqueues, users need to use
>> registration API is to register/unregister DMA callback implementations to
>> the vhost library and bind DMA channels to virtqueues. The DMA channels
>> used by virtqueues are provided by DPDK applications, which is backed by
>> virtual or physical DMA devices.
>>> The proposed APIs are consisted of 3 sub-sets:
>>> 1. DMA Registration APIs
>>> 2. DMA Operation Callbacks
>>> 3. Async Data APIs
>>>
>>> DMA Registration APIs
>>> ====================================
>>> DMA acceleration is per queue basis. DPDK applications need to explicitly
>> decide whether a virtqueue needs DMA acceleration and which DMA channel
>> to use. In addition, a DMA channel is dedicated to a virtqueue and a DMA
>> channel cannot be bound to multiple virtqueues at the same time. To enable
>> DMA acceleration for a virtqueue, DPDK applications need to implement
>> DMA operation callbacks for a specific DMA type (e.g. CBDMA) first, then
>> register the callbacks to the vhost library and bind a DMA channel to a
>> virtqueue, and finally use the new async API to perform data-path operations
>> on the virtqueue.
>>> The definitions of registration API are shown below:
>>> int rte_vhost_async_channel_register(int vid, uint16_t queue_id,
>>>                                         struct rte_vdma_device_ops
>>> *ops);
>>>
>>> int rte_vhost_async_channel_unregister(int vid, uint16_t queue_id);
>>
>> We already have multiple DMA implementation over raw dev.
>> Why not make a new dmadev class for DMA acceleration and use it by virtio
>> and any other clients?
> 
> I believe it doesn't conflict. The purpose of this RFC is to create an async data path in vhost-user and provide a way for applications to work with this new path. dmadev is another topic which could be discussed separately. If we do have the dmadev available in the future, this vhost async data path could certainly be backed by the new dma abstraction without major interface change.
Maybe that one advantage of a dmadev class is that it would be easier
and more transparent for the application to consume.
The application would register some DMA devices, pass them to the Vhost
library, and then rte_vhost_submit_enqueue_burst and
rte_vhost_poll_enqueue_completed would call the dmadev callbacks
directly.
Do you think that could work?
Thanks,
Maxime
> Thanks,
> 
> Patrick
> 
    
    
More information about the dev
mailing list