[dpdk-dev] dmadev discussion summary

Morten Brørup mb at smartsharesystems.com
Mon Jul 5 12:28:18 CEST 2021


> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> Sent: Sunday, 4 July 2021 09.43
> 
> On Sat, Jul 3, 2021 at 5:54 PM Morten Brørup <mb at smartsharesystems.com>
> wrote:
> >
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jerin Jacob
> > > Sent: Saturday, 3 July 2021 11.09
> > >
> > > On Sat, Jul 3, 2021 at 2:23 PM Morten Brørup
> <mb at smartsharesystems.com>
> > > wrote:
> > > >
> > > > > From: fengchengwen [mailto:fengchengwen at huawei.com]
> > > > > Sent: Saturday, 3 July 2021 02.32
> > > > >
> > > > > On 2021/7/2 22:57, Morten Brørup wrote:
> > > > > >> In the DPDK framework, many data-plane API names contain
> queues.
> > > > > e.g.
> > > > > >> eventdev/crypto..
> > > > > >> The concept of virt queues has continuity.
> > > > > >
> > > > > > I was also wondering about the name "virtual queue".
> > > > > >
> > > > > > Usually, something "virtual" would be an abstraction of
> something
> > > > > physical, e.g. a software layer on top of something physical.
> > > > > >
> > > > > > Back in the days, a "DMA channel" used to mean a DMA engine
> on a
> > > CPU.
> > > > > If a CPU had 2 DMA channels, they could both be set up
> > > simultaneously.
> > > > > >
> > > > > > The current design has the "dmadev" representing a CPU or
> other
> > > chip,
> > > > > which has one or more "HW-queues" representing DMA channels (of
> the
> > > > > same type), and then "virt-queue" as a software abstraction on
> top,
> > > for
> > > > > using a DMA channel in different ways through individually
> > > configured
> > > > > contexts (virt-queues).
> > > > > >
> > > > > > It makes sense to me, although I would consider renaming "HW-
> > > queue"
> > > > > to "channel" and perhaps "virt-queue" to "queue".
> > > > >
> > > > > The 'DMA channel' is more used than 'DMA queue', at least
> google
> > > show
> > > > > that there are at least 20+ times more.
> > > > >
> > > > > It's a good idea build the abstraction layer: queue <> channel
> <>
> > > dma-
> > > > > controller.
> > > > > In this way, the meaning of each layer is relatively easy to
> > > > > distinguish literally.
> > > > >
> > > > > will fix in V2
> > > > >
> > > >
> > > > After re-reading all the mails in this thread, I have found one
> more
> > > important high level detail still not decided:
> > > >
> > > > Bruce had suggested flattening the DMA channels, so each dmadev
> > > represents a DMA channel. And DMA controllers with multiple DMA
> > > channels will have to instantiate multiple dmadevs, one for each
> DMA
> > > channel.
> > > >
> > > > Just like a four port NIC instantiates four ethdevs.
> > > >
> > > > Then, like ethdevs, there would only be two abstraction layers:
> > > dmadev <> queue, where a dmadev is a DMA channel on a DMA
> controller.
> > > >
> > > > However, this assumes that the fast path functions on the
> individual
> > > DMA channels of a DMA controller can be accessed completely
> > > independently and simultaneously by multiple threads. (Otherwise,
> the
> > > driver would need to implement critical regions or locking around
> > > accessing the common registers in the DMA controller shared by the
> DMA
> > > channels.)
> > > >
> > > > Unless any of the DMA controller vendors claim that this
> assumption
> > > about independence of the DMA channels is wrong, I strongly support
> > > Bruce's flattening suggestion.
> > >
> > > It is wrong from alteast octeontx2_dma PoV.
> > >
> > > # The PCI device is DMA controller where the driver/device is
> > > mapped.(As device driver is based on PCI bus, We dont want to have
> > > vdev for this)
> > > # The PCI device has HW queue(s)
> > > # Each HW queue has different channels.
> > >
> > > In the current configuration, we have only one queue per device and
> it
> > > has 4 channels. 4 channels are not threaded safe as it is based on
> > > single queue.
> >
> > Please clarify "current configuration": Is that a configuration
> modifiable by changing some software/driver, or is it the chip that was
> built that way in the RTL code?
> 
> We have 8 queues per SoC, Based on some of HW versions it can be
> configured as (a) or (b) using FW settings.
> a) One PCI devices with 8 Queues
> b) 8 PCI devices with each one has one queue.
> 
> Everyone is using mode (b) as it helps 8 different applications to use
> DMA as if one application binds the PCI device other applications can
> not use the same PCI device.
> If one application needs 8 queues, it is possible that 8 dmadevice can
> be bound to a single application with mode (b).
> 
> 
> I think, in above way we can flatten to <device> <> <channel/queue>
> 
> >
> > >
> > > I think, if we need to flatten it, I think, it makes sense to have
> > > dmadev <> channel (and each channel can have thread-safe capability
> > > based on how it mapped on HW queues based on the device driver
> > > capability).
> >
> > The key question is how many threads can independently call data-
> plane dmadev functions (rte_dma_copy() etc.) simultaneously. If I
> understand your explanation correctly, only one - because you only have
> one DMA device, and all access to it goes through a single hardware
> queue.
> >
> > I just realized that although you only have one DMA Controller with
> only one HW queue, your four DMA channels allows four sequentially
> initiated transactions to be running simultaneously. Does the
> application have any benefit by knowing that the dmadev can have
> multiple ongoing transactions, or can the fast-path dmadev API hide
> that ability?
> 
> In my view it is better to hide and I have similar proposal at
> http://mails.dpdk.org/archives/dev/2021-July/213141.html
> --------------
> >   7) Because data-plane APIs are not thread-safe, and user could
> determine
> >      virt-queue to HW-queue's map (at the queue-setup stage), so it
> is user's
> >      duty to ensure thread-safe.
> 
> +1. But I am not sure how easy for the fast-path application to have
> this logic,
> Instead, I think, it is better to tell the capa for queue by driver
> and in channel configuration,
> the application can request for requirement (Is multiple producers enq
> to the same HW queue or not).
> Based on the request, the implementation can pick the correct function
> pointer for enq.(lock vs lockless version if HW does not support
> lockless)

+1 to that!

> 
> ------------------------
> >



More information about the dev mailing list