[25.11 PATCH v3 0/5] Introduce DMA enqueue/dequeue operations
Pavan Nikhilesh Bhagavatula
pbhagavatula at marvell.com
Wed Oct 1 11:22:47 CEST 2025
>> Hi Bruce,
>>
>> >On Sat, May 24, 2025 at 02:43:10PM +0530, <pbhagavatula at marvell.com> wrote:
>> >> From: Pavan Nikhilesh <pbhagavatula at marvell.com>
>> >>
>> >> Introduce DMA enqueue/dequeue operations to the DMA device library.
>> >>
>> >> Add configuration flags to rte_dma_config instead of boolean for
>> >> individual features.
>> >>
>> >> The enqueue/dequeue operations allow applications to communicate with the
>> >> DMA device using the rte_dma_op structure, providing a more flexible and
>> >> efficient way to manage DMA operations.
>> >>
>> >
>> >While I have no really strong objections to this addition to the dmadev
>> >API, I'd appreciate if you could explain WHY or how this method of working
>> >is more efficient in your usecase? When designing the dmadev APIs
>> >originally, we looked at using both an enqueue-type API as well as the
>> >implemented individual-op-based APIs. IIRC at that time testing showed that
>> >using the single ops directly was faster than using the enqueue APIs, so
>> >I'm wondering what exactly has changed, or is different about your usecase?
>> >
>>
>> Here is an example where we see enqueue/dequeue ops to be useful especially when
>> integrating with Graph library.
>>
>> We had to write an entire wrapper[1] for tracking sges with the current implementation
>> making our nodes[2] very complex.
>>
>
>Can you explain a bit more here. Why do you need the wrapper rather than
>just tracking in a circular ring all the copies offloaded? How does having
>an enqueue API make this better?
This is what we already do in our wrapper.
We found it unnecessary overhead since, the driver already does this internally
and we can leverage the existing functionality.
This also reduces the memory footprint as in the case below we use a lot of VCHANS.
Instead of checking for completions and maintaining the circular ring, we can spend
those cycles doing other things in the application.
>Can you perhaps give a trivial example
>showing the difference it makes here? The examples you give below are
>rather long to understand quickly.
>
The example below is a graph based application which currently uses the wrapper implementation.
Which we want to swap with enq/deq ops to reduce overhead.
Also, the ops descriptor already existes for eventdev subsystem, we are just importing it to DMA
device and reusing it.
>Thanks,
>/Bruce
>
>> [1]<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_MarvellEmbeddedProcessors_dao_blob_dao-2Ddevel_lib_common_dao-5Fdma.h&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=E3SgYMjtKCMVsB-fmvgGV3o-g_fjLhk5Pupi9ijohpc&m=dXtUywAGV8Rir_dtqGP5J-tvRAxN9zQjmM96PeDo6Ke6QybID8eLdPbVwWzlgZFy&s=QryV2vh2_mWEz5yS37615Xb1F6B-gQZHM1uZ3badxoU&e=>
>> [2]<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_MarvellEmbeddedProcessors_dao_blob_3f364261de91e355699bd9af20d60ea6459f7d67_lib_virtio-5Fnet_virtio-5Fnet-5Fdeq-5Fext.c-23L51&d=DwIBAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=E3SgYMjtKCMVsB-fmvgGV3o-g_fjLhk5Pupi9ijohpc&m=dXtUywAGV8Rir_dtqGP5J-tvRAxN9zQjmM96PeDo6Ke6QybID8eLdPbVwWzlgZFy&s=Bl2X7g7xXg_XrWvVIjPhMuIZuy3PG7tOM-Eje9i2ITA&e=>
>>
>> >/Bruce
>>
>> Thanks,
>> Pavan.
>>
More information about the dev
mailing list